UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Enzymology of the xyloglucan utilization system in the soil saprophyte Cellvibrio japonicus Attia, Mohamed Awad AbdElKader 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2018_may_attia_mohamed.pdf [ 8.77MB ]
Metadata
JSON: 24-1.0364095.json
JSON-LD: 24-1.0364095-ld.json
RDF/XML (Pretty): 24-1.0364095-rdf.xml
RDF/JSON: 24-1.0364095-rdf.json
Turtle: 24-1.0364095-turtle.txt
N-Triples: 24-1.0364095-rdf-ntriples.txt
Original Record: 24-1.0364095-source.json
Full Text
24-1.0364095-fulltext.txt
Citation
24-1.0364095.ris

Full Text

ENZYMOLOGY OF THE XYLOGLUCAN UTILIZATION SYSTEM IN THE SOIL SAPROPHYTE CELLVIBRIO JAPONICUS by  Mohamed Awad Abdelkader Attia   M.Sc. University of Calgary, 2012  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  Doctor of Philosophy in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Chemistry)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  February 2018   © Mohamed Awad Abdelkader Attia, 2018 ii  Abstract Xyloglucan (XyG) is a ubiquitous plant heteropolysaccharide representing up to one-quarter of the total carbohydrate content of terrestrial plant cell walls. Given its structural complexity, XyG requires a consortium of backbone-cleaving endo-xyloglucanases and sidechain-cleaving exo-glycosidases for complete saccharification. The Gram-negative soil saprophyte Cellvibrio japonicus is a treasure trove for carbohydrate active enzymes (CAZymes) due to its robust capacity to degrade different plant polysaccharides. The XyG utilization machinery in C. japonicus is incompletely understood, despite recent characterization of associated sidechain-cleaving exo-glycosidases. I present here my attempts to identify and functionally characterize the endo-xyloglucanase(s) and the XyG-specific β-1,4 exo-glucosidase catalyzing the first and final steps, respectively, in the XyG saccharification pathway in C. japonicus. Bioinformatic analysis identified one Glycoside Hydrolase Family 74 (CjGH74), three GH5_4 (CjGH5D, CjGH5E and CjGH5F) and three GH9 (CjGH9A, CjGH9B and CjGH9C) candidates with putative endo-xyloglucanase activity. Biochemical and structural analyses that involved CjGH74 and the three CjGH5_4 enzymes clearly demonstrated the exquisite specificity of the four enzymes towards XyG. Scrutiny of the modular architecture of the CjGH5_4 enzyme, CjGH5F, identified the module of unknown function X181. Affinity gel electrophoresis and isothermal titration calorimetry identified the X181 module as a member of a new CBM family that exclusively binds galactose-containing polysaccharides including XyG and galactomannans, congruent with the displayed endo-xyloglucanase activity of the pendent catalytic domain. Surprisingly, reverse genetic analysis in C. japonicus displayed the lack of growth perturbation on XyG upon the deletion of the four specific endo-xyloglucanases, suggesting the presence of other enzyme(s) with potential endo-xyloglucanase activity. Biochemical characterization of the GH9 enzyme CjGH9B revealed its high catalytic efficiency towards mixed linkage β-glucan and its weak side-activity against xyloglucan, which might be sufficient to rescue the quadruple deletion mutant. Bioinformatic analysis identified the four CjGH3 enzymes Bgl3A, Bgl3B, Bgl3C, and Bgl3D as potential targets for the XyG-specific exo-β-glucosidase in C. japonicus. Comprehensive genetic and biochemical approaches interestingly revealed the fundamental contribution of Bgl3D in XyG utilization in C. japonicus. Together, these data shed light on the iii  initial and final steps of xyloglucan saccharification in C. japonicus and identify useful enzymes for selective biomass deconstruction.  iv  Lay Summary  The utilization of plant materials in the production of energy and high value products has drawn a lot of attention in recent years given the abundance of plant biomass in the environment. However, deconstruction of plant cell walls, for these downstream applications, poses a major challenge because of their innate structural complexity. The heterogeneous xyloglucan (XyG), a fundamental component of plant cell wall, is essentially composed of a glucose backbone highly substituted with xylose moieties, which can be further extended by other sugar units. Interestingly, some bacteria evolved effective strategies allowing them to utilize XyG. One of these is the soil bacterium Cellvibrio japonicus, which employs a suite of enzymes that target the backbone, as well as the side chains, of XyG. In this thesis, I summarize my attempts to identify the initial backbone cleaving- and the final debranching- enzymes in the XyG degradation pathway in C. japonicus.               v  Preface Chapter 1 has been adapted from a published review about the different XyG degradation systems in nature. [Mohamed A. Attia] and Harry Brumer (2016). Recent Structural Insights into the Enzymology of the Ubiquitous Plant Cell Wall Glycan Xyloglucan. Current Opinion in Structural Biology, 40:43-53. I wrote the review and made the figures. My supervisor, Prof. Harry Brumer, revised and refined the manuscript prior to final publication. A version of Chapter 2 has been published. [Mohamed Attia], Judith Stepper, Gideon J. Davies, and Harry Brumer (2016). Functional and structural characterization of a potent GH74 endo-xyloglucanase from the soil saprophyte Cellvibrio japonicus unravels the first step of xyloglucan degradation. FEBS Journal, 283:1701-1719. I did the necessary cloning, enzyme production and biochemical characterization of CjGH74. I also contributed significantly to the writing of the manuscript. Dr. Judith Stepper solved the crystal structure of the CjGH74 enzyme.  A version of Chapter 3 has been published. [Mohamed A. Attia], Cassandra E. Nelson, Wendy Offen, Namrata Jain, Jeffrey Gardner, Gideon J. Davies, and Harry Brumer (2018). In vitro and in vivo characterization of three Cellvibrio japonicus Glycoside Hydrolase Family 5 members reveals potent xyloglucan backbone-cleaving functions. Biotechnology for Biofuels, 11:45. I performed the bioinformatic analysis, recombinant protein production, and all the biochemical characterization. I also had a significant contribution to the writing of the manuscript. Cassandra E. Nelson did the transcriptomic and reverse genetic analyses. Dr. Wendy Offen solved the crystal structure of CjGH5D. Namrata Jain did the inactivation kinetics of CjGH5D using a specific active-site inhibitor.  In Chapter 4, I designed and performed all the experiments, along with related data analysis. Research conducted includes bioinformatic analysis, recombinant protein production, biochemical and biophysical characterization. I wrote the entire chapter.  Chapter 5 is based on an Honour Thesis project conducted by a 4th year Biochemistry student (Kevin Mark). I designed all experiments, mentored the student, performed some biochemical characterization experiments, did all the analysis, and contributed significantly to the writing of the chapter.   vi  A version of Chapter 6 has been published. Cassandra E. Nelson, [Mohamed A. Attia], Artur Rogowski, Carl Morland, Harry Brumer, and Jeffrey G. Gardner (2017). Comprehensive functional characterization of the Glycoside Hydrolase Family 3 enzymes from Cellvibrio japonicus reveals unique metabolic roles in biomass saccharification. Environmental Microbiology, 19:5025-5039. Cassandra Nelson did the transcriptomic and reverse-genetic analyses. I prepared the XyG-based oligosaccharide substrates, produced and biochemically characterized the recombinant enzymes in vitro, and contributed to the writing of the manuscript. Artur Rogowski and Carl Morland contributed with some enzyme kinetic data.  vii  Table of Contents  Abstract .......................................................................................................................................... ii Lay Summary ............................................................................................................................... iv Preface .............................................................................................................................................v Table of Contents ........................................................................................................................ vii List of Tables .............................................................................................................................. xiii List of Figures ............................................................................................................................. xiv List of Abbreviations ................................................................................................................ xvii Acknowledgements .................................................................................................................... xix Dedication .....................................................................................................................................xx Chapter 1: Introduction ................................................................................................................1 1.1 Plant biomass as a source of renewable energy and valuable products .......................... 1 1.2 Complexity of plant cell wall .......................................................................................... 2 1.2.1 Cellulose ..................................................................................................................... 4 1.2.2 Hemicelluloses ............................................................................................................ 4  Xyloglucans (XyGs) ........................................................................................... 5 1.2.2.1 Mixed linkage glucans (MLGs) .......................................................................... 6 1.2.2.2 Xylans ................................................................................................................. 7 1.2.2.3 Mannans .............................................................................................................. 9 1.2.2.41.2.3 Pectins ......................................................................................................................... 9 1.2.4 Lignin ........................................................................................................................ 11 1.3 Carbohydrate active enzymes (CAZymes) ................................................................... 11 1.4 Mechanistic insights into glycoside hydrolases ............................................................ 14 1.4.1 Inverting mechanism ................................................................................................. 14 1.4.2 Retaining mechanisms .............................................................................................. 15 1.4.3 Catalytic mode of action of GHs .............................................................................. 21 1.5 XyG deconstructing enzymes ....................................................................................... 21 1.5.1 Backbone cleaving xyloglucanases (endo-xyloglucanases) ..................................... 22  Plant endo-xyloglucanases (glycoside hydrolase family 16) ............................ 22 1.5.1.1 Microbial endo-xyloglucanases ........................................................................ 23 1.5.1.2viii  1.5.1.2.1 Glycoside hydrolase family 5 (GH5) .......................................................... 24 1.5.1.2.2 Glycoside hydrolase family 9 (GH9) .......................................................... 26 1.5.1.2.3 Glycoside hydrolase family 12 (GH12) ...................................................... 26 1.5.1.2.4 Glycoside hydrolase family 44 (GH44) ...................................................... 27 1.5.1.2.5 Glycoside hydrolase family 74 (GH74) ...................................................... 29 1.5.2 XyG-specific side-chain debranching enzymes (exo-glycosidases) ......................... 30  Glycoside hydrolase family 2 (GH2) ................................................................ 30 1.5.2.1 Glycoside hydrolase family 3 (GH3) ................................................................ 30 1.5.2.2 Glycoside hydrolase family 31 (GH31) ............................................................ 31 1.5.2.3 Glycoside hydrolase family 35 (GH35) ............................................................ 32 1.5.2.4 Glycoside hydrolase family 42 (GH42) ............................................................ 33 1.5.2.5 Glycoside hydrolase family 43 (GH43) ............................................................ 34 1.5.2.6 Glycoside fydrolase family 95 (GH95)............................................................. 34 1.5.2.71.6 Microbial XyG utilization systems ............................................................................... 35 1.6.1 The XyG utilization system of the symbiotic gut bacterium Bacteroides ovatus..... 35 1.6.2 The XyG utilization system of the non-ruminal bacterium Ruminiclostridium cellulolyticum ........................................................................................................................ 36 1.6.3 The XyG utilization system of the model soil saprophyte Cellvibrio japonicus ...... 37 1.7 Aim of investigation ..................................................................................................... 38 Chapter 2: Functional and structural characterization of a potent GH74 endo-xyloglucanase ................................................................................................................................41 2.1 Introduction ................................................................................................................... 41 2.2 Materials and Methods .................................................................................................. 43 2.2.1 Bioinformatic analysis .............................................................................................. 43 2.2.2 Cloning of cDNA encoding protein modules ........................................................... 43 2.2.3 Site-directed mutagenesis ......................................................................................... 44 2.2.4 Gene expression and protein purification ................................................................. 44 2.2.5 Carbohydrate sources ................................................................................................ 45 2.2.6 Carbohydrate analytics.............................................................................................. 46 2.2.7 Enzyme kinetic analysis ............................................................................................ 46 2.2.8 Enzyme product analysis .......................................................................................... 47 ix  2.2.9 X-ray crystallography and structure solution ............................................................ 47 2.2.10 Carbohydrate-binding analysis of CjCBM10 and CjCBM2 ................................. 48 2.3 Results and Discussion ................................................................................................. 50 2.3.1 Bioinformatic analysis .............................................................................................. 50 2.3.2 Recombinant protein production and purification .................................................... 51 2.3.3 CjGH74 substrate specificity .................................................................................... 53 2.3.4 Bond cleavage specificity and mode of action of CjGH74....................................... 56 2.3.5 CjGH74 crystallography ........................................................................................... 56 2.3.6 Characterization of CjCBM2 and CjCBM10 ............................................................ 62 2.4 Conclusions ................................................................................................................... 65 2.5 Supporting information ................................................................................................. 69 2.5.1 Supporting tables ...................................................................................................... 69 2.5.2 Supporting figures ..................................................................................................... 70 Chapter 3: In vitro and in vivo characterization of three GH5 endo-xyloglucanases ............72 3.1 Introduction ................................................................................................................... 72 3.2 Materials and Methods .................................................................................................. 74 3.2.1 Transcriptomic analysis ............................................................................................ 74 3.2.2 Bioinformatic analysis .............................................................................................. 74 3.2.3 Cloning of cDNA encoding protein modules ........................................................... 75 3.2.4 Gene expression and protein purification ................................................................. 75 3.2.5 Carbohydrate sources ................................................................................................ 76 3.2.6 Carbohydrate analytics.............................................................................................. 76 3.2.7 Enzyme kinetic analysis ............................................................................................ 77 3.2.8 Enzyme product analysis .......................................................................................... 78 3.2.9 Inhibition kinetics and active-site labeling ............................................................... 79 3.2.10 Crystallization, X-ray crystallography and structure solution .............................. 79 3.2.11 Construction of C. japonicus mutants and growth conditions .............................. 81 3.3 Results and Discussion ................................................................................................. 81 3.3.1 Transcriptomic analysis reveals a potential keystone endo-xyloglucanase from glycoside hydrolase (GH) family 5, subfamily 4. ................................................................. 81 x  3.3.2 Bioinformatic analysis and recombinant production of GH5_4 members from C. japonicus ............................................................................................................................... 83 3.3.3 CjGH5_4 enzymes are highly efficient, specific endo-xyloglucnases. .................... 86 3.3.4 Covalent labeling of CjGH5D with an active-site-directed inhibitor ....................... 90 3.3.5 CjGH5_4 crystallography ......................................................................................... 91 3.3.6 Mutational analysis of C. japonicus GH5_4 genes indicates a complex mode of action for the initial stages of xyloglucan degradation. ........................................................ 94 3.4 Conclusion .................................................................................................................... 98 3.5 Supporting information ................................................................................................. 99 3.5.1 Supporting tables ...................................................................................................... 99 3.5.2 Supporting figures ................................................................................................... 105 Chapter 4: Identification of a novel family of xyloglucan binding modules ........................114 4.1 Introduction ................................................................................................................. 114 4.2 Materials and Methods ................................................................................................ 115 4.2.1 Bioinformatic analysis ............................................................................................ 115 4.2.2 Plasmid construction ............................................................................................... 115 4.2.3 Gene expression and protein purification ............................................................... 116 4.2.4 Carbohydrate sources .............................................................................................. 117 4.2.5 Cellulose binding capacity ...................................................................................... 117 4.2.6 Affinity gel electrophoresis..................................................................................... 118 4.2.7 Activity assays ........................................................................................................ 118 4.2.8 Isothermal titration calorimetry (ITC) .................................................................... 118 4.3 Results and Discussion ............................................................................................... 119 4.3.1 Bioinformatic analysis ............................................................................................ 119 4.3.2 Recombinant protein production and purification .................................................. 119 4.3.3 X181 recognizes the galactose-containing polysaccharides XyG and galactomannan…………………………………………………………………………… 124 4.3.4 Biophysical characterization of the X181 module .................................................. 127 4.4 Conclusions ................................................................................................................. 128 4.5 Supporting information ............................................................................................... 129 4.5.1 Supporting tables .................................................................................................... 129 xi  Chapter 5: Functional analysis of a mixed-linkage β-glucanase/ xyloglucanase belonging to the glycoside hydrolase family 9 ...............................................................................................130 5.1 Introduction ................................................................................................................. 130 5.2 Materials and Methods ................................................................................................ 132 5.2.1 Bioinformatic analysis ............................................................................................ 132 5.2.2 Cloning of DNA encoding GH9 modules ............................................................... 132 5.2.3 Gene expression and protein purification ............................................................... 132 5.2.4 Carbohydrate sources .............................................................................................. 133 5.2.5 Carbohydrate analytics............................................................................................ 133 5.2.6 Enzyme kinetic analysis .......................................................................................... 134 5.2.7 Enzyme product analysis ........................................................................................ 135 5.3 Results and Discussion ............................................................................................... 136 5.3.1 Bioinformatic analysis ............................................................................................ 136 5.3.2 Recombinant protein production and purification .................................................. 139 5.3.3 CjGH9B substrate specificity ................................................................................. 139 5.3.4 Mode of action and bond cleavage specificity ........................................................ 144 5.4 Conclusions ................................................................................................................. 147 5.5 Supporting information ............................................................................................... 148 5.5.1 Supporting tables .................................................................................................... 148 Chapter 6: Comprehensive functional characterization of four glycoside hydrolase family 3 enzymes .......................................................................................................................................149 6.1 Introduction ................................................................................................................. 149 6.2 Materials and Methods ................................................................................................ 152 6.2.1 Recombinant protein production and purification .................................................. 152 6.2.2 Carbohydrate sources .............................................................................................. 152 6.2.3 Preparing the XyG-based oligosaccharide substrates ............................................. 153 6.2.4 Enzyme kinetics ...................................................................................................... 154 6.2.5 Carbohydrate analytics............................................................................................ 154 6.2.6 Bgl3D product analysis ........................................................................................... 154 6.2.7 Growth conditions ................................................................................................... 155 6.2.8 Genetic techniques .................................................................................................. 155 xii  6.3 Results ......................................................................................................................... 156 6.3.1 Linkage specificity of C. japonicus GH3 members ................................................ 156  Substrate choice for in vitro enzymology ....................................................... 156 6.3.1.16.3.1.1.1 Bgl3A exhibits β(1→3) and β(1→4) specificity in vitro .......................... 158 6.3.1.1.2 Bgl3B is agnostic toward β(1→2), β(1→3) and β(1→4) linkages in vitro………………………………………………………………………………….  158 6.3.1.1.3 Bgl3C displays preferential β(1→3) specificity in vitro ........................... 158 6.3.1.1.4 Bgl3D is a XyGO-specific β(1→4) glucosidase in vitro .......................... 159  GH3 members are not universally nor equally sufficient for disaccharide 6.3.1.2utilization ........................................................................................................................ 159 6.3.2 Functional roles of GH3 members in alternate -glucan utilization in C. japonicus…………………………………………………………………………………. 161  β(1→6)-glucosides .......................................................................................... 162 6.3.2.1 β(1→2)-glucosides .......................................................................................... 162 6.3.2.2 β(1→3)-glucosides .......................................................................................... 163 6.3.2.3 Xyloglucan β(1→4)-glucosides ...................................................................... 166 6.3.2.46.4 Discussion ................................................................................................................... 168 6.4.1 Bgl3A has a major role in MLG and sophorose utilization and supports curdlan degradation .......................................................................................................................... 168 6.4.2 Bgl3B underpins cellodextrin degradation and supports MLG utilization ............. 169 6.4.3 Bgl3C drives β(1→3)-glucan utilization ................................................................ 170 6.4.4 Bgl3D is the crucial β-glucosidase for XyG utilization .......................................... 170 6.5 Conclusions ................................................................................................................. 172 6.6 Supporting information ............................................................................................... 174 6.6.1 Supporting tables .................................................................................................... 174 6.6.2 Supporting figures ................................................................................................... 183 Chapter 7: Conclusions .............................................................................................................189 Bibliography ...............................................................................................................................194  xiii  List of Tables Table  2.1. X-ray data and structure refinement statistics for CjGH74. ........................................ 58 Table  3.1. Activity of CjGH5_4 enzymes against different polysaccharide substrates ................ 87 Table  3.2. Kinetic parameters of CjGH5_4 enzymes for (xylo)gluco-oligosaccharide glycosides…. ................................................................................................................................. 88 Table  4.1. Summary of the thermodynamic parameters for FN3-X181-sfGFP obtained by isothermal titration calorimetry ................................................................................................... 128 Table  5.1. Specific activity and kinetic parameters of Ig-GH9B towards different polysaccharide substrates ..................................................................................................................................... 141 Table  6.1. Kinetic parameters of C. japonicus GH3 enzymes against different gluco-disaccharides, cellotetraose and xyloglucan based oligosaccharides. ........................................ 157  xiv  List of Figures Figure  1.1. Schematic model for plant cell wall structure. ............................................................. 3 Figure  1.2. Two common branching patterns of plant xyloglucans (XyGs). ................................. 6 Figure  1.3. Representative structure of mixed linkage β-glucan (MLG). ....................................... 7 Figure  1.4. A schematic representation of the structure of plant xylans. ........................................ 8 Figure  1.5. Hemicellulosic glycans containing mannose. ............................................................. 10 Figure  1.6. Schematic representation of homogalacturonan structure. ......................................... 11 Figure  1.7. Inverting mechanism for a β-glycosidase. .................................................................. 15 Figure  1.8. Classical Koshland retaining mechanism for a β-glycosidase. .................................. 16 Figure  1.9. N-acetyl group substrate-assisted hydrolytic mechanism. ......................................... 17 Figure  1.10. Exogenous base-assisted mechanism. ...................................................................... 18 Figure  1.11. Alternative nucleophile in the sialidase mechanism. ............................................... 19 Figure  1.12. NAD+-dependent hydrolysis. ................................................................................... 20 Figure  1.13. Cartoon and surface representations of the Bacteroides ovatus GH5 endo-xyloglucanase in complex with XyGO. ........................................................................................ 25 Figure  1.14. Cartoon and surface representations of the Aspergillus aculeatus GH12 endo-xyloglucanase in different complexes. .......................................................................................... 28 Figure  1.15. Cartoon and surface representations of Cellvibrio japonicus xyloglucan specific exo-glycosidases. .......................................................................................................................... 33 Figure  1.16. Bacterial Xyloglucan Utilization Loci (XyGULs). .................................................. 36 Figure  1.17. The proposed (fucogalacto)xyloglucan utilization by Cellvibrio japonicus. ........... 39 Figure  2.1. Dicot xyloglucan structure, showing sidechain polydispersity. ................................. 42 Figure  2.2. Modular architecture of the native C. japonicus CJA_2477 gene product. ............... 51 Figure  2.3. SDS-PAGE of the purified protein constructs (cf. Figure 2.2). ................................. 52 Figure  2.4. pH and temperature profiles of CjGH74 with xyloglucan as a substrate. .................. 54 Figure  2.5. Michaelis-Menten kinetics of CjGH74 on polysaccharide substrates. ....................... 55 Figure  2.6. CjGH74 xyloglucan product analysis. ........................................................................ 57 Figure  2.7. MALDI-TOF analysis of the limit digest products of CjGH74 when incubated with the substrate tamarind seed XyG. ................................................................................................. 57 Figure  2.8. Three-dimensional structure of CjGH74 in complex with XyGOs. ........................... 60 Figure  2.9. Interactions of CjGH74 with two xyloglucan-derived oligosaccharides. .................. 61 xv  Figure  2.10. Binding capacity of CjCBM10-sfGFP, sfGFP-CjCBM2, sfGFP, and acetylated BSA for Avicel. ..................................................................................................................................... 63 Figure  2.11. Cellulose-binding isotherms of CjCBM10-sfGFP and sfGFP-CjCBM2. ................. 64 Figure  2.12. Bioinformatic analysis of CjCBM10 and CjCBM2. ................................................ 66 Figure  3.1. XyG structure in dicot plants and C. japonicus XyG active enzymes. ...................... 73 Figure  3.2. Volcano plots summarizing the RNAseq data for a comparative analysis of C. japoncius cells grown on either glucose or xyloglucan. ............................................................... 83 Figure  3.3. Modular architecture of the native CjGH5_4 enzymes with the different expression constructs used in the current study. ............................................................................................. 84 Figure  3.4. Michaelis-Menten kinetics of CjGH5_4 enzymes on a panel of chromogenic (xylo)gluco-oligosaccharide glycosides. ....................................................................................... 89 Figure  3.5. Inhibition kinetics of CjGH5D with XXXG-NHCOCH2Br. ..................................... 90 Figure  3.6. Three-dimensional structure of CjGH5D in complex with XXXG-NHCOCH2Br and XyGOs. ......................................................................................................................................... 92 Figure  3.7. Divergent (wall-eyed) stereo surface representation of CjGH5D-GXLG showing regions of sequence conservation. ................................................................................................ 94 Figure  3.8. Growth analysis of in-frame deletions of GH5_4, and GH74 mutant strains on xyloglucan. .................................................................................................................................... 96 Figure  3.9. Growth analysis of ΔCJA_3010 and Δgsp mutant strains when using glucose or xyloglucan. .................................................................................................................................... 97 Figure  4.1. Modular architecture of the native C.  japonicus CJA_2959 gene product and different constructs used in the study.......................................................................................... 120 Figure  4.2. SDS-PAGE of the purified protein constructs.......................................................... 122 Figure  4.3. Intact protein mass spectrometry of FN3-X181-sfGFP. ........................................... 122 Figure  4.4. Overexpression trial of FN3-X181 using the ligation independent (LIC) vectors pMCSG53, pMCSG69, and pMCSG-GST-TEV in E. coli. ....................................................... 123 Figure  4.5. Binding capacity of sfGFP, FN3-X181-sfGFP and sfGFP-CjCBM2 for Avicel. ..... 124 Figure  4.6. Affinity gel electrophoresis for FN3-X181-sfGFP against different polysaccharide substrates. .................................................................................................................................... 126 Figure  4.7. Isothermal titration calorimetry of FN3-X181-sfGFP against different ligands. ..... 127 Figure  5.1. The modular architecture of the C. japonicus GH9 enzymes. ................................. 137 xvi  Figure  5.2. Amino acid sequence alignment of the C.  japonicus GH9 catalytic modules with R. cellulolyticum XyG-active GH9 enzymes. ................................................................................. 138 Figure  5.3. Intact protein mass spectrometry of Ig-GH9B. ........................................................ 140 Figure  5.4. pH and temperature profiles of Ig-GH9B with barley β-glucan as a substrate. ....... 141 Figure  5.5. Michaelis-Menten kinetics of Ig-GH9B on different polysaccharide substrates. .... 143 Figure  5.6. Mode of action and bond cleavage specificity of Ig-GH9B towards BBG. ............. 145 Figure  5.7. Ig-GH9B product analysis when incubated with the substrate XyG ........................ 146 Figure  6.1. Representative structure of xyloglucan (XyG) and mixed linkage β-glucan (MLG)…….. ............................................................................................................................... 151 Figure  6.2. Growth of E. coli strains expressing individual C. japonicus GH3 genes using mono- and disaccharides. ....................................................................................................................... 160 Figure  6.3. Growth of C. japonicus wild-type and GH3 gene deletion mutants on the β(1→2) linked dissacharide sophorose..................................................................................................... 163 Figure  6.4. Growth of C. japonicus wild-type and GH3 gene deletion mutants on the β(1→3) linkage-containing substrates. ..................................................................................................... 164 Figure  6.5. Growth of C. japonicus wild-type and GH3 gene deletion mutants on xyloglucan and xylogluco-oliogsaccharides. ........................................................................................................ 167 Figure  6.6. Updated model of xyloglucan utilization by C. japonicus. ...................................... 171 Figure  7.1. The updated xyloglucan utilization model in C. japonicus. ..................................... 191  xvii  List of Abbreviations AAs: auxiliary activities BBG: barley β-glucan BCA: bicinchoninic acid CAZymes: carbohydrate active enzymes  CBM: carbohydrate-binding module CEs: carbohydrate esterases CNP: 2-chloro-4-nitrophenyl EDGP: Extracellular Dermal Glycoprotein GAX: glucuronarabinoxylan GH: glycoside hydrolase GHIP: Glycoside Hydrolase Inhibitor Protein GTs: Glycosyl transferases HEC: hydroxyethylcellulose HGA: homogalacturonan HPAEC-PAD: high performance anion exchange chromatography- pulsed amperometric detector  HTCS: hybrid two-component sensor IPase: Isoprimeverose-producing oligoxyloglucan hydrolases ITC: isothermal titration calorimetry KGM: konjac glucomannan MALDI-TOF: matrix assisted laser desorption ionization- time of flight MLG: mixed linkage β-glucan PGA-LM: poly-γ-glutamic acid- low molecular weight Phyre2: protein homology/analogy recognition engine. PLs: polysaccharide lyases PUL: polysaccharide utilization locus RG: rhamnogalacturonan sfGFP: super folder green fluorescent protein SGBP: surface glycan binding protein SRL: serine rich linker xviii  TBDT: TonB-dependent transporter XEH: xyloglucan endo-hydrolase XET: xyloglucan endo-transglycosylase XTH: xyloglucan endo-transglycosylase/ hydrolase XyG: xyloglucan XyGO: xylogluco-oligosaccharides XyGUL: xyloglucan utilization locus xix  Acknowledgements  I would like to express my sincere gratitude to my PhD supervisor, Dr. Harry Brumer, for giving me the opportunity to learn so many different aspects not only in research but also in life. Dr. Brumer has been always a great source of inspiration and motivation. His endless outstanding ideas never stopped guiding me throughout the entire program. I am very grateful to Dr. Brumer for securing a productive environment to pursue a high quality research and generate high impact publications. I also thank him for constantly supporting me to present my data in different international conferences, a great experience that opened the door for knowledge transfer and collaborations. On the personal level, Harry has been always a good friend who gladly and patiently used to address my concerns and provide good advice. In fact, I was fortunate to have him as a research supervisor, so thank you Harry for the enjoyable time I spent in your lab.  I am also thankful to my PhD committee members, Dr. Stephen Withers, Dr. Lawrence McIntosh and Dr. Pierre Kennepohl, for sitting in my committee and providing valuable suggestions about the future directions of my research project. I also thank them for carefully reviewing this thesis and providing a useful feedback for its ultimate improvement.   I offer my enduring gratitude to Dr. Gideon Davies and his team members Dr. Judith Stepper and Mrs. Wendy Offen (University of York, UK) for their significant contribution to the different structural analyses presented in this thesis. I would also like to express my sincere appreciation to Dr. Jeffrey Gardner and his PhD student Cassandra E. Nelson (University of Maryland, Baltimore County, USA) for helping me with all the transcriptomic and reverse genetic analyses. I thank all Brumer lab members for being that friendly, supportive and helpful. Special thanks to the postdoctoral research fellows and my friends, Dr. Gregory Arnal and Dr. Guillaume Dejean. Indeed, our fruitful discussions, soccer games, and fun times will be undoubtedly missed. I am also grateful to our lab manager Dr. Shaheen Shojania who was a great manager and an awesome and true friend. I will definitely remember you every time I see a “babyfoot” table. Eventually, I owe very special thanks to my family for their unceasing moral and financial support since I moved to Canada in 2010. Your constant encouragement fueled my motivation and enlightened my path to success.   xx  Dedication  To Mom and Dad. I could not have made it through without your love and support.   1  Chapter 1: Introduction1 1.1 Plant biomass as a source of renewable energy and valuable products Fossil fuels have been essential in the economic growth of the industrialized world since the emergence of the industrial revolution. Remarkably, fossil fuels are still dominating the global energy market generating about 80% of all primary energy in the world [1]. However, one of the greatest future challenges is the constantly diminishing reserves of fossil fuels exposing the society to energy shortage [1, 2]. Interestingly, it is anticipated that coal reserves will be the only fossil fuel source beyond 2068 [3]. Moreover, fossil fuel combustion is a key contributor to the emission of greenhouse gases leading to the human-induced global warming. To address those challenges, a lot of research and resources have been invested to identify a cleaner renewable source of energy that substitutes the rapidly depleting fossil fuels. Recently, biofuels have been suggested as a promising alternative to petroleum in energy production. Yeast fermentation of some feed stocks, such as sugarcane, corn, and sugar beets, is a relatively simple method to produce ethanol. However, the worldwide population growth and the potential risk of global warming, dryness, and increased food prices necessitate the biofuel production from non- feedstock crops, such as plants and plant-derived biomass [4]. Plant biomass represents a substantially important source not only for energy but also for chemical building blocks of industrially valuable products such as carbon fibers, polymers, and industrial additives [5]. For instance, plant biomass can be used in a multi-step process to generate polyesters used for clothing and beverage bottles on industrial scale [6]. Moreover, acid-catalyzed dehydration products of simple constituting sugars of cellulose and hemicellulose can be effectively used as precursors in the manufacture of the industrially valuable nylon. Another example is the biodegradable plastic, which is fundamentally produced from lactic acid, the fermentation product of carbohydrates [6]. It is estimated that more than 1.3 billion tons of plant biomass could be effectively produced annually in the United States alone by 2030 [7]. This amount is enough to produce approximately 130 billion gallons of cellulosic ethanol, equivalent to 87 billion gallons of gasoline [8], emphasizing the fundamental impact of plant biomass in                                                  1 Adapted from: Mohamed Attia and Harry Brumer (2016). Recent Structural Insights into the Enzymology of the Ubiquitous Plant Cell Wall Glycan Xyloglucan. Current Opinion in Structural Biology. 40:43-53 2  addressing the future energy challenges if well-exploited. Yet, the recalcitrance of plants, especially plant cell walls, to hydrolysis imposes a fundamental challenge for downstream applications since harsh chemical and environmental treatments are required to release the constituting simple fermentable sugars. Moreover, such thermochemical processes might not maintain the integrity of basic carbohydrate structures through caramelisation and undesirable by-products formation [9].  Therefore, enzymatic degradation of plant cell walls has been envisioned as a sustainable environment-friendly method for cell wall deconstruction and biofuel generation [10].   1.2 Complexity of plant cell wall The plant cell wall is crucial for the vitality of the plant cell due to its essential functions that include protection against pathogen invasion, giving shape to the different plant cell types, conferring rigidity and robustness to the cell while being dynamic and flexible, and playing important roles in recognition and signaling events [11-13]. Structurally, plant cell walls are composed primarily of polysaccharides in addition to the less abundant proteins and lignin as well as small organic and inorganic constituents [14]. The most predominant polysaccharide in all plant cell walls is cellulose which is a homopolymer composed of β(1→4)-D-glucan chains that interact together via hydrogen bonds to form crystalline fibrils, the main load bearing structures contributing to the strength of the plant cell wall [11, 15]. The stiff cellulose microfibrils are coated with hemicelluloses, also known as matrix glycans, which are flexible structures that either cross-link cellulose microfibrils together or prevent the microfibril- microfibril direct contact [15]. The cellulose-hemicellulose network is surrounded by charged polysaccharides known as pectins, which further stabilizes the cellulosic network preventing its collapse [15] (Figure 1.1). The exact composition of the hemicellulose and pectin polysaccharides varies according to cell types, plant species and age of the cell.  There are two main types of plant cell walls: the primary and the secondary cell walls [11]. The primary cell walls control the shape and size of the cell and they act as the first line of defense against microbial attacks. Moreover, primary cell walls are flexible, since they are created by the growing plant cells, and they are ubiquitous to all tissues. On the other hand, secondary cell walls are formed inside the primary cell walls after the cessation of cell growth 3  [11, 15]. Furthermore, they are only present in certain tissues and they are typically much thicker than primary cell walls to confer rigidity and water impermeability [11, 16]. Interestingly, structural features of the primary and secondary cell walls reflect the functional attributes observed in both structures: primary cell walls contain no lignin in most cases, less cellulose fibrils, and more pectins, which explains the manifested flexibility. Conversely, secondary cell walls are usually comprised of several layers (S1, S2 and S3) with more cellulose fibrils and high lignin content giving rise to the plasticity of the overall structure. Therefore, this composite nature of plant cell walls confers the apparent recalcitrance of plant biomass to deconstruction.   Figure  1.1. Schematic model for plant cell wall structure. Cellulose microfibrils are coated and cross-linked with hemicellulose forming a complex network that is embedded in a pectic matrix. Reproduced from [17].    4  1.2.1 Cellulose Cellulose is the most abundant polymer on earth contributing up to 55% of the dry weight of the agricultural lignocellulosic biomass [18-20]. As mentioned above, cellulose constitutes the main load-bearing structures in the cell wall. Structurally, cellulose is a homopolymer of β(1→4)-D-glucan chains with the disaccharide cellobiose as the repeating unit: every other glucose molecule is rotated 180° along the chain axis. The cellulose chains are extensively bound together via inter-molecular hydrogen bonds and Van der Waals forces to form cellulose microfibrils. Such strong interactions are responsible for the crystalline meta-structure of the fibrils which can consist of thousands of cellulose chains to reach lengths in the micrometer scale and diameters of at least 2 nm [11, 21]. The crystalline nature of cellulose makes it challenging not only for the hydrolytic enzymes but also for smaller molecules such as water to penetrate the cell wall. However, in between the highly crystalline regions, some less organized amorphous regions are found through which cellulolytic enzymes can access to their substrate. Notably, the cellulose microfibrils are intimately associated with hemicellulose to produce a very robust structure that strongly supports the living plant cell (vide supra). 1.2.2 Hemicelluloses Hemicellulose is a broad term that represents a wide variety of hetero-polysaccharides that can constitute up to 50% of the dry weight of plant biomass [18-20]. In the past, the term hemicelluloses defined the alkaline extractable polysaccharides from plant materials [22] Nowadays, the hemicelluloses encompass four main structurally diverse polysaccharide categories: xyloglucans (XyGs), mixed linkage glucans,  heteroxylans, and mannans [23]. Subsequently, hemicelluloses can contain different pentose sugars (D-xylose and L-rhamnose), hexose sugars (D-galactose, D-glucose and D-mannose), and uronic acids (D-glucuronic acid and D-galacturonic acid) either in the polysaccharide backbone or as side-chain substitutions. Indeed, the hemicellulose composition varies depending on plant tissue, species and whether it is in the primary or secondary cell walls [24]. Hemicelluloses play an important role in developing and maintaining the mechanical strength of cell walls by not only coating and cross-linking cellulose microfibrils but also binding to lignin covalently via ester bonds [25]. 5   Xyloglucans (XyGs) 1.2.2.1Among the hemicellulosic fraction of plant cell walls, the xyloglucans (XyGs) constitute a ubiquitous family of structurally complex polysaccharides that represents up to 25% of total dry weight of terrestrial plant cell walls, especially in dicots and non-commelinoid moncots [11, 23, 26]. XyG coats and tethers cellulose fibrils to confer robustness to the plant cell wall structure. This intimate association caused by the strong binding affinity of XyG towards cellulose was in fact demonstraed in vitro and in co-localization studies utilizing specific antibiodies and electron microscopy [11, 27].  Structurally, XyGs are heterogeneous polysaccharides characterized by a linear β(1→4)-D-glucan backbone substituted with α(1→6)- xylopyranosyl moieties at regular intervals. These sidechain residues can be further extended by other monosaccharide units, e.g. galactopyranosyl, fucopyranosyl, and arabinofuranosyl residues according to plant tissue and species [28]. For the sake of simplicity, a standard nomenclature system for XyGs has been developed [29], in which G represents an unsubstituted Glc residue, X (Xylp-α(1→6)-β-Glcp), L (Galp-β(1→2)-Xylp-α(1→6)-β-Glcp), F (L-Fucp-α(1→2)-Galp-β(1→2)-Xylp-α(1→6)-β-Glcp), and S (L-Araf- α(1→2)-Xylp-α(1→6)-β-Glcp). Indeed, the number of known side-chain variants has recently increased to over 20 due to the significant recent advances in XyG structural determination leading to the presentation of an updated nomenclature [28, 30]. Typically, XyG chains exhibit the two common backbone branching motifs XXGG and XXXG which represent four glucose residues substituted with two or three xylose residues, respectively, that might be further extended by other substitutions [31].  The fucogalacto-XyG of the XXXG-type is the most common branching variety in higher land plants such as lettuce, carrot and soybean, while the less common arabino-XyG of XXGG branching pattern is mainly found in Solanaceous plants such as tomato, potato and pepper (Figure 1.2) [32]. Due to its structural complexity, XyG degradation requires a combination of backbone-cleaving endo-xyloglucanases, as well as a range of sidechain-cleaving exo-glycosidases [33, 34].   6   Figure  1.2. Two common branching patterns of plant xyloglucans (XyGs). A) XXXG-type fucogalacto-XyG typical of dicots. B) XXGG-type arabinogalacto-XyG typical of Solanaceous species.  Monosaccharides are represented using the Consortium for Functional Glycomics Symbol Nomenclature (http://www.functionalglycomics.org/static/consortium/Nomenclature.shtml). Common side-chain variants are indicated together with the standard shorthand XyG nomenclature [30].  The reader is referred to references [28, 30] for an exhaustive catalog of known structures. Reproduced from [35].  Mixed linkage glucans (MLGs) 1.2.2.2Mixed linkage glucan (MLG) is a hemicellulose that is commonly found in the Poaceae (grasses and cereals) and other related families of the Poales order [36]. The amount of MLGs in the primary cell walls is strongly growth stage-dependent since they are involved in cell expansion [37, 38]. In addition to the structural role, MLG is an abundant storage polysaccharide within the endosperm cell walls [39]. Structurally, MLGs are composed of cellotriosyl and cellotetraosyl moieties (β(1→4)-D-glucose linked residues) that are connected by β(1→3) linkages at irregular intervals (25-30% of the total linkages is β(1→3) [40]) (Figure 1.3). Contrary to cellulose which is only composed of β(1→4)-D-glucan backbone, MLGs are soluble in water. Recently, a widespread interest in the MLG- based products (i.e. wholegrain cereals) has developed due to the discovered importance of the ingested dietary fibres in alleviating 7  dietary conditions such as inflammations, diabetes, colorectal cancer, as well as reducing the risk of cardiovascular disorders [41, 42].  Figure  1.3. Representative structure of mixed linkage β-glucan (MLG). General Poales MLG structure comprised of β(1→4)-linked cellotetraose and cellotriose units connected by β(1→3) linkages. Monosaccharides are represented using the Consortium for Functional Glycomics Symbol Nomenclature (http://www.functionalglycomics.org/static/consortium/Nomenclature.shtml). Reproduced from (Nelson et al 2017, in press).  Xylans 1.2.2.3Xylans are the second most abundant polymer in nature, after cellulose, representing up to 35% of the lignocellulosic biomass [43]. Like MLGs, xylans are predominant in many consumable food products such as common cereals, barley, rye and wheat bran. Therefore, xylans represent a very useful source of dietary fibre.  Xylans, in the form of glucuronarabinoxylans (GAXs) (vide infra), dominate in the primary cell walls of the Poales order (e.g. grasses) suggesting the occurrence of some hemicellulose replacement events during plants evolution [44]. Additionally, xylans, more specifically glucuronoxylans, are also essential and abundant in the secondary cell walls of dicot plants [23]. Due to its manifested ubiquity, xylans represent a valuable source of renewable carbon with potential industrial applications. On the structural level, xylans are heteropolysaccharides that are composed of β-(1→4)-linked xylose backbone that is further substituted in a species-dependent fashion. Xylans lack repeating motifs as in case of XyGs. Instead, they show huge diversity of ramification patterns. In the GAXs found mainly in the primary cell walls of commelinoid monocots, α-L-Araf residues are added strictly to the O-3 position to the xylosyl moieties of the backbone (Figure 1.4A) [11]. The α-L-Araf residue can be further substituted at its O-5 position with feruloyl moieties approximately after each 50 xylose units of the backbone. Moreover, the α-D-glucuronic acid 8  decorations can be rarely added to the O-2 positions of the xylosyl residues. On the other hand, GAX in non-commelinoid monocots and all dicots primary cell walls display the α-L-Araf substitution on both O-2 and O-3 positions of the backbone xyloses (Figure 1.4B). Glucuronoxylans are major components of the secondary cell walls of dicot plants representing about 20-30% of their dry weight [23]. In glucuronxylans, as the name implies, the backbone xylosyl residues are still substituted with the α-D-glucouronic acid while the arabinose ramifications are absent.   Figure  1.4. A schematic representation of the structure of plant xylans. A) Commelinoid glucuronoarabinoxylans. In the GAX from commelinoid monocot walls, the α-L-Ara and α-D-GlcAs units are added strictly to the O-3 and O-2 positions of the backbone xylosyl residues, respectively. Moreover, feruloyl groups (and sometimes other hydroxycinnamic acids) are esterified to the O-5 position of the α-L-Ara units and are spaced about every 50 Xyl units of the backbone. (B) Other glucuronoarabinoxylans. In the GAX from the noncommelinoid monocots and all dicots, the α-L-Ara units of these GAXs are attached to the O-2 position as well as to the O-3 position. Similar to the commelinoid GAX, the α-D-GlcA units are attached only at the O-2 position of the xylosyl backbone moieties. Monosaccharides are represented using the Consortium for Functional Glycomics Symbol Nomenclature (http://www.functionalglycomics.org/static/consortium/-Nomenclature.shtml).    9   Mannans 1.2.2.4Mannans consist of a linear β-(1→4)-linked mannose containing backbone and they are mainly found in charyophytes [45, 46]. Mannose can be the sole component of the backbone as in case of mannans and galactomannans, or it can be alternating with glucose residues in a non-repeating pattern as in glucomannans and galactoglucomannans (Figure 1.5). Pure mannans are rarely found in nature and they have been only found in the bulb of the Orchid oncidium [47]. Galactomannans are mainly used as storage polysaccharides in different plants and they are composed of a linear β-(1→4) mannan backbone that is further substituted with α-(1→6) galactopyranosyl residues (Figure 1.5B) [24, 48]. As mentioned above, galactoglucomannans and glucomannans have backbones that contain alternating stretches of β-(1→4)-glucose and β-(1→4) mannose residues with only galactoglucomannans having the α-(1→6) galactopyranosyl ramifications (Figure 1.5C). Glucomannans and galactoglucomannans are the major hemicellulosic components of secondary cell walls of softwoods: galactoglucomannans can represent up to 25% of their dry mass [49]. In addition to the α-(1→6) galactopyranosyl substitution, galactoglucomannans are partially acetylated at C2 or C3 with acetyl groups content of 6% which corresponds in average to 1 acetyl group per 3–4 backbone hexoses units [50].  1.2.3 Pectins While being significantly reduced or absent in secondary cell walls, pectins are major components in primary cell walls representing up to one third of all primary cell wall macromolecules [51]. In the cell wall, cellulose microfibrils with the coating and cross-linking hemicelluloses are completely embedded in a pectic matrix which controls cell wall porosity, modulates pH and ion balance, regulates cell-cell adhesion, and plays an important role in cellular recognition and signal transduction [11]. Pectins have been traditionally recognized as materials that can be easily extracted from cell walls by hot acids or chelators [23]. Structurally, pectins are highly heterogeneous, branched and highly hydrated polysaccharides that are rich in D-galacturonic acid which forms the backbone of the three polysaccharide pectic domains found in all pectin species: homogalacturonan (HGA), rhamnogalacturonan-I (RG-I) and rhamnogalacturonan-II (RG-II). The three pectin domains are believed to be covalently linked to form a pectic network in the primary cell wall matrix.  10   Homogalacturonan (HGA) is the most common form of pectin composing of 100-200 α-(1→4)-linked galacturonic acid residues that can be methyl-esterified at the C-6 position and/ or acetylated at the O-2 or O3 positions (Figure 1.6) [51]. Rhamnogalacturonan-I (RG-I) and rhamnogalacturonan-II (RG-II) are more complex types of pectin with many different monosaccharide ramifications and linkages. For example, the rod-like heteropolymer RG-I is consisting of as many as 100 repeats of the disaccharide (1→2)-α-L-rhamnose-(1→4)-α-D-galacturonic acid in which The L-ramnosyl residue can be further substituted at the C-4 position with side chains of 50 residues or more of arabinans, galactans, or arabinogalactans to give rise to highly variable family of polysaccharides [51]. Although RG-II is structurally distinct from RG-I, they still share the HGA backbone. However, RG-II is highly decorated with side chains including at least 12 different sugar types and more than 20 diverse glycosidic linkages which make RG-II the most structurally complex type of pectin [52].   Figure  1.5. Hemicellulosic glycans containing mannose. A) Pure mannans. B) Galactomannans with a backbone of exclusive β-1,4 mannosyl residues with α-1,6 galactosyl substitutions. C) (Galacto)Glucomannans with a backbone of a roughly equimolar mixture of  β-1,4 mannosyl and β-1,4 glucosyl residues. The mannosyl residues are substituted with α-1,6 galactosyl  at different intervals. Monosaccharides are represented using the Consortium for Functional Glycomics Symbol Nomenclature (http://www.functionalglycomics.org/static/consortium/Nomenclature.shtml).   11   Figure  1.6. Schematic representation of homogalacturonan structure. The α-1,4-linked galacturonic acid residues of the backbone can be methyl-esterified at the C-6 position and/ or acetylated at the O-2 or O-3 positions.  1.2.4 Lignin Lignins are hydrophobic phenolic macromolecules that are unique to secondary cell walls. Despite providing rigidity to plant cells and protection against cellulose and hemicellulose degrading enzymes [53, 54], lignins ensure the transport of water and nutrients throughout the cell wall. Structurally, lignins are composed of three major building blocks biosynthesized from phenylalanine: p-coumaryl alcohol, coniferyl alcohol, and sinapyl alcohol. The basic monomer building blocks are joined together via radical reactions to build a strong network which is always associated with plant cell wall carbohydrates to provide robustness and protection [54]. Because of lignin‟s inhibitory effect on the plant biomass degrading enzymes, it is now evident that reducing lignin content via transgenic approaches or through selecting low lignin content varieties can in fact facilitate saccharification for biofuels by enhancing the release of cell wall sugars [55, 56]. 1.3 Carbohydrate active enzymes (CAZymes) Carbohydrates are extremely dominant in nature due to their involvement in many biological functions in living organisms. To put that into the perspective of the plant cell wall, carbohydrates represent the most abundant component with variable functions ranging from protection and support to storage as carbon reserves. Interestingly, carbohydrates have similar building blocks. However, due to the possible variations in the type of intra-molecular linkages and the stereochemical configuration of the hydroxyl groups, in addition to the enormous diversity of the attached non-carbohydrate substituents, a limitless number of possible structural combinations can be essentially obtained. The biosynthesis and breakdown of these structurally diverse complex carbohydrates are controlled by Carbohydrate Active Enzymes (CAZymes). CAZymes are grouped based on the similarity in their primary amino acid sequences into distinct families that are deposited in the CAZy database (www.cazy.org) [57]. Notably, this 12  classification provides extremely useful insights into the structural features, evolutionary relationships and mechanism of action of different CAZymes [57]. Astonishingly, a limited number of CAZyme scaffolds handles the construction and breakdown of the tremendous number of possible carbohydrate structures via the evolution of different substrate specificities. A drawback of the classification, however, is the difficulty of substrate prediction due to the presence of enzymes with different specificity within the same family [57]. This challenge can be addressed by defining subfamilies within the same family [58-60]. As such, recognizing new substrate specificities can be initially addressed by looking at the uncharacterized subfamilies as a first step [57]. The CAZy database represents a very useful tool for enzyme discovery and it is constantly updated as more sequences are functionally and structurally characterized. To date, all the sequences in the database are grouped into five different enzyme categories, in addition to one class of the non-catalytic carbohydrate binding modules, as follows: Glycoside hydrolases (GHs), or glycosidases or glycosyl hydrolases, represent the biggest enzyme category in the CAZy database with 145 enzyme families comprising 449,520 known and predicted catalytic modules as of June 2017 (www.cazy.org) [57]. GHs catalyze the hydrolysis of glycosidic linkages between two or more carbohydrates or between a carbohydrate and a non-carbohydrate component. The catalytic mechanisms of this class of enzymes will be described below in details.   Glycosyl transferases (GTs) are responsible for the construction of di, oligo, and polysaccharides via the formation of glycosidic linkages. GTs catalyze these reactions by transferring sugar moieties from glycosyl donors, such as activated sugar phosphates, to a nucleophilic group, usually an alcohol acceptor [61-63]. As of June 2017, GTs are classified into 103 families comprising 325,643 predicted catalytic modules (www.cazy.org) [57] Polysaccharide lyases (PLs) are responsible for the non-hydrolytic cleavage of carbohydrates. They utilize a β-elimination mechanism to cleave uronic acid-containing polysaccharides, such as pectins and algal polysaccharides, to give unsaturated hexenuronic acid moiety as well as a new reducing end at the site of cleavage [64-66]. This class comprises 26 families with 17.325 predicted catalytic sequences as of June 2017 (www.cazy.org) [57]. 13  Carbohydrate esterases (CEs) are classified into 16 families with 44,135 total catalytic modules (June 2017) (www.cazy.org) [57]. CEs catalyze the cleavage of ester bonds found in the acyl-substituted carbohydrates such as pectin methyl esters and acetylated xylans. The deacylation reactions are commonly catalyzed via Ser-His-Asp catalytic triad which is also employed by classical lipases and serine proteases [67]. CEs can also utilize a less common metal ion-dependent deacylation mechanism as in case of acetylxylan esterases from Streptomyces lividans and Clostridium thermocellum [68] The class of enzymes with auxiliary activities (AAs) encompasses redox enzymes that cleave glycosidic linkages by oxidation and it includes lytic polysaccharide monooxygenases (LPMOs), which act on the backbone of crystalline polysaccharides, as well as lignolytic enzymes. Although lignin degrading enzymes and LPMOs have different substrate specificity, they have been grouped together in the AAs category based on the common mechanism of action and due to the intimate association between lignin and plant cell wall polysaccharides [25]. The AAs classification is relatively new and it was initiated by the discovery that members belonging to the GH61 and CBM33 are in fact metal ion-dependent lytic polysaccharide monooxygenase and not GHs and CBMs, respectively [25]. To date, the AAs category is classified into 13 families with 12,585 catalytic module sequences (www.cazy.org) [57]. Carbohydrate binding modules (CBMs) bring the appended catalytic domains in close proximity with their respective substrates leading to more efficient degradation of the polysaccharides [69]. CBMs are usually observed within CAZymes with the exception of some rare occasions where they can be found independently. Historically, CBMs were previously classified as Cellulose Binding Domains (CBDs) because of the cellulose binding capacity of the identified member [70, 71]. However, the continuous discovery of other modules that display different folds and that bind carbohydrate substrates other than cellulose entailed the reclassification of this family with a proper nomenclature (reviewed in [72]). To date, CBMs are classified based on the amino acid sequences into 81 families with 101,641 known sequences (www.cazy.org) [57].  14  1.4 Mechanistic insights into glycoside hydrolases Since the identification and characterization of unique GHs is the main scope of this thesis, I will discuss this class of enzymes in a bit of detail. As mentioned above, GHs are catabolic enzymes that catalyze the hydrolysis of glycosidic linkages within different carbohydrate substrates. In terms of abundance, cellulose and chitin are the two most dominant organic substances in living organisms [73] which reflects the ubiquity of carbohydrates on earth. Although glycosidic linkages can be the most stable covalent bonds in the natural biopolymers, GHs can effectively enhance their rate of hydrolysis by a factor of 1017 [74] which makes GHs superb catalysts. Given the abundance of carbohydrates in nature, GH genes are found in nearly all living organisms with the exception of some unicellular parasitic eukaryotes and some archaeans [75].  As mentioned previously, GH category is the largest in the CAZy classification with 145 families identified to date (www.cazy.org) [57]. Notably, the family classification is based on the amino acid sequence identity, which can also provide insights into protein fold and mechanism of action. When it comes to substrate specificity, the classification can only provide predictions, which emphasizes the importance of the biochemical characterization of the GH family members. A common example is the observed polyspecificity in the GH5 family [58] which contains more than 20 different activities to date (www.cazy.org) [57].  In terms of catalytic mechanisms, GHs are classified into inverting and retaining enzymes according to the anomeric configuration of the product with respect to the enzyme substrates  [76]. In the following section, Iwill discuss the different catalytic mechanisms described to date for the GHs. 1.4.1 Inverting mechanism The inverting GHs employ two carboxylic amino acid residues, typically 10 Å apart, in a one-step single displacement mechanism leading to the net inversion of the anomeric configuration. Mechanistically, one of the catalytic carboxylic residues acts as a general base while the other as a general acid. The hydrolysis takes place via the nucleophilic attack of a water molecule after being deprotonated by the catalytic base, while concomitantly, the leaving 15  group departs and the glycosidic oxygen gets protonated by the general acid (Figure 1.7) [77, 78].  Figure  1.7. Inverting mechanism for a β-glycosidase.  Reproduced from [79].  1.4.2  Retaining mechanisms Similar to the inverting mechanism, the classical Koshland retaining mechanism also employs two catalytic carboxylic residues. However, the anomeric configuration of the substrate is retained in a double-displacement mechanism involving a glycosyl-enzyme intermediate. Moreover, different from the inverting mechanism, one of the two catalytic residues acts as a nucleophile while the other acts as a general acid/base and both are spaced by ~ 5.5 Å. In the first step (glycosylation step): nucleophilic attack of the catalytic nucleophile at the anomeric position takes place concomitant with the departure of the leaving group which gets protonated by the aid of the acid/base catalytic residue. This step will lead to the formation of a covalent glycosyl-enzyme intermediate. In the second step (deglycosylation step): a water molecule, after being deprotonated by the assistance of the acid/base residue, will attack the anomeric position to release the enzyme (Figure 1.8). 16   Figure  1.8. Classical Koshland retaining mechanism for a β-glycosidase. Reproduced from [79]. Deviating from the classical Koshland mechanism, some GHs, such as those belonging to families 18, 20, 25, 56, 84, and 85 employ a substrate- assisted catalysis. These enzymes have no catalytic nucleophile. Instead, they utilize the 2-acetamido group within their cognate substrates as an intramolecular nucleophile to form an oxazoline intermediate [80-82]. The charge development in the transition state is however stabilized by a carboxylate stabilizing residue in the active site of the enzyme (Figure 1.9).   17   Figure  1.9. N-acetyl group substrate-assisted hydrolytic mechanism. Reproduced from [79]. Other enzymes such as the myrosinases, GH1 enzymes catalyzing the hydrolysis of anionic thioglycosides (glucosinolates) found in plants, harness an exogenous base-assisted mechanism. These enzymes are lacking the acid/base carboxylic residue involved in the classical Koshland retaining mechanism. Instead, it is replaced by a glutamine residue, which probably decreases the conceivable charge repulsion of the anionic aglycon sulfate. In this mechanism, the glycosylation step does not require the assistance of an acid/base catalytic residue since the unusual aglycon is an adequately good leaving group. However, in the deglycosylation step, the enzyme utilizes an exogenous base such as the co-enzyme L-ascorbate to deprotonate a water molecule prior to the nucleophilic attack on the anomeric carbon [83] (Figure 1.10).   18   Figure  1.10. Exogenous base-assisted mechanism. Reproduced from [79]. Sialidases and trans-sialidases of GH33 and GH34 [84, 85], as well as the 2-keto-3-deoxy-D-lyxo-heptulosaric acid hydrolases of GH143 [86], utilize a tyrosine as a catalytic nucleophile (alternative nucleophile) rather than the carboxylate residue in case of the typical retaining mechanism (Figure 1.11). This is attributed to the carboxylate negative charge on the anomeric carbon of the substrates of these enzymes, which can lead to the occurrence of repulsion interference if a negatively charged carboxylate nucleophile was involved in the catalysis. The tyrosine catalytic nucleophile is, however, believed to be activated by a neighboring base residue to increase its nucleophilicity.   19   Figure  1.11. Alternative nucleophile in the sialidase mechanism. Reproduced from [79].  The last unusual retaining mechanism was observed in the GH4 and GH109 members, which utilize NAD+ cofactor throughout the catalysis. In this mechanism, oxocarbenium ion-like transition states are not formed. Instead, hydrolysis proceeds via elimination and redox steps (Figure 1.12) [87, 88].  20   Figure  1.12. NAD+-dependent hydrolysis. Reproduced from [79].   21  1.4.3 Catalytic mode of action of GHs GHs can be also classified based on the site of cleavage into exo-acting and endo-acting enzymes [77]. Exo-acting enzymes have an active site pocket which recognizes and cleaves off a monosaccharide or a short oligosaccharide from the end, usually non-reducing end with a few exceptions [89, 90], of their target substrates.  On the other hand, endo-acting GHs do not typically have an active site pocket. Instead, they have an active site cleft, which is capable of recognizing and accommodating its cognate substrate to cleave internal bonds of the polysaccharide backbone. To facilitate catalysis, the enzyme utilizes a number of subsites to bind its substrate. Each subsite is represented by a group of amino acid residues in the active site cleft that recognizes and binds individual sugar moieties along the polysaccharide chains via stacking and hydrogen bond interactions. A subsite nomenclature system has been developed where subsites are labeled from the point of cleavage in the polysaccharide chain extending from -1, -2, -3, etc, towards the non-reducing end, to +1, +2, +3, etc, towards the reducing end of the chain [91].  Based on the mode of action, the endo-acting GHs can be classified into endo-dissociative and endo-processive enzymes. In the endo-dissociative mode of action, the hydrolytic enzyme gets desorbed from the polysaccharide chain after each hydrolytic cleavage. Therefore, the enzyme produces a random mixture of mid-range molecular weight products at the beginning of the reaction. Contrary to endo-dissociative mode of action, the endo-processive enzyme accommodates and binds the polysaccharide chain in the active site cleft, catalyzes the cleavage, and then slides on the polysaccharide chain without desorption to catalyze subsequent hydrolytic events. This mechanism is supported by the tunnel-like active site structures recognized in some enzymes such as the cellobiohydrolase I and II from Trichoderma reesei [92-94]. Therefore, an endo-processive enzyme would possess a hydrolysis pattern that resembles that of an exo-enzyme [95] (vide supra).  1.5 XyG deconstructing enzymes Among CAZymes that degrade the plant cell wall matrix glycans, there is a growing interest in the functional and structural characterization of xyloglucan (XyG)-specific endo-22  glucanases and exo-glycosidases. This interest is fundamentally motivated by the ubiquity and abundance of this family of complex polysaccharides across plant lineages [28, 30, 96], and by demonstrable application potential in biofuels and biomaterials production [97-100]. Numerous advances have been made on the biochemistry and structural biology of XyG hydrolases. It is now evident that different microbes, including terrestrial and gut bacteria, developed complex machineries that allow the full saccharification of XyGs in the soil and diet, respectively (see section 1.6). In addition, plants also have been shown harnessing XyG-active enzymes in the rearrangement processes of their cell wall structure. In light of XyG structural complexity (Figure 1.2), its complete saccharification to the composing monosaccharides requires a complex interplay between backbone-cleaving endo-xyloglucanases and a range of side-chain-cleaving exo-glycosidases. As XyG degradation is the focus of this thesis, the XyG-degrading enzymes in plants and microbes will be discussed with an emphasis on the coordinated XyG utilization systems identified to date. 1.5.1 Backbone cleaving xyloglucanases (endo-xyloglucanases) The endo-xyloglucanases target the β-1,4 linkages within the XyG backbone after accommodating this polymeric substrate in a wide active site cleft well-adapted for recognition and catalysis. In the next section, I will summarize the different GH families with reported endo-xyloglucanase activity in plants and microbes.  Plant endo-xyloglucanases (glycoside hydrolase family 16) 1.5.1.1Plant xyloglucan endo-transglycosylase/ hydrolase (XTH) enzymes (reviewed in [101]) play a key role in cell wall biosynthesis and remodeling involved in gravitropic responses, seed germination, fruit ripening, and tissue expansion [101-103]. Plant XTHs have been only identified in the GH16 family which contains, in addition to XTHs, a wide range of microbial endoglucanases and endogalactanases [101].  Previous phylogenetic analyses of the GH16 family identified bacterial licheninases (EC 3.2.1.73), which hydrolyze β(1→4) bonds in mixed linkage glucans, as the closest relatives to XTHs [104]. All GH16 members, including XTHs, are predicted to share the same overall β-23  jelly-roll protein fold. However, compared to licheninases, the active site cleft of the XTH gene products has undergone a major loop deletion at the negative subsites and a large C-terminal extension at the positive subsites producing a much wider and a further elongated substrate-binding cleft, respectively, well-suited for XyG recognition and binding [101].  The molecular phylogeny of XTH genes was divided into 3 groups with xyloglucan endo-hydrolase (XEH) activity only demonstrated in group III-A [101, 103]. Mutagenesis and quantitative kinetic analysis along with superposition of the crystal structures of the group I/II xyloglucan endo-transglycosylase (XET), PttXET16-34 [105], and the group III-A XEH, TmNXG1, identified a short loop extension at the active site of TmNXG1 as a key structural determinant for the XEH activity [102]. Recently, a mixed function endoglucanase from black cottonwood Populus trichocarpa (PtEG16) [106] and Vitis vinifera EG16 (VvEG16) [107] have been biochemically and structurally characterized shedding the light on a small clade of GH16 members intermediate between the bacterial licheninases and plant XTH gene products. Interestingly, the EG16 gene products lack the large C-terminal extension characteristic to the XTH gene products, while they demonstrate broad substrate specificity with nearly equal capacity to degrade the linear β-glucans and the branched XyGs [106].  Phylogenetic analysis clustered PtEG16 and homologs, which are represented in limited numbers in plants contrary to XTHs, in a major distinct clade intermediate between XTH gene products and their putative ancestors, licheninases [106]. Based on those observations, the newly identified clade comprising the PtEG16-like sequences is believed to represent an ancestral link to the evolution of XTH subfamily [106]. Nonetheless, given their low abundance in higher plant genomes, the biological significance of PtEG16-like members remains to be investigated.  Microbial endo-xyloglucanases 1.5.1.2Unlike plant endo-xyloglucanases, which are only observed in the GH16 family, bacteria evolved a large suite of hydrolytic enzymes from different GH families that selectively target the plant polysaccharide XyG.   24  1.5.1.2.1 Glycoside hydrolase family 5 (GH5) GH5 is a big family with more than 7800 catalytic modules identified in bacteria and more than 20 substrate specificities (www.cazy.org) [57]. GH5 is classified into 51 subfamilies, with endo-xyloglucanase activity only demonstrated in the subfamily 4 (GH5_4) [58]. GH5 enzymes employ a typical retaining hydrolytic mechanism utilizing two catalytic glutamate residues, of which the first acts as a catalytic nucleophile, while the second acts as a catalytic acid/base (see section 1.4.2). Structurally, GH5 endo-xyloglucanases display an (α/β)8 barrel fold with an active site cleft running through the entire surface of the protein which accommodates the XyG polymeric substrate. Previously, Paenibacillus pabuli XG5 (PDB: 2JEQ) has been characterized as the first GH5_4 endo-xyloglucanase structure to be elucidated [108]. Recently, an in-depth biochemical, structural and reverse genetic investigation revealed that the vanguard outer membrane-anchored GH5_4 enzyme from the gut symbiont Bacteroides ovatus, BoGH5, is indispensable to the XyG utilization capacity of this gut bacterium. This could be clearly explained by the fundamental role of BoGH5 which initiates the backbone hydrolysis of XyG at the cell surface leading to the generation of the Glc4-based xylogluco-oligosaccharides (XyGOs), the substrates for the downstream debranching enzymes encoded by the B. ovatus genome (see section 1.6.1) [34]. Like other GH5_4 members, BoGH5 displays an (α/β)8 barrel fold with a broad active-site cleft (Figure 1.13, PDB: 3ZMR), that accommodates equally the distinct side-chain branching of plant specific galacto-, fucogalacto-, arabinogalacto-XyG polysaccharides with near-equivalent hydrolysis kinetics.  In terms of backbone regio-specificity, BoGH5 hydrolyzes XyG at C-1 of unbranched Glc(1→4) residues (G in the standard linear nomenclature [30]), which is the most common point of cleavage by endo-xyloglucanases across all GH families.  Elegant structure-function analysis of two GH5_4 endo-xyloglucanases, XEG5A and XEG5B, from a bovine rumen metagenomic library has recently shed the light on the molecular basis for this regio-specificity [109]. Whereas XEG5A cleaves XyG at unbranched G units, XEG5B can hydrolyze the backbone between branched [Xyl(1→6)]-Glc(1→4) units (X in the standard linear nomenclature [30]).  Structural analysis of XEG5A and XEG5B, in comparison to previously solved GH5_4 endo-xyloglucanases including BoGH5, revealed that most members of this 25  subfamily have a constricted subsite -1 immediately adjacent to the catalytic residues (see [91] for CAZyme subsite nomenclature). In contrast, XEG5B exhibits a particular widening of the active-site cleft (PDB: 4W8B), and is thus capable of accommodating the branched X unit at this position [109]. Moreover, this comparative analysis also revealed a +1 subsite pocket conserved among all xyloglucan-specific GH5_4 members, to specifically cradle an X unit on the other side of the catalytic residues. Other recent structural enzymology has revealed that further subtle primary structure variation among GH5_4 members can significantly attenuate activity toward the highly branched XyGs. An ensemble of (xylo)gluco-oligosaccharide complexes spanning the entire active-site cleft of a predominant β(1→3)/β(1→4) mixed-linkage endo-glucanase from the rumen Bacteriodetes Prevotella bryantii B14 (PDB: 3VDH, 5D9M, 5D9N, 5D9O, 5D9P) have revealed how xylosyl branching is disfavored in the positive subsites of this enzyme, which nonetheless tolerates XyG as a substrate [110].  Even more striking, crystallography of a strict unbranched endo-glucanase from a cow rumen metagenomics PUL illuminated a tightly constricted active-site cleft that exhibited only trace activity toward the highly branched XyG polysaccharide (PDB: 4YHE, 4YHG; reference [111] provides a comprehensive comparative structural analysis).   Figure  1.13. Cartoon and surface representations of the Bacteroides ovatus GH5 endo-xyloglucanase in complex with XyGO. The catalytic domain is shown in cyan and the N-terminal extending domain is highlighted in pink. The XyGOs are shown in yellow and red sticks. Reproduced from [35]. 26  1.5.1.2.2 Glycoside hydrolase family 9 (GH9) The GH9 is the second largest “cellulase” family (formerly, Cellulase Family E [57]) comprising more than 2500 catalytic domains of which 170 are biochemically characterized and only 16 structures are determined to date (www.cazy.org) [57]. The GH9 family demonstrates wide substrate specificity that includes general endo-glucanase (EC 3.2.1.4), β-glucosidase (EC 3.2.1.21), licheninase/ endo-β-1,3-1,4-glucanase (EC 3.2.1.73), exo-β-1,4-glucanase/ cellodextrinase (EC 3.2.1.74), cellobiohydrolase (EC 3.2.1.91), exo-β-glucosaminidase (EC 3.2.1.165), and xyloglucan-specific endo-β-1,4-glucanase (EC 3.2.1.151) [57]. Notably, the endo-xyloglucanase activity was added to the spectrum of the GH9 family only recently when the first strict xyloglucan-specific GH9 endo-hydrolase (Cel9X) was functionally characterized from the terrestrial bacterium Ruminiclostridium cellulolyticum (see section 1.6.2) [112]. The GH9 enzymes catalyze the hydrolytic cleavage of their substrates through an inverting mechanism utilizing a catalytic aspartate residue as a nucleophile and a catalytic glutamate residue as general acid/base. Although the structural determinants required for binding are yet to be discovered for the GH9 endo-xyloglucanases, the GH9 enzyme structures solved thus far display an (α/α)6-barrel topology [113-119]. An interesting structural feature of the GH9 enzymes is the presence of an N-terminal immunoglobulin (Ig)-like domain with an independent fold from the GH9 catalytic domain. Until now, the biological significance of the Ig-like domain is yet to be identified. However, it has been previously shown that the GH9 catalytic activity could be completely abolished upon the deletion of the N-terminal Ig-like domain [116, 120], which might indicate its essential role in the creation of a functional conformation of the active site [119].  1.5.1.2.3 Glycoside hydrolase family 12 (GH12)  The GH12 family comprises more than 730 sequences, 68 characterized enzymes, and 16 solved structures of which only 3 are endo-xyloglucanases  (www.cazy.org) [57]. The GH12 is a classic retaining “cellulase” family (formerly, Cellulase Family H [57]), with some members possessing significant or near-exclusive endo-xyloglucanase activity.  All GH12 endo-xyloglucanases characterized thus far cleave the XyG backbone at the anomeric position of unbranched Glc(1→4) residues following a typical retaining mechanism [108, 121-124].  27  The bacterial GH12 endo-xyloglucanase from Bacillus licheniformis, BlXG12, was functionally and structurally characterized as the first representative example of endo-xyloglucanases in this family [108]. Biochemical investigation of BlXG12 demonstrated a broad substrate specificity with slightly higher activity on XyG compared to the soluble artificial substrate, carboxymethylcellulose [108]. Structural analysis of BlXG12 (PDB: 2JEN) identified for the first time the structural features for XyG recognition in the active site cleft. These features include the presence of serine residue at the -3 subsite which effectively form a hydrogen bond interaction with the -3ʹ xylose, and the lack of tyrosine and arginine blocking residues at the +2 subsite which are present in the closest endo-glucanase homolog [108]. Compared to the bacterial BlXG12, the more abundant fungal GH12 endo-xyloglucanases have been shown more specific towards XyG [121-123]. More recently, crystallography of a highly specific endo-xyloglucanase from Aspergillus aculeatus, in individual complexes with a XyGO product (PDB:  3VL9) and a specific Glycoside Hydrolase Inhibitor Protein (GHIP) from carrot (Daucus carota, PDB: 3VLB), revealed the molecular basis of plant defense against this potentially phytopathogenic fungus [125]. The GHIP binds competitively to block the GH12 endo-xyloglucanase active-site cleft, including the insertion of two conserved arginine residues that form salt bridges with the catalytic carboxylate residues of the enzyme (Figure 1.14 , PDB: 3VLB). 1.5.1.2.4 Glycoside hydrolase family 44 (GH44)  The GH44, formerly known as Cellulase Family J, is a slightly explored glycoside hydrolase family with about 140 catalytic domains of which only 4 are structurally characterized to date (www.cazy.org) [57]. GH44 displays very narrow substrate specificity with only reported endo-glucanase (EC 3.2.1.4) and xyloglucanase (EC 3.2.1.151) activities [57]. Similar to the GH5 and GH12 families, GH44 utilizes a retaining cleavage mechanism that involves two catalytic glutamate residues.  28    Figure  1.14. Cartoon and surface representations of the Aspergillus aculeatus GH12 endo-xyloglucanase in different complexes. AaGH12 is complexed with XyGO (PDB: 3VL9) and carrot extracellular dermal glycoprotein (EDGP) (PDB: 3VLB) [125].  The catalytic domain is shown in cyan and EDGP is highlighted in pink. The two catalytic glutamate side-chains of AaGH12 are shown in green/red and the two conserved arginine side-chains in EDGP are shown in yellow/red. Reproduced from [35]. It is noteworthy that among the four GH44 structures determined thus far, the GH44 from Paenibacillus polymyxa (PpXG44) is the only xyloglucan-specific endo-glucanase structurally characterized [126]. PpXG44 displayed (β/α)8-fold characteristic to the GH44 family [126]. Interestingly, compared to most characterized endo-xyloglucanases, biochemical characterization of PpXG44 revealed the unusual bond-cleavage specificity of the enzyme towards XyG as indicated by the products XXX and GXXXG resulting from the incubation of the enzyme with the long XyGO XXXGXXXG [126]. The accommodation of the X motif in the -1 subsite and the preferential binding of the substrate in the -3 to +5 subsites were further supported by the crystal structure of the enzyme, which clearly demonstrated the potential steric clash that hinders the accommodation of a xylosyl residue in the -4‟ subsite of the enzyme [126]. Notably, similar mode of action was observed in the GH44 endo-xyloglucanase (Cel44O) which is involved as a backbone cleaver in the initial step of xyloglucan saccharification pathway in the cellulosome-producing bacterium R. cellulolyticum (see section 1.6.2) [127].      29  1.5.1.2.5 Glycoside hydrolase family 74 (GH74) The GH74 is another  small family comprising more than 360 catalytic domains with only 17 enzymes characterized to date (www.cazy.org) [57]. GH74 family exhibits a limited number of reported catalytic activities that include general endo-glucanase (EC 3.2.1.4), xyloglucan-specific endo-β-1,4-glucanase (EC 3.2.1.151), and oligoxyloglucan reducing end-specific cellobiohydrolase (EC 3.2.1.150) activities. Notably, the endo-xyloglucanases  represent the majority of the well-characterized fungal and bacterial enzymes within this family (http://www.cazy.org/GH74_characterized.html, [57]). Commensurate with the high XyG specificity of this family, a recent interesting study revealed the essential contribution of the GH74 enzyme (Xgh74) to the XyG saccharification in the Gram-positive bacterium R. cellulolyticum. This observation was indicated by the upregulation of the Xgh74 gene, along with the xyloglucan utilization locus (XyGUL) genes, when the bacterium was grown on XyG or XyGOs (see section 1.6.2) [127].  Mechanistically, the GH74 enzymes utilize an inverting hydrolytic mechanism driven by two catalytic aspartate residues, one of which acts as a general base while the other acts as a general acid (see section 1.4.1). Structurally, the GH74 endo-xyloglucanase enzymes consist of two seven-bladed β-propeller domains that are connected by two short loops where the active site is located [128-130]. With regard to mode of action of the GH74 endo-xyloglucanases, a particularly insightful structure-function analysis by Yaoi, Kaneko, and co-workers has indicated that some GH74 endo-xyloglucanases may demonstrate processivity due to the presence of structurally conserved tryptophan residues that line the active-site cleft [131, 132]. In particular, W318 and W319 compose part of the positive subsites in a Paenibacillus sp. GH74 member and site-directed mutagenesis has unambiguously indicated that these provide the greatest contribution to the endo-processive mode of action. Most of the GH74 endo-xyloglucanases accommodate the unsubstituted glucose residues of the XyG chains in the -1 subsite [129, 130, 133-135]; although it has been recently shown that some members cleave between the substituted glucose residues [132, 136-138]. 30  1.5.2 XyG-specific side-chain debranching enzymes (exo-glycosidases) The XyG-specific exo-glycosidases include all the enzymes working downstream of the backbone cleavers. They selectively accept the XyGO substrates and cleave off terminal sugar residues for final processing of the XyG substrates. In the next section, I will discuss some selected families with exo- glycosidase activities central to the XyG degradation.   Glycoside hydrolase family 2 (GH2) 1.5.2.1The GH2 family comprises more than 10400 catalytic domains with a predominant β-galactosidase activity (EC 3.2.1.23) observed within the 147 characterized members, although other activities such as β-mannosidase, β-glucuronidase, α-L-arabinofuranosidase, mannosylglycoprotein endo-β-mannosidase, and exo-β-glucosaminidase were also observed in the family (www.cazy.org) [57]. Recently, a GH2 β-galactosidase from the gut symbiont B. ovatus (BoGH2A) was found fundamentally involved in the XyG utilization system of this gut bacterium (see section 1.6.1) [34]. Biochemical characterization of BoGH2A clearly confirmed the β-galactosidase activity of the enzyme and suggested its potential ability to remove the β-galactosyl residues from the arabino-galacto XyGOs [34, 139]. Nevertheless, the three-dimensional structures of the XyG-specific β-galactosidases from the GH2 family are yet to be elucidated.             Glycoside hydrolase family 3 (GH3) 1.5.2.2The GH3 is a broad family with more than 16000 retaining enzymes and about 11 catalytic activities identified to date (www.cazy.org) [57]. Of these, the β-1,4 exo-glucosidase activity (EC 3.2.1.21) is fundamental in the ultimate step of the XyG saccharification pathway. In the human gut microbe B. ovatus, the two GH3 exo-glucosidase enzymes BoGH3A and BoGH3B are integral components of the XyG utilization machinery despite their biochemical preference towards the unsubstituted cello-oligosaccharide substrates (see section 1.6.1) [34, 139]. Likewise, the GH3 β-glucosidase, Glu3A, from the Gram-positive bacterium R. cellulolyticum has been shown primarily involved in XyG saccharification as indicated by its upregulation along with other XyG-active enzymes when the organism is grown on XyG or XyGOs (see section 1.6.2) [127]. To identify structural determinants responsible for substrate recognition and 31  catalysis, the crystal structure of the BoGH3B was recently determined. Similar to the previously characterized structures of GH3 β-glucosidases [140, 141], BoGH3B is composed of a three-domain architecture that involves an N-terminal barrel-like domain where the active site pocket is mainly formed, a central α/β sandwich domain, and a C-terminal fibronectin type-III-like domain [139]. Because of the relatively poor catalytic efficiency of the characterized BoGH3 enzymes toward XyG-based substrates, I have been investigating whether other members of the GH3 family were evolved to target XyG with higher selectivity and specificity (see chapter 6). In addition to β-glucosidase activity, a very interesting activity in the GH3 family that has not been widely explored is the isoprimeverose-producing oligoxyloglucan hydrolase (IPases, EC 3.2.1.120) activity. The IPase enzymes are unique β-glucosidases that release Xyl(1→6)Glc units (X in the standard linear nomenclature [30]) from the non-reducing end of XyGOs allowing an alternative route to XyG saccharification vis-à-vis the strictly concerted action of monosaccharide-releasing exo-glycosidases. The IPase activity was first reported in 1985 from the fungus Aspergillus oryzae commercial enzyme preparation “Driselase®” (e.g., Sigma-Aldrich cat. no. D8037) [142]. However, the gene encoding sequence for that activity remained unidentified until recently [143]. To date, only one bacterial [144] and one fungal [143] GH3 IPase enzymes from Oerskovia sp. Y1 and A. oryzae, respectively, have been isolated and identified. Biochemical characterization of both GH3 enzymes confirmed their catalytic ability to remove the terminal X from XyGOs, while not accepting cellobiose as a substrate, indicating the strict recognition of the X unit at the non-reducing end of the substrate [143, 144]. Moreover, gene expression studies demonstrated the association of the A. oryzae IPase with the XyGO degradation as indicated by the upregulation of the IPase-encoding gene in the presence of XyGOs. This recent identification of specific encoding sequences will significantly enable future efforts to elucidate the structural basis of the unique recognition of Xyl(1→6)Glc units by IPases, in the context of phylogenetically related GH3 exo--glucosidases and exo--xylosidases [143].  Glycoside hydrolase family 31 (GH31) 1.5.2.3The GH31 family encompasses about 6900 sequences and more than 10 activities that mainly include α-glucosidase, α-xylosidase and α-galactosidase activities (www.cazy.org) [57]. 32  Of these, the α-xylosidase enzymes have been found indispensible to the XyG utilization pathways in the gut symbiont B. ovatus and the soil saprophyte Cellvibrio japonicus [34, 145]. Moreover, the upregulation of the GH31 α-xylosidase gene Xyl31A in R. cellulolyticum upon the bacterial growth on XyG or XyGOs clearly indicates its central role in the XyG deconstruction pathway (see sections 1.6.1, 1.6.2, and 1.6.3) [127]. In terms of regio-specificity, the α-xylosidase enzymes remove terminal unsubstituted xylose residues strictly from the non-reducing ends of the processed XyGOs, thus exposing the backbone β-1,4-linked glucose moieties to the action of β-glucosidases [145]. Therefore, α-xylosidase and β-glucosidase enzymes are working in sequential cycles to release the constituent xylose and glucose residues from the processed XyGOs in the final step of the saccharification pathway.  To identify structural features allowing the XyG-specific α-xylosidases to accommodate the highly branched XyGO substrates, crystal structures of the BoGH31 and CjXyl31A enzymes have been recently solved [139, 145]. Indeed, crystallography of the two similar enzymes BoGH31 and CjXyl31A revealed how an N-terminal “PA14” domain insertion, within the multimodular architecture of the GH31 enzymes, extends the active-site pocket to provide a platform to accommodate longer XyGO substrates (Figure 1.15A) [139, 145]. Congruently, both CjXyl31A and BoGH31 have extremely poor activity on the disaccharide isoprimeverose (Xyl(1→6)Glc) versus longer XyGOs, which emphasizes the role of the side-chains in substrate recognition and binding [139, 145].   Glycoside hydrolase family 35 (GH35) 1.5.2.4The GH35 comprises more than 1900 catalytic domains with only 72 characterized to date (www.cazy.org) [57]. The GH35 exhibits a narrow spectrum of activities with the β-galactosidase activity (EC 3.2.1.23) dominating in the family [57]. Since different types of XyG structures contain numerous β-galactosyl substitutions, an efficient XyG-specific β-galactosidase is indeed required for an efficient XyG deconstruction in XyG-utilizing microbes. Congruently, the GH35 β-galactosidase CjBgl35 was found essential for the XyG-utilization capacity of the soil saprophyte C. japonicus as indicated by transcriptomic analysis and reverse genetics [33]. Similar to the CjXyl31A, the three-dimensional structure of the GH35 β-galactosidase CjBgl35 has revealed unique enzyme structural adaptations to the cognate complex XyGO substrates. The 33  unusual two-domain structure of CjBgl35 consists of an N-terminal (β/α)8 barrel catalytic domain intimately appended to a mixed α/β domain, which together constitute a monolithic structure (Figure 1.15B) [33]. Co-crystallization of the CjGH35 β-galactosidase with 1-deoxygalactonojirimycin (DGJ) demonstrated that the catalytic domain presents a large active-site cavity with a putative +1 subsite to accommodate the Xyl residue to which Gal is linked in various XyGOs (Figure 1.15B). Observed steric interactions in the CjGH35:DGJ complex also rationalized the strict exo-hydrolytic action of CjGH35, which requires an efficient α-L-fucosidase to first remove terminal Fucα(1-2) residues in the concerted, stepwise saccharification of fucogalacto-XyG (see section 1.6.3) [33].  Figure  1.15. Cartoon and surface representations of Cellvibrio japonicus xyloglucan specific exo-glycosidases. A) α-xylosidase CjGH31:5-fluoro-β-D-xylopyranosyl covalent complex (PDB: 2XVK) [145]. B) β-galactosidase CjGH35: 1-deoxygalactonojirimycin (DGJ) complex (PDB: 4D1J) [33].  GH catalytic domains are shown in cyan and PA14 domain (panel A) is highlighted in pink; other domains are shown in grey. Carbohydrate ligands are shown in yellow and red sticks. Reproduced from [35].  Glycoside hydrolase family 42 (GH42) 1.5.2.5The GH42 displays a very limited number of activities that only include a predominant β-galactosidase and a less prevalent α-L-arabinopyranosidase activities (www.cazy.org) [57]. The GH42 encompasses more than 2300 catalytic domains with only 65 enzymes characterized thus far [57]. Recently, genomic and transcriptomic analysis in the Gram-positive R. cellulolyticum revealed the central role of the GH42 β-galactosidase (Gal42A) in XyG utilization (see section 1.6.2) [127]. It should be noted that only 8 β-galactosidase structures were determined in the GH42 family to date (www.cazy.org) [57]. However, none of the structurally determined enzymes selectively targets the highly branched XyGO substrates. Therefore, structural 34  determinants adapted for XyG recognition in the XyG-specific GH42 β-galactosidases are yet to be investigated.  Glycoside hydrolase family 43 (GH43) 1.5.2.6The GH43 is a wide family displaying about 9 activities and encompassing 8600 inverting catalytic domains identified to date (www.cazy.org) [57]. Recently, the GH43 family was classified into 37 subfamilies based on a phylogenetic analysis that was further supported by enzyme functional characterization studies [146]. Notably, the dominant β-D-xylosidase (EC 3.2.1.37) and α-L-arabinofuranosidase (EC 3.2.1.55) activities are observed in all the polyspecific subfamilies identified to date within the GH43 family [57, 146]. The α-L-arabinofuranosidase activity is particularly involved in the removal of XyG side-chains, specifically in the arabino-galacto XyG mainly found in the Solanaceous plants. Interestingly, two GH43 α-L-arabinofuranosidases (BoGH43A and BoGH43B) were found integral to the XyG utilization machinery of B. ovatus (see section 1.6.1). Biochemical characterization of BoGH43A and BoGH43B indeed supported their potential role in the removal of the α-L-arabinofuranosyl residues from the XyGO substrates [34]. On a structural level, although BoGH43A and BoGH43B share only 41% sequence identity, they display extremely similar structures with an N-terminal 5-bladed β-propeller catalytic domain characteristic to the GH43 family, in addition to a C-terminal β-sandwich domain [139]. Co-crystallization of BoGH43A with different putative α-L-arabinofuranosidase inhibitors clearly showed a shallow enclosed pocked well adapted for accommodating the α-L-arabinofuranosyl residues of the XyGO substrates [139]. Moreover, structural analysis of both BoGH43A and BoGH43B clearly identified key amino acid residues that are involved in substrate binding via hydrogen bonding and stacking interactions [139].  Glycoside hydrolase family 95 (GH95) 1.5.2.7The GH95 is a small not widely explored family encompassing about 1300 catalytic domains, of which only 8 are biochemically characterized and 3 are structurally determined to date (www.cazy.org) [57]. The inverting GH95 family displays a dominant α-1,2-L-fucosidase (EC 3.2.1.63) activity in addition to the less abundant α-L-galactosidase activity. Recently, transcriptomic analysis in C. japonicus revealed the upregulation of a GH95 α-fucosidase gene 35  (CjAfc95A), along with other XyG utilization genes, when the organism was grown on XyGOs (see section 1.6.3) [33]. Biochemical characterization of CjAfc95A revealed the strong capacity of the enzyme to remove the terminal α-1,2-L-fucosyl side-chains from the XyGO substrates, which underpins the fundamental role of the enzyme in the downstream deconstruction of XyG [33]. Notably, structural analysis is indeed required to identify the structural features conferring the substrate specificity of the XyG-specific α-fucosidase (CjAfc95A).    1.6 Microbial XyG utilization systems Certain environmental microbes are capable of the full saccharification of XyG via utilizing an energy saving strategy in which all or some of the genes involved in the deconstruction pathway are co-regulated and co-localized in complex XyG utlilzation loci, XyGULs. The XyGULs encodes a combination of XyG-degrading enzymes, specific XyGO transporters, and possibly specific non-catalytic XyG-binding proteins for XyG binding and acquisition. In the next section, I will briefly discuss the XyG utilization systems identified to date with an emphasis on the soil saprophyte Cellvibrio japonicus.   1.6.1 The XyG utilization system of the symbiotic gut bacterium Bacteroides ovatus The capacity of the gut microbiota to degrade non-starch polysaccharides, known as “dietary fibre”, including XyGs, has been recently explored. Remarkably, a single complex XyGUL indispensable to growth on XyG has been identified and characterized from the colonic Gram-negative symbiont B. ovatus [34]. This XyGUL confers the ability of the human body to derive nutrition from the abundant XyG in fruit and vegetable cell walls given the lack of human genome-encoded enzymes that can actively utilize this substrate [147]. The contiguous XyGUL in B. ovatus encodes two surface glycan-binding proteins (SGBPs), a TonB-dependent transporter (TBDT), an inner membrane hybrid two-component sensor (HTCS), in addition to eight glycoside hydrolases that address the monosaccharide compositional and linkage diversity of dietary XyGs. Therefore, these eight GHs include endo-xyloglucanase GH9 and GH5, α-xylosidase GH31, α-L-arabinofuranosidase GH43A and GH43B, β-galactosidase GH2, and β-glucosidase GH3A and GH3B enzymes (Figure 1.16A) [34, 148]. Localization studies and biochemical characterization of all B. ovatus XyGUL enzymes evidenced the extracellular backbone cleavage of XyG catalyzed by the vanguard GH5 (BoGH5) (see section 1.5.1.2.1). The 36  resulting XyGOs can be then transported by the TBDT to the periplasm where the exo-acting enzymes are localized to achieve final processing of the imported substrates [34].    Figure  1.16. Bacterial Xyloglucan Utilization Loci (XyGULs). A) Human gut symbiont Bacteriodes ovatus XyGUL [34]. B) Environmental R. cellulolyticum XyGUL [127]. C) Soil saprophyte Cellvibrio japonicus XyGUL [33]. Genes encoding backbone-cleaving endo-xyloglucanases (GH5, GH9s, GH74, GH44) are indicated in navy blue and genes encoding side-chain-cleaving exo-glycosidases (GH3 β-glucosidases; GH2, GH35, and GH42 β-galactosidases; GH31 α-xylosidases; GH43 α-L-arabinofuranosidases; and GH95 α-L-fucosidase) are in cyan. Genes encoding carbohydrate-binding, sensor/regulator, and transporter proteins, are indicated as follows: surface glycan binding protein (SGBP), yellow; hybrid two-component system (HTCS), magenta; two-component system, orange; SusC-like TonB dependent transporters (TBDTs), green; ATP-binding cassette (ABC) transporters comprising transmembrane domain (TMD) proteins and solute binding protein (SBP), brown. Reproduced from [35]. 1.6.2 The XyG utilization system of the non-ruminal bacterium Ruminiclostridium cellulolyticum Similar to B. ovatus, the Gram-positive anaerobic model organism R. cellulolyticum developed a neat XyG degradation machinery which has been fully unravelled in a recent interesting study [127]. In that system, extracellular cellulosomes (more than 1 MDa heterogenous protein complexes, reviewed in [149]) are in charge of the backbone cleavage of XyG, while intracellular debranching enzymes encoded by a XyGUL are responsible for the 37  further deconstruction of the transported XyGOs [112, 127]. At least four backbone-cleaving xyloglucanases including 2 GH9s (Cel9X, Cel9U), 1 GH74 (Xgh74), and 1 GH44 (Cel44O) are incorporated in the R. cellulolyticum cellulosomal structures initiating the first step in the XyG saccharification pathway [127]. Although endo-xyloglucanase activity has been only reported in the GH5, GH12, GH16, GH44, and GH74 families [150, 151], Cel9X has been strikingly identified as the first strict endo-xyloglucanase enzyme belonging to the GH9 family (see section 1.5.1.2.2) [112].  In addition to the cellulosomal backbone-cleaving xyloglucanases, R. cellulolyticum genome contains a XyGUL that is responsible for the transport and deconstruction of the XyGOs, the degradation products of the extracellular endo-xyloglucanases. This complex gene locus encodes a highly specific ATP –binding cassette transporter (ABC transporter), a two-component sensor, and 3 cytoplasmic exo-xyloglucanases that include a GH42 β-galactosidase (Gal42A), a GH31 α-xylosidase (Xyl31A), and a GH3 β-glucosidase (GH3A) enzymes (Figure 1.16B) [127]. Indeed, the ABC transporter imports the extracellularly generated XyGOs to the cytoplasm where the other exo-enzymes work in a concerted fashion to release the simple sugars, galactose, xylose, glucose and cellobiose [127]. 1.6.3 The XyG utilization system of the model soil saprophyte Cellvibrio japonicus Cellvibrio japonicus (formerly, Pseudomonas fluorescens subsp. cellulosa) is a Gram- negative saprophytic bacterium first isolated from a Japanese soil in 1952 [152]. The 16S rRNA sequence, however, revealed the lack of high similarity between this soil bacterium and the Pseudomonas genus [153]. Instead, it was more closely related to the genus Cellvibrio, therefore it was recently reclassified under its current name [154]. Twenty-five years of study, spearheaded by Harry Gilbert and Geoffrey Hazlewood [155] and continued through recent whole-genome sequencing [156] and genetic methods development [157], underscore the importance of C. japonicus  as a model organism for CAZyme discovery.  Indeed, C. japonicus was rapidly considered as a very useful platform to study plant polysaccharide deconstruction due to its strong ability to degrade nearly all plant cell wall polysaccharides including cellulose, xylan, mannan, arabinan and XyG [158]. Moreover, the 38  industrial capacity for ethanol production could be achieved in C. japonicus via its genetic manipulation, thus extending the industrial potential for this model organism [159, 160].  In contrast to Clostridia which mostly secrete CAZymes in the form of outer-membrane- anchored protein complexes (cellulosomes), C. japonicus secretes many of its degrading enzymes in the extracellular environment, via a type II secretion system [159], to allow the depolymerization of the biomass [161].  Phylogenetically distinct from the human gut bacteria Bacteroidetes, C. japonicus generally lacks genetically co-localized sets of CAZymes (reviewed in [33]). Strikingly, however, reverse genetic and biochemical analyses recently demonstrated that fucogalacto-XyG utilization in C. japonicus is critically dependent on a “mini”-XyGUL encoding three periplasmic, side-chain-cleaving GHs (a GH31 α-xylosidase, a GH35 β-galactosidase, and a GH95 α-L-fucosidase), and a predicted outer-membrane TBDT (Figure 1.16C) [33]. Based on the extensive biochemical and localization studies conducted by the former PhD student in our lab, Johan Larsbrink, and which targeted the XyG utilization system in C. japonicus, a working model explaining the mechanics of the process was proposed (Figure 1.17). In this model, XyGOs are generated extracellularly by the action of endo-xyloglucanase(s) and imported probably via a TonB-dependent transporter to the periplasmic space where the identified exo-xyloglucanases are localized (Figure 1.17) [33, 145, 162]. Notably, C. japonicus XyGUL is fundamentally incomplete, as it does not encode the endo-xyloglucanase(s) necessary for the generation of the XyGOs, the substrates of the downstream exo-enzymes. Moreover, it lacks the β-glucosidase gene involved in the final step of XyG utilization. Therefore, the work done in this thesis fundamentally targets these unknown enzymes in order to illuminate the first and final steps of the XyG saccharification pathway in the soil saprophyte C. japonicus.   1.7 Aim of investigation The main objectives in this thesis essentially include the identification and characterization of (1) the endo-xyloglucanase(s) and (2) the β-glucosidase(s) involved in the first and ultimate steps of the XyG utilization pathway, respectively, in C. japonicus. A multidisciplinary approach utilizing in-depth genomic, biochemical, and three-dimensional structural analyses will be implemented to achieve the intended outcomes. Certainly, identifying the missing enzymes in the 39  process will not only elucidate the full XyG saccharification pathway in C. japonicus, but also will expand our repertoire of CAZymes that can be ultimately harnessed in new innovative applications including the production of energy and high value products from the ubiquitous plant biomass.  Figure  1.17. The proposed (fucogalacto)xyloglucan utilization by Cellvibrio japonicus. Extracellular endo-xyloglucanases depolymerize XyG into XyGOs which are transported via a TonB-dependent transporter to the periplasm before they get deconstructed by a group of XyG-specific exo-glycosidases into simple sugars. Monosaccharides are represented using the Consortium for Functional Glycomics Symbol Nomenclature (http://www.functionalglycomics.org/static/consortium/Nomenclature.shtml). Reproduced from [33]. To accomplish my first aim, which targets the endo-xyloglucanase enzyme(s) necessary for the fundamental XyG backbone cleavage in C. japonicus, a comprehensive bioinformatic analysis of C. japonicus genome was pursued to identify the potential candidates. Since the endo-xyloglucanase activity was only identified in the GH5, 9, 12, 16, 44, and 74 (see section 1.5.1), I selected members belonging to these families in C. japonicus for further biochemical and structural investigations. Therefore, this thesis summarizes my attempts to functionally 40  characterize one GH74 (CjGH74), three GH5_4 (CjGH5D, CjGH5E, and CjGH5F) and three GH9 (CjGH9A, CjGH9B and CjGH9C) enzymes in C. japonicus, in order to illuminate the key first step in the XyG saccharification pathway. My analyses involved enzyme modular dissection, molecular cloning, heterologous overexpression, enzyme biochemical characterization employing a panel of different substrates, enzyme structural analyses, and reverse genetics to confirm the biochemical function in the biological context.           My second objective strived for the identification and functional characterization of the XyG-specific β-glucosidase, which works in a concerted fashion with the α-xylosidase CjXyl31a in the final breakdown steps of XyGOs in C. japonicus. Because different GH3 β-glucosidase members have been found central to the XyG utilization systems of the gut and terrestrial bacteria (see section 1.5.2.2), the investigation expanded to include the four GH3 exo-glucosidases (Bgl3A, Bgl3B, Bgl3C and Bgl3D) from C. japonicus. I will therefore discuss my efforts to identify and characterize the key β-glucosidase enzyme catalyzing the last step in the XyG utilization pathway in C. japonicus. This comprehensive study combined reverse genetics and biochemistry facilitated by XyG-customized substrates to unravel the last missing piece of the XyG degradation model in this soil saprophyte.      41  Chapter 2: Functional and structural characterization of a potent GH74 endo-xyloglucanase2 2.1 Introduction One of the greatest future challenges facing humanity is reducing dependency on fossil petroleum reserves in light of increasing concerns regarding environmental impacts stemming from the use of non-renewable carbon sources. To address this challenge, the production of fuels, chemical feedstocks, and materials from plant-derived biomass has been suggested as a promising alternative to meeting increasing global demands for both industrial commodities and consumer products [6, 163]. However, plant cell walls are both chemically and structurally complex, which hinders fractionation of this abundant carbon resource into product streams, including fibres, polysaccharides, polyphenolics (lignins), and simple sugars, etc., in modern “biorefineries” [5, 164]. Plant cell wall deconstruction using microbial enzymes, which have uniquely co-evolved with their recalcitrant substrate, is widely envisioned as a way to enable efficient valorisation of biomass feedstocks with minimal by-product generation [10]. Plant cell walls are primarily composed of polysaccharides, which often comprise >70% of the dry weight, in addition to lignins and a minor amount of structural proteins. Cellulose, β(1→4)-D-glucan that forms partially crystalline microfibrils, is the most abundant polysaccharide in all plant cell walls. Cellulose microfibrils are embedded in a matrix of amorphous glycans, including the broad classes of hemicelluloses (predominantly neutral polysaccharides) and pectins (galacturonate-containing anionic polysaccharides) [11, 23]. Among the hemicelluloses, the xyloglucans (XyGs) constitute a ubiquitous family of structurally complex polysaccharides that can represent up to 25% of total dry weight of terrestrial plant cell walls, especially in dicots and non-commelinoid moncots [11, 23, 26]. All XyGs share a common β(1→4)-D-glucan backbone that is regularly branched with α(1→6)-xylopyranosyl residues; typical substitution patterns result in XXXG and XXGG motifs (nomenclature according to [30], where G represents an unbranched β-Glc-(1→4) unit and X represents a                                                  2 Adapted from: Mohamed Attia, Judith Stepper, Gideon J. Davies, and Harry Brumer (2016). Functional and structural characterization of a potent GH74 endo-xyloglucanase from the soil saprophyte Cellvibrio japonicus unravels the first step of xyloglucan degradation. FEBS Journal. 283:1701-1719 42  branched [-Xyl-(1→6)-]-β-Glc-(1→4) unit. These sidechain residues can be further extended by other monosaccharide units, e.g. galactopyranosyl, fucopyranosyl, and arabinofuranosyl residues according to plant tissue and species (Figure 2.1 [28]). Due to this structural complexity, complete XyG saccharification requires a consortium of endo-acting and exo-acting glycoside hydrolases [33, 34].  Figure  2.1. Dicot xyloglucan structure, showing sidechain polydispersity.  See [6] for other species-dependent branching motifs. The saprophytic Gram-negative bacterium, Cellvibrio japonicus (formerly, Pseudomonas fluorescens subsp. cellulosa), has emerged as a useful source for the discovery of specific carbohydrate-active enzymes (CAZymes [57]), due to its ability to utilize a wide range of plant cell wall polysaccharides, including XyGs [33, 152, 155, 156, 161]. Commensurate with these observations, the C. japonicus genome encodes a large number of confirmed and predicted CAZymes for plant polysaccharide saccharification [156]. Recently, the development of genetic tools for C. japonicus has greatly enabled its use as a model for enzyme discovery and metabolic engineering [157, 159, 160, 165]. We recently identified and characterised a multicomponent gene locus in C. japonicus that is uniquely responsible for XyG utilization (XyGUL) [33, 145]. This locus (CJA2706-2710) encodes all of the exo-glycosidases required for the cleavage of the branches of dicot 43  (fucogalacto)xyloglucan, namely a Glycoside Hydrolase Family 95 (GH95) α-L-fucosidase, a GH35 β-galactosidase, and a GH31α-xylosidase, as well as a predicted TonB-dependent transporter (TBDT) [33, 145, 166]. However, this locus does not encode an endo-xyloglucanase, which is required for the initial generation of xylogluco-oligosaccharides (XyGOs) as substrates for these exo-glycosidases. To identify possible candidate endo-xyloglucanases, we surveyed the annotated C. japonicus genome and identified the sole GH74 member, Gly74A, encoded by locus CJA_2477 [156], as an initial target, due to the predominance of this specificity among members of this family. Notably, CJA_2477 encodes a multi-modular enzyme comprising an N-terminal GH74 module in train with a Carbohydrate-Binding Module Family 10 (CBM10) member and a CBM2 member (CjGH74-CBM10-CBM2). We present here the comprehensive functional and structural characterization of these modules, which provides new insight into the molecular basis for xyloglucan utilization by C. japonicus.  2.2 Materials and Methods 2.2.1 Bioinformatic analysis The full length protein encoded by ORF CJA_2477 in C. japonicus genome (CjGH74-CBM10-CBM2) was screened for the presence of a signal peptide using SignalP 4.0 [167]. The modular architecture of CjGH74-CBM10-CBM2 was obtained from BLASTP analysis and additional alignment with representative GH and CBM modules from the CAZy Database [57] using ClustalW [168]. Structural models for CjCBM10 and CjCBM2 were generated using the Protein homology/analogy recognition engine (Phyre2) [169].  2.2.2 Cloning of cDNA encoding protein modules  cDNA encoding CjGH74-CBM10-CBM2, CjGH74-CBM10, CjGH74, CjCBM10, and CjCBM2 were PCR amplified from C. japonicus genomic DNA; all constructs were designed such that the native predicted signal peptide was removed (PCR primers are listed in Table 2.S1). The amplified products were double-digested with NheI and XhoI, gel purified and ligated to the respective sites of pET28a to fuse an N-terminal 6x His-Tag. Overlapping extension PCR was 44  used to make CjCBMs-GFP fusion constructs. CjCBM10 and CjCBM2 were individually fused to an N-terminal super-folder green fluorescent protein (sfGFP) gene sequence with a TEV cleavage site connecting the two modules. TEV-CBM10 and TEV-CBM2 were PCR amplified from C. japonicus genomic DNA, while sfGFP was amplified from a pBAD vector harboring the sfGFP gene (GenBank accession number AGT98536.1) (Table. 2.S1) [170]. Purified sfGFP was mixed individually with the purified TEV-CBM10 and TEV-CBM2, and the respective fusion products sfGFP-CjCBM10 and sfGFP-CjCBM2 were obtained by PCR amplification. CjCBM10 was also fused to a C-terminal sfGFP module with a connecting TEV cleavage site, using overlapping extension PCR; the amplified CjCBM10-TEV and sfGFP were mixed and the fused CjCBM10-sfGFP was obtained by PCR amplification (Table. 2.S1). CjCBM10-sfGFP, sfGFP-CjCBM10, and sfGFP-CjCBM2 purified DNA fragments were double digested with NheI and XhoI, then cloned into the respective sites of pET28a E. coli expression vector. Successful cloning was confirmed by restriction mapping and PCR. Q5 high fidelity DNA polymerase was used for all the PCR amplifications. 2.2.3 Site-directed mutagenesis CjGH74(D70A) and CjGH74(D483A) single mutants were generated using the PCR-based QuickChange II Site-Directed Mutagenesis Kit (Agilent, USA) following manufacturer‟s protocol and using pET28a::CjGH74 as a template DNA (Table. 2.S1). The resulting constructs were sequenced to confirm the desired mutations. 2.2.4 Gene expression and protein purification Constructs were individually transformed into the chemically competent E. coli Rosetta DE3 cells. Colonies were grown on LB solid media containing kanamycin (50 µg mL-1) and chloramphenicol (30 µg mL-1). One colony of the transformed E. coli cells was inoculated in 5 mL of LB medium containing the same antibiotics and grown overnight at 37 °C (200 rpm). The whole overnight culture was used to inoculate 500 mL of LB liquid medium containing the proper antibiotics. Cultures were grown at 37 °C (200 rpm) until D600 = 0.6. Overexpression was induced by adding IPTG to a final concentration of 0.1 mM. After induction, cultures were grown overnight at 16 °C (200 rpm). Cultures were then centrifuged and pellets were resuspended in 5 mL of E. coli lysis buffer containing 20 mM HEPES, pH 7.0, 500 mM NaCl, 45  40 mM imidazole, 5% glycerol, 1 mM DTT and 1 mM PMSF. Cells were then disrupted by sonication and the clear supernatant was separated by centrifugation at 4 °C (4200 g for 45 minutes). Recombinant proteins were purified from the clear soluble lysates using a Ni+2– affinity column utilizing a gradient elution up to 100% elution buffer containing 20 mM HEPES, pH 7.0, 100 mM NaCl, 500 mM imidazole, and 5% glycerol in an FPLC system. Purity of the recombinant proteins was determined by visualizing the protein contents of the fractions on SDS-PAGE. Pure fractions were pooled, concentrated, and buffer exchanged against 50 mM citrate buffer (pH 6.5) in case of CjGH74, and 20 mM HEPES buffer, pH 7.1, 1 mM EDTA, 1 mM DTT, 1 mM PMSF, and 10% glycerol in case of sfGFP-fused CjCBMs. Protein concentrations were then determined using Epoch Micro-Volume Spectrophotometer System (BioTek®,USA) at 280 nm, and identities of the expressed proteins were confirmed by intact mass spectrometry. Overexpression and purification of the active site mutants, CjGH74(D70A) and CjGH74(D483A), were conducted following the same protocol used for the wild type enzyme. The fidelity of protein production was confirmed by intact protein mass spectrometry [171]. 2.2.5 Carbohydrate sources Tamarind seed XyG, konjac glucomannan (KGM), barley β-glucan (BBG), wheat flour arabinoxylan, and beechwood xylan were purchased from Megazyme® (Bray, Ireland). Hydroxyethylcellulose (HEC) was purchased from Amresco® (Solon, USA). Carboxymethyl cellulose was purchased from Acros Organics (New Jersey, USA). Guar gum was purchased from Sigma Aldrich® (St. Louise, USA). Xanthan gum was purchased from Spectrum® (New Brunswick, USA). Avicel®- PH-101 was purchased from Fluka® (St. Gallen, Switzerland). 2-Chloro-4-nitrophenyl (CNP)-β-D-cellobioside (GG-β-CNP) and CNP-β-D-cellotrioside (GGG-β-CNP) were purchased from Megazyme®. XXXG-β-Resorufin (resorufinyl α-D-xylopyranosyl-(16)-β-D-glucopyranosyl)-(14)-[α-D-xylopyranosyl-(16)]-β-D-glucopyranosyl-(14)-[α-D-xylopyranosyl-(16)-]-β-D-glucopyranosyl-(14)-β-D-glucopyranoside) and XXXG-β-CNP were prepared are previously described [172, 173]. Glc4-based XyGOs (XXXG, XLXG, XXLG, and XLLG; nomenclature according to [30]) and Glc8-based XyGOs were prepared from XyG powder (Innovassynth Technologies, Maharashtra, India) as previously described [150]. 46  2.2.6 Carbohydrate analytics High Performance Anion-Exchange Chromatography with Pulsed Amperometric Detection (HPAEC-PAD) was performed on a Dionex ICS-5000 DC HPLC system operated by the Chromeleon software version 7 (Dionex) using a Dionex Carbopac PA200 column. Solvent A was double-distilled water, solvent B was 1 M sodium hydroxide, and solvent C was 1 M sodium acetate. The gradient used was: 0–5 min, 10% solvent B and 3.5% solvent C; 5–12 min, 10% B and a linear gradient from 3.5–30% C; 12–12.1 min, 50% B and 50% C; 12.1 – 13 min, an exponential gradient of NaOH and NaOAc (sodium acetate) back to initial conditions; and 13–17 min, initial conditions.  Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) was performed on a Bruker Daltonics Autoflex System (Billerica, USA). The matrix, 2,5-dihydroxy benzoic acid, was dissolved in 50% methanol in water to a final concentration of 10 mg mL-1. Oligosaccharide samples were mixed 1:1 (v/v) with the matrix solution. One µl of this solution was placed on a Bruker MTP 384 ground steel MALDI plate and left to air dry for two hours prior to analysis. 2.2.7 Enzyme kinetic analysis All enzyme activities toward polysaccharides were determined using a bicinchoninic acid (BCA) reducing-sugar assay [174]. The effect of temperature on activity was determined by incubating the recombinant catalytic domain (0.049 µg) with tamarind seed xyloglucan at a final concentration of 1 mg mL-1. Phosphate buffer (pH 6.5) was used to a final concentration of 50 mM in a total reaction volume of 200 µL. Reaction mixtures were incubated for 10 minutes at temperatures ranging from 30 °C to 90 °C prior to the BCA assay. To determine the pH-rate profile, the same enzyme reaction conditions were used at 65 ºC, with 50 mM of the following buffers: citrate (pH 3-6.5), phosphate (pH 6-8), and glycine (pH 8.5). To determine specific activity values of CjGH74, final concentration of 1.33, 5.9, 8.9, and 53.2 nM of the recombinant purified catalytic module was incubated with tamarind seed XyG (1 mg mL-1), HEC (2 mg L-1), BBG (2 mg mL-1), and CMC (2 mg L-1), respectively, in 200 µL reaction mixtures containing 50 mM phosphate buffer (pH 6.5). Reaction mixtures were incubated at 65 ⁰C for 10 minutes prior to the BCA assay.  47  To determine Michaelis-Menten parameters for XyG, seven different concentrations of XyG solutions were used over the range 0.025 to 1 mg mL-1. The recombinant enzyme (0.022 µg) was incubated with each XyG concentration at 65 ⁰C for 10 min in a 200 µl final reaction mixture containing 50 mM phosphate buffer (pH 6.5). Using the same enzyme assay conditions, 0.25- 2.0 mg mL-1 HEC and 0.5- 2.0 mg mL-1 BBG were each incubated with 0.096 and 0.086 µg of the recombinant CjGH74, respectively. Km and kcat values were determined by non-linear fitting of the Michaelis-Menten equation to the data in Sigmaplot®(Systat software Inc.) To test enzyme activity on the chromogenic substrates CNP-β-GG, CNP-β-GGG, XXXG-β-Res, and XXXG-β- CNP, up to 1 µM of the enzyme was incubated with 1 mM of the substrate in a final reaction mixture of 250 µL containing 50 mM phosphate buffer (pH 6.5). The solution was incubated at 65 ⁰C and was monitored continuously for the release of the aglycone with a Cary50 UV–visible spectrophotometer (Varian) at 410 nm for CNP based substrates and 571 nm for XXXG-β-Res [172, 173]. 2.2.8 Enzyme product analysis  To determine the limit-digest products of CjGH74, 4.9 µg of the recombinant enzyme was incubated with tamarind seed XyG at final concentration of 0.25 mg mL-1 for 4 hours (50 ⁰C) in a 200 µL reaction mixture that contained 50 mM phosphate buffer (pH 6.5). The reaction mixture was then diluted 5 times prior to product analysis by HPAEC-PAD. To determine the mode of action of the enzyme, 0.01 µg of CjGH74 was incubated at 65 ⁰C with 1 mg mL-1 final concentration of tamarind seed XyG in 200 µL reaction volume containing 50 mM phosphate buffer (pH 6.5). The reaction was stopped at different time points by adding 100 µL of NH4OH. Reaction mixtures were then diluted 20 times with water prior to product analysis by HPAEC-PAD. 2.2.9 X-ray crystallography and structure solution CjGH74 was crystallized by vapour diffusion using the sitting drop method from 0.1 M Tris (pH 7.8), 0.3 M potassium bromide, 8% PGA-LM (poly-γ-glutamic acid- low molecular weight) with subsequent seeding used to improve initial crystals. Crystals were obtained at 19 °C in equal volumes of protein solution (5, 7 or 10 mg mL-1) and mother liquor. This condition yielded 48  a P21212 (cell dimensions a=87.89Å, b=131.91 Å, c=73.17) crystal form of apo-CjGH74, which was used for structure solution. Crystals of the XyGO complex were obtained by soaking CjGH74 crystals for 10 min in mother liquor supplemented with an XyGO mixture containing XXXG, XLXG, XXLG, and XLLG. Crystals were cryo-protected in a solution containing the mother liquor with 25% (v/v) ethylene glycol and harvested into rayon fibre loops prior to flash-freezing in liquid nitrogen. X-ray data were collected at the Diamond Light Source and processed using XIA2 [175] implementations of XDS [176] or MOSFLM [177].  Native “apo” data were collected to 2.3 Å. The structure was solved by molecular replacement using the CCP4 [178] implementation of the MOLREP [179] program and with a single molecule of the Acidothermus cellulolyticus GH74 glycoside hydrolase as a search model (PDB code: 4LGN). This structure, with one molecule in the asymmetric unit, was refined using REFMAC to a model with an initial Rcryst/Rfree of 46/45%. Model building using COOT [180] and refinement using REFMAC yielded a final model with Rcryst/Rfree of 22/28%. The structure of the CjGH74:XyGO complex was refined, at a resolution of 2 Å, using the apo-CjGH74 as the starting model and with XyGOs added manually in COOT and refined using REFMAC. Carbohydrate conformations were confirmed using Privateer [181].  CjGH74 catalytic mutants CjGH74(D483A) and CjGH74(D70A) were crystallised by vapour diffusion using the sitting drop method from 0.1 M sodium acetate (pH 5.0), 0.6 M sodium formate, 8 % w/v PGA-LM with seeding. The combination of equal volumes of protein solution (5 mg/mL) and mother liquor at 19°C resulted in a P21 (cell dimensions a=84.12 Å, b=94.29 Å, c=105.56 Å, 103°) crystal form of CjGH74(D483A) and a P21 (cell dimensions a=84.02 Å, b=94.15 Å, c=105.67 Å, 103°) crystal form of CjGH74(D70A). The crystals were harvested and data sets collected as for the apo-CjGH74 structure. For CjGH74(D483A) data was collected to 1.52 Å and for CjGH74(D70A) to 1.71 Å. The structures of the catalytic mutants were solved using the apo-CjGH74 model and correcting for the mutations in COOT.  2.2.10 Carbohydrate-binding analysis of CjCBM10 and CjCBM2  Native 10% (w/v) polyacrylamide gels containing 0.1% final concentration of the tested polysaccharides, XyG, HEC, and β-glucan were used to perform affinity electrophoresis at room 49  temperature for approximately 90 minutes (100 volts) [182]. To perform qualitative analysis of cellulose binding capacity of both CBMs, 100 µg of the recombinant proteins, sfGFP-CjCBM2 and CjCBM10-sfGFP, were mixed with 10 mg Avicel type PH-101 in a 200 µL reaction volume containing 50 mM phosphate buffer (pH 7.0) [183]. Mixtures were incubated on ice with gentle agitation for 1 hour before they were centrifuged for 5 minutes (14000 rpm). Clear supernatants were removed, mixed with SDS-PAGE loading dye, and 4 µL were analysed by SDS-PAGE. Avicel pellets were washed twice with 250 µL of 50 mM phosphate buffer (pH 7.0), then resuspended in 200 µL of the same buffer. 50 µL of SDS-PAGE loading dye was added, and then bound proteins were released by boiling for 10 minutes before 4 µL were removed and subjected to SDS-PAGE. Binding isotherms were obtained using a method adapted from that of Doi and co-workers [184]. CjCBM10-sfGFP, sfGFP-CjCBM2, and sfGFP initial protein concentrations were determined using the calculated extinction coefficients, ε, of 39880, 52370, and 18910 M-1 cm-1, respectively, at λ = 280 nm. In 1.5 mL Eppendorf tubes, the purified recombinant proteins, (5 to 100 µg) were mixed with Avicel (1% w/v) in 1 mL total volume containing 50 mM phosphate buffer (pH 7.0) and BSA to a final concentration of 0.1 mg mL-1 to minimize the non-specific adsorption of recombinant proteins on the hydrophobic surface of the tubes . Mixtures were incubated at 4 ºC for 1 hour with continuous end-over-end rotation. Mixtures were then centrifuged for 5 minutes (18800 g), and the concentration of unbound proteins was determined by fluorescence using an Infinite M1000 Pro multifunction plate reader (Tecan Ltd., Morrisville, NC) with an excitation filter of 450 nm and an emission filter of 510 nm. Fluorescence readings were converted to protein concentrations from a linear fit using sfGFP-CjCBM2 (2.45 nM to 1.23 µM) as a standard. The amount of proteins bound to Avicel was calculated by subtracting the unbound protein concentration from the total used protein concentration. Kd values were determined by fitting the equation [PC] = [FP][PC]max/ Kd + [FP] (where PC and FP are the bound and unbound protein concentrations, respectively [184]) to binding isotherm data using Sigmaplot®. Purified recombinant sfGFP (20 to 150 µg) was used as a negative control.  50  2.3 Results and Discussion 2.3.1 Bioinformatic analysis Endo-xyloglucanase activity, which cleaves β(1→4)-D-glucosidic linkages in the XyG backbone, has been previously demonstrated in GH5, GH9, GH12, GH16, GH44, and GH74 families [112, 150, 151]. C. japonicus does not encode GH12, nor GH44 members, but encodes multiple GH5 (n = 15), GH9 (n = 3), and GH16 (n = 9) members (http://www.cazy.org/genomes.html). In light of the known polyspecificity of these families, which makes prediction of substrate specificity challenging, we focused our attention on the sole GH74 member, which is encoded by locus CJA_2477 (GenBank ACE84745.1, annotated as CjGly74A [156]). GH74 enzymes studied thus far exhibit a comparatively limited range of catalytic activities, including general endo-glucanase (EC 3.2.1.4), xyloglucan-specific endo-β-1,4-glucanase (EC 3.2.1.151), and oligoxyloglucan reducing end-specific cellobiohydrolase (EC 3.2.1.150) activities. Of these, endo-xyloglucanases dominate the number of well-characterised examples, across bacterial and fungal sources (http://www.cazy.org/GH74_characterized.html, [57]).  Analysis of the primary structure of the CJA_2477 gene product indicated a novel modular architecture composed of a catalytic GH74 module, followed in series by two carbohydrate binding modules (CBMs): CBM10 and CBM2. Further sequence analysis and homology modelling, using the Phyre2 server, allowed us to define the modular boundaries to be: GH74 (Ala34 - Gly766), CBM10 (Lys822 - Gly871), and CBM2 (Val917 - Gln1017). These modules are connected with serine-rich linkers commonly found among multi-modular CAZymes (Figure 2.2) [185, 186]. SignalP predicted an extracellular secretion signal peptide (Met1- Ala33) at the N-terminus of CjGH74-CBM10-CBM2. Amino acid alignment of CjGH74 and other characterized GH74 xyloglucanase catalytic modules revealed a generally high sequence conservation (40 to 61% identity) and strict conservation of the two catalytic aspartate residues (D70 and D483 in CjGH74, Figure 2.S1).  51   Figure  2.2. Modular architecture of the native C. japonicus CJA_2477 gene product. A) The full length gene product is composed of a signal peptide, a GH74 catalytic domain, and two carbohydrate binding modules: CBM10 and CBM2. The GH74, CBM10, and CBM2 modules are connected by serine rich linkers. B) Recombinant proteins produced for characterisation. Super folder GFP (sfGFP) is connected to the different CjCBMs by a TEV cleavage site. 2.3.2 Recombinant protein production and purification Our initial attempts to recombinantly produce and purify the full-length CjGH74-CBM10-CBM2 and the truncated CjGH74-CBM10 constructs were met with difficultly, due to apparent proteolytic degradation at the serine rich linkers (Figure 2.3A and 2.3B, cf. Figure 2.2B). In contrast, the catalytic module CjGH74 (Figure 2.2B) was successfully produced independently and purified intact (calculated mass, 80811.3 Da; observed by ESI-MS, 80812.0 Da), with a typical production yield of 30 mg L-1 (Figure 2.3C). CjGH74 was therefore used for all subsequent biochemical and structural characterization.   52   Figure  2.3. SDS-PAGE of the purified protein constructs (cf. Figure 2.2). A) Purified recombinant CjGH74-CBM10-CBM2. B) Purified recombinant CjGH74-CBM10. C) Purified recombinant CjGH74. D) purified recombinant CjCBM10-sfGFP, sfGFP-CjCBM2, and sfGFP. The calculated molecular masses of the recombinant CjGH74-CBM10-CBM2, CjGH74-CBM10, CjGH74, CjCBM10-sfGFP, sfGFP-CjCBM2, and sfGFP are 105.9 kDa, 91.2 kDa, 80.8 kDa 35.6 kDa, 40.7 kDa, and 27.8 kDa, respectively. To enable the functional characterization of the CBM10 and CBM2 modules, we attempted the recombinant production of each in E. coli. However, we were unable to produce either CBM10 or CBM2 independently: CjCBM2 was mainly targeted to inclusion bodies, while CjCBM10 was not detectably produced. Indeed, a particular challenge in CBM characterization is the difficulty in obtaining soluble recombinant production in E. coli due to improper protein folding [184]. Recombinant CBM-fusion technology has been used extensively for various applications [187], and we were particularly inspired by the specific use of CBM-GFP fusion constructs to facilitate the production and purification of CBMs and CBM-tagged fusion proteins [188-192], and to make probes for the determination of surface accessibility of cellulose [193, 194]. Hence, both CBMs were fused to an N-terminal sfGFP domain (Figure 2.2B), which successfully enabled the production of sfGFP-CBM2 in a soluble form, with a typical production yield of 8.0 mg L-1 (Figure 2.3D). However, sfGFP-CjCBM10 was proteolytically degraded during the purification process; attempts to use low-temperature purification and protease inhibitors to enhance stability were unsuccessful. Instead, fusion of sfGFP to the C-terminus of 53  CBM10 (Figure 2.2B) enabled purification of intact protein, albeit in modest yields (typically 2.5 mg L-1, Figure 2.3D).  2.3.3 CjGH74 substrate specificity Anticipating that CjGH74 would be maximally active on XyG, we used this polysaccharide to determine the pH and temperature optima of the catalytic module. A bell-shaped pH-rate profile was obtained, with the highest enzymatic activity observed at pH 6.5 in 50 mM phosphate buffer (Figure 2.4A). Incubation of CjGH74 with XyG in this buffer at a range of temperatures over 10 min revealed that the recombinant enzyme had the highest activity at 65 ⁰C (Figure 2.4B), which compares favorably with other characterized GH74 members [95, 132-134, 137, 195-197].  In screening the substrate specificity of CjGH74, the recombinant catalytic module indeed demonstrated a strong preference for XyG as a natural substrate, as reflected in a high specific activity value (51.3 ± 2.1 µmol min-1 mg-1). In comparison, CjGH74 showed ca. 50-fold lower specific activity for the natural mixed-linkage (1-3)/(1-4)--glucan from barley (BBG, 1.03 ± 0.11 µmol min-1 mg-1) at the highest tested substrate concentration (2 mg mL-1). Similarly, CjGH74 showed ca. 24-fold and 165-fold lower specific activities for the artificial polysaccharide derivatives hydroxyethyl cellulose (HEC, 2.1 ± 0.05 µmol min-1 mg-1) and carboxymethyl cellulose (CMC, 0.31 ± 0.02 µmol min-1 mg-1), respectively, at 2 mg mL-1. No endo-mannanase activity was detected on guar galactomannan and konjac glucomannan, no endo-xylanase activity was observed on beechwood xylan and wheat flour arabinoxylan, and no endo-xanthanase activity was observed on xanthan gum. Despite this apparent specificity for XyG, CjGH74 was unable to release the chromophoric aglycones 2-chloro-4-nitrophenol (CNP) and resorufin from the artificial substrates, XXXG-β-CNP [173] and XXXG-β-Resorufin [172], perhaps due to a lack of productive positive subsite interactions. CjGH74 was also unable to hydrolyze the shorter chromogenic substrates GG-β-CNP and GGG-β-CNP.  54   Figure  2.4. pH and temperature profiles of CjGH74 with xyloglucan as a substrate. A) pH-rate profile of CjGH74 with tamarind seed XyG as a substrate. Blue squares, citrate buffer; red circles, phosphate buffer; and magenta triangle, glycine buffer. Lines were drawn to guide the eye with no physical significance. B) CjGH74 temperature-activity profile with the substrate XyG. Error bars represent standard errors of the mean for 4 replicates. To further refine our determination of substrate specificity, recombinant CjGH74 was subjected to Michaelis-Menten analysis using XyG, HEC, and BBG as substrates. The Km and kcat values for XyG were 0.08 ± 0.01 mg mL-1 and 77.6 ± 2.2 sec-1, respectively (Figure 2.5A). Compared to known functionally characterized GH74 xyloglucanases, which have Km values ranging from 0.25 to 1.2 mg mL-1 for tamarind seed XyG, the Km value of CjGH74 is notably low [129, 130, 133, 135, 195, 197]. Moreover, the turnover number of CjGH74 is among the highest observed for previously characterized GH74 endo-xyloglucanases, which underscores the application potential of this enzyme. In contrast, Michaelis-Menten parameters for HEC (Figure 2.5B; Km 1.2 ± 0.1 mg mL-1, kcat 4.6± 0.2 sec-1) and BBG (Figure 2.5C; Km 4.7 ± 2.5 mg mL-1, 55  kcat 4.7 ± 1.9 sec-1) indicated a particularly low hydrolytic ability toward these polysaccharides. kcat/Km values indicate that CjGH74 has about 250-fold and 970-fold higher specificity for XyG than the artificial derivative HEC and the mixed-linkage BBG, respectively. These results showcase the fundamental importance of α(1→6)-xylopyranosyl substitutions on the strictly (1→4)-glucan backbone of the natural substrate.   Figure  2.5. Michaelis-Menten kinetics of CjGH74 on polysaccharide substrates. A) XyG. B) HEC. C) BBG. Bars represent standard errors based on 3 replicates.   56  2.3.4 Bond cleavage specificity and mode of action of CjGH74 Limit digest analysis by HPAEC-PAD and MALDI-TOF of CjGH74 acting on tamarind XyG revealed that the catalytic module hydrolyzes the polysaccharide at unbranched backbone glucosyl residues (Figure 2.1) to generate the oligosaccharides XXXG, XLXG, XXLG, and XLLG, which differ in the degree of sidechain galactosylation (Figure 2.6, Figure 2.7). This is the most common cleavage pattern observed for GH74 endo-xyloglucanases [129, 130, 133-135], although certain members are known to cleave between branched backbone glucosyl units [132, 136-138].   It has been recently reported that four tryptophan residues in the active-site of the GH74 endo-xyloglucanase XEG74 from Paenibacillus are responsible for the apparent endo-processive activity of this enzyme. Specifically, kinetic analyses indicate that W318 and W319 in the positive subsites appear to play key roles compared to W61 and W64 in the negative subsites, which appear to contribute much less to the endo-processive mode of action [131]. These residues are strictly conserved in CjGH74 (Figure 2.S1). However, time-course analysis of the CjGH74 XyG hydrolysis products did not reveal a rapid accumulation of short oligosaccharides (i.e., XXXG, XLXG, XXLG, and XLLG), in the early stages of the reaction (Figure 2.6B, cf. Figure 2.6A), as would be expected for endo-processive activity. Rather, CjGH74 appears to act as a typical endo-dissociative xyloglucanase, which indicates that the presence of the four tryptophan residues may not be strictly diagnostic of the mode of action of GH74 members.  2.3.5 CjGH74 crystallography To further illuminate the structural basis for CjGH74 specificity, we solved the tertiary structure of the catalytic module in wild-type, catalytic mutant, and product-complexed forms (Table 2.1). The structural model of apo-CjGH74 (PDB ID 5FKR) comprises residues Pro35 to Ala765 and was refined to crystallographic R-factors of Rcryst/Rfree of 22/28% at 2.3Å resolution. Data collection and refinement statistics are summarized in Table 2.1. 57   Figure  2.6. CjGH74 xyloglucan product analysis. A) HPAEC-PAD analysis of the limit-digestion products of xyloglucan hydrolysis by the GH74 module. B) HPAEC-PAD analysis of hydrolysis time course.   Figure  2.7. MALDI-TOF analysis of the limit digest products of CjGH74 when incubated with the substrate tamarind seed XyG. The observed molecular masses of the major 3 peaks were 1086.47, 1248.54 and 1410.63; which correspond to [M+Na]+ of XXXG (calculated: 1085.9), XLXG/XXLG (calculated: 1248.05), and XLLG (calculated: 1410.19), respectively. 58  Table  2.1. X-ray data and structure refinement statistics for CjGH74.  The asymmetric unit contains only one copy of the polypeptide chain. As expected from the sequence homology with the endo-xyloglucanases Cel74A from Acidothermus cellulolyticus (PDB code: 4LGN) and Xgh74A from Clostridium thermocellum (PDB code: 2CN2, 2CN3), CjGH74 has a very similar secondary structure, consisting of two seven-bladed β-propeller domains (Figure 2.8) [128, 129]. Structure alignment of apo-CjGH74 using PDBeFOLD [198] shows the closest structure is 4LGN with 720 residues (59.9 % amino acid identity and 95 % secondary structure identity) overlapping with C alpha RMSD of 0.905 and PDBeFOLD Q score 0.880.  Two loops in CjGH74 particularly stand out in comparison to CtXgh74A: the loop Asp524-Asp527 at the top end of the positive subsite is shorter, and the loop Tyr663-Glu683 at Data Collection P21212  apo-structure P21212  XLLG & GXLG P21  D483A P21  D70A Cell dimensions         a, b, c (Å) 87.89, 131.91, 73.17 89.36, 127.69, 73.28 84.12, 94.29, 105.56 84.02, 94.15, 105.67     α, β, γ (°)  90.0, 90.0, 90.0 90.0, 90.0, 90.0 90.0, 103.2, 90.0 90.0, 103.3, 90.0 Resolution (Å) 2.28 (2.34-2.28) 1.99 (2.04-1.99) 1.52 (1.56-1.52) 1.71(1.75-1.71) Rmerge 0.129 (0.62) 0.130 (0.579) 0.087 (0.723) 0.103 (0.731) Mean (I / σI)† 11.4 (2.8) 11.5 (3.0) 10.7 (1.7) 11.0 (1.6) Completeness (%) 99.8 (99.6) 100.0 (100.0) 99.9 (99.9) 99.8 (99.7) Multiplicity 6.5 (6.6) 6.6 (6.5) 4.1 (4.1) 4.1 (4.2) Refinement     Rcryst/Rfree (%) 22/28 17/21 17/19 19/23 r.m.s.d bonds (Å) 0.014 0.018 0.02 0.019 r.m.s.d. angles (°) 1.6 1.9 1.9 1.8 Mean B protein (Å2) 19 19.0 13 15      PDB Code 5FKR 5FKS 5FKT 5FKQ 59  the end of the negative subsite is longer. The loop comprising Asp528-Gly530 is disordered and varies slightly in the apo-structure from the XyGO-complex structure (PDB ID 5FKS, described below). Furthermore, there are eight cis-peptides present, which either fit the density (Gly161 and Gln350) or fit the density and come after a proline residue (Gly296, Trp354, Tyr378, Ala404, Ser408 and Thr502). Enzymes of the GH74 family catalyse a reaction with inversion of the anomeric configuration of the product hydroxyl in respect to the β-linkage of the substrate [57]. The inverting mechanism is mediated by a catalytic base residue, which activates water for direct nucleophilic attack of a water molecule by deprotonation, and a catalytic acid residue, which protonates the leaving group facilitating its departure [77, 78]. The substrate-binding site of CjGH74 lies in an open cleft at the intersection of the N- and C-terminal domains. The catalytic residues, Asp70 (catalytic base) and Asp483 (catalytic acid), are located on opposite sides in the middle of the active center cleft, about 8Å apart (Figure 2.8). Site-directed mutagenesis of either of the conserved aspartate residues (Figure 2.S1) results in a loss of the enzymatic activity by more than 10,000-fold compared to the wild-type enzyme, based on the limit-of-detection of our assay (data not shown). The catalytic acid mutant CjGH74(D483A) comprises two molecules in the asymmetric unit with chain A ranging from Pro35 to Ala765 and chain B starting at Met33 to Ala765 (PDB ID 5FKT, Table 2.1). The final crystallographic R-factors for CjGH74(D483A) are Rcryst/Rfree of 17/19% at 1.52Å resolution. The catalytic base mutant, CjGH74(D70A), also comprises two molecules in the asymmetric unit ranging from Met33 to Ala765 for chain A and Pro35 to Ala765 for chain B (PDB ID 5FKQ, Table 2.1). CjGH74(D70A) was refined to the final crystallographic R-factors of Rcryst/Rfree of 19/23% at 1.71Å resolution. We were unsuccessful in obtaining Michaelis (ES) complexes with Glc8-based XyGOs with either catalytic mutant, despite extensive effort.  In contrast, crystals of CjGH74 in complex with Glc4-based XyGOs were obtained by soaking into native crystals and diffracted to 2 Å (PDB ID 5FKS, Table 2.1). The electron density map displays well defined density for sixteen sugar rings, corresponding to a molecule of GXLG bound in the negative subsites and a molecule of XLLG bound in the positive subsites (Figure 2.9A), despite soaking with a mixture of XyGOs from a limit-digestion of tamarind XyG (18% XXXG, 9% XLXG, 31% XXLG, 42% XLLG). 60   Figure  2.8. Three-dimensional structure of CjGH74 in complex with XyGOs. A) Divergent (wall-eyed) stereo cartoon of the structure, colour ramped from the N-terminus (blue) to the C-terminus (red) with the ligands represented as cylinders. B) View down the active-site cleft, with the N- and C-terminal domains coloured gold and dark cyan, respectively. C) Enzyme surface representation. In all panels, XyGOs in the active-site cleft are represented as sticks. This figure was drawn with CCP4MG [199]. 61   Figure  2.9. Interactions of CjGH74 with two xyloglucan-derived oligosaccharides. A) Observed 2Fo - Fc electron-density for the XyGOs (green) in the active site. B) XyGOs (green) binding to CjGH74 with the residues making direct interactions shown in grey. Hydrogen bond interactions are shown between the residues and individual carbohydrate moieties of the XyGOs. This figure was drawn with CCP4MG [199]. The lack of electron density for the expected xylose moiety at the -4‟ subsite might be attributed to substrate flexibility. That xylose is positioned at the solvent-exposed, open end of the active site cleft, where there are no obvious potential interactions with either other sugar rings or amino acids of the polypeptide chain. Most of the polypeptide residues interacting with the xyloglucans in CjGH74 (Figure 2.9B) are equivalent to those in CtXgh74A [129], however there are some exceptions. In CtXgh74A, the +2 xylose is „stacked‟ on the side chain of Trp395, which is replaced by Phe397 in CjGH74. Lys399 in CtXgh74A is replaced by Pro402 in CjGH74, resulting in a larger cleft that can accommodate a galactose residue at the +3 position. There is no equivalent residue for Asp524 (Xgh74A), as the loop is shorter in CjGH74, however Xyl+1 might be forming hydrogen bonds with Gal+2 and Glc+2 instead. Asp731 forms hydrogen bonds with Xyl-3 in CtXgh74A, as there is no equivalent residue in CjGH74, the -3 xylose might form hydrogen bonds with Asn735 (see also Figure 2.S1). 62  2.3.6 Characterization of CjCBM2 and CjCBM10 Carbohydrate binding modules (CBMs) in CAZymes function generally to bring appended catalytic domains in close proximity with their respective substrates, thereby leading to more efficient degradation of polysaccharides [72, 200-203]. To date, ca. 60000 CBMs have been classified, based on amino acid sequence similarity, into 71 families (http://www.cazy.org/Carbohydrate-Binding-Modules.html) in the CAZy Database [57]. Bioinformatic analysis indicates that locus CJA_2477 encodes CBM10 and CBM2 modules immediately following the GH74 catalytic module, with the three modules separated by serine-rich linker sequences (Figure 2.2A). Both CBMs are predicted to be type A CBMs, which present flat binding surfaces for binding to insoluble, crystalline polysaccharides such as cellulose and chitin [72]. In light of the clear specificity of the GH74 module for the amorphous plant cell wall matrix glycan XyG, we performed a detailed biochemical and structural analysis of both of the CBM modules, which were produced as super-folder green fluorescent protein (sfGFP) fusions. Native affinity gel electrophoresis [182] indicated that CjCBM10-sfGFP and sfGFP-CjCBM2 (Figure 2.2B) have no significant affinity to the soluble polysaccharide substrates of the GH74 catalytic module, XyG, BBG, and HEC (data not shown). The cellulose binding capacities of CjCBM10-sfGFP and sfGFP-CjCBM2 were then investigated by incubating the recombinant proteins with an aqueous suspension of Avicel and subjecting bound and unbound protein fractions to SDS-PAGE. CjCBM10-sfGFP and sfGFP-CBM2 were predominantly retained on the insoluble substrate versus independent sfGFP and BSA control samples, thereby demonstrating a strong affinity of both CBMs for microcrystalline cellulose (Figure 2.10). The lack of XyG binding of the CBM2 and CBM10 modules notably contrasts a recent study indicating that some Type A CBMs may exhibit cross-specificity for XyG and crystalline cellulose [204]. Also notable, the C-terminal CBM2 appended to the Streptomyces avermitilis GH74A catalytic module exhibited no affinity towards the crystalline substrates chitin and cellulose, but rather bound soluble polysaccharides with β-1,4-glucan backbone [132]. These findings suggest that bioinformatic analysis is not always sufficient to unambiguously predict CBM function, which underscores the importance of careful biochemical and biophysical characterization of these modules. 63   Figure  2.10. Binding capacity of CjCBM10-sfGFP, sfGFP-CjCBM2, sfGFP, and acetylated BSA for Avicel. Recombinant CjCBM10-sfGFP (35.6 kDa; lane 1, 2), sfGFP-CjCBM2 (40.7 kDa; lane 3, 4), sfGFP (27.7 kDa; lane 5, 6), and acetylated BSA (66 kDa; lane 7, 8) were incubated with Avicel and unbound proteins were removed by centrifugation (lanes 1, 3, 5, and 7). Bound proteins were released from Avicel by boiling in SDS (lanes 2, 4, 6, and 8). The dissociation constant (Kd) was subsequently determined for the sfGFP-CBM fusion proteins to more precisely determine this affinity. Analysis of the binding isotherms shown in Figure 2.11 revealed that CjCBM10-sfGFP exhibited a Kd value of 1.5 ± 0.27 µM and sfGFP-CjCBM2 exhibited a Kd value of 0.39 ± 0.02 µM for Avicel; sfGFP on its own had no detectable binding at loadings of up to 150 µg protein and 10 mg Avicel. These values lie within the range of previously characterized members of CBM2 and CBM10 [183, 205]. [PC]max values (the maximum amount of the recombinant enzyme that binds to Avicel, see Materials and Methods) were 0.5 ± 0.08 µmol/g cellulose, and 0.16 ± 0.004 µmol/g cellulose for CjCBM10-sfGFP and sfGFP- CjCBM2, respectively. It should be noted, however, that [PC]max values are dependent on the nature of the CBM fusion protein. For example, the [PC]max value for a maltose-binding protein (MBP)-Clostridium cellulovorans CBM3A fusion was 17 times less than that of the independent recombinant CBM, presumably due to steric effects [184]. On the other hand, CBMs are usually naturally in tandem with catalytic domains. Thus, analysis of protein-CBM fusions, such as GFP-CBM constructs, may better approximate true substrate binding capacity than independent CBMs. Regardless, CBM-GFP fusions are extremely useful tools to 64  characterize CBMs that were otherwise recalcitrant to independent production and purification. The GFP tag, in particular, provides a convenient reporter for accurate protein quantitation by fluorimetry, which also allows the inclusion of BSA to prevent non-specific adsorption, without interfering with detection.  Figure  2.11. Cellulose-binding isotherms of CjCBM10-sfGFP and sfGFP-CjCBM2. A) CjCBM10-sfGFP. B) sfGFP-CjCBM2. No binding of independently produced sfGFP was observed, including at concentrations higher than the maxima used for the fusion proteins. Representative data from single determinations are shown; each data set was repeated at least twice on different days.   65  Currently, only one structure of a CBM10 has been solved (by NMR), while six independent structures of CBM2 members have been solved (4 by NMR, 1 by crystallography, and 1 by both methods) [57, 183, 206-210]. Within CBM10, three aromatic residues, Tyr8, Trp22, and Trp24, in the CBM10 from C. japonicus xylanase A [CjCBM10(XylA)] have been shown to be crucial for crystalline cellulose binding through hydrophobic stacking interactions [206, 211]. Amino acid sequence alignment and structural homology modeling indicate that these residues are strictly conserved in CjCBM10 (Figure 2.12A and 2.12B), which rationalizes the observed common substrate specificity (Figure 2.10, Figure 2.11). Similarly, the three aromatic residues Trp929, Trp966, and Trp983 in CjCBM2 were found to be homologous to Trp393, Trp430, and Tyr448 in the previously characterized CBM2 from Clostridium cellulovorans endoglucanase D [CcCBM2(EndD)], as well as to Trp17, Trp54, and Trp72 of Cellulomonas fimi CBM2 [CfCBM2] (Figure 2.12C and 2.12D), which have been shown pivotal in cellulose binding [209, 210]. 2.4 Conclusions Cellvibrio japonicus represents a useful model to understand the saprophytic degradation of xyloglucan, one of the most abundant plant cell wall glycans in the biosphere and a key reservoir in the global carbon cycle. Our recent identification of a multi-gene locus encoding periplasm-localized exo-glycosidases, namely an α-xylosidase, a β-galactosidase, and an α-L-fucosidase, in addition to a TonB-dependent transporter (TBDT) provided only a partial picture of XyG utilization by C. japonicus [33]. Here, functional and structural characterization of the multi-modular CjGH74-CBM10-CBM2 gene product of locus CJA_2477 sheds new light on the requisite initial extracellular backbone hydrolysis step. In particular, the demonstrably high specific activity of the GH74 catalytic module for XyG evidences the catalytic proficiency of this enzyme. Additionally, the orthogonal specificity of the pendant CBMs for crystalline cellulose is readily rationalized in light of the composite nature of the plant cell wall, in which cellulose microfibrils and amorphous xyloglucans are intimately associated [11]. As it is presently unknown whether CjGH74 is the only endo-xyloglucanase enzyme produced by C. japonicus, our future efforts will be directed towards the identification and functional characterization of other CAZymes with complementary activities via detailed genetic and transcriptomic studies. 66  Moreover, the regulatory elements controlling the expression of all C. japonicus xyloglucan-active enzymes and proteins remain to be elucidated.  Figure  2.12. Bioinformatic analysis of CjCBM10 and CjCBM2. A) Amino acid sequence alignment of CjCBM10 and C. japonicus xylanase A CBM10 (PDB code: 1E8R). B) Superposition of CjCBM10 model structure (cyan) with C. japonicus xylanase A CBM10 (magenta) (PDB code: 1E8R). C) Amino acid sequence alignment of CjCBM2 with C. cellulovorans endoglucanase D CBM2 (PDB code: 3NDY) and Cellulomonas fimi CBM2 (PDB code: 1EXG). D) Superposition of CjCBM2 model structure (cyan) with C. cellulovorans endoglucanase D CBM2 (magenta) (PDB code: 3NDY). Amino acid alignments were generated using ClustalW [168]; aromatic residues involved in cellulose binding are marked with asterisks. Numbering of the aligned proteins is based on the amino acid sequence of the full length CjGH74 enzyme (CjGH74-CBM10-CBM2). Three-dimension homology models were generated using the Phyre2 online server [169]; aromatic amino acids involved in cellulose binding are shown in sticks and only CjCBM10 and CjCBM2 aromatic binding residues are labelled. 67  The xyloglucan utilization system of C. japonicus, as has been further defined here by the characterization of the CjGH74-CBM10-CBM2 endo-xyloglucanase, shows notable differences to that of the human gut symbiont Bacteroides ovatus [34]. The contiguous B. ovatus Xyloglucan Utilization Locus (XyGUL) encodes eight glycoside hydrolases (including both endo-xyloglucanases and sidechain-cleaving exo-glycosidases), in addition to outer membrane-localized XyG-binding proteins, a TBDT, and an inner membrane hybrid- two-component sensor [34]. Thus, in sharp contrast, the analogous C. japonicus XyGUL (CJA_2706-2710) is far less complete, only encoding 3 exo-glycosidases and a predicted TBDT; a key difference between the two systems is the lack of endo-xyloglucanase encoding genes in the C. japonicus XyG utilization locus (reviewed in [212]).  In this case, the locus (CJA_2477) encoding the vanguard CjGH74 enzyme, which is required to cleave the XyG polysaccharide into its component oligosaccharides, is distant (~337 kb) from the C. japonicus XyGUL. As discussed previously, genomic co-localization of GHs is in fact rare in C. japonicus, with only a few examples of similarly “incomplete” polysaccharide utilization loci [33].  A further key distinction between the B. ovatus and C. japonicus XyG utilization systems concerns enzyme localization in these Gram-negative bacteria:  B. ovatus tethers its endo-xyloglucanases to the external surface of the outer membrane by N-terminal lipidation [34].  In contrast, the CjGH74-CBM10-CBM2 of C. japonicus is predicted to be freely secreted into the environment, as evidenced by its signal peptide. In both bacteria, the exo-glycosidases are localized to the periplasm, where ultimate saccharification of oligosaccharides transported through the TBDT occurs prior to primary metabolism in the cytosol [33, 34, 212]. In light of continuing interest to better utilize renewable plant biomass to produce fuels and materials, the identification and functional characterization of coordinated microbial systems of CAZymes represents a potentially powerful approach to identify new biotechnological tools for biomass deconstruction and analysis [6, 10, 163, 164]. It has been shown that incorporating accessory enzymes such as hemicellulases and lytic polysaccharide monooxygenases in cellulase mixtures enhances conversion to monosaccharides, thereby allowing reduced enzyme loading and crucially reducing process costs [213-216]. Although the use of xyloglucan-active enzymes as accessory enzymes has not yet been extensively explored, data indicate that endo-xyloglucanases and -xylosidases can synergistically increase the hydrolytic capacity of 68  cellulases [97, 98], perhaps by removing XyG that is intimately associated with cellulose microfibrils. As such, the complete xyloglucanolytic system of C. japonicus, comprising the GH74 endo-xyloglucanase and sidechain-cleaving exo-glycosidases of the XyGUL, may represent an attractive system for improving the saccharification efficiency of specific biomass sources. Likewise, C. japonicus has been demonstrated as a platform for metabolic engineering to produce fuels or other chemicals, because it possesses the machinery required for the utilization of different plant polysaccharides [156, 157, 159, 160]. As such, the comprehensive characterization of the complete CAZome of C. japonicus is central to further advances. In addition to contributing to this effort, the structural genomics study presented here will fundamentally inform future analyses of similar enzymes within microbial genomes.  69   2.5 Supporting information 2.5.1 Supporting tables Table 2.S1. Primer sequences used in current study.   Primer Oligonucleotide Sequence Recombinant protein CjGH74-Full-NheI- F CjGH74-Full -XhoI-R 5ˊ -GACCGCTAGCATGGCCCCGTCGGAAAATTACACC- 3ˊ 5ˊ-GGTCCTCGAGTTACTGGCAGGGTGTTCCTGTTAC- 3ˊ CjGH74-CBM10-CBM2 CjGH74-Full-NheI- F CjGH74-CBM10-XhoI-R  5ˊ -GACCGCTAGCATGGCCCCGTCGGAAAATTACACC- 3ˊ 5ˊ -GGTCCTCGAGTTAGCCGCTGCCCTCAATGCCAAAG- 3ˊ CjGH74-CBM10 CjGH74-Full-NheI- F CjGH74-XhoI-R 5ˊ -GACCGCTAGCATGGCCCCGTCGGAAAATTACACC- 3ˊ 5ˊ -GGTCCTCGAGTTATCCTGCAGAATCACCGTACAG- 3ˊ CjGH74 CjCBM10-NheI-F  CjGH74-CBM10-XhoI-R 5ˊ -GACCGCTAGCATGCTCAGTGGCGAACGCTGCAACTG- 3ˊ 5ˊ -GGTCCTCGAGTTAGCCGCTGCCCTCAATGCCAAAG- 3ˊ CjCBM10 CjCBM2-NheI-F  CjGH74-Full -XhoI-R 5ˊ -GACCGCTAGCATGGTTAGTGGTGCTTGCACCTATG- 3ˊ 5ˊ-GGTCCTCGAGTTACTGGCAGGGTGTTCCTGTTAC- 3ˊ CjCBM2 N-terminus-sfGFP-NheI-F N-terminus-sfGFP-R 5ˊ-GACCGCTAGCATGGTTAGCAAAGGTGAAGAAC-3ˊ 5ˊ- GAAAATAAAGATTCTCGCTGCCTTTATACAGTTC- 3ˊ N-terminus- sfGFP TEV-CjCBM10-F    CjGH74-CBM10-XhoI-R 5ˊ-CTGTATAAAGGCAGCGAGAATCTTTATTTTCAGGGCC- TCAGTGGCGAACGCTGC-3ˊ 5ˊ -GGTCCTCGAGTTAGCCGCTGCCCTCAATGCCAAAG- 3ˊ TEV-CjCBM10 N-terminus-sfGFP-NheI-F CjGH74-CBM10-XhoI-R 5ˊ-GACCGCTAGCATGGTTAGCAAAGGTGAAGAAC-3ˊ 5ˊ -GGTCCTCGAGTTAGCCGCTGCCCTCAATGCCAAAG- 3ˊ SfGFP-CjCBM10 TEV-CjCBM2-F   CjGH74-Full -XhoI-R 5ˊ-CTGTATAAAGGCAGCGAGAATCTTTATTTTCAGGGCGT- TAGTGGTGCTTGCACC -3ˊ 5ˊ-GGTCCTCGAGTTACTGGCAGGGTGTTCCTGTTAC- 3ˊ TEV-CjCBM2 N-terminus-sfGFP-NheI-F CjGH74-Full -XhoI-R 5ˊ-GACCGCTAGCATGGTTAGCAAAGGTGAAGAAC-3ˊ 5ˊ-GGTCCTCGAGTTACTGGCAGGGTGTTCCTGTTAC- 3ˊ SfGFP-CjCBM2 CjCBM10-NheI-F  CjCBM10-TEV-R 5ˊ -GACCGCTAGCATGCTCAGTGGCGAACGCTGCAACTG- 3ˊ 5ˊ-CTTCACCTTTGCTAACGCCCTGAAAATAAAGATTCTCGC- CGCTGCCCTCAATGCCAAAG-3ˊ CjCBM10-TEV C-terminus-sfGFP-F C-terminus-sfGFP-XhoI-R 5ˊ-CTTTATTTTCAGGGCGTTAGCAAAGGTGAAGAAC- 3ˊ 5ˊ- GGTCCTCGAGTTAGCTGCCTTTATACAGTTCATC- 3ˊ C-terminus sfGFP CjCBM10-NheI-F  C-terminus-sfGFP-XhoI-R 5ˊ -GACCGCTAGCATGCTCAGTGGCGAACGCTGCAACTG- 3ˊ 5ˊ- GGTCCTCGAGTTAGCTGCCTTTATACAGTTCATC- 3ˊ cjCBM10-sfGFP CjGH74-D70A-F CjGH74-D70A-R 5ˊ-ATTTACGCGCGTACTGCCATCGGTGGTGCCTATC- 3ˊ 5ˊ- GATAGGCACCACCGATGGCAGTACGCGCGTAAAT- 3ˊ CjGH74(D70A) CjGH74-D483A-F CjGH74-D483A-R 5ˊ- TATTCTGCCCTGGGTGCCATTGGCGGCTTCCGC- 3ˊ 5ˊ-GCGGAAGCCGCCAATGGCACCCAGGGCAGAATA- 3ˊ CjGH74(D483A) 70  2.5.2 Supporting figures                        10        20        30        40        50        60        70            80                              ....|....|....|....|....|....|....|....|....|....|....|....|....|...*|.*..|....| CjGH74        --------APSENYTWKNVRID-GGGFVPGIIFNQKEADLIYARTDIGGAYRWNSATSSWIPLLDWVGWDNWGWNGVMSL  BAE44527.1    --------APSEPYTWKNVVTGAGGGFVPGIIFNESEKDLIYARTDIGGAYRWNPANESWIPLTDFVGWDDWNKNGVDAL  Q70DK5        --------VTSVPYKWDNVVIGGGGGFMPGIVFNETEKDLIYARADIGGAYRWDPSTETWIPLLDHFQMDEYSYYGVESI  BAC70285      -----AETTAGPSYRWRNAVIG-GTGFVTGVLFHPSVRGLAYARTDIGGAYRWDDRGARWTPLIDHLGWDDWNLLGVEAM  BAC69567      -ASAPTATIAADTYSWKNARVD-GGGFVPGIVFNRSEKNLAYARTDIGGAYRWAESSKTWTPLLDSVGWSDWGHTGVVSL  NP_630626     AEPAPRAAVAADSYTWKNARID-GGGFVPGIVFNRTEKDLAYARTDIGGAYRWQEESHTWTPLLDHVGWDDWGHTGVVAL  WP_011292038  ----------TTGYTWRNVEIV-GGGFVPGIVFNQSEPDLIYARTDIGGAYRWDPATERWIPLLDHVGWDDWGHSGVVSI                          90       100       110       120       130       140       150       160                       ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        ATDAADPNRVYAAVGMYTNTWDPNNGAILRSTDRGNTWQATPLPFKVGGNMPGRGMGERLAIDPNRNSIIYYGAEGGNGL  BAE44527.1    ATDPVDPDRVYLAVGTYTNSWDKNNGAILRSTDRGDTWQTTTLPFKVGGNMPGRSMGERLVVDPNDNRILYFGARSGNGL  Q70DK5        ATDPVDPNRVYIVAGMYTNDWLPNMGAILRSTDRGETWEKTILPFKMGGNMPGRSMGERLAIDPNDNRILYLGTRCGNGL  BAC70285      AVDPTHPDRLYLAVGTYAQSWAG-NGAVLRSEDRGATWTRTDLTVKLGGNEDGRGAGERLLVDPRDSDTLWLGTR-HDGL  BAC69567      ASDSVDPNKVYAAVGTYTNSWDPGNGAVLRSGDRGASWQKTDLPFKLGGNMPGRGMGERLAVDPNRNSVLYLGAPSGKGL  NP_630626     ASDAVDPDRVYAAVGTYTNDWDPTNGAVLRSADRGASWEKADLPFKLGGNMPGRGMGERLAVDPHDNDVLYLGAPSGHGL  WP_011292038  ATDPVDPDRVYAAVGTYTNDWDPNNGAIKRSTDRGETWETTELPFKLGGNMPGRGMGERLAIDPNDNSVLYLGAPSGHGL                         170       180       190       200       210       220       230       240                      ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        WRSTDYGATWAKVSSFTNGGNYAQDPNDPNDYLNKIQGVVWVTFDPASGS-AGNTSQVIYVGVADTQNAIYRSTDGGTTW  BAE44527.1    WRSSDYGATWSKVTSFPNPGTYVQDPAN--EYGSDIVGLAWITFDKSSGQ-VGQATQTIYVGVADTAQSIYRSTDGGATW  Q70DK5        WRSTDYGVTWSKVESFPNPGTYIYDPNF--DYTKDIIGVVWVVFDKSSST-PGNPTKTIYVGVADKNESIYRSTDGGVTW  BAC70285      LKSTDRGATWAAATAFPAKAN------------SSGQGVVFLVAAGRTVY-AGWGDGDGTSGTAN----LYRTADG-TTW  BAC69567      WRSTDSGASWSQVTDFPNVGTYVQDATDTSGYASDNQGIVWVTFDESTGS-PGSSTRTVYVGVADKDNSVYRSTDAGATW  NP_630626     WRSTDAGVTWSEVTAFPNPGNYAQDPNDTSGYASDNQGITWVTFDESTGGGAGTATRTLYVGVADKENAVYRSTDAGATW  WP_011292038  WKSTDYGKTWQKVTSFPNPGNYVADPSDVGGYLGDNQGVVWVVFDPTSSS-PGHVTKDIYVGVADKQNTVYRSTDGGQTW                         250       260       270       280       290       300       310       320                      ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        SRLAGQPTG---FLPHKGVYDAVNGVLYIAYSDTGGPYDG-AKGDVWKFTASSGTWTNISPI-----PSSSSDLYFGYSG  BAE44527.1    TAVPGQPTG---YLPHHGVLDAD-GSLYITYSNGVGPYDG-TKGDVWKLNTSTGAWTNISPI-----PSSSADNYFGYGG  Q70DK5        KAVPGQPKG---LLPHHGVLASN-GMLYITYGDTCGPYDGNGKGQVWKFNTRTGEWIDITPI-----PYSSSDNRFCFAG  BAC70285      GAVPGRPSGTSAKVPLRAAYDTHTRELYVTYGDAPGPGGQ-SDGSVHKLRTATGTWTEVTPVKPGGTTSDGSADTFAYGG  BAC69567      SRLAGQPTG---HLAHKGVLDAANGCLYLAYSDKGGPYDG-GKGQLWRYTTKTGTWTNIS-------PVAEADTYYGFSG  NP_630626     ERLAGQPTG---YLAHKGVLDAENGYLYLAYSDTGGPYDG-GKGRLYRYATATGTWTDIS-------PAAEADTYYGFSG  WP_011292038  ERIPGQPTG---FLAQKGVFDHVNGLLYIATSDTGGPYDG-SDGEVWRYDTTTGTWTDIT-------PADPDGFEYGFSG                         330        340      350       360       370       380       390       400                      ....|....|....|...**....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        LTIDRKNPNTLMVASQIAWWPDAVFFRSTNGGASWTRIWDWTSYPSRSFRYTMDITEVPWLNFGNSNPVAPEVSPKLGWM  BAE44527.1    LAVDAQEPGTLMVATLNSWWPDAILFRSKDGGTTWTRIWEFDGYPNRKFRYTQNISAAPWLTFG-TTPAPPEVSPKLGWM  Q70DK5        LAVDRQNPDIIMVTSMNAWWPDEYIFRSTDGGATWKNIWEWGMYPERILHYEIDISAAPWLDWG-TEKQLPEINPKLGWM  BAC70285      VAVDARRPGTLVVSTNNRWADGDTVFRSTDGGRTWTSLKDAAVF---------DVSETPFLDWG-------DDKPKFGWW  BAC69567      LTVDRQHPGTVMATAYSSWWPDTQLFRSTDSGGTWTKAWDYTSYPSRSNRFTMDVSSSPWLTWG-ANPAPPEQTPKLGWM  NP_630626     LTVDRQRPGTVMATAYSSWWPDTQIFRSTDSGATWSQAWSYTSYPDRENRYTMDVSSSPWLTWG-ANPAPPEQTPKLGWM  WP_011292038  LTIDRQNPDTIMVVSQILWWPDIQIWRSTDRGETWSRIWEFSGYPDRTLRYNHDISAAPWLDFN-RQDNPPEVSPKLGWM                         410       420       430       440       450       460       470       480                      ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        NESVEIDPHNSNRLMYGTGATIYATENLTSWDS-GGQILLKPMVKGLEETAVLDVVSPPVG-APVYSALGDIGGFRHDDL  BAE44527.1    IGDLEIDPFDSDRMMYGTGATIYGTNNLTNWDN-NEKIDISVMAKGVEEMAVLDLVSPPSG-AHLVSGLGDVNGFRHDDL  Q70DK5        IGDIEIDPFNSDRMMYVTGATIYGCDNLTDWDR-GGKVKIEVKATGIEECAVLDLVSPPEG-APLVSAVGDLVGFVHDDL  BAC70285      IQALAVDPYDSQHVVYGTGATLYGTRDLKRWAP---------RIRGLEESAVRQLISPPVGEAHLISGLGDIGVMYHERL  BAC69567      TESLEIDPFDSARMMYGTGATVYGTDNLTNWDS-GSQFTIKPMARGLEETAVNDLASPPSGGAQLFSALGDIGGFRHTDL  NP_630626     TEALEIDPFDSDRMMYGTGATVYGTENLTNWDDEGGTFAVEPMVRGLEETAVNDLASPPSG-APLLSALGDVGGFRHTSL  WP_011292038  TQAFEIDPFNSDRMLYGTGATIYGSDNLTNWDE-GKKIDIKVRAQGIEETAVQDLIAPPGD-TELVSALGDIGGFVHDDI            71   Figure 2.S1. Amino acid sequence alignment of the C. japonicus GH74 catalytic module (CjGH74) with other characterized GH74 members. BAE44527.1, Paenibacillus XEG74; Q70DK5, C. thermocellum xyloglucanase (xgh74A); BAC70285, BAC69567, Streptomyces avermitilis xyloglucanases; NP_630626, Streptomyces coelicolor xyloglucanase; and WP_011292038, Thermobifida fusca xyloglucanase. Red arrows indicate the conserved catalytic aspartate residues. Conserved active-site tryptophan residues are indicated with red asterisks [131].                    490       500       510       520       530       540       550       560                      ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        TKVPT-SMYTTPNFSSTTSIDFAELQPATMVRVGNLDS-----GGGIGVTTNAGGSWWQG-QNPPGVTSG-GNVALAADG  BAE44527.1    DQPPA-KMFSSPNYASTESLDFAELNPSTMVRVGKADYAADPNAKSIGLSSDGGTNWYKANAEPAGTAGG-GTVAISSDG  Q70DK5        KVGPK-KMH-VPSYSSGTGIDYAELVPNFMALVAKADLY---DVKKISFSYDGGRNWFQPPNEAPNSVGG-GSVAVAADA  BAC70285      TASPSRGMATNPVFGSATGLAQAAARPAYVVRTGWGDHG------NGAYSHDGGRTWAPFEAQPDIAKDAPGPIATSADG  BAC69567      TTVPS-LMYTSPNFTTSTSLDYAETDPGTVVRVGNLDS-----GPHVAFSTDNGANWFAG-ADPSGVSGG-GTVAAASDG  NP_630626     TEVPS-MMYTSPNFTSTTSLDFAETKPDVVVRAGNLDS-----GPHIAFSTDNGANWFGG-TDPSGVSGG-GTVAAGADG  WP_011292038  TVVPD-AMFDSPFHGNTRSIDFAELNPSVMARVGEAVDGE--VDSHIGISTSGGSHWWAG-QEPSGVTGA-GTVAVNADG                         570       580       590       600       610       620       630       640                      ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        GAIVWAPGGS----TNVYLSTTFGSTWTAISALPAGAVIEADRVNPNKFYA--LANGTFYVSTNKGASFSATVTAGIP-A  BAE44527.1    SKLVWSTSD-----KGVHYSSTGGNSWTASTGIPAQAKVISDRVNPNKFYG--FAAGKIYVSVNGGVSFSQTAAAGLPVD  Q70DK5        KSVIWTPEN-----ASPAVTTDNGNSWKVCTNLGMGAVVASDRVNGKKFYA--FYNGKFYISTDGGLTFTDTKAPQLPKS  BAC70285      GTLLWSFVHWDGTTYAAHRSTDNGASWSEVSSFPKGATPVADPADPTRFYAYDFDNGTLYASTDSGRSFTARAGGLPSGD  BAC69567      SRFVWSPAG-----TGVQYTTGFGTSWSASAGLPAGAIVESDRVDPKTFYG--FKSGRFYVSSDGGATFTASAATGLPSG  NP_630626     SRFVWSPEG-----AGVQYTTGFGTSWQASTGLPAGAIVESDRVNPATFYG--FKSGRFYVSTDGGATFTASAATGLPAG  WP_011292038  SRIVWSPDG-----TGVHYSTTLGSSWTPSQGVPAGARVEADRVNPDKFYA--FANGTFYTSTDGGATFTKSSAAGLPTK                         650       660       670       680       690       700       710       720                      ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        AARKFKAVYGREGDIWLAGG-SSTTTYGLWRSTNSGASFTKLASVQEADNVTFGKAATGATYPAIYIIGKVDNVRGVFRS  BAE44527.1    GNADLDAVPGVEGELWFAGGNEDGGPYGLWHSTDSGASFAKLSNVEEADSIGFGKAAPGRNSAALYAVAQIDGTRGFFRS  Q70DK5        VN-KIKAVPGKEGHVWLAAR--EGG---LWRSTDGGYTFEKLSNVDTAHVVGFGKAAPGQDYMAIYITGKIDNVLGFFRS  BAC70285      SQFKLVAAPGRSGDLWLSAK-----WNGLYRSTDGGDTFARIDSCWASYTLGFGKAADGADYPAIYQVGSTETITAVYRS  BAC69567      DSVRFKALPGTKGDIWLAGG-ASDGAYGLWHSTDGGAAFTKLATVDQADTIGFGKAATGASYQTLYTSAKIGGVRGIFRS  NP_630626     DGVRFKALPGGEGDVWLAGG-AADGPYGLWHSTDGGGTFTRLPGVDAADTVGFGKAAPGASYQTLFTSAEIGGVRGIFRS  WP_011292038  GNIRFAAVPGHEGDIWLAGG-ETNSTYGMWRSTDSGATFTRITAVDEGDVVGFGKPAPGRSYPAVYTSSKINGVRGIFRS                         730       740       750       760       770       780       790                     ....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH74        TNEGASWVRINDDQRQYGNFGEAISGDPRIYGRLYLGTNGRGLLYGDS--------AG------------  BAE44527.1    DDGGASWVRINDDAHQY-----------------------------------------------------  Q70DK5        DDAGKTWVRINDDEHGYGAVDTAITGDPRVYGRVYIATNGRGIVYGEP----ASDEPVPTPPQVDKGLVG  BAC70285      DDAARTWVRINDDAHQWGWIGEAVVGDPRIHGRVYLATNGRGIQYGEPV---------------------  BAC69567      TDKGASWTRVNDDAHQWGWTGAAITGDPRVYGRVYVSTNGRGIVYGDT----AGSSDGGGTEPAPTGAC-  NP_630626     TDAGATWTRVNDDAHQWGWTGAAITGDPRVYGRVYVATNGRGVIYGDTSDTGGGTDPGPGPDPTPTGA--  WP_011292038  DDAGTTWVRINDDQHQWAWTGAAITGDPDVYGRVYIGTNGRGVIVGDL--------DG------------  72  Chapter 3: In vitro and in vivo characterization of three GH5 endo-xyloglucanases3 3.1 Introduction Renewable plant biomass is envisioned as a promising alternative to fossil petroleum in the production of liquid fuels and high-value chemicals [6, 163]. However, plant cell walls are chemically and structurally complex in nature, and thus require harsh thermo-chemical treatment to yield fermentable sugars. Such processes often generate undesirable by-products that inhibit subsequent microbial conversion [9].  In light of their ability to catalyze the degradation of recalcitrant plant cell walls under ambient conditions, enzymes from saprophytic micro-organisms constitute an attractive palette of biocatalysts for improved biomass saccharification [10].  The discovery and characterization of new enzymes from saprophytes is thus central to advancing biotechnology and, not least, underpins fundamental understanding of the biological roles of these micro-organisms in the global carbon cycle. The Gram-negative bacterium, Cellvibrio japonicus Ueda107 (formerly, Pseudomonas fluorescens subsp. cellulosa) has emerged as a model saprophytic micro-organism with a demonstrated ability to utilize nearly all plant cell wall polysaccharides, including cellulose, xylans, mannans, arabinans, and pectins [155, 158]. Indeed, sequencing of the C. japonicus genome in 2008 revealed vast array of carbohydrate-active enzymes (CAZymes [57]) predicted to be involved in plant cell wall saccharification [156].  The recent development of genome editing techniques for C. japonicus has further advanced the biology and bioengineering of this bacterium in biomass conversion [33, 157, 159, 160, 165]. We have recently sought to elucidate the xyloglucan (XyG) utilization system of C. japonicus, in light of the ubiquity and abundance of this family of cell wall matrix polysaccharides across the plant kingdom [11, 217].  In dicots, XyGs may constitute up to 25% of the primary cell wall dry weight, with lower amounts found in confers (10%) and grasses (<5%) [23, 26].  Structurally, XyGs have brush-like architectures built upon a linear, cellulosic                                                  3 Adapted from: Mohamed A. Attia, Cassandra Nelson, Wendy Offen, Namrata Jain, Jeffrey Gardner, Gideon J. Davies, and Harry Brumer. In vitro and in vivo characterization of three Cellvibrio japonicus Glycoside Hydrolase Family 5 members reveals potent xyloglucan backbone-cleaving functions. Biotechnology for Biofuels, 11:45. 73  β(1→4)-D-glucan backbone that is extensively branched with α(1→6)- xylopyranosyl residues at regular intervals. Further elaboration of these branch points with diverse monosaccharides and acetyl groups is dependent on the species and tissue of origin [32, 218]; presently ca. 20 distinct sidechain saccharide compositions are known [30, 96].  The structure of the canonical dicot (fucogalacto)xyloglucan is shown in Figure. 3.1. Due to this structural complexity, complete XyG saccharification requires the concerted action of numerous backbone-cleaving endo-xyloglucanases and side-chain-cleaving exo-glycosidases [34, 35].   To this end, we functionally characterized the products of a multi-gene XyG utilization locus in the C. japonicus genome, which encodes the three exo-glycosidases required for (fucogalacto)xyloglucan sidechain cleavage (a GH95 α-L-fucosidase, a GH35 β-galactosidase, and a GH31 α-xylosidase) together with a predicted TonB-dependent transporter (TBDT) (Figure. 3.1B) [33, 145, 166]. Noting that this locus lacked an associated endo-xyloglucanase, we subsequently provided biochemical and structural evidence that the lone, secreted C. japonicus GH74 member (encoded by CJA_2477, Figure. 3.1B) could efficiently generate the Glc4-based XyG oligosaccharides (XyGOs) required by the downstream exo-glycosidases [219].  Figure  3.1. XyG structure in dicot plants and C. japonicus XyG active enzymes. A) Schematic representation of dicot XXXG- type fucogalacto-XyG with the different possible ramifications. Nomenclature is according to [30]. B) Soil saprophyte Cellvibrio japonicus XyGUL in addition to the distant GH74 and GH5s endo-xyloglucanases. Genes encoding backbone-cleaving endo-xyloglucanases (GH5 and GH74) are indicated in navy blue and genes encoding side-chain-cleaving exo-glycosidases (GH35 β-galactosidases; GH31 α-xylosidases and GH95 α-L-fucosidase) are in cyan, and the TonB dependent transporter (TBDT) is shown in green.   74  As we now show, genetic deletion of this GH74 endo-xyloglucanase did not, however, impede the growth of C. japonicus on the polysaccharide, which suggested the involvement of additional, unidentified endo-xyloglucanases. Hence, we also explored the in vitro and in vivo function of three candidate endo-xyloglucanases from GH5 subfamily 4 (GH5_4) [58], guided by bioinformatic and transciptomic analyses. Utilizing a combination of reverse genetics, enzymology, and structural biology, the present study provides a new insight into the upstream deconstruction of XyG by C. japonicus. 3.2 Materials and Methods 3.2.1 Transcriptomic analysis RNAseq sampling and analysis was performed as previously described [157, 165]. Briefly, C. japonicus cultures were grown in 500 mL flasks at 30°C shaking at 200 RPM. OD600 was measured every hour to monitor growth and samples were taken during exponential and stationary phase. Within two minutes of sampling, metabolism was stopped using a phenol/ ethanol solution (5%/ 95%). The samples were immediately pelleted by centrifugation at 8000 g at 4 °C for five minutes. The supernatant was discarded and cell pellets were then flash frozen using a dry ice/ethanol bath and stored at -80 °C. RNA extraction, library preparation, multiplexing, and sequencing were performed by GeneWIZ (South Plainfield, NJ). Illumina Hi Seq2500 was performed in 50 bp single-reads with at least 10 million reads generated per sample. The raw data have been submitted to GEO (GSE109594). 3.2.2 Bioinformatic analysis The full length proteins encoded by ORFs CJA_3010 (CjGH5D), CJA_3337 (CjCBM2-CBM10-GH5E) and CJA_2959 (CjFN3-GH5F) in C. japonicus genome were screened for the presence of a signal peptide using SignalP 4.0 [167] and LipoP 1.0 [220]. The modular architecture of the three enzymes was obtained from BLASTP analysis and additional alignment with representative GH and CBM modules from the CAZy Database [57] using ClustalW [168].   75  3.2.3 Cloning of cDNA encoding protein modules  cDNA encoding the full length enzymes CjSRL-GH5D, CjCBM2-CBM10-GH5E and CjFN3-GH5F, in addition to the catalytic domains CjGH5D, CjGH5E and CjGH5F were PCR amplified from C. japonicus genomic DNA; all constructs were designed such that the native predicted signal peptide was removed (PCR primers are listed in Table 3.S1). The amplified CjSRL-GH5D, CjCBM2-CBM10-GH5E, CjFN3-GH5F, CjGH5D and CjGH5F products were double-digested with NheI and XhoI, gel purified and ligated to the respective sites of pET28a to fuse an N-terminal 6x His-Tag. The amplified CjGH5E product was ligated in an SspI linearized pMCSG53 vector using Ligation Independent Cloning (LIC) strategy [221]. Successful cloning was confirmed by PCR and plasmid DNA sequencing. Q5 high fidelity DNA polymerase was used for all the PCR amplifications.  3.2.4 Gene expression and protein purification Constructs were individually transformed into the chemically competent E. coli Rosetta DE3 cells. Colonies were grown on LB solid media containing kanamycin (50 µg mL-1) and chloramphenicol (30 µg mL-1) [CjSRL-GH5D, CjCBM2-CBM10-GH5E, CjFN3-GH5F, CjGH5D and CjGH5F], or containing ampicillin (50 µg mL-1) and chloramphenicol (30 µg mL-1) [CjGH5E]. One colony of the transformed E. coli cells was inoculated in 5 mL of LB medium containing the same antibiotics and grown overnight at 37 °C (200 rpm). The whole overnight culture was used to inoculate 500 mL of TB liquid medium containing the proper antibiotics. Cultures were grown at 37 °C (200 rpm) until D600 = 0.6. Overexpression was induced by adding IPTG to a final concentration of 0.1 mM. After induction, cultures were grown overnight at 16 °C (200 rpm). Cultures were then centrifuged and pellets were resuspended in 10 mL of E. coli lysis buffer containing 20 mM HEPES, pH 7.0, 500 mM NaCl, 40 mM imidazole, 5% glycerol, 1 mM DTT and 1 mM PMSF. Cells were then disrupted by sonication and the clear supernatant was separated by centrifugation at 4 °C (4200 g for 45 minutes). Recombinant proteins were purified from the clear soluble lysates using a Ni+2– affinity column utilizing a gradient elution up to 100% elution buffer containing 20 mM HEPES, pH 7.0, 100 mM NaCl, 500 mM imidazole, and 5% glycerol in an FPLC system. Purity of the recombinant proteins was determined by visualizing the protein contents of the fractions on SDS-PAGE. Pure fractions 76  were pooled, concentrated, and buffer exchanged against 50 mM phosphate buffer (pH 7.0) containing 10% glycerol. Protein concentrations were then determined using Epoch Micro-Volume Spectrophotometer System (BioTek®,USA) at 280 nm, and identities of the expressed proteins were confirmed by intact mass spectrometry [171]. Purified proteins were then aliquoted and stored at -80 ºC until needed. 3.2.5 Carbohydrate sources Tamarind seed XyG, konjac glucomannan (KGM), barley β-glucan (BBG), wheat flour arabinoxylan, and beechwood xylan were purchased from Megazyme® (Bray, Ireland). Hydroxyethylcellulose (HEC) was purchased from Amresco® (Solon, USA). Carboxymethyl cellulose was purchased from Acros Organics (New Jersey, USA). Guar gum was purchased from Sigma Aldrich® (St. Louise, USA). Xanthan gum was purchased from Spectrum® (New Brunswick, USA). 2-Chloro-4-nitrophenyl (CNP)-β-D-cellotrioside (GGG-β-CNP) and CNP-β-D-cellotetraoside (GGGG-β-CNP) were purchased from Megazyme®. XXXG-β-CNP and XLLG-β-CNP were prepared are previously described [172, 173]. Glc4-based XyGOs (XXXG, XLXG, XXLG, and XLLG; nomenclature according to [30]) and Glc8-based XyGOs were prepared from XyG powder (Innovassynth Technologies, Maharashtra, India) as previously described [150]. 3.2.6 Carbohydrate analytics High Performance Anion-Exchange Chromatography with Pulsed Amperometric Detection (HPAEC-PAD) was performed on a Dionex ICS-5000 DC HPLC system operated by the Chromeleon software version 7 (Dionex) using a Dionex Carbopac PA200 column. Solvent A was double-distilled water, solvent B was 1 M sodium hydroxide (NaOH), and solvent C was 1 M sodium acetate (NaOAc). The gradient used was: 0–4 min, 10% solvent B and 2.5% solvent C; 4–24 min, 10% B and a linear gradient from 2.5–25% C; 24–24.1 min, 50% B and 50% C; 24.1 – 25 min, an exponential gradient of NaOH and NaOAc back to initial conditions; and 25–31 min, initial conditions.  Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) was performed on a Bruker Daltonics Autoflex System (Billerica, USA). The matrix, 2,5-dihydroxy benzoic acid, 77  was dissolved in 50% methanol in water to a final concentration of 10 mg mL-1. Oligosaccharide samples were mixed 1:1 (v/v) with the matrix solution. One µl of this solution was placed on a Bruker MTP 384 ground steel MALDI plate and left to air dry for two hours prior to analysis. 3.2.7 Enzyme kinetic analysis All enzyme activities toward polysaccharides were determined using a bicinchoninic acid (BCA) reducing-sugar assay [174].  The effect of temperature on xyloglucanase activity was determined by incubating the recombinant catalytic domain: CjGH5D (0.098 µg), CjGH5E (0.086 µg), CjGH5F (0.017 µg) with tamarind seed xyloglucan at a final concentration of 1 mg mL-1. Citrate buffer (pH 6, CjGH5D and CjGH5F) or phosphate buffer (pH 7.5, CjGH5E) was used to a final concentration of 50 mM in a total reaction volume of 200 µL. Reaction mixtures were incubated for 10 minutes at temperatures ranging from 25°C to 80 °C prior to the BCA assay. To determine the pH-rate profile, the same XyG concentration was incubated with the same enzyme amounts, except for CjGH5D (0.049 µg), for 10 minutes at 50 ºC (CjGH5D, CjGH5F), or 55 ºC (CjGH5E), with 50 mM final concentration of the following buffers: citrate (pH 3-6.5), phosphate (pH 6.5-8), and glycine (pH 8.5-9). For qualitative activity assessment against the other polysaccharide substrates, 1 µg of each recombinant enzyme was added to XyG, HEC, CMC, BBG, KGM, wheat flour arabinoxylan, beechwood xylan, xanthan gum, and guar gum to a final concentration of 2 mg mL-1 in 200 µL reaction volumes containing 50 mM phosphate buffer (pH 7.5: CjGH5D and CjGH5E, or pH 7: CjGH5F). Mixtures were then incubated at 50 ºC (CjGH5D and CjGH5F) or 55 ºC (CjGH5E) for 10 minutes before the generated reducing ends were detected using BCA assay. To determine specific activity values of CjGH5 enzymes toward XyG, final concentration of 0.75, 2.59, and 0.71 nM of the recombinant purified catalytic modules CjGH5D, CjGH5E, and CjGH5F, respectively, was incubated with tamarind seed XyG (1 mg mL-1) in 200 µL reaction mixtures containing 50 mM phosphate buffer (pH 7.5: CjGH5D and CjGH5E, or pH 7: CjGH5F). Likewise, specific activity values of CjGH5 enzymes toward HEC were obtained by incubating CjGH5E and CjGH5F at a final concentration of 1.04 and 0.65 µM, respectively, with 2 mg mL-1 HEC in 200 µl reaction mixtures containing 50 mM phosphate buffer (pH 7.5: 78  CjGH5E or pH 7: CjGH5F). For specific activity toward CMC, final concentration of 1.04 µM of the purified catalytic module CjGH5E was incubated with CMC (2 mg mL-1) in 200 µL reaction volume containing 50 mM phosphate buffer (pH 7.5). All reaction mixtures were incubated at 50 ⁰C (CjGH5D and CjGH5F) or 55 ⁰C (CjGH5E) for 10 minutes prior to the BCA assay and all assays were performed in triplicates.   To determine Michaelis-Menten parameters for XyG, eight different concentrations of XyG solutions were used over the range 0.025 to 1 mg mL-1. The recombinant enzyme CjGH5D (0.007 µg), CjGH5E (0.022 µg), and CjGH5F (0.006 µg) was individually incubated with each XyG concentration at 50 ⁰C (CjGH5D and CjGH5F) or 55 ⁰C (CjGH5E) for 10 min in a 200 µl final reaction mixture containing 50 mM phosphate buffer (pH 7.5: CjGH5D and CjGH5E, or pH 7: CjGH5F). Km and kcat values were determined by non-linear fitting of the Michaelis-Menten equation to the data in Sigmaplot® (Systat software Inc.)  To identify Michaelis-Menten constants for the chromogenic substrates, different dilution series were established to give final concentration ranges of 0.0625- 8 mM (CNP-β-GGG), 0.0625- 8 mM (CNP-β-GGGG), 0.002-4 mM (XXXG-β-CNP), and 0.002- 2 mM (XLLG-β-CNP). Substrate mixtures (225 µL) containing 50 mM phosphate buffer with the optimum pH of the enzyme were pre-incubated for 10 minutes at the optimum temperature of the enzyme (vide supra). Twenty five µL of 10X CjGH5D (to give 30-3800 nM final concentration according to the tested substrate), CjGH5E (3- 260 nM), and CjGH5F (4- 650 nM) was added to the substrate mixtures before the release of the aglycone was continuously monitored by measuring the change in absorbance at 405 nM for 2 minutes in a Cary50 UV–visible spectrophotometer (Varian). CNP molar extinction coefficients were determined to be 17288 M-1cm-1 in 50 mM phosphate buffer pH 7 and 17741 M-1cm-1 in 50 mM phosphate buffer pH 7.5. 3.2.8 Enzyme product analysis  To determine the limit-digest products of the CjGH5s, 5 µg of each recombinant enzyme was incubated with tamarind seed XyG at final concentration of 0.25 mg mL-1 for 7 hours (40 ⁰C) in a 200 µL reaction mixture that contained 50 mM phosphate buffer of the optimum pH of the tested enzyme (pH 7.5: CjGH5D and CjGH5E, or pH 7: CjGH5F). The reaction mixture was 79  then diluted 5 times prior to product analysis by HPAEC-PAD. To determine the mode of action of the enzyme, 0.01 µg of CjGH5s was incubated at 40 ⁰C with 1 mg mL-1 final concentration of tamarind seed XyG in 200 µL reaction volumes containing the same buffers used in limit-digest analysis. The reaction was stopped at different time points by adding 100 µL of NH4OH. Reaction mixtures were then diluted 2 times with water prior to product analysis by HPAEC-PAD. 3.2.9 Inhibition kinetics and active-site labeling Inhibition kinetic parameters were determined as previously described [222]. Briefly, a final concentration of 0.23 μM of CjGH5D was incubated with a series of different concentrations (0.5-16 mM) of XXXG-NHCOCH2Br at 40 °C in 20 mM phosphate buffer (pH 7.5) for up to 90 minutes. BSA to a final concentration of 0.1 mg mL-1 was added to the inhibition mix to prevent the non-specific loss of activity. Small samples (10 µL) of the incubate were periodically diluted 1:100 in 20 mM phosphate buffer (pH 7.5), and 100 µL of the diluted incubate was added to 100 µL of the pre-incubated substrate XXXG-CNP at 40 °C (0.1 mM final substrate concentration in the assay). Residual activity of the enzyme was determined by measuring the rate of the release of the chromophore 2-chloro-4-nitrophenolate [173] at 405 nm in Agilent Cary 60 UV-Vis Spectrophotometer. Initial-rate kinetics were measured in the strictly linear range of the enzyme. Equations 1 and 2 were used to determine Ki and ki values by non-linear regression curve-fitting using OriginPro 2015 software as previously described [222].  V = V0 exp(-kappt) + yoffset           (1) kapp =    ki [I]                                    (2)           Ki + [I]  Intact protein masses were determined on a Waters Xevo Q-TOF with a nanoACQUITY UPLC system, according to the method previously published [171], with 2.5 mM inhibitor and 4.52 μM of enzyme. 3.2.10 Crystallization, X-ray crystallography and structure solution CjGH5D was crystallized in sitting drops by the Vapour Diffusion method, using protein at 21 mg mL-1 in 50 mM sodium citrate pH 6.5, 10% glycerol over a well solution comprised of 1.9 80  M ammonium sulfate, 0.1 M 2-(N-morpholino)ethanesulfonic acid (MES) pH 5.5. The drop consisted of 0.5 l enzyme, 0.1 l seed stock and 0.4 l well solution, and the seed stock was prepared by vortexing crystals, grown in 1.4 M ammonium sulfate, 0.1 M MES pH 5.5, 1 % (w/v) polyethylene glycol 1,000, in an Eppendorf tube with a polystyrene bead. Crystals were harvested into liquid nitrogen using nylon CryoLoopsTM (Hampton Research). A non-ligand complexed “apo” dataset was collected from a crystal after immersing for a few minutes in a cryoprotectant solution, comprised of the mother liquor supplemented with 20% (v/v) glycerol. Data were collected at Diamond beamline I04, and processed using DIALS [223], and scaled using AIMLESS [224] to 1.6 Å. The space group is P212121 with unit cell dimensions 55.0, 96.4, 159.0 Å, and there are 2 molecules in the asymmetric unit. The structure was solved by molecular replacement using Phaser [225] using residues 138 to 500 of PDB entry 3zmr as a model [34], which align to residues 106 to 464 of CjGH5D, with which they share 38% identity (using the program lalign from the FASTA package [226]). The structure was built automatically using Buccaneer [227] and refined using cycles of manual model rebuilding using Coot [228] followed by refinement with REFMAC [229], including cycles using anisotropic B-factor refinement. In addition to 2 protein chains there are 20 molecules of glycerol, 5 sulfate ions and 3 molecules of PEG (introduced from the seed stock solution).  A crystal of CjGH5D, grown as above over a well solution comprised of 2.3 M ammonium sulfate, 0.1 M MES pH 5.5, was soaked for 27.5 hours in 1.85 M ammonium sulfate, 0.1 M MES pH 5.5 with 4.5 mM XXXG-NHCOCH2Br, and fished directly into liquid nitrogen. Data were collected at Diamond beamline I03, and processed using DIALS [223]. After scaling with AIMLESS [224] the data were cut off at a resolution of 2.1 Å, as although the X-ray images showed significant spot smearing, the Rmerge and CC1/2 values were good (6.2% overall, 54.7% in the outer shell and 0.998 overall, 0.933 in the outer shell respectively).  Crystals grown under similar conditions, over a well containing 1.6 M ammonium sulfate, 0.1 M MES pH 5.5, were soaked in the presence of a mixture of Glc12-based XyGOs (produced as described in [129] at a concentration of 5 mM for 5 hours, before fishing into liquid nitrogen 81  via a cryoprotectant solution, as for the apo crystal.  A dataset was collected at Diamond beamline I04 and processed using DIALS and scaled using AIMLESS to 1.9 Å.  Both ligand structures were solved initially using the apo structure as a model for REFMAC, and the ligand was placed after the protein chain had been rebuilt (using cycles of Coot interspersed with refinement in REFMAC) and some water molecules added. All models were validated using MolProbity [230] and the sugar conformations of the ligand in the complex structures were checked using Privateer [181]. Problems with diffraction anisotropy in both ligand datasets limited the possibility of refining the structures to R/Rfree lower than 0.22/0.28 and 0.23/0.30 for the complexes with XXXG-NHCOCH2 and GXLG (produced after the Glc12 soak) respectively.  3.2.11 Construction of C. japonicus mutants and growth conditions In-frame deletion mutants were made as previously described by Nelson & Gardner and confirmed via PCR [157]. Briefly, 500 bp regions up- and down-stream of the genes of interest were amplified by PCR (CJA_3010, CJA_3337, and CJA_2959) or synthesized by GeneWIZ (South Plainfield, New Jersey, USA) (CJA_2477) and assembled into pK18mobsacB by the method of Gibson [231]. Deletions were confirmed by PCR. For a complete list of primers used see Table 3.S2. Cultures were grown at 30°C with 200 RPM shaking in MOPS minimal media containing 0.25% (w:v) glucose or 0.5% (w:v) Tamarid seed xyloglucan (Megazyme) as the sole carbon source in 18 mm test tubes or in 96 well flat bottom polystyrene plates (Corning). Growth was measured using a Spec20D+ spectrophotometer (Thermo Scientific) or a Tecan Plate reader (Tecan, Switzerland). All experiments were performed in biological triplicate. Statistical analysis was performed using Graphpad Prism 6 software package (La Jolla, CA) where appropriate. 3.3 Results and Discussion 3.3.1 Transcriptomic analysis reveals a potential keystone endo-xyloglucanase from glycoside hydrolase (GH) family 5, subfamily 4.  It was previously shown via quantitative PCR (qPCR) that the C. japonicus gene cluster containing xyl31A (CJA_2706), bgl35A (CJA_2707), afc95A (CJA_2710) and CJA_2709 (Figure 82  3.1B), was up-regulated during growth on xyloglucan-containing medium [33]. Biochemical characterization has confirmed that xyl31A, bgl35A, and afc95A encode a XyGO-specific GH31 α-xylosidase, GH35 β-galactosidase, and GH95 α-L-fucosidase, respectively, while CJA_2709 was predicted to encode a TonB-dependent transporter (TBDT), but has not been assigned a gene name [33, 145, 166]. To aid identification of potential C. japonicus endo-xyloglucanases acting upstream of these enzymes, a comprehensive expression analysis via RNAseq was performed in the present study. Samples were collected from both exponentially growing and stationary phase cells grown on glucose or xyloglucan as the sole carbon source to allow for analyses of gene expression based on early-stage substrate detection (Figure 3.2), late-stage substrate detection (Fig 3.S1A), or growth rate (Figure 3.S1B). During exponential growth there were 27 CAZyme-encoding genes significantly up-regulated on XyG, including the four genes of the C. japonicus XyG cluster, which corroborated previous qPCR results (Table 3.S3). Indeed, CJA_2709 (encoding a predicted TBDT) and CJA_2706 (xyl31A, encoding a GH31 α-xylosidase) were the second and third most-upregulated genes, which were preceded only by CJA_3010, which encodes a GH5 subfamily 4 (GH5_4) member previously annotated as cel5D [156]. Among the large and functionally diverse GH5 family [58], subfamily 4 is the only subfamily known to contain predominant endo-xyloglucanases [35], which suggested a keystone role for this enzyme in xyloglucan utilization by C. japonicus.  Notably, CJA_2477 (previously annotated as gly74 [156]; Figure 3.1B) was not significantly up-regulated during growth on XyG, despite the encoded GH74 endo-xyloglucanase being previously shown to have high, specific activity for this polysaccharide [219]. Instead, CJA_2477 appeared to be constitutively expressed at a low level (RPKM levels in the 100-200 range), as were 14 other predicted CAZyme-encoding genes (Table 3.S4). The remaining CAZyme genes up-regulated during exponential growth are predicted to have roles in the degradation of a diverse set of polysaccharides, which suggests that there is complex cross-regulation of expression. As xyloglucan is unlikely to be encountered alone during the saprophytic growth habit of C. japonicus, these results are suggestive of xyloglucan degradation being one component of a sophisticated plant cell wall degradation response. Indeed, when comparing the exponential phase to the stationary phase for xyloglucan-grown cells, a growth-phase-dependent response manifested as a significant shift in the suite of expressed 83  CAZyme genes was observed (Figure 3.S1B, Table 3.S5). Additionally, only two genes of the XyG cluster, bgl35A and afc95A were still up-regulated during stationary phase, together with 33 additional predicted and confirmed hemicellulase- and pectinase-encoding genes (Figure 3.S1A, Table 3.S6). Similar growth-phase-dependent responses have been previously observed during cellulose utilization by C. japonicus [165] and Clostridium thermocellum (now Ruminiclostridium thermocellum) [232, 233].  Figure  3.2. Volcano plots summarizing the RNAseq data for a comparative analysis of C. japoncius cells grown on either glucose or xyloglucan. The volcano plots represent a comparison between exponentially growing cells (glucose vs xyloglucan). Each gray circle denotes a single gene, and the blue-filed circles indicate up-regulated CAZyme genes. The complete list of up-regulated CAZyme genes can be found in Table S1. Fold change in gene expression (log2 scale) is plotted on the x-axis and p-value (-log10 scale) is plotted on the y-axis. For orientation on the x-axis (fold change), positive values indicate genes that are up-regulated when grown using xyloglucan as the sole carbon source. The red dashed lines indicate significance cut-off values (2-fold for gene expression and p-value of 0.01). 3.3.2 Bioinformatic analysis and recombinant production of GH5_4 members from C. japonicus Spurred by the implication of the GH5 subfamily 4 member encoded by CJA_3010 in xyloglucan utilization by C. japonicus, we searched the genome for potential homologs. C. japonicus encodes 15 GH5 members, of which only three belong to subfamily 4 ([156] see http://www.cazy.org/b776.html): The aforementioned CJA_3010 (GenBank ACE84905.1, 84  previously annotated as cel5D [156]), CJA_3337 (GenBank ACE83841.1, previously annotated as cel5E [156]), and CJA_2959 (GenBank ACE86198.1, previously annotated as cel5F [156]).  Protein sequence analysis revealed that each of these gene products had a unique, multi-modular architecture that suggested the possibility of distinct cellular localization and biological function (Figure 3.3). In light of the lack of demonstrable activity on cellulose and high activity on xyloglucan (vide infra), the corresponding encoded enzymes are referred to as CjGH5D, CjGH5E, and CjGH5F hereafter.  Figure  3.3. Modular architecture of the native CjGH5_4 enzymes with the different expression constructs used in the current study. A) The locus CJA_3010 (GenBank ACE84905.1) encodes a signal peptide, a serine rich linker, and a GH5 catalytic domain (CjGH5D). B) CJA_3337 (GenBank ACE83841.1) encodes a signal peptide, two carbohydrate binding modules (CBM2 and CBM10), and a GH5 catalytic domain (CjGH5E). C) CJA_2959 (GenBank ACE86198.1) encodes a signal peptide, an FN3 domain, and a GH5 catalytic domain (CjGH5F). All expression constructs were designed to produce 6x His-Tag at the N-terminus of the recombinant protein. 85  The highly up-regulated CJA_3010 encodes a signal peptidase II lipoprotein signal peptide (predicted by LipoP 1.0 [234]), followed by a serine rich linker and a GH5_4 catalytic module, and was thus predicted to be anchored extracellularly in the outer membrane by N-terminal cysteine lipidation. CJA_3337 encodes an N-terminal signal peptide (predicted by SignalP 4.0 [167]) and two carbohydrate-binding modules (CBMs, [72], CBM2 and CBM10, in train with a GH5_4 catalytic module). CJA_2959 encodes a signal peptide (predicted by SignalP 4.0 [167] a Fibronectin type III (FN3) domain, an undefined region, and a C-terminal GH5_4 catalytic module.  The presence of signal peptides, and CBMs in the case of CjGH5E, is indicative of extracellular secretion of both CjGH5E and CjGH5F. Amino acid alignment of the catalytic modules of CjGH5D, CjGH5E, and CjGH5F suggest highly specific GH5_4 endo-xyloglucanases [34, 108, 109] and demonstrate conservation of the two catalytic glutamate residues, but low to moderate overall sequence conservation (26 to 45% identity) (Figure 3.S2, Table 3.S7).  GH5_4 is one of the largest GH5 subfamilies and contains, in addition to specific endo-xyloglucanases, promiscuous endo-β(1,4)-glucanases, strict cellulases, and mixed-linkage endo-β(1,3)/β(1,4)glucanases (reviewed in [35, 58]). As such, we undertook the recombinant production and enzymological characterization of the three C. japonicus GH5_4 members to precisely define their catalytic activities in the context of potential biological function. Our initial attempts to produce recombinantly the full-length, multi-modular proteins in E. coli by replacement of the native signal peptides with an N-terminal hexahistidine (His6) purification tag were consistently unsuccessful: intact protein mass spectrometry revealed proteolytic instability of His6-SRL-GH5D, while His6-CBM2-CBM10-GH5E and His6-FN3-GH5F had very poor production yields (data not shown). In contrast, His6-GH5D (Figure 3.3A) was produced as a stable, intact, active protein (calculated mass, 44222.2 Da; observed by ESI-MS, 44222.6 Da) in excellent yield (150 mg L-1). Likewise, our attempts to produce the individual catalytic modules of CjGH5E and CjGH5F as N-terminally His6-tagged constructs (Figure 3.3B&C) were met with success (His6-CjGH5E calculated mass, 41367.1 Da; observed by ESI-MS, 41370.1 Da, His6-CjGH5F calculated mass, 40253.8 Da; observed by ESI-MS 40253.9 Da) with approximate production yields of 14 and 9 mg L-1, respectively. 86  3.3.3 CjGH5_4 enzymes are highly efficient, specific endo-xyloglucnases.  In light of the subfamily membership of the three GH5_4 members, we anticipated that these enzymes might exhibit significant endo-hydrolytic activity towards XyG. Hence, this polysaccharide was used to determine pH and temperature optima.  CjGH5D, CjGH5E, and CjGH5F each exhibited bell-shaped pH profiles were obtained with the highest activity achieved in 50 mM phosphate buffer (pH 7.5 in case of CjGH5D and CjGH5E, and pH 7 in case of CjGH5F) (Figure 3.S3). When the three enzymes were incubated with XyG at different temperatures over the course of 10 minutes, optimum temperatures were identified as 50 ºC (CjGH5D and CjGH5F) and 55 ºC (CjGH5E) (Figure 3.S3). To determine substrate specificity of the three GH5_4 members, a panel of nine soluble polysaccharide substrates were screened under these optimal conditions. Indeed CjGH5D, CjGH5E, and CjGH5F all displayed high specific activity toward XyG (Table 3.1). No detectable activity toward barley mixed-linkage 1,3/1,4-β-glucan, guar galactomannan, konjac glucomannan, beechwood xylan, wheat flour arabinoxylan, or xanthan for any of the three enzymes. CjGH5D appeared to strictly require the branched XyG structure, while CjGH5E demonstrated trace activities against the artificial 1,4-β-glucans hydroxyethylcellulose (HEC) and carboxymethylcellulose (CMC) at the highest tested substrate concentration (2 mg mL-1); specific activities were 200-1500-fold less than XyG, respectively (Table 3.1). Similarly, CjGH5F was able to hydrolyze HEC with an 800-fold lower specific activity than XyG, while no activity towards CMC was detected.  Michaelis-Menten analysis for XyG further underscored the high XyG specificity of the three enzymes: remarkably low Km values were observed and high kcat values recapitulated those previously observed for predominant endo-xyloglucanases, including CjGH74 [34, 110, 219, 235] (Table 3.1, Figure 3.S4).     87  Table  3.1. Activity of CjGH5_4 enzymes against different polysaccharide substratesa. Enzyme Catalytic domains Substrate Km mg. mL-1 kcat sec-1 Specific activity µmol/ min. mg CjGH5D XyG <0.025 30.3 ± 0.4 43.3 ± 1.9  CjGH5E XyG 0.02 ± 0.002 10.3 ± 0.1 15.1 ± 0.2 hydroxyethylcellulose (HEC) NDb NDb 0.07 ± 0.004 carboxymethylcellulose (CMC) NDb NDb 0.01 ± 0.002 CjGH5F XyG 0.04 ± 0.003 52.4 ± 0.8 74.8 ± 4.1 hydroxyethylcellulose (HEC) NDb NDb 0.09 ± 0.003 aAssays conducted at pH 7.5 (CjGH5D and CjGH5E) or pH 7 (CjGH5F). Recombinant enzymes were incubated at 50 ⁰C (CjGH5D and CjGH5F) or 55 ⁰C (CjGH5E) with the different tested substrates.  bNot determined due to poor specific activity.  Time-course analyses of native XyG polysaccharide hydrolysis products by HPAEC-PAD revealed that all three GH5_4 enzymes generated products of intermediate retention time in the early stages of the reactions, with no significant generation of the Glc4-based XXXG, XLXG, XXLG, and XLLG limit-digest products (Figure 3.S5, Figure 3.S6; cf. Figure 3.1). These results indicate that the three enzymes hydrolyze XyG through a dissociative, rather than processive [131] mechanism, and are thus canonical endo-xyloglucanases (EC 3.2.1.151; cf. EC 3.2.1.150, EC 3.2.1.155).  The limit-digestion products further revealed that all C. japonicus GH5_4 enzymes specifically catalyze hydrolysis at the anomeric position of the unbranched glucose residues of the (galacto)XyG polysaccharide chain (Figure 3.1). This cleavage pattern is typical for many GH5 [34, 108, 235], GH9 [112, 127], GH12 [108, 121-124], GH16 [106] and GH74 [129, 130, 133-135] endo-xyloglucanases, although certain GH5 [109, 110], GH7 [138], GH44 [126, 127] and GH74 [132, 136, 137] members preferentially hydrolyze the XyG backbone between branched glucosyl residues.  The canonical XXXG-type XyGOs produced by CjGH5D, CjGH5E, and CjGH5F are direct substrates for the exo-glycosidases of the XyG cluster [33].  In light of the cleavage specificity of the GH5_4 members, we determined kinetic parameters for the hydrolysis of a panel of chromogenic oligosaccharides to reveal the contribution of side chain substitution on substrate recognition and catalysis (Table 3.2). All 88  three enzymes were only weakly active on 2-chloro-4-nitrophenyl cellotrioside (GGG-β-CNP) and 2-chloro-4-nitrophenyl cellotetraoside (GGGG-β-CNP), with meager increases in kcat/Km value arising from the addition of potential -4 subsite binding for the cellotetraoside (Figure 3.4, Table 3.2,) GH subsite nomenclature according to [91].  Strikingly, the addition of three α(1→6)-xylopyranosyl residues to the glucan backbone resulted in significant increases in catalytic efficiency for all GH5_4 members, which was manifested as 65-, 700-, and 150-fold higher kcat/Km values for XXXG-β-CNP vis-à-vis GGGG-β-CNP with CjGH5D, CjGH5E, and CjGH5F, respectively (Table 3.2, Figure 3.4). These values correspond to 11, 18, and 13 kJ/mol, respectively, of additional transition state stabilization in the formation of the covalent glycosyl-enzyme in these anomeric-configuration-retaining GH5 enzymes [236]. With XLLG-β-CNP, the specificity constants (kcat/Km) were only increased 1.5 to 5 folds for the three endo-xyloglucanases, thus indicating that extending β(1→2)-galactopyranosyl residues (Figure 3.1) have little additional effect on catalysis (Table 3.2). Table  3.2. Kinetic parameters of CjGH5_4 enzymes for (xylo)gluco-oligosaccharide glycosides              anon-determined due to limited availability of substrate.  Enzyme catalytic domains Substrate Km mM kcat min-1 kcat/ Km min-1. mM-1 CjGH5D GGG-CNP NDa NDa 2.21 ± 0.05 GGGG-CNP NDa NDa 5.36 ± 0.07 XXXG-CNP 0.81 ± 0.1 281 ± 12 347 ± 45 XLLG-CNP 0.18 ± 0.02 162 ± 4 900 ± 103 CjGH5E GGG-CNP 11.8 ± 0.6 191 ± 7 16.2 ± 1.0  GGGG-CNP 5.02 ± 0.35 180 ± 7 35.9 ± 2.8  XXXG-CNP 0.01 ± 0.001  254 ± 5 (25.4 ± 2.6) x 103  XLLG-CNP 0.01 ± 0.001 332 ± 9 (33.2 ± 3.4) x 103 CjGH5F GGG-CNP NDa NDa 6.45 ± 0.30 GGGG-CNP NDa NDa 16.5 ± 1.0 XXXG-CNP 0.07 ± 0.01 169 ± 4 (2.41 ± 0.35) x 103 XLLG-CNP 0.03 ± 0.002 393 ± 8 (13.1 ± 0.9) x 103 89   Figure  3.4. Michaelis-Menten kinetics of CjGH5_4 enzymes on a panel of chromogenic (xylo)gluco-oligosaccharide glycosides. A-D) CjGH5D. E-H) CjGH5E. I-L) CjGH5F. Error bars represent standard errors of the mean for 2 replicates. Only one replicate was done on XLLG-β-CNP due to limited availability of the substrate.   90  3.3.4 Covalent labeling of CjGH5D with an active-site-directed inhibitor Active-site affinity-based inhibitors are important tools for the detailed kinetic analysis of GH enzymes [237]. In particular, N-bromoacetylglycosylamine derivatives of xyloglucan oligosaccharides have been previously demonstrated to be specific active-site affinity labels for endo-xyloglucanases [110, 222]. A time- and concentration-dependent inactivation of the enzyme CjGH5D was observed upon incubation with XXXG-NHCOCH2Br, which followed pseudo-first-order kinetics (Figure 3.5). The dissociation constant Ki and the irreversible inactivation rate ki towards CjGH5D were 1.78 ± 0.17 mM and 0.17 ± 0.01 min-1, respectively, resulting in a ki/Ki value (9.3x10-2 mM-1.min-1) that was comparable to that previously observed for a Prevotella bryantii GH5_4 member [110].  Notably, intact protein mass spectrometry of CjGH5D following incubation with the inhibitor indicated covalent labelling with 1:1 stoichiometry and no over-labelling of the enzyme (Figure 3.S7).  Figure  3.5. Inhibition kinetics of CjGH5D with XXXG-NHCOCH2Br.  91  A) Initial-rate enzyme activity over time (single determinations). B) Pseudo-first-order rate constants (kapp) obtained from the fitted curves shown in panel A. Bars represent errors in kapp values from curve-fitting. The 95% confidence interval is indicated (pink band) for the fitted curve (solid line). 3.3.5 CjGH5_4 crystallography A tertiary structure of the catalytic domain of CjGH5D was determined at 1.6 Å resolution in uncomplexed “apo” form by X-ray crystallography (Table 3.S8). The overall structure of CjGH5D (residues Gly96 to Gln468) is an (β/α)8 barrel as is typical for GH5 family members (Figure 3.6A). Despite sequence identities in the 25-40% range, the structure is similar to the catalytic domains of many GH5 enzymes, most annotated as xyloglucanases, glucanases and lichenases with typical alignment values of approximately 310 residues aligning with an rmsd of 1.3 Å [198]. The structure used for molecular replacement, Bo BoGH5A (pdb: 3ZMR, [34]), for example overlaps with an r.m.s.d. of 1.1 Å over 332 equivalent Cα atoms with 40% identity.  There are a few minor differences in loops at the end of helices in the two structures.  BoGH5A has an extra loop Val170-Gly180 (residues equivalent to Ile137-Gly138 in CjGH5D) which enables the formation of a hydrogen bond to the -4‟-xylosyl residue of ligand XXXG (between N Val182 and the sugar ring O atom), see below.  A 1.9 Å-resolution product complex of CjGH5D was obtained by soaking crystals with a mixture of Glc12-based XyGOs of variable sidechain galactosylation (Table 3.S8). Here, we anticipated that the substrate mixture would be hydrolyzed and that the enzyme would selectively bind the oligosaccharide for which it had the best affinity.  Commensurate with limit-digest analysis, we observed a Glc4–based oligosaccharide backbone spanning the -4 to -1 subsites for both molecules in the asymmetric unit: GXLG in molecule A (with glucose in the -4 subsite and the -3‟-xylosyl group modelled at occupancies of 0.5 and 0.7 respectively) and GXXG in molecule B (here, there was insufficient electron density in the Fo-Fc difference map to allow unambiguous modelling of a galactose on the -2‟-xylosyl unit). In the -1 subsite, the glucosyl residue interacts with the catalytic acid base Glu255 (via O1), and nucleophile Glu390 via O2. In addition, O3 is hydrogen bonded to His208. In molecule B, the equivalent glucose also hydrogen bonds via O2 to Asn254 and His208 (Figure 3.6B & C).  92   Figure  3.6. Three-dimensional structure of CjGH5D in complex with XXXG-NHCOCH2Br and XyGOs. A) Cartoon representation of the secondary structure of CjGH5D colour ramped from the N-terminus (blue) to the C-terminus (Red). The two ligands XXXG-NHCOCH2Br and GXLG are overlaid in the active site cleft and shown in green and magenta sticks, respectively. B) A close-up view of the active site cleft with the overlaid ligands XXXG-NHCOCH2Br in green and XXLG in magenta showing different amino acids interacting with the carbohydrate ligands. C) 2Fo-Fc (A/maximum likelihood weighted) electron density contoured in blue around GXLG in the CjGH5D-XXLG complex (left panel) and the chemical structure of the corresponding ligand (Right panel). Insufficient electron density was observed for the -4‟ xylosyl residue to allow modelling, therefore it is shown in grey. D) 2Fo-Fc electron density at 1 (approx. 0.2 e-/ Å3) contoured in blue around XXXG-NHCOCH2 moiety in the CjGH5D-XXXG-NHCOCH2Br complex (Left panel) and chemical structure of the corresponding ligand (Right panel). The bromide leaving group is shown in grey. A second oligosaccharide complex was obtained at 2.1 Å resolution by soaking CjGH5D crystals with the N-bromoacetyl affinity label XXXG-NHCOCH2Br, in which the reagent had 93  indeed reacted through nucleophilic attack of the enzyme‟s general acid/base sidechain to displace the bromide nucleofuge, see below (Table 3.S8). In molecule A of the asymmetric unit there is electron density for GXXG-NHCOCH2–CjGH5D, whilst in molecule B, XXXG-NHCOCH2–CjGH5D is modeled, but with the -3‟- and -4‟-xylosyl sugars modelled at half occupancy. The carboxyl oxygen of the N-acetyl moiety forms a hydrogen bond with His323. There are hydrogen bonds between this subsite -1 sugar and the catalytic nucleophile Glu390, and also to His208 and Asn254 (Figure 3.6B & D). These are similar to the interactions observed in the structure of the XXXG-NHCOCH2–PbGH5A complex structure occupying -4 to -1 subsites (pdb: 5D9P, [110]). Glucose in the -2 subsite is hydrogen bonded via O3 to ND2 Asn132 and via O2 to NE1 Trp432; this latter interaction is long at approximately 3.2Å, which may reflect the positioning of the tryptophan as the -1 subsite stacking residue.  The equivalent Asn/Trp interactions are also seen in related enzymes; Asn28 and Trp324 in XXXG-NHCOCH2–PbGH5A (pdb: 5D9P) and Asn165 and Trp472 in the BoGH5A-XXXG complex (PDB 3ZMR). In addition to Trp432, Trp143 provides aromatic stacking interactions with glucose the -3 subsite (as do Trp324 and Trp48 in PbGH5A, and Trp472 and Trp185 in BoGH5A) and Trp209 lies against the -2‟-xylosyl residue (as does the equivalent Trp252 in BoGH5A). This pattern of conserved/ highly invariant residues interacting with the xyloglucan chain presumably accounts for the fact that despite sharing amino acid identity as low as 30%, these enzymes are all tailored for xyloglucan as a substrate (Figure 3.7). None of these three GH5 structures exhibits direct interactions of glucose in -3 and -4 subsites with the protein. The -3‟-xylosyl unit is tethered by two hydrogen bonds between O3 and O4 and Asp438, which hold the sugar perpendicular to the orientation of the equivalent xylose in the XXXG-NHCOCH2–PbGH5A and BoGH5A:XXXG complexes (in the latter, the xylose lies parallel to the side chain of Tyr476). The covalent adduct formation through the reactivity of the N-bromoacetyl reagent is fascinating given that in the structures observed here, the attack is made by the acid-base Glu255, as opposed to the enzymatic nucleophile of the enzymatic reaction, Glu390. This latter residue is poised for nucleophilic attack at the anomeric carbon, C1, of the  – 1 subsite glucoside. However, Glu390 is too distant (6-7 Å), and with impossible geometry and steric hindrance, from the reactive carbon of the N-bromoacetyl moiety, to permit nucleophilic interception.  The reactive group, however is located in the +1 subsite – some 3.8 Å from C1 – thus can be 94  fortuitously attacked by the acid/base which is in almost ideal position for SN2 attack on the reactive carbon to displace the bromide. Such a reaction is facilitated, either prior, or subsequent to attack by rotation around the CB-CG bond, Figure 3.6D, which leaves the side-chain in a different rotamer after the reaction relative to its “normal” position in unreacted complexes.  Figure  3.7. Divergent (wall-eyed) stereo surface representation of CjGH5D-GXLG showing regions of sequence conservation. Surfaces of conserved and non-conserved residues, shown in purple at reduced opacity and sea-green, respectively, were calculated from an amino acid sequence alignment of GH5 domains of CjGH5D, CjGH5E, CjGH5F and five additional GH5 members showing E.C. 3.2.1.151 activity (made using ESPript 3.0 (Figure 3.S2)). Figure was generated using CCP4MG [199].   3.3.6 Mutational analysis of C. japonicus GH5_4 genes indicates a complex mode of action for the initial stages of xyloglucan degradation.  We have recently developed improved genetic techniques for C. japonicus, which enable in-frame deletions for the precise construction of mutant strains [157].  Hence, we embarked on a comprehensive reverse-genetic analysis in an attempt to delineate the biological functions of the individual GH5_4 and GH74 endo-xyloglucanases in C. japonicus in light of their broadly similar catalytic properties. In-frame deletion mutants were first generated in the XyG gene cluster encoding the three exo-glycosidases and the TBDT (Figure 3.1B) to provide benchmark controls for subsequent analysis of endo-xyloglucanase deletion mutants. Recapitulating our previous work using insertional mutants [33], an in-frame xyl31A (-xylosidase) mutant was unable to grow on XyG due to an inability to remove non-reducing-terminal xylosyl residues as the first essential step in XyGO saccharification (Figure 3.8A cf. Figure 3.1).  A CJA_2709 (TBDT) single mutant strain 95  had a significant growth defect, presumably resulting from a decreased ability to uptake extracellularly produced XyGOs into the periplasm. The deletion of bgl35A also attenuated growth, due to an inability of the strain to access the full complement of sidechain monosaccharides.  As expected, growth of the afc95A (-L-fucosidase) mutant on tamarind (galacto)xyloglucan was identical to the wild-type strain, because this readily available substrate lacks the terminal fucosyl residues typically found in dicot primary cell wall XyG (Figure 3.1A). As expected, all XyG gene cluster mutant strains grew similar to wild type in glucose containing medium (Figure 3.S8). With these control experiments complete, we next analyzed the effect of deleting the individual GH5_4- and GH74-encoding genes. Despite original indications by RNAseq analysis of a potential lead role for CjGH5D in XyG utilization, in-frame deletion of CJA_3010 surprisingly did not elicit a statistically significant growth defect (Figure 3.8B).  Likewise, strains containing single in-frame deletions of CJA_3337, CJA_2959, and CJA_2477 grew identically to the wild-type strain.  Moreover, comprehensive combinatorial mutagenesis did not yield a strain with a substantial growth defect for any combination of double, triple, or quadruple mutants (Figure 3.8C & D). CjGH5D is predicted to be attached to the exterior face of the outer membrane by N-terminal lipidation, while CjGH5E, CjGH5F, and CjGH74 are likely secreted enzymes (vide supra).  As such, we hypothesized that other secreted enzymes, with either predominant or side hydrolytic activities toward XyG, may be enabling growth of the quadruple mutant lacking the four known, dedicated endo-xyloglucanases.  Deletion of the Type Two Secretion System (T2SS) in the gsp mutant has been previously shown to abolish the ability of C. japonicus to secrete cellulases [159], and constitutes a powerful tool to restrict extracellular secretion of CAZymes in general.  Interestingly, introduction of the CJA_3010 deletion into the gsp background failed to cause any additional growth attenuation on xyloglucan (Figure 3.9).  With the T2SS extracellular secretion pathway disabled, the ability of the gsp CJA_3010 strain to grow on XyG strongly suggests the presence of other membrane-bound XGases that effect XyG depolymerization in a physiologically relevant manner. 96   Figure  3.8. Growth analysis of in-frame deletions of GH5_4, and GH74 mutant strains on xyloglucan. Cultures were grown for 24 hours at 30°C with high aeration (200 RPM) in MOPS defined media supplemented with 0.5% (w:v) xyloglucan as the sole carbon source. Graphs represent the average of three biological replicates and error bars represent the standard deviation. A) Control experiment with XyGUL mutant strains, with these in-frame deletion mutants growing as previously described (see text). B) Single, C) double, D) triple and quadruple deletion mutants were made with the GH5_4 and GH74 genes; CJA_3010 encodes CjGH5D, CJA_3337 encodes CjGH5E, CJA_2959 encodes CjGH5F, and CJA_2477 encodes CjGH74A. All strains grew like wild type when grown with MOPS-glucose defined medium (Figure 3.S8). 97   Figure  3.9. Growth analysis of ΔCJA_3010 and Δgsp mutant strains when using glucose or xyloglucan. Cultures were grown in 18 mm test tubes at 30°C with shaking at 200 RPM using MOPS minimal media supplemented with A) 0.25% (w:v) glucose or B) 0.5% (w:v) xyloglucan as the sole carbon source. Open circles represent wild type, gsp is represented by closed squares, CJA_3010 (encoding CjGH5D) is represented by open triangles, and CJA_3010 gsp is represented by inverted closed triangles. Graphs depict the average of biological triplicate experiments, and the error bars represent the standard deviation. Predominant xyloglucanase activity has been demonstrated previously in members CAZyme families GH5, GH7, GH9, GH12, GH16, GH44, and GH74, and potentially may constitute a side activity in other endo-(1,4)glucanases (cellulases) [35, 150].  Examination of the C. japonicus genome indicates the presence of multiple GH5 (n = 15), GH9 (n = 3), and GH16 (n = 9) encoding genes, in addition to the single GH74 member ([156]; for a summary table, see http://www.cazy.org/b776.html). Further, Deboy, et al. [156] predicted that there are approximately 45 membrane-bound CAZymes. Although it constitutes a significant undertaking that is beyond the scope of the present study, our future investigations will focus on scrutinizing these additional CAZymes in the context of XyG utilization by C. japonicus. 98  3.4 Conclusion We previously proposed a model of XyG utilization by C. japonicus, in which an extracellular endo-xyloglucanase mediates degradation of the polysaccharide to XyGOs for uptake via the TBDT, followed by complete hydrolysis to monosaccharides in the periplasm by the exo-glycosidases encoded by the XyG gene cluster [33].  Our present study, combining biochemical and reverse-genetic analyses, reveals that the number of actors in the initial cleavage event is significantly greater than what was originally anticipated by bioinformatics. We propose that the existence of so many extracellular endo-xyloglucanases of apparently overlapping biochemical function can be explained by a physiological interplay of secreted reconnaissance enzymes and cell-surface-bound, proximal XyG degraders (Figure 3.1B). Thus, the secreted GH74 and two secreted GH5_4 enzymes may act as highly mobile, primary “unravellers” of the plant cell, liberating large XyG fragments from the lignocellulose matrix. Indeed, the concept of “sensing” polysaccharidases playing a lead role in generating inducers has been previously proposed [238]. As plant cell wall polysaccharide degradation advances, more intimate contact between the bacterial cell surface and the substrate may ensue, engaging the outer-membrane-bound CjGH5D and a more efficient interplay between XyG backbone hydrolysis and direct TBDT-mediated uptake of the oligosaccharide products. The coordinated capture, hydrolysis, and uptake of partially hydrolyzed polysaccharides as a successful competitive strategy has considerable precedent in the Polysaccharide Utilization Loci of the Bacteroidetes [212].  Moreover, the need to initiate cell wall “unravelling” has been suggested to explain why saprophytes such as C. thermocellum, which are unable to utilize xyloglucan or xylan for growth, contain endo-xyloglucanases and endo-xylanases within their cellulosomes [112, 239].   99  3.5 Supporting information 3.5.1 Supporting tables Table 3.S1. Primer sequences used for recombinant protein production. aUnderlined sequences are the restriction sites. Table 3.S2. Primers used for generation of in-frame deletion mutants. Primer namea Sequence Source CJA_3010 UP (5‟) GCTATGACATGATTACGAATTCCCAGCGCGATAAAGAGCAG This study CJA_3010 UP (3‟) GTTCATCATTTACACCGCTCCATTGCC This study CJA_3010 DOWN (5‟) CAGCGGTGTAAATGATGAACAACTCCATCGCCTATATC This study CJA_3010 DOWN (3‟) CGACGGCCAGTGCCAAGCTTACCCACCGTCACTGTTAAAACG This study CJA_3010 INT (5‟) GGTGGTGAATTCGCAGTTGGTCCAGC This study CJA_3010 INT (3‟) GGTGGTTCTAGACAGGTCCATCTGCT This study CJA_3337 UP (5‟) GCTATGACATGATTACGAATTCCGCGCTCTTGTGTTTGTAATCG This study CJA_3337 UP (3‟) ATGGGGTGTCACATTAGCGTTATTCTCCGTTGACAC This study CJA_3337 DOWN (5‟) CGCTAATGTGACAGGGCATTTGCCAGC This study CJA_3337 DOWN (3‟) CAGTGCCAAGCTTTGGTGCAGGTGCTGTTA This study CJA_3337 INT (5‟) GGTGGTGAATTCGGTGTCATCGTCCC This study CJA_3337 INT (3‟) GGTGGTTCTAGAACGCCGGGTCAAT This study CJA_2959 UP (5‟) GCTATGACATGATTACGAATTCAAACAAAATTACCCTGGTGC This study CJA_2959 UP (3‟) CACGATATTACATTTTATTATTTTCCTTTAGCTGATGGATGGA This study CJA_2959 DOWN (5‟) ATAATAAAATGTAATATCGTGCGGGAAAGCGTG This study CJA_2959 DOWN (3‟) GCCAGTGCCAAGCTTAGAGAAATTAATTTCCAC This study CJA_2959 INT (5‟) GGTGGTGAATTCCGACACCAGTGCG This study CJA_2959 INT (3‟) GGTGGTTCTAGACACGGGGATGCGG This study CJA_2477 CONF (5‟) AGGAAACCGGGTGTTAC This study CJA_2477 CONF (3‟) GTACACCCTTGGGTA This study CJA_2477 INT (5‟) TGAATGAATC CGTTGAGA This study CJA_2477 INT (3‟) GTAAGGCACTAATAGCCG This study aCJA_3010 encodes CjGH5D, CJA_3337 encodes CjGH5E, CJA_2959 encodes CjGH5F, and CJA_2477 encodes CjGH74.  Primer Oligonucleotide Sequencea Recombinant protein CjGH5D-Full-NheI-F CjGH5D -XhoI-R 5ˊ - GACCGCTAGCATGTGTGGTAGCGCCGGTGGCGGCTC - 3ˊ 5ˊ- GGTCCTCGAGTTATTGTGCTCCTGCGCCCTCC - 3ˊ CjSRL-GH5D CjGH5D-NheI- F CjGH5D -XhoI-R 5ˊ -GACCGCTAGCGGGCTTTATCCCAGTTACAACACC - 3ˊ 5ˊ- GGTCCTCGAGTTATTGTGCTCCTGCGCCCTCC - 3ˊ CjGH5D CjGH5E-Full-NheI-F CjGH5E -Full-XhoI-R 5ˊ - GACCGCTAGCATGCAAACAGCCAGTTGTAAGTATG - 3ˊ 5ˊ - GGTCCTCGAGTCAGAAGGTTGCGTTTACAATCG - 3ˊ CjCBM2-CBM10-GH5E CjGH5E-LIC-F  CjGH5E –LIC-R 5ˊ -TACTTCCAATCCAATGCCATGCTGACCAGTGTGGAGTTA- ACGCGC- 3ˊ 5ˊ -TTATCCACTTCCAATGTTATCAGAAGGTTGCGTTTACAAT- CGC- 3ˊ CjGH5E CjGH5F-Full-NheI-F CjGH5F -XhoI-R 5ˊ - GACCGCTAGCATGCAGAATTGCGGCAGCGGTGGCG - 3ˊ 5ˊ - GGTCCTCGAGTTATTGCGCCGCATTGATAATGG - 3ˊ CjFN3-GH5F CjGH5F-NheI- F CjGH5F -XhoI-R 5ˊ - GACCGCTAGCAGTGTGCAATTGGCCAGGTTGATG  - 3ˊ 5ˊ - GGTCCTCGAGTTATTGCGCCGCATTGATAATGG - 3ˊ CjGH5F 100  Table 3.S3. List of genes up-regulated during exponential growth on xyloglucan compared to glucosea. Locus ID Geneb Predicted Functionb Fold Changec p-valued CJA_3010 cel5D cellulase 4.9 5.0 CJA_2709 CJA_2709 TBD- transporter 4.4 4.8 CJA_2706 xyl31A  -xylosidasee 4.2 4.0 CJA_0491 gal53A-1 arabinogalactan endo-1,4 -galactosidase 4.2 3.2 CJA_3008 xyl39A -xylosidase 4.1 4.9 CJA_3007 gly43H  -xylosidase/-L-arabinfuranosidasef 4.0 5.0 CJA_2707 bgl35A  -galactosidaseg 3.6 3.0 CJA_2769 abf51A  -L-arabinofuranosidaseh 3.6 4.6 CJA_0496 bgl2A -galactosidase 3.5 3.5 CJA_1140 bgl3D cellodextrinase 3.1 3.7 CJA_2710 afc95A  -L-fucosidaseg 3.1 2.7 CJA_0492 gal53B arabinogalactan endo-1,4--galactosidase 2.8 3.9 CJA_0497 gal53A-2  arabinogalactan endo-1,4--galactosidasei 2.5 2.9 CJA_0246 aga27A  -galactosidasej 2.3 2.3 CJA_3018 gly43N  -xylosidase/-L-arabinfuranosidasef 1.9 2.1 CJA_0818 gly43D  -xylosidase/-L-arabinfuranosidasef 1.8 2.8 CJA_0181 pme8C pectin methylesterase 1.7 2.5 CJA_0007 cbp2A carbohydrate binding protein 1.6 4.5 CJA_0398 amy13F -amylase 1.6 2.1 CJA_3763 xyn11A  endo-1,4--xylanasek 1.6 2.1 CJA_0799 gly43E  -xylosidase/-L-arabinfuranosidasef 1.5 2.5 CJA_0817 agd97A -glucosidase 1.5 2.3 CJA_0223 bgl3C -glucosidase 1.4 2.0 CJA_1182 chi19A chitinase 1.3 2.1 CJA_2469 cbp2F carbohydrate binding protein 1.3 2.3 CJA_0819 abf43M  -L-arabinofuranosidasef 1.1 2.1 CJA_3120 pel1G pectate lyase 1.1 2.1 CJA_2869 cbp26A carbohydrate binding protein 1.1 3.4 a RNASeq sampling performed in biological triplicate b Gene names and predicted functions according to Deboy et al. [156] c log2 scale d –log10 conversion e Function confirmed by Larsbrink, et al. [145] f Function confirmed by Cartmell, et al. [240] g Function confirmed by Larsbrink, et al. [33] h Function confirmed by Beylot, et al. [241] I Function confirmed by Braithwaite, et al. [242] j Function confirmed by Halstead, et al. [243] k Function confirmed by Millward-Sadler, et al. [244]   101  Table 3.S4. List of genes with low-level constitutive expression on xyloglucana. Locus ID Geneb Predicted Functionb RPKMc (Exp) RPKM (Sta) CJA_0805 arb43A -L-arabinfuranosidase 102 130 CJA_3139 lpmo10B lytic polysaccharide mono-oxygenased 169 102 CJA_0276 cbp6B  carbohydrate binding protein 110 125 CJA_3300 cbp6C carbohydrate binding protein 134 100 CJA_0374 cel45A cellulase 157 153 CJA_0619 cel5H  endo-1,4 -glucanase 100 108 CJA_3286 ebg98  endo--galactosidase 118 165 CJA_3287 fee1A  ferruloyl esterase 116 104 CJA_3282 fee1B ferruloyl esterase 152 241 CJA_2477 gly74A endo-1,4--glucanase/xyloglucanase 126 123 CJA_0384 pel10C  pectate lyase 139 115 CJA_2413 pel3B pectate lyase 130 126 CJA_0172 pga28A  polygalacturonase 119 177 CJA_0284 tre37B  trehalase 178 136 CJA_3762 xyn11B  endo-1,4--xylanase 139 140 a RNASeq sampling performed in biological triplicate b Gene names and predicted function according to Deboy et al.[156] c Reads per kilobase per million mapped reads (average) d Function confirmed by Gardner, et al. 2014 [165].   Table 3.S5. List of genes up-regulated during stationary phase compared to exponential phase on xyloglucana. Locus ID Geneb Predicted Functionb Fold Changec p-valued CJA_1883 gly57A glycoside hydrolase 4.5 2.5 CJA_1522 amy13B -amylase 2.0 3.2 CJA_1497 bgl3B -glucosidase 1.7 2.4 CJA_1923 acm73B endo--N-acetylglucosaminidase 1.7 2.3 a RNASeq sampling performed in biological triplicate b Gene names and predicted functions according to Deboy et al [156] c log2 scale d log10 conversion   102  Table 3.S6. List of genes up-regulated during stationary phase on xyloglucan compared to glucosea. Locus ID Gene Predicted Functionb Fold Changec p-valued CJA_2618 amy13A -amylase 3.0 3.3 CJA_0806 abf43L -L-arabinofuranosidasee 3.0 2.9 CJA_2611 chi18D endo-chitanase 2.8 2.8 CJA_3247 amy13H -amylase 2.7 2.3 CJA_2710 afc95A -L-fucosidasef 2.6 2.3 CJA_0020 cbp35A carbohydrate binding protein 2.5 3.0 CJA_3248 agd31B  -glucosidaseg 2.3 2.4 CJA_2610 bgl2C -galactosidase 2.2 2.6 CJA_1497 bgl3B -glucosidase 2.2 2.3 CJA_0849 cgs94A cyclic -1,2-glucan synthetase 2.1 2.2 CJA_3263 cgt13B cyclomaltodextrin transferase 2.1 2.4 CJA_0042 pel1D pectate lyase 2.1 2.5 CJA_3281 abf62A -L-arabinofuranosidase 2.0 2.8 CJA_2769 abf51A -L-arabinofuranosidase 2.0 2.5 CJA_2770 man26A endo-1,4- mannanase 1.9 4.2 CJA_2616 cbp2D carbohydrate binding protein 1.9 3.5 CJA_0044 pel1C pectate lyase 1.9 2.7 CJA_0816 gly43C -xylosidase/-L-arabinfuranosidasee 1.8 3.0 CJA_3066 xyn10C endo-1,4--xylanaseh 1.8 2.4 CJA_2040 pel10B pectate lyase 1.7 2.7 CJA_1140 bgl3D cellodextrinase 1.7 3.0 CJA_0450 axe2C acetyl xylan esterasei 1.7 3.8 CJA_0224 glu16B -glucanase 1.7 2.2 CJA_2707 bgl35A -galactosidasef 1.5 2.1 CJA_2872 agd31A -glucosidase 1.5 2.3 CJA_2887 gla67A  -glucuronidasej 1.4 2.9 CJA_0350 hex20A N-acetyl--hexosaminidase 1.4 2.5 CJA_0225 glu16A -glucanase 1.3 2.1 CJA_2993 chi18C endo-chitinase 1.3 2.2 CJA_3279 xyn5A endo-1,4--xylanase 1.3 2.3 CJA_3470 man5C endo1,4--mannanase 1.3 2.1 CJA_3280 xyn10B endo-1,4--xylanasek 1.3 3.2 CJA_3070 gly43G -xylosidase/-L-arabinfuranosidasee 1.2 2.3 CJA_0559 cbp35B carbohydrate binding protein 1.1 2.2 a RNASeq sampling performed in biological triplicate b Gene names and predicted functions according to Deboy et al.[156] c log2 scale d –log10 conversion e function confirmed by Cartmell, et al. [240] f function confirmed by Larsbrink, et al. [33] g Function confirmed by Larsbrink, et al. [245] h Function confirmed by Millward-Sadler, et al. [244] i Function confirmed by Montainer, et al. [246] j Function confirmed by Nagy, et al. [247] k Function confirmed by Kellett, et al. [248]   103  Table 3.S7. Identity matrix between CjGH5_4 enzymes and other specific previously characterized endo-xyloglucanasesa.        aOnly amino acid sequences of the catalytic domains were used in the analysis.    bCJA_3010 (ACE84905.1). cCJA_3337 (ACE83841.1). dCJA_2959 (ACE86198.1).  eBacteroides ovatus GH5A (EDO11444.1).  fPaenibacillus pabuli XG5 (WP_017688986.1). gXEG5A from rumen microflora metagenomics library (ACZ54907.1).    CjGH5Db CjGH5Ec CjGH5Fd BoGH5e PpXG5f XEG5Ag CjGH5Db ---- 41% 41% 35% 35% 26% CjGH5Ec 41% ---- 65% 41% 42% 28% CjGH5Fd 41% 65% ----- 45% 41% 29% BoGH5e 35% 41% 45% ----- 33% 24% PpXG5f 35% 42% 41% 33% ----- 29% XEG5Ag 26% 28% 29% 24% 29% ----- 104  Table 3.S8. X-ray data collection and refinement statistics for CjGH5D.  Apo-structure XXXG-NHCOCH2 GXLG Data collection    Space group P212121 P212121 P212121 Cell dimensions        a, b, c (Å) 55.0, 96.4, 159.0 55.5, 97.6, 157.2 55.3, 95.8, 157.2  ()  90.0, 90.0, 90.0 90.0, 90.0, 90.0 90.0, 90.0, 90.0 Resolution (Å) 82.58-1.60 (1.63-1.60) 82.92-2.10 (2.16-2.10) 81.96-1.90 (1.95-1.90) Rsym or Rmerge 0.060 (1.717)  0.059 (0.547) 0.084 (1.465) Rpim 0.024 (0.695) 0.028 (0.256) 0.031 (0.592) CC1/2 0.999 (0.699) 0.998 (0.933)  0.998 (0.813) I / I 11.8 (1.0) 12.8 (2.7) 8.5 (0.8) Completeness (%) 100.0 (100.0) 100.0 (100.0) 100.0 (100.0) Redundancy 7.8 (7.9) 6.2 (6.3) 7.7 (7.6)     Refinement    No. reflections 106661 48113 63393  Rwork / Rfree 0.14/0.20 0.22/0.28 0.23/0.30 No. atoms        Protein 5936 5792 5843     Ligand/ion 200 178 168     Water 587 186 98 B-factors (Å2)        Protein 34 50 56     Ligand/ion 54 47 58     Water 45 47 44 R.m.s deviations        Bond lengths (Å) 0.018 0.017 0.018     Bond angles () 1.7 1.8 1.9 Ramachandran plot Residues    In most favorable regions (%) 96.4 96.7 95.9 In allowed regions (%) 3.6 3.1 3.8 PDB code 5OYC 5OYD 5OYE   105  3.5.2 Supporting figures   Figure 3.S1. Volcano plots summarizing transcriptomic analysis of C. japoncius cells grown on either glucose or xyloglucan. The volcano plots represent comparisons between A) stationary phase cells (glucose vs xyloglucan), or B) xyloglucan grown cells in either exponential or stationary phase. Each gray circle denotes a single gene, and the blue-filed circles indicate up-regulated CAZyme genes. The complete list of up-regulated CAZyme genes for panel A can be found in Table S3, and for panel B in Table S4. The fold change (log2 scale) is plotted on the x-axis and the p-value (-log10 scale) is plotted on the y-axis. For orientation on the x-axis, positive values indicate genes that are up-regulated when grown using xyloglucan as the sole carbon source for panel A, and genes up-regulated during stationary phase for panel B. The red dashed lines indicate the significance cut-off values (2-fold for gene expression and p-value of 0.01).   106   107  Figure 3. S2.  Amino acid sequence alignment showing regions of structural similarity between the catalytic domains of Cellvibrio japonicus GH5_4 enzymes (CjGH5D, CjGH5E, and CjGH5F) and other members exhibiting endo-xyloglucanase activity belonging to the same family. Cellvibrio japonicus GH5s (CjGH5D, Accession: ACE84905.1; CjGH5E, Accession: ACE83841.1; and CjGH5F, Accession: ACE86198.1); Bacteroides ovatus GH5 (BoGH5A, Accession: WP_004298445.1); Prevotella bryantii B14 GH5 (PbGH5A, Accession: EFI71705.1); Paenibacillus Pabuli GH5 (PpGH5, PDB: 2JEP_A); ruminal metagenomic GH5s (XEG5A, Accession: ACZ54907.1; and XEG5B, Accession: ADB44000.1). Secondary structural elements are shown for CjGH5D, with η referring to 310-helices, and α to α-helices (displayed as small and medium squiggles respectively), and β-strands shown as arrows, with TT and TTT representing strict β-turns and strict α-turns respectively. Alignment was created using Clustal Omega and Espript 3.0. Catalytic residues are marked with red asterisks, CjGH5D active site residues interacting with the ligands (GXLG and XXXG-NHCOCH2Br) via hydrogen bond formation and stacking interactions are marked with blue and green asterisks, respectively.   108    Figure 3. S3. pH and temperature profiles of CjGH5_4 enzymes with tamarind seed XyG as a substrate. A) CjGH5D. B) CjGH5E. C) CjGH5F.  Left panels are pH rate profiles while right panels are temperature profiles. Black squares, citrate buffer; red circles, phosphate buffer; and blue triangle, glycine buffer. Error bars represent standard error of the mean for 3 replicates.   109   Figure 3.S4. Michaelis-Menten kinetics of CjGH5_4 enzymes on tamarind seed XyG. A) CjGH5D. B) CjGH5E. C) CjGH5F. Error bars represent standard error based on 3 replicates.   110   Figure 3.S5. HPAEC-PAD analysis of the hydrolysis time course and limit-digest of CjGH5_4 enzyme-xyloglucan degradation products. A) CjGH5D. B) CjGH5E. C) CjGH5F.   111   Figure 3.S6. MALDI-TOF analysis of the limit digest products of CjGH5_4 enzymes upon incubation with tamarind seed XyG. A) CjGH5D. B) CjGH5E. C) CjGH5F. The observed molecular masses of the major 3 peaks were 1085.21, 1247.29 and 1409.37; which correspond to [M+Na]+ of XXXG (calculated: 1085.9), XLXG/XXLG (calculated: 1248.05), and XLLG (calculated: 1410.19), respectively.   112   Figure S7. Intact protein mass spectrometry of CjGH5D with the XXXG-NHCOCH2Br inhibitor. A) CjGH5D negative control with no inhibitor. B) CjGH5D incubated with 2.5 mM of the inhibitor for 3 hours at 37 °C. The peak at 44226.5 Da corresponds to CjGH5D (calculated 44222.2 Da) and 45330.5 Da to CjGH5D-inhibitor covalent adduct (calculated: 45325.2 Da). Peak at 44403.1 Da (and correspondingly at 45509.5 Da) is attributed to the post translationally modified protein due to the N-gluconylation of the His-tag of the recombinant protein [249].   113   Figure 3. S8. Growth analysis control experiments for in-frame deletions mutants of the GH5_4, and GH74 genes on xyloglucan. Cultures were grown for 24 hours at 30°C with high aeration (200 RPM) in MOPS defined media supplemented with 0.5% (w:v) glucose as the sole carbon source. Graphs represent the average of three biological replicates and error bars represent the standard deviation, A) XyGUL mutants, B) single, double, triple, and quadruple deletion mutants of the GH5_4 and GH74 genes; CJA_3010 encodes CjGH5D, CJA_3337 encodes CjGH5E, CJA_2959 encodes CjGH5F, and CJA_2477 encodes CjGH74A. 114  Chapter 4: Identification of a novel family of xyloglucan binding modules 4.1 Introduction Plant biomass is considered as a useful renewable source of energy and biomaterials given the continuous depletion of fossil fuel reserves worldwide [5, 10, 164]. Nevertheless, the plant biomass saccharification is hindered by the enormous plant cell wall complexity, which is attributed to the sophisticated assembly of, and various inter-linkages between, the plant cell wall polysaccharide components. Some microorganisms evolved competent machineries that efficiently break down these polysaccharide structures via a broad-spectrum repertoire of carbohydrate active enzymes (CAZymes) including primarily glycoside hydrolases, but also polysaccharide lyases, carbohydrate esterases, and polysaccharide oxidases. These enzymes are often appended to non-catalytic carbohydrate binding modules (CBMs) (see [72] and [203] for review) which potentiate the activity of the partnered enzymes possibly by bringing them in close proximity to their cognate substrates [69, 250].  Similar to CAZymes, CBMs are classified into families based on amino acid sequence identity (www.cazy.org [57]). Currently, there are 83 families of CBMs with more than 106,700 known sequences. The majority of the identified families to date contain members that are known to target plant cell wall components [203]. CBMs are also classified based on the carbohydrate binding modes into 3 categories: Type A CBMs bind to the crystalline surfaces of insoluble polysaccharide substrates, Type B CBMs interact internally with single glycan chains of the soluble polysaccharides (endo-type), and Type C CBMs recognize and bind the termini of the glycan chains of short oligo-saccharide sequences (exo-type) [203]. Notably, the number of explored CBM families is continuously expanding, and hundreds of broad and sequence-related CBM families are yet to be investigated in order to reveal their potential biological roles [251]. Of these, we identified a non-catalytic module of unknown function (referred to as X181) in the CJA_2959 gene product from the soil saprophyte Cellvibrio japonicus, a Gram negative bacterium that is able to utilize nearly all plant cell wall polysaccharides [158]. The X181 module is found in tandem with a strictly specific endo-xyloglucanase (CjGH5F) (see Chapter 3). Although the X181 amino acid sequence is phylogenetically unrelated to the known CBM 115  families, it is predicted to exhibit carbohydrate-binding capacity and contribute to the catalytic activity of the CjGH5F.  In this work, I investigated the potential activity of the X181 module utilizing a combinatorial approach encompassing biochemical and biophysical characterizations to shed light on a new CBM family that selectively targets XyG and other galactose-containing polysaccharides. 4.2 Materials and Methods 4.2.1 Bioinformatic analysis The modular architecture of the full length protein encoded by ORF CJA_2959 (FN3-X181-CjGH5F) in C. japonicus genome was obtained from BLASTP analysis and alignment with representative GH modules from the CAZy Database [57] using ClustalW  [168] (see Chapter 3). Boundaries of the X181 module were generously provided by Prof. Bernard Henrissat (afmb, France).  4.2.2 Plasmid construction The ORF of the X181 and FN3-X181 were PCR amplified from C. japonicus genomic DNA using sequence specific primers (Table 4.S1). Amplified products were double digested with NheI and XhoI, gel purified and then ligated to the respective site of pET28a, so that they are fused to an N-terminal 6x His-Tag. For GFP fusion, Overlapping extension PCR was used to fuse sfGFP sequence to either the N- or C-terminus of the FN3-X181 and X181 domains with a TEV cleavage site in between. TEV-FN3-X181, FN3-X181-TEV, and X181-TEV were PCR amplified from C. japonicus genomic DNA, while sfGFP was amplified from a pBAD vector harboring the sfGFP gene (GenBank accession number AGT98536.1) (Table. 4.S1). Purified sfGFP was mixed individually with the purified TEV-FN3-X181, FN3-X181-TEV, and X181-TEV and the respective fusion products sfGFP-TEV-FN3-X181, FN3-X18-TEV-sfGFP, and X18-TEV-sfGFP were obtained by PCR amplification. The purified sfGFP-TEV-FN3-X181, FN3-X18-TEV-sfGFP, and X18-TEV-sfGFP DNA fragments were cloned in pET28a as previously described for FN3-X181 after double digesting with NheI and XhoI, and ligating the digested 116  products in the respective sites of the pET28a vector. The amplified FN3-X181 product was also individually ligated in an SspI linearized pMCSG53, pMCSG69, and pMCSG-GST-TEV (a vector derived from pMCSG52) using Ligation Independent Cloning (LIC) strategy [221] so that the ORF FN3-X181 is fused to 6x His-Tag, 6x His-Tag-MBP, 6x His-Tag-GST, respectively. Successful cloning was confirmed by PCR and plasmid DNA sequencing. Q5 high fidelity DNA polymerase was used for all the PCR amplifications. 4.2.3 Gene expression and protein purification  To overproduce the recombinant proteins X181, FN3-X181, sfGFP-FN3-X181, FN3-X181-sfGFP, and X181-sfGFP in E. coli, constructs were individually transformed to the chemically competent Rosetta DE3 cells. Colonies were grown on LB solid media containing kanamycin (50 µg mL-1) and chloramphenicol (33 µg mL-1). One colony of the transformed E. coli cells was inoculated in 5 mL of LB medium containing the same antibiotics and grown overnight at 37 °C (200 rpm). The whole overnight culture was used to inoculate 500 mL of TB liquid medium containing the proper antibiotics. Cultures were grown at 37 °C (200 rpm) until A600 = 0.6. Gene overexpression was induced by adding IPTG to a final concentration of 0.1 mM. After induction, cultures were grown at 16 °C (200 rpm) for 20-22 hours. Cultures were then centrifuged and pellets were resuspended in 5 mL of E. coli lysis buffer containing (20 mM HEPES, pH 7.0, 500 mM NaCl, 40 mM imidazole, 5% glycerol, 1 mM DTT and 1 mM PMSF). Cells were then lysed by sonication and the clear supernatant was separated by centrifuging the sonicated cultures at 4 °C (12400 g for 60 minutes). Recombinant proteins were purified from the clear soluble lysates using a using a Ni+2– affinity column utilizing a  gradient elution in an FPLC system utilizing an elution buffer that contains 20 mM HEPES, pH 7.0, 100 mM NaCl, 500 mM imidazole, and 5% glycerol. Purity of the recombinant proteins was determined by visualizing the protein contents of the fractions on SDS-PAGE. Pure fractions were pooled, concentrated, and buffer exchanged with 20 mM HEPES buffer (pH 7) containing 100 mM NaCl. Protein concentrations were then determined using Epoch Micro-Volume Spectrophotometer System (BioTek®,USA) at 280 nm.  When LIC vectors were used, individual constructs were transformed to the chemically competent BL21 and Rosetta DE3 cells. An overexpression trial was conducted in 10 mL LB medium containing ampicillin (50 µg mL-1) (BL21) or ampicillin (50 µg mL-1) and 117  chloramphenicol (33 µg mL-1) (Rosetta) to identify constructs with successful soluble expression of the recombinant proteins. Briefly, cultures were grown at 37 °C (200 rpm) until A600 = 0.6 before they were induced by IPTG at a final concentration of 0.2 mM. Cultures were then grown overnight at 16 °C (200 rpm) before aliquots of 1 mL were taken out and centrifuged at 4 °C in a bench-top centrifuge for 5 minutes (15000 g). Supernatants were discarded and cell pellets were resuspended in 700 µL lysis buffer containing 20 mM HEPES, pH 7.0, 500 mM NaCl, 40 mM imidazole, 5% glycerol. Cells were sonicated and the clear supernatants were separated from pellets by centrifugation in a bench-top cooling centrifuge for 15 minutes (15000 g). Clear supernatants were mixed with 4x- SDS-PAGE loading dye, boiled for 10 minutes before 10-15 µL were analyzed by SDS-PAGE. Pellets were resuspended in 150-300 µL of the same lysis buffer, mixed with 4x- SDS-PAGE loading dye before 10 µL were analyzed on SDS-PAGE.  For the large-scale production of MBP-FN3-X181 in the Rosetta strain, the same protocol described in the production of X181, FN3-X181, sfGFP-FN3-X181, FN3-X181-sfGFP, and X181-sfGFP was employed with the exception of the antibiotic ampicillin (50 µg mL-1) which was used instead of kanamycin. 4.2.4 Carbohydrate sources Tamarind seed XyG, konjac glucomannan (KGM), barley β-glucan (BBG), beechwood xylan, and carob galactomannan (GalMan) were purchased from Megazyme® (Bray, Ireland). Hydroxyethylcellulose (HEC) was purchased from Amresco® (Solon, USA). Guar gum was purchased from Sigma Aldrich® (St. Louise, USA). Glc4 XyGOs were prepared from tamarind seed XyG as previously described [150]. 4.2.5 Cellulose binding capacity Qualitative analysis for cellulose binding capacity was performed following the previously described protocol [219]. Briefly, 100 µg of the recombinant protein FN3-X181-sfGFP was mixed with 10 mg Avicel type PH-101 in a 200 µL reaction volume containing 50 mM phosphate buffer (pH 7.0). Mixture was gently agitated while sitting on ice for 1 hour. Clear supernatant was removed by centrifugation for 5 minutes (12000 g), mixed with SDS-PAGE loading dye and 6 µl were analyzed by SDS-PAGE. Avicel pellets were washed twice with 250 118  µL of 50 mM phosphate buffer (pH 7.0), then resuspended in 200 µL of the same buffer. 40 µL of 6x-SDS-PAGE loading dye was added, and then bound proteins were released by boiling for 10 minutes before 6 µL were analyzed by SDS-PAGE. 4.2.6 Affinity gel electrophoresis Methodology for the native gel electrophoresis was adapted from the previously described protocol [182]. Briefly, seven µg of the native recombinant FN3-X181-sfGFP were loaded on native 10% (w/v) polyacrylamide gels containing 0.1% final concentration of the tested polysaccharides BBG, HEC, beechwood xylan, KGM, XyG, carob GalMan and guar gum. Electrophoresis was conducted at room temperature for approximately 10 hours (90 volts) due to the very poor migration rate of the recombinant protein. Protein bands were visualized by fluorescence and Coomassie Blue staining.  4.2.7 Activity assays To determine whether FN3-X181-sfGFP exhibited a catalytic activity against XyG, one µg of the recombinant protein was mixed with XyG to a final a concentration of 2 mg mL-1 in a 200 µl reaction volume that contains 50 mM phosphate buffer (pH 7) and 2 mM CaCl2. Reaction mixture was incubated at 40 °C for 30 minutes before bicinchoninic acid assay (BCA) was employed to detect the possible formation of new reducing ends [252].   4.2.8 Isothermal titration calorimetry (ITC) Isothermal titration calorimetry was performed using a MicroCal VP-ITC calorimeter following the  previously described protocol [34]. Purified FN3-X181-sfGFP was buffer exchanged in 20 mM HEPES buffer (pH 7.0) containing 100 mM NaCl and 2 mM CaCl2, while XyG and XyGOs were dissolved separately in the same buffer. The recombinant protein (35- 40 µM) was placed in the sample cell while the XyG (2.5 mg mL-1) or XyGOs (10 mM) was loaded in the injecting syringe. After the temperature equilibrated to 25 °C in the sample cell, a first injection of 2 µl of the ligand followed by 25 subsequent injections of 10 µl each were performed while stirring at 280 rpm. The resulting heat of reaction was recorded and data were analysed using the Origin software program.  119  4.3 Results and Discussion 4.3.1 Bioinformatic analysis The gene locus CJA_2959 (GenBank ACE86198.1) encodes a multi-modular gene product that is predicted to be excreted extracellularly after the cleavage of its signal peptide via signal peptidase I enzyme (Met1– Ala22). The unique modular architecture of the CJA_2959 gene product consists of a Fibronectin type III (FN3) domain (Gln23-Cys120), a catalytic module (CjGH5F) (Ser218– Gln556) and a module of unknown function referred to as X181 module in between (Thr123-Val205) (Figure 4.1A). Although the CjGH5F catalytic domain was extensively studied and biochemically characterized as a GH5_4 strict endo-xyloglucanase (see Chapter 3), the functions of the FN3 and the X181 domains are yet to be investigated. The FN3 domains are ubiquitous in animal proteins and they play a vital role in cellular adhesion and protein-protein interactions [253]. Moreover, FN3 domains are commonly observed in extracellular bacterial glycoside hydrolases (GHs); however, their function in that context is not clearly understood [254]. It has been interestingly found that FN3-like repeats in Clostridium thermocellum can aid enzymatic degradation of cellulose by altering its surface via the development of surface erosions [255]. It has been also suggested that FN3 domains might play a role in substrate binding and potentiate the catalytic hydrolysis [256]. Nevertheless, extensive biochemical and structural characterization of the bacterial FN3 domains is indeed required to shed light on their potential biological function. On the other hand, X181 modules have been observed in the modular architecture of GH5_4, 16, 74, and 53, as well as in the polysaccharide lyase family 1 (PL1) (Observation by Prof. B. Henrissat, personal communication). Yet, the function of these modules remains a treasure trove to be explored.  4.3.2 Recombinant protein production and purification Initial attempts to recombinantly produce and purify X181 and FN3-X181 independently in E. coli were hindered by the extremely poor expression. To address this challenge, I employed the protein-fusion strategy I successfully utilized in our recent characterization of two carbohydrate binding modules (CBMs) from C. japonicus [219]. Indeed, the green fluorescent protein (GFP) fusion constructs not only improve the production and purification but also facilitate the detection of the tagged proteins [188-192, 219]. 120   Figure  4.1. Modular architecture of the native C. japonicus CJA_2959 gene product and different constructs used in the study. A) The full length gene product is composed of a signal peptide, an FN3 domain, an X181 module, and a catalytic GH5_4 (CjGH5F) domain. B) Recombinant proteins produced for characterisation using the E. coli expression vector pET28a. Super folder GFP (sfGFP) is connected to X181 or FN3-X181 by a TEV cleavage site. C) Recombinant proteins produced using the ligation independent cloning (LIC) vectors pMCSG53, pMCSG69, and pMCSG-GST-TEV [221]. Similar to the pET28a constructs, MBP or GST tags are connected to FN3-X181 by a TEV cleavage site.    121  Therefore, FN3-X181 was initially fused to an N-terminal sfGFP domain (Figure 4.1B). However, sfGFP-FN3-X181 was proteolytically degraded during the purification process despite my efforts to minimize the proteolytic cleavage via incorporating protease inhibitors in the purification buffers and decreasing the expression and purification temperatures. Subsequently, fusion of a C-terminal sfGFP to FN3-X181 was sought (Figure 4.1B). Although most of the recombinant protein FN3-X181-sfGFP was produced in inclusion bodies, protein purification from the bacterial lysate was successful and the intact protein (calculated mass, 48673.7  Da; observed by ESI-MS, 48671.7 Da) was obtained, albeit in modest yields (typically 5.7 mg L-1, Figure 4.2 and 4.3). Because of our interest in the characterization of X181 module independent of the FN3 domain, I attempted to fuse X181 to a C-terminal sfGFP domain following my successful production trial of FN3-X181-sfGFP. Interestingly, the removal of the FN3 domain remarkably affected the production yield in E. coli and hence, X181-sfGFP could not be obtained and purified.  Additionally, to optimize the production yield of FN3-X181, different ligation independent cloning (LIC) vectors were tested. These vectors provide built-in fusion tags that might increase the solubility of the produced proteins (Figure 4.1C). Small-scale expression trial revealed the promising enhancement of the production level of FN3-X181 when fused to the maltose binding protein (MBP)-tag and expressed in the Rosetta strain. It was quite evident that MBP- FN3-X181 is largely produced in the soluble fraction upon induction with IPTG (Figure 4.4). However, similar to the GFP-FN3-X181 construct, MBP-FN3-X181 had undergone proteolytic degradation during the purification process making it impossible to obtain in the intact form (data not shown). Therefore, based on comprehensive expression trials, FN3-X181-sfGFP was selected for all subsequent biochemical and biophysical characterization. 122   Figure  4.2. SDS-PAGE of the purified protein constructs. A) Purified recombinant sfGFP-FN3-X181. B) Purified recombinant FN3-X181-sfGFP. The calculated molecular mass of both recombinant proteins is 48673.7 Da.    Figure  4.3. Intact protein mass spectrometry of FN3-X181-sfGFP. The peak at 48671.7 Da corresponds to FN3-X181-sfGFP (calculated 48673.7 Da). Peak at 48849.5 Da is attributed to the N-gluconylation of the His-tag of the recombinant protein [249].  123   Figure  4.4. Overexpression trial of FN3-X181 using the ligation independent (LIC) vectors pMCSG53, pMCSG69, and pMCSG-GST-TEV in E. coli. A) BL21 expression strain. B) Rosetta strain. Only MBP-FN3-X181 was clearly observed (black box). The calculated molecular masses of the recombinant proteins are 21251.83 kDa, 63355.67 kDa, and 46749.48 kDa for FN3-X181, MBP-FN3-X181, and GST-FN3-X181, respectively.   124  4.3.3 X181 recognizes the galactose-containing polysaccharides XyG and galactomannan Since the biological function of the X181 modules has not been previously studied, a comprehensive biochemical and biophysical investigation has been pursued to reveal the potential role of this family in the context of the plant polysaccharide degradation. To investigate the cellulose binding capacity of FN3-X181-sfGFP, the recombinant protein was incubated with an aqueous suspension of Avicel before analyzing the bound and unbound protein fractions using SDS-PAGE. Cellulose pull-down assays demonstrated no significant cellulose binding affinity of the FN3-X181-sfGFP suggesting that the X181 module is unlikely to be involved in the cellulose utilization in C. japonicus (Figure 4.5).   Figure  4.5. Binding capacity of sfGFP, FN3-X181-sfGFP and sfGFP-CjCBM2 for Avicel. Recombinant sfGFP (27.7 kDa; lane 1, 2), FN3-X181- sfGFP (48.7 kDa; lane 3, 4) and sfGFP-CjCBM2 (40.7 kDa; lane 5, 6) were incubated with Avicel and unbound proteins were removed by centrifugation (lanes 1, 3, 5, and 7). Bound proteins were released from Avicel by boiling in SDS (lanes 2, 4, 6, and 8). sfGFP was used as a negative control while sfGFP-CjCBM2 [219] was used as positive controls.    125  Interestingly, it has been observed that X181 modules are found in tandem with catalytic domains belonging to the xyloglucan active glycoside hydrolase families GH5_4, 16, 74 suggesting the XyG binding potential of these modules. Therefore, we expanded the panel of the tested ligands to include the soluble polysaccharides barley β-glucan (BBG), hydroxyethyl cellulose (HEC), beechwood xylan, konjac glucomannan (KGM), xyloglucan (XyG), carob galactomannan (GalMan) and guar gum. Native affinity gel electrophoresis utilizing polyacrylamide gels individually incorporated with these polysaccharides revealed exclusive migration retardation of the native FN3-X181-sfGFP protein, compared to the negative control sfGFP, when XyG and different galactomannans were used as substrates. This observation highlights the high specificity of the X181 module towards the galactose-containing polysaccharides including XyG (Figure 4.6).  To exclude the catalytic potential of X181 module, enzyme assays utilizing the recombinant FN3-X181-sfGFP were performed via the incubation with XyG and quantitatively analyzing the potentially generated reducing ends. The recombinant protein displayed no catalytic activity against XyG suggesting that X181 is in fact a XyG binding module and not a catalytically active xyloglucanase. These results are indeed congruent with the association of the X181 module with the GH5_4 XyG-specific endo-xyloglucanase (CjGH5F) in the modular architecture of the CJA_2959 gene product.  The non-catalytic XyG binding modules have been previously observed in a few CBM families including CBM44 [257], CBM65 [258], CBM62 [259], in addition to the Type A cellulose binding families CBM2 and CBM3a [204]. It should be noted, however, that despite their strong affinity towards XyG, all the XyG binding modules characterized to date display noticeable flexibility in recognizing other polysaccharide substrates according to the presented mode of binding. For example, XyG-binding modules can recognize soluble β-1,4 glucan backbone-containing polysaccharides [257, 258], crystalline cellulose [204], and galactosyl ramifications in different plant polysaccharides [259]. Likewise, the X181 module represents an example of this flexible recognition. The ability of the X181 module to bind XyG and galactomannans, while completely lacking affinity towards β-1,4 glucan, glucomannan, and xylan backbones, clearly indicates the X181 strict recognition of the galactosyl side-chain decorations in different heterogeneous polysaccharide substrates.   126   Figure  4.6. Affinity gel electrophoresis for FN3-X181-sfGFP against different polysaccharide substrates. Electrophoresis conditions: 90 volts for 10 hours. Recombinant protein bands were visualized by A) Fluorescence and B) Coomassie Blue staining. The negative control sfGFP migrated through the entire gel over the course of the 10-hour run. C) An example of a shorter electrophoresis runtime experiment (5.5 hours) revealing no binding affinity of the negative control sfGFP towards XyG. Protein bands were visualized by fluorescence.  127  4.3.4 Biophysical characterization of the X181 module To further illuminate the binding affinity of X181 towards XyG and xylogluco-oligosaccharides (XyGOs), as representative examples of galactosylated hemicellulosic components, isothermal titration calorimetry (ITC) was employed to determine all the binding thermodynamic constants (Figure 4.7, Table 4.1). Commensurate with the affinity gel electrophoresis, ITC revealed a strong binding affinity for FN3-X181-sfGFP towards XyG and Glc4-based XyGOs as illustrated by the relatively high association constants: (Ka) 7.17 X 103 M-1 (based on the concentration of the Glc4-XyGO binding units) and 1.11 X 103 M-1, respectively (Figure 4.7, Table 4.1). It is quite evident from the thermodynamic data that binding to the XyG-based substrates is enthalpically driven which resembles the majority of CBMs studied to date [258-260]. Notably, the Ka value for XyG is about 7 times higher than that of the shorter oligosaccharides suggesting the presence of structural features that favour the longer ligands.   Figure  4.7. Isothermal titration calorimetry of FN3-X181-sfGFP against different ligands. A) XyG. B) XyGOs. Top graph in each pair shows the raw heat during titration, while the bottom graph shows the integrated heats after correction.    128  Table  4.1. Summary of the thermodynamic parameters for FN3-X181-sfGFP obtained by isothermal titration calorimetry aBinding thermodynamics of XyG is based on the concentration of the binding units Glc4- XyGOs. 4.4 Conclusions  CBMs are commonly found appended via flexible linkers to glycoside hydrolases in unique modular architectures. The number of characterized CBM families is constantly growing as a result of advances in protein production and characterization. In this report, we identified and functionally characterized a novel galactose binding module (X181) from the CJA_2959 gene product in C. japonicus. This module can efficiently and exclusively bind the galactose-containing plant cell wall polysaccharides XyG and galactomannan allowing the appended xyloglucanase to encounter its cognate substrate. Therefore, I propose a new CBM family to encompass this newly characterized module. My analysis suggests the presence of unique structural features in the X181 module that are fundamental in the recognition of and binding to the galactosyl side-chain residues in XyG and galactomannan. Hence, our future investigation will address the 3D- structure of the X181 module to shed the light on the binding mechanism of this novel family on the molecular level.   Carbohydrate Ka (M-1) ΔG° (kCal. Mol-1) ΔH° (kCal. Mol-1) TΔS° (kCal. Mol-1) XyGa (7.17 ± 0.52) X 103 ‒ 5.3 ‒ 10 ± 0.4 ‒ 4.7 XyGOs (1.11 ± 0.04) X 103 ‒ 4.2 ‒ 6.5 ± 0.1 ‒ 2.3 129  4.5 Supporting information 4.5.1 Supporting tables Table 4.S1. Primers used in the study. Primer Sequence Recombinant protein X181-NheI-F X181-XhoI-R 5ˊ- GACCGCTAGCATGACTGCGGTGACGCCCTATATCAAC-3ˊ 5ˊ- GGTCCTCGAGTTAGACCGTTACGCTGAAAACCTGGC -3ˊ X181 FN3-NheI-F X181-XhoI-R 5ˊ - GACCGCTAGCATGCAGAATTGCGGCAGCGGTGGCG - 3ˊ 5ˊ- GGTCCTCGAGTTAGACCGTTACGCTGAAAACCTGGC -3ˊ FN3-X181 GFP-NheI-F  GFP-TEVoh-R 5ˊ - GACCGCTAGCATGGTTAGCAAAGGTGAAGAAC - 3ˊ 5ˊ - GAAAATAAAGATTCTCGCTGCCTTTATACAGTTC - 3ˊ GFP-TEVoh GFPoh-TEV-FN3-F  X181-XhoI-R 5ˊ - CTGTATAAAGGCAGCGAGAATCTTTATTTTCAGGGC-CAGAATTGCGGCAGCGGTGGCG- 3ˊ 5ˊ - GGTCCTCGAGTTAGACCGTTACGCTGAAAACCTGGC - 3ˊ GFPoh-TEV-FN3-X181 GFP-NheI-F  X181-XhoI-R 5ˊ - GACCGCTAGCATGGTTAGCAAAGGTGAAGAAC - 3ˊ 5ˊ - GGTCCTCGAGTTAGACCGTTACGCTGAAAACCTGGC - 3ˊ GFP-FN3-X181 FN3-NheI-F X181-TEV-GFPoh-R  5ˊ -GACCGCTAGCATGCAGAATTGCGGCAGCGGTGGCG- 3ˊ 5ˊ -CTTCACCTTTGCTAACGCCCTGAAAATAAAGATTCTC-GACCGTTACGCTGAAAACCTGG- 3ˊ FN3-X181-TEV-GFPoh TEVoh-GFP-F GFP-XhoI-R 5ˊ -CTTTATTTTCAGGGCGTTAGCAAAGGTGAAGAAC- 3ˊ 5ˊ -GGTCCTCGAGTTAGCTGCCTTTATACAGTTCATC- 3ˊ TEVoh-GFP FN3-NheI-F GFP-XhoI-R 5ˊ -GACCGCTAGCATGCAGAATTGCGGCAGCGGTGGCG- 3ˊ 5ˊ -GGTCCTCGAGTTAGCTGCCTTTATACAGTTCATC- 3ˊ FN3-X181-GFP X181-NheI-F GFP-XhoI-R 5ˊ- GACCGCTAGCATGACTGCGGTGACGCCCTATATCAAC-3ˊ 5ˊ -GGTCCTCGAGTTAGCTGCCTTTATACAGTTCATC- 3ˊ X181-GFP FN3-X181-LIC-F  FN3-X181-LIC-R 5ˊ -TACTTCCAATCCAATGCCATGCAGAATTGCGGCAGC-GGTGGC- 3ˊ 5ˊ- TTATCCACTTCCAATGTTATCAGACCGTTACGCTGAAAAC-CTGGC-3ˊ FN3-X181 GST-FN3-X181 MBP-FN3-X181 130  Chapter 5: Functional analysis of a mixed-linkage β-glucanase/ xyloglucanase belonging to the glycoside hydrolase family 9 5.1 Introduction A great deal of attention has been paid recently to the development of a clean and renewable alternative source of energy that overcomes the challenge of the constantly depleting fossil fuel reserves all over the world. In response to this global challenge, plant-derived biomass represented a substantial focus of interest as an untapped source of polysaccharide precursors that can be ultimately converted into liquid fuels and value-added products [6, 163]. However, plant cell walls are extremely complex and recalcitrant to hydrolysis into the constituting simple components. Interestingly, different microbes evolved efficient machineries and sophisticated systems to degrade plant materials and provide the required energy for their growth in a key aspect of global carbon cycle [261]. Therefore, the use of microbial enzymes in the deconstruction of plant biomass has been envisioned as an extremely powerful tool that overcome the apparent recalcitrance of plant cell walls while generating minimal amounts of undesirable by-products [10]. Plant cell walls are essentially composed of polysaccharides, represented in the cellulosic and hemicellulosic fraction, comprising more than 70% of the cell wall dry weight. In addition to the polysaccharides, plant cell walls also contain the less dominant polyphenolic components (lignin), structural proteins and small organic and inorganic constituents [14]. The most abundant polysaccharide in plant cell wall is cellulose, a crystalline homopolymer of β(1→4)-D-glucan chains bundled in the form of microfibrils. The cellulosic microfibrils are coated and cross-linked with hemicellulose, structurally diverse glycan entities, and embedded in a matrix of charged polysaccharides (pectins) [11, 15, 23]. Among the hemicelluloses, the structurally complex xyloglucan (XyG) family represents up to 25% of total dry weight of terrestrial plant cell walls [11, 23]. Given its abundance, XyG has been considered as an attractive target for the discovery of efficient plant biomass degrading enzymes. From a structural perspective, XyG is composed of a linear β(1→4)-D-glucan backbone that is regularly and extensively branched with α(1→6)-xylopyranosyl residues in the two common backbone branching motifs XXXG and XXGG (nomenclature according to [30]). According to plant tissue and species [28] , other 131  monosaccharide units, e.g. galactopyranosyl, fucopyranosyl, and arabinofuranosyl residues can be added to the xylopyranosyl sidechain moieties resulting in a large number of complex structures that require a combination of endo- and exo-acting cleaving enzymes for efficient saccharification.  Recently, we have been investigating the XyG degrading machinery in Cellvibrio japonicus, a Gram-negative soil saprophyte that is capable of utilizing a wide array of plant polysaccharides [33, 152, 155, 156, 161]. The XyG utilization capacity of C. japonicus is attributed to the presence of a XyG-utilization locus (XyGUL), in addition to a suite of off-locus endo-xyloglucanase genes, within its genome. XyG degradation pathway in C. japonicus therefore involves interplay between extracellular backbone cleaving endo-xyloglucanases and the locus side-chain cleaving exo-glycosidases, which are confined to the periplasm [33, 35, 145, 219]. Thus far, four XyG-specific endo-glucanases [one glycoside hydrolase family 74 (GH74), and three GH5s (GH5D, GH5E and GH5F)] have been identified and characterized from C. japonicus (see Chapter 3) [219]. Astonishingly, reverse genetics revealed that C. japonicus ΔGH74ΔGH5DΔGH5EΔGH5F quadruple deletion mutant is still able to grow on XyG- based media thereby suggesting the presence of additional enzyme(s) with potential endo-xyloglucanase activity (see Chapter 3). An interesting recent study identified a specific GH9 endo-xyloglucanase for the first time from Ruminiclostridium cellulolyticum [112], thus expanding the list of target families to include this second largest cellulase family.  C. japonicus genome encodes three GH9 enzymes (CjGH9A, CjGH9B and CjGH9C). In this report, we investigated the potential role of CjGH9B in the XyG degradation pathway in C. japonicus via a combinatorial approach that involved bioinformatic analysis, molecular biology, and biochemical characterization. Our data clearly demonstrate that despite utilizing mixed linkage glucan as a main substrate, CjGH9B exhibits a notable side activity on XyG. These results might justify the lack of growth perturbations of the C. japonicus quadruple deletion mutant lacking the four specific endo-xyloglucanases when grown on XyG. Moreover, the presented data rationalize our future functional characterization attempts of the other GH9 family members in C. japonicus.      132  5.2 Materials and Methods 5.2.1 Bioinformatic analysis The presence of a signal peptide in the full-length proteins encoded by ORF CJA_2472 (CjGH9A), CJA_1633 (CjGH9B) and CJA_3804 (CjGH9C) of Cellvibrio japonicus Ueda107 was investigated using both SignalP (4.1) and LipoP (1.0) algorithms [167, 220]. The modular architecture of the encoded CjGH9A, CjGH9B, and CjGH9C proteins were determined by BlastP analysis and alignment with representative glycoside hydrolases and CBM modules from the CAZy Database [57] using ClustalW [168]. A protein homology/analogy recognition engine (Phyre2) was employed to generate structural models of the three proteins [169]. 5.2.2 Cloning of DNA encoding GH9 modules cDNA encoding the enzymes Ig-GH9A, Ig-GH9B and Ig-GH9C were PCR amplified from C. japonicus genomic DNA using Q5 high fidelity DNA polymerase; all constructs were designed such that the native predicted signal peptide was removed (PCR primers are listed in Table 5.S1). The amplified Ig-GH9A and Ig-GH9B products were double-digested with NheI and XhoI, while the amplified Ig-GH9C product was double-digested with NdeI and XhoI. Digested products were gel purified and ligated to the respective sites of pET28a to fuse an N-terminal 6x His-Tag. Successful cloning was confirmed by PCR and plasmid DNA sequencing. 5.2.3 Gene expression and protein purification Constructs were individually transformed into the chemically competent E. coli Rosetta DE3 cells and grown on LB solid media containing kanamycin (50 µg mL-1) and chloramphenicol (30 µg mL-1). One colony of the transformed E. coli cells was inoculated in 10 mL of LB medium containing the same antibiotics and grown overnight at 37 °C (200 rpm). The entire overnight culture was used to inoculate 1 L of TB liquid medium containing the proper antibiotics. Cultures were grown at 37 °C (200 rpm) until D600 = 0.6 before overexpression was induced by adding IPTG to a final concentration of 0.1 mM. After induction, cultures were grown overnight at 16 °C (200 rpm). Cultures were then centrifuged (4220 g at 16 ºC) and pellets were resuspended in 10 mL of E. coli lysis buffer containing 20 mM HEPES, pH 7.0, 500 mM NaCl, 133  1 mM CaCl2, 40 mM imidazole, 5% glycerol, 1 mM DTT and 1 mM PMSF. Sonication was then utilized to disrupt the cells and the clear supernatants were separated by centrifugation at 4 °C (24000 g for 60 minutes). Recombinant proteins were purified from the clear soluble lysates using a Ni+2– affinity column in an FPLC system utilizing a gradient elution up to 100% elution buffer containing 20 mM HEPES, pH 7.0, 100 mM NaCl, 1 mM CaCl2. 500 mM imidazole, and 5% glycerol. The protein contents of the collected fractions were visualized on SDS-PAGE to determine the purity of the recombinant proteins. Pure fractions were pooled, concentrated, and buffer exchanged against 50 mM citrate buffer (pH 6.5) containing 1 mM CaCl2. Protein concentrations were then determined using Epoch Micro-Volume Spectrophotometer System (BioTek®,USA) at 280 nm, and identities of the expressed proteins were confirmed by intact mass spectrometry. The fidelity of protein production was confirmed by intact protein mass spectrometry [171]. 5.2.4 Carbohydrate sources  Barley β-glucan (BBG), tamarind seed xyloglucan (XyG), konjac glucomannan (KGM), yeast β-glucan, curdlan, wheat flour arabinoxylan and beechwood xylan were obtained from Megazyme (Bray, Ireland). Carboxymethyl cellulose (CMC) was purchased from Acros Organics (Morris Plains, NJ, USA). Hydroxyethyl cellulose (HEC) was purchased from Amresco (Solon, OH, USA). Guar gum was purchased from Sigma Aldrich (St Louis, MO, USA). Xanthan gum was purchased from Spectrum (New Brunswick, NJ, USA). The oligosaccharide standards cellobiose (G4G), 1,3:1,4-β-Glucotriose A (G3G4G), 1,3:1,4-β-Glucotetraose B (G4G4G3G), and 1,3:1,4-β-Glucotetraose C (G4G3G4G) were purchased from Megazyme (Bray, Ireland). 1,3:1,4-β-Glucotetraose A (G3G4G4G) was purchased from Carbosynth® (Berkshire, UK). 5.2.5 Carbohydrate analytics High Performance Anion-Exchange Chromatography with Pulsed Amperometric Detection (HPAEC-PAD) was performed on a Dionex ICS-5000 DC HPLC system (Dionex Corp., Sunnyvale, CA, USA) using a Dionex Carbopac PA200 column and operated by Chromeleon software version 7 (Dionex). Solvent A was double-distilled water, solvent B was 1 M sodium hydroxide and solvent D was 1 M sodium acetate.  The gradient used was: 0–5 min, 10% solvent 134  B and 3.5% solvent C; 5–12 min, 10% B and a linear gradient from 3.5–30% C; 12–12.1 min, washing step with 50% B and 50% C; 12.1 – 13 min, an exponential gradient of NaOH and NaOAc (sodium acetate) back to initial conditions; and 13–17 min, initial conditions. Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) was performed on a Bruker Daltonics Autoflex System (Billerica, USA). The matrix used was 2,5-dihydroxy benzoic acid. The matrix was dissolved in 50% methanol in water to a final concentration of 10 mg mL-1 before it was mixed 1:1 (v/v) with the oligosaccharide samples. Two µl of this mixture was placed on a Bruker MTP 384 ground steel MALDI plate and left to air dry for two hours prior to analysis. 5.2.6 Enzyme kinetic analysis All enzymatic activity towards polysaccharides were determined using a bicinchoninic acid (BCA) reducing sugar assay [252]. The pH profile for the purified recombinant Ig-GH9B was obtained by incubating 0.012 μg of the protein with BBG at a final concentration of 2 mg mL-1.  Buffers with different pHs such as citrate (pH 3-6.5), phosphate (pH 6.5-8), HEPES (pH 7.5-8) and glycine buffers (pH 8.5-9) to a final concentration of 50 mM were used in a total reaction volume of 200 µL containing 1 mM CaCl2. Reaction mixtures were incubated for 10 minutes at 40 ºC prior to the BCA assay. To obtain the temperature profile of the recombinant Ig-GH9B, the same enzymatic reaction conditions were used in 50 mM phosphate buffer (pH 7). However, incubation temperature ranged from 25 °C to 80 °C. To obtain the specific activity values of Ig-GH9B towards BBG, XyG, HEC, CMC, KGM, wheat flour arabinoxylan, beechwood xylan, guar gum, curdlan, yeast β-glucan and xanthan gum, 0.012 μg of the purified recombinant enzyme was incubated with each polysaccharide at a final concentration of 2 mg mL-1 (except for curdlan which was used at 1 mg·mL-1 final concentration) in 50 mM phosphate buffer (pH 7). Reaction mixtures were incubated at 40 ⁰C for 10 minutes prior to the BCA assay.  To determine Michaelis-Menten kinetic parameters for BBG, eight different concentrations of BBG were used over the range 0.05 to 2.0 mg mL-1. Reactions were performed by incubating 0.012 µg of the recombinant protein with each BBG concentration at 40 °C for 10 min in a 200 135  µL total reaction mixture containing 50 mM phosphate buffer (pH 7) and 1 mM CaCl2. Following the same enzyme assay conditions, 0.25- 2.0 mg mL-1 CMC, HEC, KGM, and XyG were individually incubated with 0.023, 0.023, 0.046, and 0.046 µg of the recombinant Ig-GH9B, respectively. Km and kcat values were determined by non-linear fitting of the Michaelis-Menten equation to the data in OriginPro 2015 software. 5.2.7 Enzyme product analysis To identify the mode of action of the enzyme against BBG, 0.05 µg of Ig-GH9B was incubated at 40 ⁰C with 0.25 mg mL-1 final concentration of BBG in 200 µL reaction volume containing 50 mM phosphate buffer (pH 7) and 1 mM CaCl2. The reaction was stopped at different time intervals by adding 100 µL of 1M NH4OH before the reaction tubes were put immediately on ice. Reaction mixtures were then diluted 8 times with water prior to product analysis by HPAEC-PAD. To determine the bond cleavage specificity of the enzyme, limit-digest profiles of Ig-GH9B against BBG and XyG were obtained. The purified recombinant enzyme (2 μg for BBG and 6 μg for XyG) was incubated with 0.25 mg· mL-1 of each polysaccharide for 24 hours (40 °C) in 200 μL total reaction mixtures containing 50 mM phosphate buffer (pH 7) and 1 mM CaCl2. Reaction mixtures were then diluted 8 and 5 times for BBG and XyG, respectively, before they were analyzed by HPAEC-PAD.     136  5.3 Results and Discussion 5.3.1 Bioinformatic analysis The XyG degradation pathway in C. japonicus has been a key point of investigation in our group for the last few years. Therefore, we have been searching the C. japonicus genome for potential endo-xyloglucanase candidates which cleave the β(1→4)-D-glucosidic linkages in the XyG backbone. This activity of interest has been previously demonstrated in GH5, GH9, GH12, GH16, GH44, and GH74 [112, 150, 151]. Out of these families, only GH5 (n = 15), GH9 (n = 3), GH16 (n = 9) and GH74 (n = 1) are represented in C. japonicus genome  (http://www.cazy.org/genomes.html) [219]. Our extensive biochemical and structural studies revealed the remarkable specificity of the CjGH74 (Chapter 2) [219] and the three GH5_4 enzymes (CjGH5D, CjGH5E, CjGH5F) (Chapter 3) towards the hemicellulosic component XyG. Interestingly, our reverse genetic study clearly suggested the presence of other endo-xyloglucanases in C. japonicus (Chapter 3). Therefore, to further complement and extend these findings, we focused our attention on the three GH9 enzymes which are encoded by loci CJA_2472 (GenBank ACE85757.1, annotated as Cel9A), CJA_1633 (GenBank ACE85719.1, annotated as Cel9B) and CJA_3804 (GenBank ACE83873.1, annotated as Cel9C). The GH9 family exhibits a relatively wide range of catalytic activities, which impedes functional prediction and demands the thorough biochemical characterization of its family members to reveal the biological roles associated with their production (http://www.cazy.org/-GH9_characterized.html). Since Cel9B demonstrated poor activity against the soluble cellulosic substrates compared to mixed-linkage β-glucan, the three C. japonicus enzymes are referred to as CjGH9A, CjGH9B, and CjGH9C hereafter (vide infra).      Primary structure analysis of the CJA_2472 gene product revealed a unique modular architecture consisting of an N-terminal immunoglobulin (Ig)-like domain, a catalytic GH9A domain, a carbohydrate binding module family 10 (CBM10) member, a polycystic kidney disease (PKD)-like domain, and a CBM2 member. The CJA_1633 and CJA_3804 gene products had a simpler modular architecture only consisting of an N-terminal Ig-like domain followed by either the catalytic GH9B or GH9C domain, respectively. Homology alignments by the Phyre2 server, in addition to extensive sequence analysis, successfully defined the modular boundaries 137  of the constituting domains of the three CjGH9 enzymes (Figure 5.1). Notably, serine-rich linkers were found connecting the various modules of the CjGH9 enzymes as commonly observed in the multi-modular CAZymes (Figure 5.1) [185, 186]. Nevertheless, it is worthwhile mentioning that the N-terminal Ig-like and catalytic domains are always connected by a short loop, which justifies our attempts to recombinantly produce them in the native fusion form (Figure 5.1) (vide infra).    As predicted by LipoP 1.0 [234],  CJA_2472 gene product (CjGH9A) has an N-terminal signal peptide with a Signal Peptidase I (SPI) cleavage site, thus suggesting the extracellular secretion of the enzyme. The extracellular localization of the CjGH9A is further supported by the presence of the predicted cellulose-targeting CBM10 and CBM2 within the modular architecture of the full-length enzyme. On the other hand, LipoP 1.0 [234] predicted the presence of an SPII lipoprotein signal peptide in the CJA_1633 and CJA_3804 gene products (CjGH9B and CjGH9C, respectively), thus suggesting their attachment to the outer membrane by an N-terminal cysteine lipidation. Notably, C. japonicus GH9 enzymes share modest amino acid percentage identity (24-33%) with the other R. cellulolyticum XyG active GH9 enzymes [112], making it challenging to predict the potential xyloglucanase activity of CjGH9 enzymes solely via the bioinformatic analysis (Figure 5.2).  Figure  5.1. The modular architecture of the C. japonicus GH9 enzymes. Immunoglobulin-like domain, Ig-like; carbohydrate binding module, CBM; polycystic kidney disease-like domain, PKD; serine rich linker; SRL.   138   Figure  5.2. Amino acid sequence alignment of the C. japonicus GH9 catalytic modules with R. cellulolyticum XyG-active GH9 enzymes. The accession numbers for RcCel9U and RcCel9X are Ccel_0755 and Ccel_2621, respectively. Amino acid sequences include the N-terminal Ig-like domains. Red arrows indicate the conserved catalytic residues.                   10        20        30        40        50        60        70        80                         ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  ASTSPGDYQQDSRIRLNSIGYLPEAEKKATIA-----------ASSS-EFIVVNSSGTAVLTS--RTTSA---YNTDTSE  RcCel9X  --------RAYPAIKVNQVGFGESSEKYAYVSGFED----ELKAEAGTQFQVKRVSDDQVVYSDELVLVKD--YDAESGE  CjGH9A   ---------EVGNPRVNQLGYIPNGDRIAVYK-----------ASNNSAQTWQLTHNGSLIASGQTIPKG---SDASSGD  CjGH9B   ---------NLGLIKLNQVGFLPAASKLAVVP-----------EVAATAFQLLDSDTHRVVYSGELTSAA---NWVPAQE  CjGH9C   -----------NFMVVDQFGYLPDAQKIAVIRDPQTGFDAQQSFTPGGNYQLVNLHNGQVVHTGTASAWKSGAVSAEAGD                     90       100       110       120       130       140       150       160                  ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  QVNIADFSSVKTEGSYTLLV--PGIGKSVTFKIDKNIYANPFKTAMLGMYLWRCGTSVSATHNG----NVFSHETCHTKD  RcCel9X  RVFKAVFSDLKQPGEYYITVNEDGIEKSPRFKIGNDIFKPLLTDVARYFYFQRSGTDLTEEYCP----DYPRKDRTPQDT  CjGH9A   NIHHIDLSSVTATGSGFTLT--VGGDSSYPFSISSTTFNAAFYDALKYFYHNRSGIAIETPYTGGGRGSYASHSRWSRPA  CjGH9B   RVKLADFSGITEPGIYQLRV--EGVEDSHSFPIGTEVYRDLAAASIKAFYYNRAGTALLSQHAG----IYARQAGHPDTQ  CjGH9C   KVWWFDFSNVTATGNYAVVD-VERNVRSPGFRIAADVYKPVLKHAVRTFFYQRAGFAKQQPYAEAGWTDGASHLGSCQDT                    170       180       190       200       210       220       230       240                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  AYTDYINGQHSI-------KDG-----GKGWHDAGDYNKYVVNAGITVGSMFFAWEQFKDQIK------EISLTMPESNN  RcCel9X  AAIYDSNPSATR-------------DVSQGWFDAGDLGKYVSTGAMAAINILWSYEMFPEVYT------DNQFTIPESGN  CjGH9A   GHLNQGANKGDMNVPCWSGTCNYSLNVTKGWYDAGDHGKYVVNGGISVWTLLNLYERAQHITGNLAAVADGSMNIPESGN  CjGH9B   VYVHGSAASQAR-------PEGTVISSPKGWYDAGDYNKYIVNSGISTYTLLAAYEHFPLLFN------EQNLNIPESND  CjGH9C   QARLFKREGNTITG-----VSGTEKDLSGGWFDAGDYNKYTNWHADYLIALLHAYLENPSAWT-------DDFNIPESSN                    250       260       270       280       290       300       310       320                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  SMPDYLDELKYETDWLLTMQYPD---GSGKVSHKLSTKDFGGF-VLPEKETTDRFFTPWGSAATADFVAMMAMASRAFRP  RcCel9X  GIPDILDETKWQLDWILKMQDTS----SGGFYARVQSDDDGNITKRIIKDKEGDVANIRSTEDTACAAAALAHASIVYEK  CjGH9A   GVADILDEARWQMEFMLAMQVPQGQAKAGMAHHKIHDVGWTGLPLAPHEDPQQRALVPPSTAATLNLAATAAQAARIWKD  CjGH9B   DMPDLLDEILWNLEWMLTMQDPH---DGG-VYHKLTNKNFDGT-VMPHQATSQRYVVQKTTAAALDFAAVMATASRVFAA  CjGH9C   GIPDLIDEIKWGFDWLKKMQNND-----GSVLSILG----LAHASPPSAAMGCSYYGPASTSATLSSAAAFAFGAKVFAD                    330       340       350       360       370       380       390       400                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  YDAAY----ADKCIAAAKVSYAFLKANP-------------------WNTKPDQSGFTTGAY-DTTDTDDRLWAAAEMWE  RcCel9X  YDPAF----ALKCLNAAKSAWSYLEKNP-------------------SNIKSPD-----GPYSTADDSQSRFLAAATLYR  CjGH9A   IDAGF----AALCLTAAERAWNAAQANP-------------------NDIYSGNYDNGGGGYGDRFVADEFYWAAAELYI  CjGH9B   YEAQRPG-LAAQMREAAESAWAWAQTNP-------------------AVFYVQPADIRTGEYGDRSLADEFAWAAAELYI  CjGH9C   LGNELLASYAADLQTRAANAWTWAGNNPDVRVNNNQGIYQGLGAGDQEVCSAPGSDATAQQRCANALTEKKVQAAVYLFA                    410       420       430       440       450       460       470       480                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  TLGDSSY-LADFEASANTFTKKIDVDFDWGNVNNLGMFTYLLSERSGKNPALYNTIKSALISAADSIVAIADGHGYGRPL  RcCel9X  VTGEAKYNDYFLENYSKGKNSYENVSGDWVGSWNFAFFSYMKANN--RNGDAEKWFKDEFTIWLNNKIDRYKNNTWGNTI  CjGH9A   TTGDSRY-LPTINNYT-----LERTDFGWPDTELLGVMSLAVVPATHTN-SLRIAARNHIQTIASTHLTTQSASGYPAPL  CjGH9B   TTGDDSY-YTAMNAAG-----TENTVSSWGDVRGLAWISLAHHRDNLTDLADQDLIASRVTGLANSLHNTWQASAYRVSM  CjGH9C   LTGTDAY-KTLAENFINSNRLYWVSHWNTNRVNTFLYYASLPGATSSTANTIRSDYSNQITGSDYLAGVRNANDAYRVYL                    490       500       510       520       530       540       550       560                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  G--ATYYWGCNGTVARQTMILNIANKLS--PKSEYVNTSLDALNFLFGRNYYNRSFVT---GLGLNPPMNPHDRRSG---  RcCel9X  AN-GNYYWGSNSQILGMCMEALIGSKVLGISNDEINKMTFSSFNWLLGANAMRKSFVS---GYGDDCIKTIFSTFNN---  CjGH9A   SS-LEYYWGSNSVIANKLVLMGLAYDFS--GNQNFALGVSKGINYLFGSNVLSTSFIT---GLGTNTVAQPHHRFWAGAL  CjGH9B   QT-NHFVWGSNSVALNQGIVLVQAYRLS--GERRYLDAAQSMLDYVLGRNATNIAQVT---GFGTRSTLHPHHRPSE---  CjGH9C   EGAGGFSWGSNRSMSQRGTLFVHYAALSSQHAGEANNAALGYLNYLHGTNPLGMVYLSNMYSLGVHSSVNEFYHSWFADK                    570       580       590       600       610       620       630       640                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| RcCel9U  -------GDSLKDPWPGYLVGGGWPG-----------------------------AKDWTDNQDSYETN--EIAINWNGA  RcCel9X  --------DGKSGIPKGFMPGGINRYQG--------------------VGLSLFPSKCYLDSADEWSTN--EHTTGWNSI  CjGH9A   -------NSNYPWAPPGALSGGPNAGLEDSLSASRL------------SGCTSRPATCWLDSIDAWSTN--EITINWNAP  CjGH9B   -------ADGIEAPIPGFVAGGPNPGQQDRSDCP-V------------SYPSAVTARSYLDHYCSYASN--EIAINWNAP  CjGH9C   SANWDRVGQSVYGPAPGFLVGGPNPNYNWDSNCNTQNPHQDCGTAAPNPPTGQPAMKSYLDFNTSWPLNSWEVTENHNDY                    650                ....|....|....|. RcCel9U  LIYAL-----------  RcCel9X  LTFVA-----------  CjGH9A   LAWVLGFYNDFA----  CjGH9B   LVYLVAAVQALTPVNP  CjGH9C   QVAYIRLLSKFVSN--   139  5.3.2 Recombinant protein production and purification To facilitate the study, a modular dissection approach was initially used to clone and recombinantly produce the catalytic GH9 modules in E. coli. Notably, it has been previously shown that GH9 activity can be completely abolished upon the full- or half-deletion of the N-terminal Ig-like domain despite both having independent folds [116, 120]. Moreover, the N-terminal Ig-like domain was found fundamental in the production of soluble GH9 enzymes and in the generation of a functional conformation of the active site [119]. Therefore, the Ig-like domain was kept in the expressing constructs, and we initially attempted the recombinant production of the CjGH9 enzymes in E. coli as Ig-GH9 fusions.  Only Ig-GH9B was successfully purified from the bacterial lysate (typical yield ~11 mg L-1, calculated mass, 62514.4 Da; observed by ESI-MS, 62517.9 Da), although the majority of the recombinant protein was produced in inclusion bodies (Figure 5.3). On the other hand, the recombinant production and purification of Ig-GH9A and Ig-GH9C were met with difficulty, presumably due to the improper folding of the proteins and subsequently, they were not successfully retrieved from the soluble fractions of the lysed E. coli cells. It has been previously shown that GH9 enzymes have two calcium-binding sites as a conserved structural feature that is essential for enzyme integrity and theromstability [116, 262-264]. Therefore, protein production and purification were reattempted after the addition of calcium chloride to a final concentration of 1 mM in the growth media, purification and storage buffers. The presence of calcium ions, however, did not affect the solubility of the recombinant proteins and both Ig-GH9A and Ig-GH9C were still mainly produced in inclusion bodies. Likewise, using low-temperature purification and switching to the BL21 E. coli expression strain did not improve the protein solubility and similar results were obtained. Therefore, only Ig-GH9B was used for all subsequent biochemical investigation to decipher the potential role of these enzymes on XyG utilization in C. japonicus.  5.3.3 CjGH9B substrate specificity Prior to the functional characterization of Ig-GH9B, it was necessary to identify the pH and temperature optima of the enzymatic activity. For that purpose, Ig-GH9B was incubated with  140   Figure  5.3. Intact protein mass spectrometry of Ig-GH9B. The peak at 62517.9 Da corresponds to Ig-GH9B (calculated 62514.4 Da).   BBG in a wide range of pHs and temperatures before the generated reducing ends were quantified. The pH profile displayed a classical bell-shaped representation with the highest enzymatic activity observed in phosphate buffer (pH 7) (Figure 5.4). When Ig-GH9B was incubated with BBG in phosphate buffer (pH 7) at a range of temperatures over 10 minutes, a bell-shaped temperature profile with an optimum temperature of 40 °C was obtained (Figure 5.4). To reveal the substrate specificity of Ig-GH9B, the recombinant enzyme was incubated with a panel of different polysaccharide substrates and specific activities were determined and compared. Ig-GH9B demonstrated a strong preference for BBG as indicated by the observed high specific activity value (141 ± 3.8 μmol·min-1·mg-1) (Table 5.1). When tested against the soluble cellulosic substrates CMC and HEC, Ig-GH9B showed ca. 3- and 6-fold lower specific activity at the highest tested substrate concentrations (2 mg mL-1) , respectively, compared to the natural substrate BBG (CMC, 42 ± 1.1 μmol·min-1·mg-1;  HEC, 24 ± 1.2 μmol·min-1·mg-1). 141      Figure  5.4. pH and temperature profiles of Ig-GH9B with barley β-glucan as a substrate. A) pH-rate profile of Ig-GH9B with BBG as a substrate. Black squares, citrate buffer; red circles, phosphate buffer; Blue triangles, HEPES buffer; and pink triangles, glycine buffer. Lines were drawn to guide the eye with no physical significance. B) Ig-GH9B temperature-activity profile with the substrate BBG. Error bars represent standard errors of the mean for 3 replicates. Table  5.1. Specific activity and kinetic parameters of Ig-GH9B towards different polysaccharide substrates 1 Determined at a substrate concentration of 2 mg mL-1. 2 Not determined. Substrate  kcat (s-1) Km  (mg. mL-1) kcat/Km  (mL. s-1. mg-1)  Specific activity1 µmole/min.mg barley β-glucan (BBG) 187 ± 4.8 0.5 ± 0.02 375 ± 6 141 ± 4 carboxymethyl cellulose (CMC) ND2 ND 22 ± 2 42 ± 1 hydroxyethyl cellulose (HEC) ND ND 13 ± 1 24 ± 1 konjac glucomannan (KGM) ND ND 13 ± 0.2 23 ± 1 xyloglucan (XyG) ND ND 5.5 ± 0.1 11 ± 0.1 142  Surprisingly, Ig-GH9B showed a specific activity value for KGM (23 ± 0.9 μmol. min-1.mg-1) similar to that observed for HEC at the same tested substrate concentration (2 mg mL-1). This observation could be explained by the presence of β-1,4 linked D-glucose residues within the KGM backbone (see section 1.2.2.4). On the other hand, when XyG was used as a substrate, Ig-GH9B demonstrated ca. 13-fold lower specific activity (11 ± 0.1 μmol·min-1·mg-1) at the highest tested substrate concentration (2 mg mL-1) when compared to BBG, suggesting the negative contribution of the XyG bulky sidechains in the catalysis (Table 5.1). The ability of Ig-GH9B to accommodate BBG, CMC, HEC, KGM, and XyG in its active site clearly illustrates the importance of β-1,4 linked D-glucose backbone residues in the recognition and catalysis. Although the recombinant Ig-GH9B displayed quite low substrate specificity towards XyG, this trace endo-xyloglucanase activity might be sufficient for the C. japonicus deletion quadruple mutant ΔCjGH5DΔCjGH5EΔCjGH5FΔCjGH74, which lacks the four specific endo-xyloglucanases, to grow on XyG in the in vitro conditions (see Chapter 3). It should be noted that Ig-GH9B did not demonstrate any activity against xanthan gum despite having a β-1,4-D-glucose backbone, thus suggesting the inability of the enzyme to accommodate the trisaccharide sidechains within the xanthan structure. Furthermore, the recombinant Ig-GH9B did not exhibit endo-mannanase activity on guar galactomannan, endo-xylanase activity on both beechwood xylan and wheat flour arabinoxylan, and β-1,3 endo-glucanase activity on the strict β-1,3 glucan backbone-containing polysaccharides curdlan and yeast β-glucan..  To further confirm the identified substrate specificity, we subjected Ig-GH9B to Michaelis-Menten kinetics using BBG, CMC, HEC, KGM and XyG as substrates (Table 5.1). When tested on BBG, the recombinant enzyme had a Km value of 0.5 ± 0.02 mg·mL-1 and a kcat value of 187 ± 4.8 s-1. Compared to the previously characterized β-1,3-1,4-glucanases from the GH9 family [265-267], Ig-GH9B demonstrated a relatively low Km value for BBG indicating a strong affinity towards this polysaccharide substrate. Additionally, the kcat value for Ig-GH9B towards BBG is considerably high showing the quick turn-over of the substrate. The relatively high kcat/ Km value for BBG (ca. 375 mL. mg-1. s-1) indicates the high catalytic efficiency of Ig-GH9B towards this substrate (Figure 5.5).  When CMC, HEC, KGM, and XyG were used, the tested substrate concentrations were much less than the apparent Km values. Therefore, the individual kinetic constants kcat and Km 143  could not be determined. Instead, the catalytic efficiencies of Ig-GH9B towards these substrates in terms of kcat/Km values were successfully extracted from the slopes of the linear functions of enzyme rates vs substrate concentrations (Table 5.1, Figure 5.5). The low hydrolytic ability of the recombinant Ig-GH9B towards CMC, HEC, KGM, and XyG was indeed evidenced by the observed small kcat/Km values relative to BBG. For instance, kcat/Km values indicate that Ig-GH9B has about 17-, 29-, 29-, and 68-fold higher specificity for BBG than CMC, HEC, KGM, and XyG, respectively (Table 5.1). Therefore, Michaelis-Menten kinetics along with specific activity determination certainly illustrate the substantial preference of Ig-GH9B towards mixed linkage β-glucan, and the weak ability of the enzyme to accommodate the artificial cellulose derivatives, CMC and HEC, as well as the other β-1,4 glucan backbone-containing substrates, KGM and XyG.   Figure  5.5. Michaelis-Menten kinetics of Ig-GH9B on different polysaccharide substrates. A) BBG. B) CMC. C) HEC. D) KGM. E) XyG. Bars represent standard errors based on 3 replicates.  144  5.3.4 Mode of action and bond cleavage specificity Time-course digestion profile of BBG by the recombinant Ig-GH9B clearly indicates the endo-dissociative mode of action in which the enzyme binds randomly across the polysaccharide chain to catalyze a cleavage event before it dissociates and repeats the process. Indeed, the endo-dissociative profile of Ig-GH9B is characterized by the significant amounts of mid-range molecular weight products generated at the initial stage of the reaction (Figure 5.6A). To reveal the bond cleavage specificity of Ig-GH9B, limit-digest analyses were performed to the end-point of hydrolysis before degradation products were analyzed by HPAEC-PAD and MALDI-TOF. BBG is a mixed-linkage glucan consisting primarily of cellotriosyl and cellotetraosyl moieties (β-1,4 linked glucosyl residues) connected by a single β-1,3 glycosidic bond. Notably, the β-1,3 linkages can represent up to 30% of the total linkages within the polysaccharide [268].  Limit-digest profile for Ig-GH9B against BBG revealed 1,3:1,4-β-gluco-triose A (G3G4G), 1,3:1,4-β-glucotetraose C (G4G3G4G), and cellobiose (G4G) as major oligosaccharide end-products. This profile suggests the high selectivity of the enzyme towards the β-1,4 linkages on the 4-O-substituted glucosyl residues (Figure 5.6B, 5.6C). Certainly, the identified cleavage specificity of Ig-GH9B is distinguished from the one observed in the bacterial GH16 licheninases which selectively hydrolyse the β-1,4 glycosidic linkages on the 3-O-substituted glycosyl residues [40].  Interestingly, the degradation product 1,3:1,4-β-glucotetraose A (G3G4G4G) was found accumulating in the early stage of the reaction (Figure 5.6A). However, unlike the Vitis vinifera EG16 enzyme (VvEG16, see section 1.5.1.1) which produces G3G4G4G as a limit-digest product with BBG [107], prolonged incubation of Ig-GH9B with BBG demonstrated the ability of the enzyme to catalytically convert the accumulated G3G4G4G into the major product [1,3:1,4-β-Glucotriose A (G3G4G)] and glucose (Figure 5.6B). The slow conversion rate of G3G4G4G might be explained by the weak positive subsite interactions taking place when accommodating the substrate in the active site cleft. To the best of my knowledge, the observed bond cleavage specificity for Ig-GH9B towards BBG represents the first in-depth analysis of BBG linkage specificity in the GH9 family. Indeed, these mechanistic investigations should be expanded to 145  include other members of the family to develop a broader insight into the GH9 mode of action and linkage specificity.  Figure  5.6. Mode of action and bond cleavage specificity of Ig-GH9B towards BBG. A) HPAEC-PAD analysis of the hydrolysis time course. B) HPAEC-PAD analysis of the limit-digest product profile. C) MALDI-TOF analysis of the limit-digest products. The observed molecular masses of the major peaks were 527.3 and 689.4 Da which correspond to [M+Na]+ of G3G4G(calculated: 527.4 Da) and G4G3G4G(calculated: 689.6), respectively. 146  When XyG was used as a substrate, limit-digest analysis revealed that Ig-GH9B hydrolyzes the polysaccharide at the unbranched backbone glucosyl residues to produce the Glc4- based oligosaccharides XXXG, XLXG, XXLG, and XLLG (nomenclature according to [30]) (Figure 5.7). Notably, this bond cleavage specificity and the resulting degradation profile have been previously observed in the first reported GH9 endo-xyloglucanase Cel9X from R. cellulolyticum [112, 127]. However, it is worthwhile mentioning that XyG degradation potential of the GH9 family is not widely explored and other modes of cleavage might be discovered by extending the analysis to include other GH9 members.  Figure  5.7. Ig-GH9B product analysis when incubated with the substrate XyG A) HPAEC-PAD analysis of the limit-digest products B) MALDI-TOF analysis of the limit-digest products. The observed molecular masses of the major 3 peaks were 1085.3, 1247.4 and 1409.4; which correspond to [M+Na]+ of XXXG (calculated: 1085.9), XLXG/XXLG (calculated: 1248.1), and XLLG (calculated: 1410.2), respectively.  147  5.4 Conclusions Cellvibrio japonicus has been considered as a useful platform for CAZyme discovery due to its readily available repertoire of plant biomass degrading enzymes. Our extensive biochemical investigations revealed that C. japonicus evolved efficient and selective machinery for XyG deconstruction which emphasizes the industrial potential of that soil saprophyte. In addition to the previously characterized specific endo-xyloglucanases from C. japonicus [219] (Chapter 2 and 3), we identified and functionally characterized a GH9 mixed-linkage glucanase (CjGH9B) with a weak side-activity against XyG. Indeed, our biochemical data suggest the fundamental contribution of CjGH9B in the utilization of mixed-linkage glucan. On the other hand, the enzyme might not be predominantly involved in the XyG degradation pathway given its poor activity against this substrate compared to the four previously characterized highly specific endo-xyloglucanases. However, this side-activity hampers the reverse genetic analysis which is required to pinpoint the biological contribution of enzymes of interest. The fact that the C. japonicus quadruple deletion mutant, in which the four selective endo-xyloglucanases were deleted from the genome, was able to grow on XyG could be simply explained by the presence of other enzymes, such as CjGH9B, which exhibit activity, albeit slight, against the readily available XyG substrate in the growth media. It should be noted, however, that in vitro growth experiments might not well represent the environmental growth conditions in which XyG is entangled with the cellulosic microfibrils in a complex network. In this case, the production of the specific endo-xyloglucanases might be necessary for growth and survival. This report provides the second example of a XyG-active GH9 enzyme (after the R. cellulolyticum Cel9X and Cel9U enzymes [112, 127]) and encourages the comprehensive screening and characterization of the GH9 family members against this substrate.   148  5.5 Supporting information 5.5.1 Supporting tables Table 5.S1. Primers used in the study  Primer name Sequence Ig-GH9A-NheI-F GACCGCTAGCGAGGTGGGTAACCCCCGTGTCAAC Ig-GH9A-XhoI-R GGTCCTCGAGTTACGCAAAATCGTTGTAGAAGCCCAG Ig-GH9B-NheI-F GACCGCTAGCAACCTGGGCCTGATCAAGCTGAAC Ig-GH9B-XhoI-R  GGTCCTCGAGTTATGGATTGACTGGCGTCAGCGC Ig-GH9C-NdeI-F GACCCATATGAATTTTATGGTGGTCGATCAGTTC Ig-GH9C -XhoI-R GGTCCTCGAGTTAGTTACTGACAAACTTGGATAAC   149  Chapter 6: Comprehensive functional characterization of four glycoside hydrolase family 3 enzymes4  6.1 Introduction Terrestrial biomass, specifically complex plant cell walls (lignocellulose), is a major reservoir in the global carbon cycle and a vast renewable resource for the production of food, chemicals, and materials. It is estimated that 1011 tons of plant biomass are broken down annually by a diverse array of bacteria and fungi [269]. Microbial lignocellulose degradation plays a key role in nutrient acquisition in animal digestive tracts and is therefore central to the health of humans, livestock, and wildlife [270-273]. Additionally, there is sustained interest in the development of lignocellulose bioconversion technologies for the synthesis of value-added products that can displace petrochemicals [274-277]. The intrinsic recalcitrance of plant biomass, which stymies microbial degradation and biorefinery applications alike, arises from the chemically complex, structurally composite nature of the plant cell wall [164]. Terrestrial plant cell walls are constructed of paracrystalline cellulose fibers embedded in a matrix of amorphous polysaccharides (hemicellulose and pectins), cross-linked polyphenolics (lignins), and structural proteins. The composition of these components varies across plant species and tissues [11].  Historically, plant cell wall types have been delineated on the basis of hemicellulose composition [11, 23]. In dicots and non-commelinid monocots, the xyloglucans (XyGs) are the predominant hemicellulosic polysaccharides, representing up to 25% of total dry weight of the wall [11, 23, 26]. XyGs are characterized by a cellulose-like linear β(1→4)-D-glucan backbone, which is regularly decorated with α(1→6)-xylopyranosyl branches in two common motifs, XXGG and XXXG, where G represents unbranched D-Glcp-β(1→4) and X represents a [D-Xylp-α(1→6)]-D-Glcp-β(1→4) substitution [30]. Depending on the species and tissue, the                                                  4 Adapted from: Cassandra E. Nelson, Mohamed A. Attia, Artur Rogowski, Carl Morland, Harry Brumer, and Jeffrey G. Gardner (2017). Comprehensive functional characterization of the Glycoside Hydrolase Family 3 enzymes from Cellvibrio japonicus reveals unique metabolic roles in biomass saccharification. Environmental Microbiology. 19:5025-5039.   150  xylosyl side-chains can be further ramified by diverse monosaccharides, often galactopyranosyl, fucopyranosyl, and arabinofuranosyl residues, and specific residues may also be acetylated (Figure 6.1A) [28].  In contrast, the XyG content of the cell walls of the Poales (e.g. grasses and cereals of the Poaceae) and Pteridophytes (represented by the horsetails) is low. Instead, xylans and mixed-linkage β-glucans (MLGs) predominate [11, 26, 278]. These MLGs are thought to be comprised of predominantly cellotriosyl and cellotetraosyl moieties (β(1→4)-D-gluco-tri- and tetrasaccharides, respectively) that are connected by β(1→3) linkages to form a polysaccharide chain with an irregular sequence and three-dimensional structure. The relative ratios of the cellotriosyl and cellotetraosyl units vary depending upon the plant species, and the non-linear β(1→3) linkages typically represent ca. 30% of the total linkages within the polysaccharide (Figure 6.1B) [268]. In addition to cellulose, XyG, MLG, and callose (β(1→3)-glucan [279]) represent the major sources of glucose for primary metabolism by saprophytic microorganisms. Of these, the Gram-negative bacterium Cellvibrio japonicus (formerly Pseudomonas fluorescens subsp. cellulosa) has emerged as a powerful system for biomass enzyme discovery, due to its ability to degrade nearly all plant cell wall polysaccharides via the production of more than 150 Carbohydrate-Active enZymes (CAZymes) from 59 Glycoside Hydrolase (GH), Polysaccharide Lyase (PL), Carbohydrate Esterase (CE), and Auxiliary Activity (AA) families [57, 156, 158]. Moreover, robust reverse-genetic and transcriptomic tools for C. japonicus have been developed, which significantly enable systems biology and metabolic engineering for industrial applications [33, 157, 159, 160, 280]. Using a comprehensive in vitro, in vivo, and in silico approach, we have recently revealed the central contributions of key GH5 and GH6 cellulases [157], an AA10 lytic polysaccharide mono-oxygenases [33], and two GH3 -glucosidases [281] employed for efficient utilization of crystalline cellulose by C. japonicus. Specifically, in the latter case we were able to determine that two of four GH3 enzymes were important for cellodextrin utilization, yet the physiological roles for the remaining two GH3 members were unresolved. In this report, we further delineate the specific physiological functions of all four C. japonicus GH3 enzymes through a systematic 151  combination of gene-deletion mutants, heterologous gene expression, recombinant enzyme production, and biochemical analysis. Our results reveal that these enzymes are not functionally redundant despite very high sequence similarity (Figure 6.S1), but instead have distinct roles in cleaving the diverse glucosidic linkages present in plant biomass in vivo.  Figure  6.1. Representative structure of xyloglucan (XyG) and mixed linkage β-glucan (MLG). (A) Dicot (fucogalacto)xyloglucan indicating variable sidechain substitution. (B) General Poales MLG structure comprised of β(1→4)-linked cellotetraose and cellotriose units connected by β(1→3) linkages. Representations of monosaccharide residues are according to [282]: glucose, blue circles; xylose, orange stars; galactose, yellow circles; L-fucose, red triangles.   152  6.2 Materials and Methods 6.2.1 Recombinant protein production and purification The expression vectors pET21a::GH3A, pET21a::GH3B, pET28a::GH3C, and pET28a::GH3D, produced as previously described without the predicted native signal peptide-encoding sequences [281], were generously provided by Prof. Harry Gilbert (Newcastle University, UK). These constructs included fusion of a C-terminal 6x His-Tag (GH3A and GH3B) or an N-terminal 6x His-Tag (GH3C and GH3D) to aid purification. Constructs were transformed to the chemically competent BL21 E. coli strain. Colonies were grown on LB solid media containing 50 µg ml-1 ampicillin (GH3A and GH3B) or 50 µg ml-1 kanamycin (GH3C and GH3D). One colony was selected from each plate, inoculated in 15 mL LB liquid medium containing the same antibiotic, and grown overnight at 37 °C (200 RPM). The entire overnight cultures were used to inoculate 1 liter LB media containing the proper antibiotic before they were grown at 37 °C (200 RPM) until OD600= 0.8. To induce the overexpression, IPTG to a final concentration of 1 mM was added before the growing flasks were transferred to a 16 °C cooling shaker for an overnight incubation (200 RPM). Cultures were then centrifuged and pellets were re-suspended in 20 mL of E. coli lysis buffer containing 20 mM HEPES, pH 7.0, 500 mM NaCl, 40 mM imidazole, 5% glycerol, 1 mM DTT and 1 mM PMSF. Sonication was used to disrupt the cells and the clear supernatants were separated by centrifugation at 4 °C (4220 g for 60 minutes). To purify recombinant proteins from the clear soluble lysates, a BioRad® FPLC system with a Ni+2– affinity column was used. The system utilized a gradient elution up to 100% elution buffer containing 20 mM HEPES, pH 7.0, 100 mM NaCl, 500 mM imidazole, and 5% glycerol. Pure fractions verified by SDS-PAGE were pooled, concentrated, and buffer exchanged against 50 mM phosphate buffer (pH 7.5). Protein concentrations were finally determined using Epoch Micro-Volume Spectrophotometer System (BioTek®,USA) at 280 nm. Typical production yields of 2.5, 5, 22, and 15 mg per liter of culture medium were obtained for Bgl3A, Bgl3B, Bgl3C, and Bgl3D, respectively. 6.2.2 Carbohydrate sources Tamarind seed xyloglucan, sophorose, and laminaribiose were purchased from Megazyme® (Bray, Ireland). Gentiobiose was purchased from Carbosynth (Berkshire, UK). 153  6.2.3 Preparing the XyG-based oligosaccharide substrates Both GXXG and GLLG were obtained using the XyG-degrading enzymes from C. japonicus. For making GXXG one gram of tamarind seed XyG was dissolved in 100 mL of double distilled water. To speed up dissolution, continuous stirring at 60 ºC was employed for 2-3 hours. XyG solution was then cooled down to room temperature and a phosphate buffer (pH 7.5) to final concentration of 1 mM was added. One mg of the C. japonicus endo-xyloglucanase CjGH5E was then added and the digestion reaction was incubated at 40 ºC overnight while stirring. The limit-digest reaction gave the XyGOs XXXG, XLXG, XXLG, and XLLG. After the completion of the reaction was verified using HPAEC-PAD and MALDI-TOF analyses, the reaction was stopped by boiling for 15 minutes. The XyGOs mixture was then cooled down to room temperature and 0.2 mg of the C. japonicus β-galactosidase Bgl35A [33] was added. Reaction was then incubated overnight while stirring at room temperature. After the reaction was completed, only the sole product XXXG was identified. To obtain GXXG, 0.5 mg of the C. japonicus α-xylosidase Xyl31A [145] was added to the XXXG solution before the reaction mixture was incubated overnight while stirring at room temperature. After confirming the identity of the GXXG product, the solution was boiled to inactivate the enzymes, and then GXXG was freeze-dried after flash freezing in liquid nitrogen. For purifying GXXG, ~230 mg of the freeze-dried powder were dissolved in ~ 2.5 mL of ultra-pure water before they were loaded on 90-cm BioGel® P-2 Gel (Bio-Rad, California) column (XK 26/100, GE Healthcare). Fractions collected from the isocratic elution at 0.5 ml/ min were screened for the presence of GXXG using HPAEC-PAD and MALDI-TOF analyses. Pure fractions were pooled, flash frozen and freeze- dried.  To produce GLLG, the XyGO mixture XXXG, XLXG, XXLG, and XLLG was obtained following the same protocol described above. The crude oligosaccharide mixture was peracetylated and separated according the previously described protocol [173]. XLLG was obtained after performing the deacetylation as described previously [283]. To obtain GLLG, 30-40 mg of XLLG were incubated overnight at room temperature with 0.5 mg of the C. japonicus α-xylosidase Xyl31A [145] in a 4 mL reaction volume containing 25 mM citrate buffer (pH 6). Product identity and reaction completion were confirmed before the mixture was loaded on 90-154  cm BioGel® P-2 Gel (Bio-Rad, California) column (XK 26/100, GE Healthcare) for purification (vide supra).  6.2.4 Enzyme kinetics All enzyme assays against the different disaccharides and XyG-based oligosaccharides were performed using a glucose detection kit (D-glucose-HK, megazyme®). Enzyme assays were executed following manufacturer‟s protocol with a minor modification: each recombinant β-glucosidase was added to a freshly prepared mixture of the tested substrate with the kit components (Buffer pH 7, NADP+ + ATP, hexokinase + glucose-6-phosphate dehydrogenase). Subsequently, the rate of glucose release, corresponding to the rate of enzymatic activity, can be monitored in a continuous assay format using Cary50 UV–visible spectrophotometer (Varian) at 340 nm. The enzyme concentration (0.004 to 0.81 µM) to be used was determined based on the resulting activity on the tested substrate and so that less than 10% conversion of the substrate is achieved within the measurement time. For calculation, a molar extinction coefficient of 6220 M-1cm-1 was used for NADPH at 340 nm. 6.2.5 Carbohydrate analytics High Performance Anion-Exchange Chromatography with Pulsed Amperometric Detection (HPAEC-PAD) was performed on a Dionex ICS-5000 DC HPLC system operated by the Chromeleon software version 7 (Dionex) using a Dionex Carbopac PA200 column. Method of analysis followed the previously published protocol [219]. Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) was performed on a Bruker Daltonics Autoflex System (Billerica, USA) and using the matrix, 2,5-dihydroxy benzoic acid as previously described [219]. 6.2.6 Bgl3D product analysis Six µg of the recombinant Bgl3D was incubated overnight at 37 °C with GXXG (0.25 mM final concentration) in a 400 uL reaction volume containing 50 mM phosphate buffer (pH 7.5). The enzyme was then inactivated by boiling for 15 minutes to stop the reaction. After the mixture was cooled down, 10 µg of the C. japonicus α-xylosidase Xyl31A was added and the reaction was incubated at 37 °C for 6 hours. By repeating the aforementioned steps, products were subjected to sequential degradation using GH3D and Xyl31A with a heat-inactivation step 155  in between. Reactions were always monitored using HPAEC-PAD and the identity of the product from each step was verified using MALDI-TOF analysis as described above. 6.2.7 Growth conditions  Growth experiments with E. coli and C. japonicus strains used MOPS (3-(N-morpholino)propanesulfonic acid) defined media [284] with 0.25% (w:v) glucose, 0.25% (w:v) xylose, 0.5% (w:v) sophorose, 0.5% (w:v) laminaribiose, 0.5% (w:v) gentiobiose, 0.5% (w:v) CM-curdlan, 0.5% (w:v) barley glucan, 0.5% (w:v) xyloglucan, or 0.5% (w:v) xyloglucan oligosaccharides as carbon sources. Disaccharides, curdlan, and MLG were purchased from MegaZyme (Ireland, UK). Antibiotics were used at the following concentrations, kanamycin (50 µg ml-1) and gentamycin (15 µg ml-1). All growth experiment parameters including inoculation, temperature, aeration level, measurement of growth rate and maximum optical density were identical to what has been previously described [157, 281]. All experiments were performed in biological triplicate and standard deviation was calculated using the GraphPad Prism 6 software (CA, USA).  6.2.8 Genetic techniques  The construction of the 4G xylA quintuple mutant and subsequent verification by PCR was identical to methods previously described [157, 160, 281]. A complete list of strains, plasmids, and primers can be found in Table 6.S2.   156  6.3 Results 6.3.1 Linkage specificity of C. japonicus GH3 members  Substrate choice for in vitro enzymology 6.3.1.1Our previous studies on the functions of the C. japonicus GH3 enzymes tested activity exclusively on β(1→4)-linked gluco-oligosaccharides (cellodextrins). Although all GH3 members were competent β(1→4)-glucosidases in vitro and were able to confer growth on cellobiose when expressed heterologously in E. coli in vivo, individual activities on cello-oligosaccharides varied by orders-of-magnitude amongst the recombinant enzymes [281]. These results suggested that the true substrates for some of the GH3 enzymes might not be β(1→4)-linked gluco-oligosaccharides. To explore the wider capacity of the C. japonicus GH3 enzymes to cleave other -glucosidic linkages present in nature, we performed Michaelis-Menten kinetic analysis of β(1→2)-, β(1→3)-, and β(1→6)-linked disaccharides, as well as representative xylogluco-oligosaccharides (XyGOs) containing a β(1→4)-linked Glcp backbone (Table 6.1, Figure 6.S2).  Specifically, sophorose (Glcp(1→2)Glcp) is a well-known inducer of cellulase gene expression in bacteria and filamentous fungi [285, 286]. The cognate β(1→2) linkage is also found naturally in yeast sophorolipids [287, 288]. Additionally, cyclic β(1→2)-glucans are involved in infection or symbiosis with plants and animals [289-291]. Laminaribiose (Glcp(1→3)Glcp) is representative of exclusive (1→3) backbone linkages found in bacterial exopolysaccharides (e.g. curdlan), yeast cell wall glucan, and algal cell wall laminarin [292-295]. Within plant biomass, (1→3) linkages occur in callose ((1→3)-glucan) and in grass and cereal MLGs [268, 279]. The (1→6) linkages represented by gentiobiose (Glcp(1→6)Glcp) are comparatively rare, however (1→6)-glucans are known to be an integral component of yeast cell wall polysaccharides [295]. Finally, the xyloglucan oligosaccharides GXXG and GLLG represent known intermediates in the complete saccharification of plant cell wall XyG by C. japonicus [33, 145].   157  Table  6.1. Kinetic parameters of C. japonicus GH3 enzymes against different gluco-disaccharides, cellotetraose and xyloglucan based oligosaccharides.a Enzyme Substrate kcat (min-1) Km (mM) kcat/Km  (min-1.mM-1) Bgl3A sophorose (2.14 ± 0.05) x 103 3.36 ± 0.33 637 ± 65 laminaribiose (9.23 ± 0.33) x 103 2.62 ± 0.33 (3.52 ± 0.46) x 103 cellobioseb 117 ± 3 1.8 ± 0.1 65.0 ± 3.5 cellotetraoseb - - (5.31 ± 0.11) x 103 GXXG 37.6 ± 1.4  1.67 ± 0.13 22.5 ± 1.9 GLLG NDc ND ND gentiobiose 24.5 ± 1.0 11.3 ± 1.2 2.17 ± 0.25  Bgl3B sophorose (19.9 ± 0.5) x 103 5.81 ± 0.46 (3.43 ± 0.28) x 103 laminaribiose (5.65 ± 0.19) x 103 2.39 ± 0.29 (2.35 ± 0.30) x 103 cellobioseb (5.39 ± 0.18) x 103 1.5 ± 0.1 (3.59 ± 0.33) x 103 cellotetraoseb - - 522 ± 12 GXXG 23.7 ± 3.5 5.15 ± 1.17  4.60 ± 1.25 GLLG ND ND ND gentiobiose 386 ± 30 9.31 ± 1.96 41.5 ± 9.3  Bgl3C sophorose (3.46 ± 0.44) x 103  4.60 ± 0.97 752 ± 185 laminaribiose (3.14 ± 0.11) x 103 0.64 ± 0.07 (4.91 ± 0.56) x 103 cellobioseb 168 ± 6 1.9 ± 0.1 88.4 ± 7.2  cellotetraoseb - - 856 ± 22 GXXG 27.0 ± 1.9 2.46 ± 0.33 11.3 ± 1.7 GLLG ND ND ND gentiobiose 161 ± 2  4.88 ± 0.20 33.0 ± 1.4 Bgl3D sophorose (2.23 ± 0.13) x 103 28.7 ± 3.0  77.8 ± 9.4 laminaribiose (2.85 ± 0.21) x 103  6.41 ± 1.13 445 ± 85 cellobioseb 10 ± 1 11.5 ± 1.5 0.87 ± 0.13  cellotetraoseb - - 91.0 ± 2.5 GXXG (1.33 ± 0.08) x 103 0.83 ± 0.13 (1.60 ± 0.27) x 103 GLLG 749 ± 46 1.21 ± 0.14 619 ± 81 gentiobiose 39.7 ± 2.1  83.3 ± 5.9 0.48 ± 0.04 a Supporting initial-rate kinetic data is available in Fig 6.S2. bData from [281]. For cellotetrose, substrate saturation was not achieved and kcat/Km values were determined from the slope of linear velocity-[S] plots. c ND, not determined due to insufficient activity.   158  As detailed below, all of the GH3 members exhibit -glucosidase specificities that are neither necessarily restricted nor related to cellulose utilization. In recognition of the new biochemical and biological data presented herein, the individual enzymes will be henceforth denoted with the non-prescriptive identifiers Bgl3A, Bgl3B, Bgl3C, and Bgl3D (Bgl: -glucosidase, cf. [296, 297], rather than the original [156] Cel3A, Cel3B, Cel3C, and Cel3D, respectively (Cel: cellulase [298]).  6.3.1.1.1 Bgl3A exhibits β(1→3) and β(1→4) specificity in vitro In our recent cellulose utilization study, Bgl3A exhibited the highest catalytic efficiency towards longer cello-oligosaccharides (degree-of-polymerization, DP: 3-5), as indicated by high kcat/Km values among the four GH3 members [281]. Bgl3A exhibited a similarly high specificity for the β(1→3)-linked laminaribiose, with a kcat/Km value ca. 1.5-fold lower than cellotetraose as a benchmark, indicating a potential biological role of the enzyme in β(1→3)-glucan and MLG saccharification. Bgl3A displayed ca. 8-fold less specificity towards the β(1→2) linkage of sophorose than cellotetraose, and was catalytically ineffective toward gentiobiose and the XyGOs, all of which exhibited lower activities than cellobiose (Table 6.1, Figure 6.S2). 6.3.1.1.2 Bgl3B is agnostic toward β(1→2), β(1→3) and β(1→4) linkages in vitro The highly constitutively expressed bgl3B has been shown to be fundamental in cellodextrin utilization by C. japonicus, and the gene product has a strong preference for cellobiose [281]. Commensurate with these observations, Michaelis-Menten analysis with alternate -di-glucosides demonstrates superior specificity for cellobiose over longer cello-oligosaccharides. The Bgl3B kcat/Km values also indicate high specificities for sophorose and laminaribiose, however, this enzyme was weakly active on gentiobiose. Bgl3B only cleaved the XyGO substrate GXXG weakly, and was not active against the bis-galactosylated GLLG XyGO (Table 6.1, Figure 6.S2). 6.3.1.1.3 Bgl3C displays preferential β(1→3) specificity in vitro Previous biochemical characterization and mutational analysis of Bgl3C indicated only modest activity toward β(1→4)-linked cello-oligosaccharides and suggested no physiological 159  contribution to cellulose utilization [281]. Extending the biochemical analysis of Bgl3C to include alternate -disaccharides revealed a predominant specificity for the β(1→3)-linked laminaribiose, with a kcat/Km value 55- and 5.7-fold higher than for cellobiose and cellotetraose, respectively. Sophorose was hydrolyzed with a comparable kcat value to that of laminaribiose, although the Km value for sophorose was significantly (7-fold) higher, resulting in a lower overall specificity constant. As also observed for Bgl3A and Bgl3B, Bgl3C was poorly active on gentiobiose and GXXG, and not active on GLLG (Table 6.1, Figure 6.S2). 6.3.1.1.4 Bgl3D is a XyGO-specific β(1→4) glucosidase in vitro Of the four GH3 members from C. japonicus, Bgl3D was catalytically weak against all tested cello-oligosaccharides. Specifically, the kcat/Km value for cellobiose was ca. 1 mM-1 min-1, while the highest kcat/Km value observed was 90 mM-1 min-1 for cellotetraose [281]. Although Bgl3D can hydrolyze sophorose and laminaribiose with weak to moderate specificities due to comparatively high Km values, the data also indicate a uniquely high specificity for XyGOs (Table 6.1, Figure 6.S2). Indeed, Bgl3D was the only GH3 enzyme from C. japonicus that was able to hydrolyze both GXXG and GLLG, which are key intermediates in the complete saccharification of XyG [33, 145]. The kcat/Km values of Bgl3D for GXXG and GLLG were ca. 17- and 7-fold higher, respectively, than for cellotetraose, which highlights the significance of side-chain branching to substrate recognition and catalysis. Enzyme product analysis revealed the specificity of Bgl3D for the non-reducing-terminal β(1→4)-Glcp residue of GXXG, as well as the ability of Bgl3D to work in concert with Xyl31A [33, 145] to effect the complete, stepwise hydrolysis of GXXG (Figure 6.S3).  GH3 members are not universally nor equally sufficient for disaccharide 6.3.1.2utilization  We demonstrated previously that heterologous expression of individual C. japonicus GH3 genes were sufficient to confer the non-cellulolytic bacterium Escherichia coli with the ability to utilize cellobiose as a sole carbon source [281]. To further support in vitro enzyme specificity data and probe the potential sufficiency of the observed primary and side activities to support growth, we applied the same heterologous expression system using sophorose, laminaribiose, or gentiobiose as sole carbon sources (Figure 6.2).  160   Figure  6.2. Growth of E. coli strains expressing individual C. japonicus GH3 genes using mono- and disaccharides. (A) glucose, (B) sophorose, (C) gentiobiose, (D) laminaribiose. E. coli harboring the empty pBBRMCS-5 vector (pVOC) was included as a negative control. Experiments were performed in biological triplicate with the error bars representing the standard deviation. Growth rates and maximum optical density are summarized in Table 6.S1. Analysis of potential signal peptides using LipoP 1.0 [220] to predict subcellular localization indicated that Bgl3A and Bgl3C possess a Signal Peptidase II cleavage site, and are therefore likely to be anchored facing the periplasm or extracellular environment via N-terminal lipidation on Cys26 and Cys21, respectively (Figure 6.S1) [234]. Indeed, previous analysis indicated that recombinant expression of full-length Bgl3A (then known as CelD), but not an N-terminally truncated version, in E. coli results in membrane association [299]. LipoP 1.0 also indicated that Bgl3D possesses a predicted Signal Peptidase I cleavage site and is therefore likely to be secreted to the periplasm [234]. In contrast, Bgl3B lacked a predicted signal peptide and is thus likely to be cytosolic.  Congruent with the kinetic analysis of recombinant enzymes, individual heterologous expression of the full ORFs of all four GH3 genes enabled E. coli to more rapidly utilize laminaribose compared to the negative control strain (Figure 6.2D, Table 6.S1). We attribute the Laminaribiose0 6 12 18 24 30 36 42 480.11K12/pVOCK12/pbgl3AK12/pbgl3BK12/pbgl3CK12/pbgl3DTime [hours]OD600Gentiobiose0 6 12 18 24 30 36 42 480.11K12/pVOCK12/pbgl3AK12/pbgl3BK12/pbgl3CK12/pbgl3DTime [hours]OD600Sophorose0 6 12 18 24 30 36 42 480.11K12/pVOCK12/pbgl3AK12/pbgl3BK12/pbgl3CK12/pbgl3DTime [hours]OD600Glucose0 6 12 18 24 30 36 42 480.11K12/pVOCK12/pbgl3AK12/pbgl3BK12/pbgl3CK12/pbgl3DTime [hours]OD600A BC D161  eventual growth observed in the negative control strain after a prolonged lag period to the three GH1 enzymes and two GH3 enzymes encoded by E. coli K12 [300], which may have sufficient side activity to support limited growth on laminaribiose.  Similarly, the heterologous expression of bgl3A, bgl3C, and bgl3D in E. coli conferred growth on both sophorose and gentiobiose (Figure 6.2B & 6.2C), although growth on gentiobiose was significantly slower in all cases (Table 6.S1). For these GH3 genes, the growth rates of the engineered E. coli strains generally correlated with the kinetic data, specifically that laminaribose > sophorose > gentibiose (Figure 6.2). The growth of these strains is indicative of appropriate trafficking to the periplasm, as predicted by the aforementioned signal peptide analysis, where the substrates are accessible.  In contrast, the apparent failure of heterologous expression of the bgl3B gene to confer growth on gentibiose or sophorose despite the comparably high catalytic efficiency of the corresponding enzyme on both substrates was striking, but not unexpected due to its predicted cytosolic location (vide supra). As such, Bgl3B would be unable to confer growth to E. coli in the absence of suitable inner membrane PTS transporters for sophorose and gentibiose. Indeed, E. coli is currently only known to contain a (14)-specific diglucoside transporter, and the import other β-diglucosides has not been characterized [301].  6.3.2 Functional roles of GH3 members in alternate -glucan utilization in C. japonicus Possessing a broader understanding of the catalytic potential of its GH3 members, we then sought to ascertain the individual contributions of these -glucosidases to the physiology of C. japonicus. Despite the apparent substrate promiscuity observed in assays in vitro, we anticipated that the controlled expression and localization of these enzymes in their native environment strongly affects their individual contributions to the utilization of specific -glucans. Building upon our previous reverse-genetic analysis of cellulose utilization, we employed a suite of individual (Δbgl3A, Δbgl3B, Δbgl3C, Δbgl3D) and combinatorial gene deletion mutants, including a GH3 quadruple mutant (4G) [281].  162   β(1→6)-glucosides 6.3.2.1As described above, three of four engineered E. coli strains were able to leverage the expression of GH3 genes to grow using gentiobiose as a sole carbon source, despite weak kinetics on this disaccharide. However, in vivo analysis of genetic mutants is ultimately essential to reveal the actual physiologically-relevant contributions of individual GH3 members to β(1→6)-glucoside utilization by C. japonicus. We observed that all of the single gene deletion mutants grew like wild type when gentiobiose was the sole carbon source. Strikingly, all of the multiple mutants, including the Δ4βG strain, also had no growth defect on gentiobiose (Figure 6.S4), clearly indicating that one or more non-GH3 CAZymes are primarily responsible for cleaving the β(1→6) linkages of this disaccharide in C. japonicus.  β(1→2)-glucosides 6.3.2.2Mutational analysis suggested that the bgl3A and bgl3C gene products contribute synergistically to sophorose utilization (Figure 6.3). Specifically, when grown on this β(1→2)-linked diglucoside the bgl3A single deletion strain had a reproducible growth defect. The growth rate and maximum growth obtained was mirrored in the bgl3A bgl3B and bgl3A bgl3D double mutant strains, indicating that the Bgl3B or Bgl3D do not work in concert with Bgl3A to hydrolyze sophorose. In contrast, the bgl3A bgl3C double mutant strain was unable to grow, despite the absence of a growth phenotype for the bgl3C single mutant. The importance of Bgl3C and Bgl3A for sophorose utilization may be explained by comparable kinetics against this disaccharide and identical predicted trafficking to the periplasm (Table 6.1). Notably, strains containing the bgl3B deletion grew like wild type using sophorose, indicating that the cytoplasmic GH3 does not play a role in sophorose utilization despite the high activity of Bgl3B against this substrate. In light of the demonstrable action of Bgl3A and Bgl3C, it is likely that all exogenous sophorose is completely hydrolyzed in the periplasm under the experimental conditions and transported into the cytoplasm as glucose. As sophorose is an inducer for cellulase gene expression in C. japoncius [286], rapid degradation of this disaccharide may be an example of a metabolic mechanism to regulate cellulase expression.   163   Figure  6.3. Growth of C. japonicus wild-type and GH3 gene deletion mutants on the β(1→2) linked dissacharide sophorose. (A) single, (B) double, (C) triple and quadruple mutants versus wild type. Experiments were performed in biological triplicate with the error bars representing the standard deviation. Growth rates and maximum optical density are summarized in Table S1. All mutants grew as wild type on glucose, as shown previously [281].   β(1→3)-glucosides 6.3.2.3When grown on the β(1→3)-linked diglucoside laminaribiose, the bgl3A single deletion had a slight but reproducible growth defect, which was essentially recapitulated in the bgl3A bgl3B and bgl3A bgl3D double mutants (Figure 6.4A). Similar to the trend observed with sophorose, a bgl3A bgl3C double mutant had a longer lag period and a decreased growth rate 164  compared to wild type (Figure 6.4B, Table 6.S1). Commensurate with the primary role of Bgl3A in the utilization β(1→3)-linked saccharides, the bgl3B bgl3C bgl3D triple mutant grew as wild type (Figure 6.4C), which was also true with the polysaccharides curdlan and MLG (vide infra). The bgl3A bgl3C bgl3D and bgl3A bgl3B bgl3C triple mutants displayed increased lag periods, decreased growth rate, and did not achieve wild type final cell density (Table 6.S1). Finally, the 4G quadruple mutant was unable to grow using laminaribiose as a sole carbon source, which correlated with the sufficiency of all four GH3s toward this substrate (Figure 6.2D).  Figure  6.4. Growth of C. japonicus wild-type and GH3 gene deletion mutants on the β(1→3) linkage-containing substrates. (A-C) laminaribiose, (D-F) curdlan, (G-I) MLG. Experiments were performed in biological triplicate with the error bars representing the standard deviation. Growth rates and maximum optical density are summarized in Table S1. All mutants grew as wild type on glucose, as shown previously [281].   165  During growth on curdlan as a representative all-β(1→3)-glucan, the bgl3C single mutant strain displayed a slight, but reproducible growth defect (Figure 6.4D). This growth defect was further exacerbated in the bgl3A bgl3C double mutant (Figure 6.4E). The bgl3A bgl3B bgl3C triple mutant had a decreased growth rate (Figure 6.4F), and behaved similarly to the bgl3A bgl3C double mutant (Table 6.S1). Finally, both the bgl3A bgl3C bgl3D triple mutant and 4G quadruple mutant had extended lag phases, decreased growth rates, and were not able to grow to wild type levels of final cell density (Figure 6.4F). Notably, there is an absence of a growth rate defect for the bgl3A single mutant on curdlan when one is observed on laminaribiose. However, a growth rate defect on curdlan only emerges with the bgl3A bgl3C double mutant. The bgl3C single mutant defect is one of maximum growth, not growth rate, which may be a consequence of the generally poor growth of all strains on curdlan (Table 6.S1). The decreased growth rates and overall lower cell densities achieved with curdlan compared to the other substrates is likely due to carboxymethylation of the commercial substrate to improve solubility [294]. Similar reductions in C. japonicus growth have been observed with the artificial β(1→4)-glucan carboxymethylcellulose [159]. As a representative of the matrix glycan abundant in the cell walls of grasses, cereals, and horsetails, we tested the growth of our suite of GH3 mutants on barley MLG. Similar to laminaribiose, the only single mutant that displayed a growth rate defect on MLG was bgl3A (Figure 6.4G). Although the bgl3B single mutant grew as wild-type on MLG, the bgl3A bgl3B double mutant exhibited a more exaggerated growth defect than the bgl3A single mutant, suggesting that these two GH3 enzymes work synergistically for MLG utilization (Figure 6.4H, Table 6.S1). Similar synergy was observed during growth on β(1→4)-linked cellodextrins [281]. Most of the GH3 triple mutants had growth defects of moderate severity, but interestingly the bgl3B bgl3C bgl3D triple mutant grew like wild type, which further suggested that the bgl3A gene product is the main driver of MLG oligosaccharide hydrolysis (Figure 6.4I, Table 6.S1). The 4G quadruple mutant was still able to grow on MLG, albeit with a distinctly long lag phase (Figure 6.4I), suggesting the presence of additional -glucosidases. Analogously, an identical growth profile was observed for this mutant on cellobiose [281]. 166   Xyloglucan β(1→4)-glucosides 6.3.2.4As described in the Introduction, XyG is an abundant cell wall matrix glycan built on an all-β(1→4)-linked glucan backbone, which is essentially found in all terrestrial plants [23, 26]. To explore the potential contribution of C. japonicus GH3 β-glucosidases to the utilization of this ubiquitous polysaccharide, we examined the growth of our suite of mutants grown with XyG and XyGOs. Strikingly, the bgl3D single mutant exhibited a reduced growth rate on XyG (Figure 6.5A), which was slightly exacerbated in the bgl3B bgl3D double mutant (Figure 6.5B). None of the GH3 triple mutants nor the quadruple mutant displayed growth defects more severe than the bgl3B bgl3D double mutant (Figure 6.5C). The importance of the bgl3D and bgl3B gene products for XyG utilization was directly recapitulated during growth on XyGOs, with the additional observation that the subtle role of Bgl3B was more directly exposed in the growth profiles of the bgl3B single and bgl3B bgl3D double mutants (Figure 6.5D, Table 6.S1).  We hypothesized that the ability of the 4G mutant to grow on XyG was due in part to xylose utilization arising from the functional -xylosidase Xyl31A encoded by the xyloglucan utilization locus (XyGUL) of C. japonicus [33]. To test this hypothesis, we deleted the xylose isomerase gene xylA, from the 4G mutant strain, as it was previously shown that a C. japonicus xylA mutant was unable to utilize xylose as a carbon source [302]. The quintuple mutant had a reduced growth rate compared to the 4G mutant and achieved three-fold lower maximum density (Figure 6.S5, Table 6.S1). The residual growth of the 4G xylA quintuple mutant is most likely due to the action of the XyGUL GH35 -galactosidase on the galactosyl sidechains, which substitute approximately 50% of the xylose residues on tamarind seed XyG [33], and possibly by the action of cryptic -glucosidases, as indicated by our results for this mutant when using cellobiose or MLG.   167   Figure  6.5. Growth of C. japonicus wild-type and GH3 gene deletion mutants on xyloglucan and xylogluco-oliogsaccharides. (A) single, (B) double, (C) triple and quadruple mutants versus wild type on XyG. (D) Growth of select strains on XyGOs.  Experiments were performed in biological triplicate with the error bars representing the standard deviation. Growth rates and maximum optical density are summarized in Table 6.S1. All mutants grew as wild type on glucose, as shown previously [281].    168  6.4 Discussion The high-throughput (meta)genomics of saprophytic microorganisms of ecological and biotechnological interest has generated a vast abundance of sequence data on the diverse suites of CAZymes and other proteins that drive biomass degradation [147, 303-305]. A common observation is that most saprophytes encode multiple homologs from individual CAZyme families within their genomes. In the absence of definitive biochemical and biological data, such multiplicity is sometimes blithely dismissed as resulting from “functional redundancy,” despite the widely accepted theory of functional evolution through paralogous gene duplication [306]. Indeed, even the most exacting biochemical analysis performed in vitro can fail to reveal enzyme performance in complex, biologically relevant situations [240, 307, 308], particularly as all physiological and regulatory context is removed [280, 281]. Therefore, a combinatorial approach that synthesizes in vitro and in vivo methods can be a powerful tool to achieve a deeper understanding of CAZyme function. As such, we have leveraged C. japonicus as a model saprophytic organism to delineate the individual contributions of four prima facie similar GH3 -glucosidases to environmental polysaccharide utilization using an integrated systems biology approach. 6.4.1 Bgl3A has a major role in MLG and sophorose utilization and supports curdlan degradation Our previous study indicated a supporting role of Bgl3A in cellulose (all-β(1→4)-glucan) utilization and a substrate preference for cello-oligosaccharides of DP >2 [281]. Our current biochemical data demonstrated that Bgl3A can also hydrolyze the β(1→3) linkage of laminaribiose with a comparably high specificity and that heterologous expression in E. coli was sufficient to confer growth on this disaccharide. Commensurately, C. japonicus the bgl3A single deletion mutant was the only one to exhibit slowed growth on laminaribiose and MLG as sole carbon sources, while no growth defect was observed for the triple mutant bgl3B bgl3C bgl3D (Figure 6.4). These data suggest the primacy of Bgl3A in the degradation of MLG oligosaccharides arising from Poales cell walls. The observed growth defects from double mutants (Figure 6.4B, 6.4E, & 6.4H) revealed that bgl3A and bgl3C gene products were the main drivers of curdlan utilization (all-β(1→3)-glucoside). The requirement for two GH3 169  enzymes for effective substrate utilization was also observed with Bgl3A and Bgl3B, which were primarily responsible for the consumption of MLG that comprises both β(1→3) and β(1→4)-linkages. This latter observation is concordant with our previous study indicating that Bgl3B is the single greatest contributor to β(1→4)-linked cellobiose utilization [281], and indicates particular synergy in MLG oligosaccharide utilization. Growth analysis of single mutants indicates that the bgl3A gene product is also the primary enzyme responsible for sophorose utilization by C. japonicus. The loss of Bgl3A can only be compensated by Bgl3C, despite Bgl3B also possessing excellent activity toward this substrate (Bgl3D is hobbled by a Km value of ca. 30 mM). The abundance of β(1→2)-glucans in the natural environment (vide supra) and their corresponding importance to the growth of saprophytes is presently unclear. It is worth noting, however, this disaccharide has been shown to induce cellulase production in C. japonicus [286], and Bgl3A may function together with Bgl3C (vide infra) in signal attenuation. 6.4.2 Bgl3B underpins cellodextrin degradation and supports MLG utilization Biochemical and reverse genetic analyses have previously shown the essential contribution of Bgl3B in cellulose utilization with an exquisite specificity towards cellobiose [281]. Present biochemical characterization and heterologous expression in E. coli also revealed comparable activity of the enzyme towards laminaribiose (Table 6.1, Figure 6.2), which suggested a potential physiological role in the utilization of β-glucans containing (1→3) linkages. As discussed above, Bgl3B displayed a distinct supporting role to Bgl3A in MLG utilization, as evidenced by the enhanced growth defect of the double mutant bgl3A bgl3B (Figure 6.4H). This observation can be explained by the abundance of (1→4) linkages in the MLG chain, together with the high activity of Bgl3B on cello-oligosaccharides [281]. Conversely, mutational analysis evidenced only a minor contribution of Bgl3B for the utilization of the strictly (1→3)-linked substrates laminaribiose and curdlan. Instead, such substrates are degraded by the more (1→3)-specialized enzymes Bgl3A (vide supra) and Bgl3C (vide infra). Bgl3B has no contribution to sophorose utilization, as revealed by the lack of growth of the triple mutant bgl3A bgl3C bgl3D, in addition to the inability to confer growth on this substrate when heterologously expressed in E. coli (Figure 6.3). The cytosolic Bgl3B also does not appear to 170  contribute to XyG utilization, presumably due to the fact that XyGO degradation occurs in the periplasm [33]. 6.4.3 Bgl3C drives β(1→3)-glucan utilization  The Bgl3C enzyme contributes minimally to cellulose utilization in C. japonicus and the bgl3C gene was not up-regulated during growth on cellobiose as the sole carbon source [281]. Although the enzyme is active against cello-oligosaccharides [281], further biochemical analysis revealed particular specificity toward (1→3)-linkages, as represented by laminaribiose, which had a notably low Km value (Table 6.1). Deletion of the bgl3C gene alone failed to produce a growth defect on laminaribiose, but did result in slightly impaired growth on curdlan. As discussed above, mutants also lacking bgl3A exhibited large growth defects, and it appears that Bgl3A and Bgl3C work synergistically on all-(1→3)-glucan (Figure 6.4). All C. japonicus GH3 enzymes, however, have at least a partial role in (1→3)-glucan utilization, as the 4G quadruple mutant was unable to grow on laminaribiose and curdlan. Moreover, the promiscuous activity against (1→3) and (1→4) linkages of Bgl3C also makes it a significant contributor to MLG utilization. Finally, Bgl3C also plays a role in (1→2)-glucoside degradation by C. japonicus, as it is the only GH3 member able to compensate for the loss of Bgl3A in sophorose utilization, and was also sufficient to support growth when expressed in E coli. 6.4.4 Bgl3D is the crucial β-glucosidase for XyG utilization The present model for XyG utilization by C. japonicus involves a xyloglucan utilization locus (XyGUL) (Figure 6.6) that encodes three periplasmic, side-chain-cleaving GHs (a GH31 α-xylosidase, a GH35 β-galactosidase, and a GH95 α-L-fucosidase) and a predicted outer-membrane TonB dependent transporter (TBDT) for periplasmic uptake and saccharification of XyGOs produced by extracellular endo-xyloglucanases [33, 35, 145, 219]. Despite extensive efforts illuminating the concerted action of these players, the identity of the β-glucosidase(s) necessary for the complete deconstruction of the β(1→4)-linked XyGO backbone was heretofore unknown. Enzyme kinetic data (Table 6.1) distinctly identified Bgl3D as the only GH3 member with high catalytic efficiency toward the XyGOs GXXG and GLLG, which are the products of the XyGUL-encoded periplasmic -xylosidase, Xyl31A. Notably, Xyl31A is highly specific for 171  the non-reducing terminal (1→6)-Xylp residue of XyGOs and is unable to remove internal xylosyl sidechains from the backbone [33, 145, 166]. Thus, β(1→4)-glucosidase activity is essential for continued degradation of XyGOs in the C. japonicus periplasm.   Figure  6.6. Updated model of xyloglucan utilization by C. japonicus. XyG hydrolysis is initiated outside of the cell by endo-xyloglucanases, followed by XyGO transport into the periplasm by a TonB-dependent transporter (TBDT). The C. japonicus α-L-fucosidase Afc95A removes terminal fucosyl residues, enabling full access of the β-galactosidase Bgl35A to both pendant galactosyl residues [33] Activity of the α-xylosidase Xyl31A is restricted to terminal non-reducing-end xylosyl residues [33], such that cycling between the -xylosidase and the primary XyGO-specific -glucosidase Bgl3D is required for complete saccharification to monosaccharides for primary metabolism (see Figure 6.S3).   172  In vivo, previous RNAseq data and mutational analysis indicated that bgl3D was not involved in cellobiose utilization [281]. Here, analysis of the single and multiple GH3 mutants directly implicated Bgl3D as the primary -glucosidase responsible for growth on XyG and XyGOs, which could be assisted to a very limited extent by the predominant β(1→4)-glucosidase Bgl3B (Figure 6.5). Additional recombinant enzyme product analysis in vitro has also provided direct evidence that Bgl3D can fulfill this role (Table 6.1). Collectively, detailed biochemical and physiological data were both essential to resolve the key outstanding question regarding the identity of the β(1→4)-glucosidase required for XyG utilization. The sequestration of oligosaccharides by transport into the cell is a strategy used by many environmental bacteria [273, 309]. The co-localization of Bgl3D with the sidechain-cleaving exo-glycosidases of the XyGUL in the periplasm of C. japonicus GH3 enzymes gives evidence of a unified strategy for competitive oligosaccharide acquisition and utilization by this saprophytic bacterium. This elegant system is analogous to that used by the human gut symbiont Bacterioides ovatus, in which two periplasmic GH3s operate in concert with an α-xylosidase, a -galactosidase, and two -L-arabinofuranosidases to saccharify dietary (arabinogalacto)xyloglucans [34, 139]. 6.5 Conclusions Using a synthesis of biochemical and physiological approaches, we have determined that the four GH3 members of Cellvibrio japonicus play unique roles in targeting different glucosidic linkages in diverse polysaccharides. Our work further illuminates the mechanism by which this model saprophyte, which has served as a treasure trove for CAZyme discovery for decades [155, 158], utilizes the ubiquitous plant cell wall matrix glycans MLG, XyG, and callose. As such, the use of truly systems biology approaches is proving essential to disentangle apparent redundancy in microbial genomes encoding multiple members of single CAZyme families. The current study not only sheds the light on the metabolic capacity of C. japonicus, but also expands the accessible CAZyme repertoire for future deployment in biotechnological applications. For example, in light of recent demonstrations [98] that -xylosidase addition to enzyme cocktails can improve ultimate glucose release, possibly through addressing tightly-bound XyG in cellulose [27], we anticipate that additional combinations with a XyG-specific -glucosidase 173  (e.g. Bgl3D) may further improve saccharification. Not least, our functional analysis provides a gold-standard reference for informed bioinformatics on the genomes of other Cellvibrio and related species [310].   174  6.6 Supporting information 6.6.1 Supporting tables Table 6.S1A. Growth statistics of E. coli GH3 heterologous expression strains grown in a defined glucose medium (corresponding to Figure 6.2A)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Empty Vector Controlbc 0.24±0.001 5 1.03±0.01  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=6 and Tf=12 c All heterologous expression strains grew as the empty vector control  Table 6.S1B. Growth statistics of E. coli GH3 heterologous expression strains grown in a defined sophorose medium (corresponding to Figure 6.2B)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Empty Vector Control NDb ND 0.13±0.003 K12pBBRMCS-5/bgl3Ac 0.17±0.003 21 0.95±0.02 K12pBBRMCS-5/bgl3B ND ND 0.14±0.003 K12pBBRMCS-5/bgl3Cd 0.13±0.02 33 0.80±0.02 K12pBBRMCS-5/bgl3De 0.13±0.02 27 0.92±0.10  a Experiments were performed in biological triplicate  b Not Determined due to lack of growth c Time points used to calculate growth rate were Ti=22 and Tf=28 d Time points used to calculate growth rate were Ti=30 and Tf=40 e Time points used to calculate growth rate were Ti=28 and Tf=34  Table 6.S1C. Growth statistics of E. coli GH3 heterologous expression strains grown in a defined gentiobiose medium (corresponding to Figure 6.2C)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Empty Vector Control NDb ND 0.12±0.004 K12pBBRMCS-5/bgl3Ac 0.06±0.002 10 0.70±0.01 K12pBBRMCS-5/bgl3B ND ND 0.14±0.004 K12pBBRMCS-5/bgl3Cd 0.08±0.003 8 0.72±0.02 K12pBBRMCS-5/bgl3De 0.07±0.0.001 8 0.77±0.01  a Experiments were performed in biological triplicate  b Not Determined due to lack of growth c Time points used to calculate growth rate were Ti=22 and Tf=28 d Time points used to calculate growth rate were Ti=30 and Tf=40 175  e Time points used to calculate growth rate were Ti=28 and Tf=34 Table 6.S1D. Growth statistics of E. coli GH3 heterologous expression strains grown in a defined laminaribiose medium (corresponding to Figure 6.2D)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Empty Vector Controlb 0.07±0.002 22 0.61±0.002 K12pBBRMCS-5/bgl3Ac 0.23±0.01 5 0.71±0.01 K12pBBRMCS-5/bgl3Bd 0.09±0.0002 15 0.86±0.02 K12pBBRMCS-5/bgl3Cc 0.12±0.01 5 0.68±0.06 K12pBBRMCS-5/bgl3Dc 0.22±0.004 5 0.71±0.004  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=32 and Tf=44 c Time points used to calculate growth rate were Ti=6 and Tf=12 d Time points used to calculate growth rate were Ti=16 and Tf=22  Table 6.S1E. Growth statistics of C. japonicus GH3 mutants grown in a defined sophorose medium(corresponding to Figure 6.3)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typeb 0.29±0.01 4 1.24±0.16 Δbgl3Ac 0.19±0.005 8 1.05±0.003 Δbgl3Bd 0.24±0.004 5 1.13±0.003 Δbgl3Cb 0.28±0.003 4 1.11±0.01 Δbgl3Db 0.30±0.004 4 1.13±0.003 Wild Typee 0.28±0.003 3 1.41±0.01 Δbgl3A Δbgl3Bf 0.24±0.003 9 1.18±0.01 Δbgl3A Δbgl3C NDg ND 0.14±0.01 Δbgl3A Δbgl3Dh 0.27±0.01 9 1.31±0.01 Δbgl3B Δbgl3Cb 0.21±0.01 4 1.31±0.02 Δbgl3B Δbgl3Di 0.22±0.01 4 1.31±0.01 Δbgl3C Δbgl3j 0.26±0.003 3 1.34±0.01 Wild Typek 0.20±0.02 3 1.20±0.02 Δbgl3A Δbgl3B Δbgl3C ND ND 0.13±0.01 Δbgl3A Δbgl3B Δbgl3D ND ND 0.10±0.01 Δbgl3A Δbgl3C Δbgl3D ND ND 0.10±0.01 Δbgl3B Δbgl3C Δbgl3Db 0.15±0.02 4 1.19±0.02 4G ND ND 0.10±0.004  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=5 and Tf=10 c Time points used to calculate growth rate were Ti=9 and Tf=19 d Time points used to calculate growth rate were Ti=6 and Tf=14 e Time points used to calculate growth rate were Ti=4 and Tf=10 176  f Time points used to calculate growth rate were Ti=10 and Tf=17 g Not determined due to lack of growth h Time points used to calculate growth rate were Ti=10 and Tf=14 i Time points used to calculate growth rate were Ti=5 and Tf=19 j Time points used to calculate growth rate were Ti=4 and Tf=9 k Time points used to calculate growth rate were Ti=4 and Tf=8  Table 6.S1F. Growth statistics of C. japonicus GH3 mutants grown in a defined laminaribiose medium (corresponding to Figure 6.4A-C)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typeb 0.29±0.01 3 1.14±0.02 Δbgl3Ac 0.21±0.002 5 1.05±0.01 Δbgl3Bd 0.40±0.02 5 1.14±0.01 Δbgl3Ce 0.34±0.01 3 1.19±0.03 Δbgl3Df 0.30±0.02 3 1.08±0.01 Wild Typeg 0.32±0.04 4 1.19±0.03 Δbgl3A Δbgl3Bh 0.23±0.003 4 1.08±0.01 Δbgl3A Δbgl3Ci 0.16±0.002 9 0.91±0.001 Δbgl3A Δbgl3Dj 0.24±0.03 6 1.04±-.01 Δbgl3B Δbgl3Ck 0.29±0.01 3 1.17±0.04 Δbgl3B Δbgl3Dk 0.27±0.02 3 1.14±0.01 Δbgl3C Δbgl3Dl 0.32±0.02 4 1.14±0.01 Wild Typeb 0.29±0.01 3 1.14±0.02 Δbgl3A Δbgl3B Δbgl3Cl 0.09±0.01 10 0.54±0.06 Δbgl3A Δbgl3B Δbgl3Dm 0.20±0.01 6 1.05±0.04 Δbgl3A Δbgl3C Δbgl3Dn 0.09±0.01 11 0.32±0.02 Δbgl3B Δbgl3C Δbgl3Do 0.33±0.11 5 1.05±0.23 4G NDp ND 0.10±0.001  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=4 and Tf=8 c Time points used to calculate growth rate were Ti=6 and Tf=12 d Time points used to calculate growth rate were Ti=6 and Tf=10 e Time points used to calculate growth rate were Ti=4 and Tf=7 f Time points used to calculate growth rate were Ti=4 and Tf=9 g Time points used to calculate growth rate were Ti=5 and Tf=10 h Time points used to calculate growth rate were Ti=5 and Tf=13 i Time points used to calculate growth rate were Ti=10 and Tf=17 j Time points used to calculate growth rate were Ti=7 and Tf=12 k Time points used to calculate growth rate were Ti=4 and Tf=11 l Time points used to calculate growth rate were Ti=11 and Tf=21 m Time points used to calculate growth rate were Ti=7 and Tf=16 n Time points used to calculate growth rate were Ti=12 and Tf=24 o Time points used to calculate growth rate were Ti=6 and Tf=9 p Not determined due to lack of growth  177  Table 6.S1G. Growth statistics of C. japonicus GH3 mutants grown in a defined curdlan medium (corresponding to Figure 6.4D-F)ab   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typec 0.14±0.04 3 0.19±0.002 Δbgl3Ad 0.13±0.01 3 0.18±0.003 Δbgl3Be 0.20±0.03 4 0.19±0.01 Δbgl3Cf 0.12±0.02 4 0.17±0.002 Δbgl3Df 0.16±0.01 4 0.19±0.002 Wild Typed 0.10±0.005 3 0.17±0.001 Δbgl3A Δbgl3Bd 0.10±0.01 3 0.17±0.003 Δbgl3A Δbgl3Cg 0.06±0.01 7 0.18±0.02 Δbgl3A Δbgl3Dd 0.11±0.01 3 0.18±0.001 Δbgl3B Δbgl3Cd 0.11±0.01 3 0.16±0.01 Δbgl3B Δbgl3De 0.17±0.03 4 0.18±0.01 Δbgl3C Δbgl3Dd 0.10±0.01 3 0.16±0.004 Wild Typec 0.14±0.04 3 0.19±0.002 Δbgl3A Δbgl3B Δbgl3Ch 0.05±0.004 8 0.16±0.005 Δbgl3A Δbgl3B Δbgl3De 0.20±0.001 4 0.19±0.01 Δbgl3A Δbgl3C Δbgl3Di 0.03±0.01 13 0.13±0.01 Δbgl3B Δbgl3C Δbgl3Df 0.15±0.01 4 0.18±0.01 4Gj 0.02±0.002 11 0.13±0.005  a Experiments were performed in biological triplicate  b Time points were taken every 15 minutes c Time points used to calculate growth rate were Ti=4 and Tf=7 d Time points used to calculate growth rate were Ti=4 and Tf=6 e Time points used to calculate growth rate were Ti=5 and Tf=6 f Time points used to calculate growth rate were Ti=5 and Tf=7 g Time points used to calculate growth rate were Ti=8 and Tf=13 h Time points used to calculate growth rate were Ti=9 and Tf=13 i Time points used to calculate growth rate were Ti=14 and Tf=19 j Time points used to calculate growth rate were Ti=12 and Tf=22    178  Table 6.S1H. Growth statistics of C. japonicus GH3 mutants grown in a defined mixed linkage glucan medium (corresponding to Figure 6.4G-I)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typeb 0.31±0.03 6 1.00±0.05 Δbgl3Ac 0.21±0.02 4 1.01±0.04 Δbgl3Bd 0.33±0.01 5 1.04±0.06 Δbgl3Ce 0.25±0.02 5 0.99±0.08 Δbgl3Df 0.26±0.04 4 0.97±0.07 Wild Typeg 0.36±0.02 6 1.02±0.01 Δbgl3A Δbgl3Bh 0.18±0.02 9 0.96±0.02 Δbgl3A Δbgl3Ci 0.23±0.02 9 1.05±0.001 Δbgl3A Δbgl3Dj 0.23±0.02 7 1.03±0.03 Δbgl3B Δbgl3Cd 0.29±0.02 5 0.94±0.01 Δbgl3B Δbgl3Dk 0.31±0.06 7 0.96.0.01 Δbgl3C Δbgl3Dc 0.23±0.01 4 1.05±0.11 Wild Typeb 0.31±0.03 6 1.00±0.05 Δbgl3A Δbgl3B Δbgl3Cl 0.17±0.02 14 0.58±0.04 Δbgl3A Δbgl3B Δbgl3Dm 0.20±0.05 11 1.01±0.03 Δbgl3A Δbgl3C Δbgl3Dn 0.21±0.01 8 0.92±0.03 Δbgl3B Δbgl3C Δbgl3Do 0.39±0.08 5 1.03±0.04 4Gp 0.10±0.02 15 0.34±0.02  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=7 and Tf=10 c Time points used to calculate growth rate were Ti=5 and Tf=13 d Time points used to calculate growth rate were Ti=6 and Tf=11 e Time points used to calculate growth rate were Ti=6 and Tf=12 f Time points used to calculate growth rate were Ti=5 and Tf=9 g Time points used to calculate growth rate were Ti=7 and Tf=11 h Time points used to calculate growth rate were Ti=10 and Tf=15 i Time points used to calculate growth rate were Ti=10 and Tf=14 j Time points used to calculate growth rate were Ti=8 and Tf=13 k Time points used to calculate growth rate were Ti=6 and Tf=11 l Time points used to calculate growth rate were Ti=8 and Tf=11 m Time points used to calculate growth rate were Ti=15 and Tf=20 n Time points used to calculate growth rate were Ti=12 and Tf=16 o Time points used to calculate growth rate were Ti=9 and Tf=12 p Time points used to calculate growth rate were Ti=6 and Tf=10    179  Table 6.S1I. Growth statistics of C. japonicus GH3 mutants grown in a defined xyloglucan medium (corresponding to Figure 6.5A-C)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typeb 0.30±0.04 3 1.03±0.15 Δbgl3Ab 0.27±0.05 3 1.01±0.14 Δbgl3Bc 0.34±0.02 3 1.00±0.08 Δbgl3Cd 0.36±0.01 3 1.26±0.07 Δbgl3De 0.17±0.003 3 1.12±0.03 Wild Typed 0.36±0.01 3 0.89±0.04 Δbgl3A Δbgl3Bb 0.27±0.02 3 0.91±0.003 Δbgl3A Δbgl3Cd 0.35±0.02 3 0.91±0.01 Δbgl3A Δbgl3Df 0.12±0.02 5 0.98±0.07 Δbgl3B Δbgl3Cb 0.29±0.03 3 0.5±0.03 Δbgl3B Δbgl3Dg 0.16±0.02 2 0.83±0.03 Δbgl3C Δbgl3Dh 0.13±0.03 6 0.98±0.06 Wild Typei 0.26±0.01 3 0.90±0.07 Δbgl3A Δbgl3B Δbgl3Cb 0.23±0.04 3 0.74±0.11 Δbgl3A Δbgl3B Δbgl3Dj 0.13±0.01 2 0.58±0.004 Δbgl3A Δbgl3C Δbgl3Dk 0.13±0.01 6 0.97±0.06 Δbgl3B Δbgl3C Δbgl3Dg 0.14±0.003 2 0.63±0.05 4Gl 0.18±0.02 5 0.86±0.002  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=4 and Tf=9 c Time points used to calculate growth rate were Ti=4 and Tf=7 d Time points used to calculate growth rate were Ti=4 and Tf=8 e Time points used to calculate growth rate were Ti=4 and Tf=13 f Time points used to calculate growth rate were Ti=6 and Tf=16 g Time points used to calculate growth rate were Ti=3 and Tf=9 h Time points used to calculate growth rate were Ti=7 and Tf=13 i Time points used to calculate growth rate were Ti=4 and Tf=10 j Time points used to calculate growth rate were Ti=3 and Tf=10 k Time points used to calculate growth rate were Ti=7 and Tf=15 l Time points used to calculate growth rate were Ti=6 and Tf=10    180  Table 6.S1J. Growth statistics of C. japonicus GH3 mutants grown in a defined xyloglucan oligosaccharide medium (corresponding to Figure 6.5D)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typeb 0.26±0.01 8 1.04±0.01 Δbgl3Ac 0.27±0.01 8 1.02±0.01 Δbgl3Bd 0.21±0.004 8 1.00±0.01 Δbgl3Cc 0.16±0.01 8 1.03±0.01 Δbgl3De 0.23±0.01 10 0.76±0.01 Δbgl3B Δbgl3Df 0.14±0.004 8 0.71±0.01 4Gg 0.15±0.01 8 0.67±0.01  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=9 and Tf=13 c Time points used to calculate growth rate were Ti=9 and Tf=14 d Time points used to calculate growth rate were Ti=9 and Tf=15 e Time points used to calculate growth rate were Ti=11 and Tf=14 f Time points used to calculate growth rate were Ti=9 and Tf=18 g Time points used to calculate growth rate were Ti=9 and Tf=17  Table 6.S1K. Growth statistics of C. japonicus GH3 mutants grown in a defined gentiobiose medium (corresponding to Figure 6.S4)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typebc 0.36±0.004 3 1.16±0.004  a Experiments were performed in biological triplicate  b All mutant strains grew as wild type c Time points used to calculate growth rate were Ti=6 and Tf=10  Table 6.S1L. Growth statistics of C. japonicus mutants grown in a defined xyloglucan medium (corresponding to Figure 6.S6)a   Strain Growth Rate (gen hr-1) Lag Time (hrs) Max OD600 Wild Typeb 0.24±0.01 2 1.05±0.04 ΔxylAb 0.25±0.04 2 0.96±0.06 Δ4Gc 0.15±0.01 3 0.99±0.06 ΔxylA4Gd 0.09±0.01 2 0.39±0.01 Δxyl31A NDe ND 0.15±0.02  a Experiments were performed in biological triplicate  b Time points used to calculate growth rate were Ti=3 and Tf=9 c Time points used to calculate growth rate were Ti=4 and Tf=12 d Time points used to calculate growth rate were Ti=3 and Tf=11 e Not Determined due to lack of growth 181  Table 6.S2. Strains, plasmids, and primers used in this study Strain, plasmid, or primer Genotype or Sequence Source or Reference Strains   E. coli DH5α λ-Φ80dlacZM15 (lacZYA-argF)U169 recA1 endA1 hsdR17(rk-mk-) supE44 thi-1 gyrA relA1 Laboratory collection E. coli S17 λpir Tpr Smr recA thi pro hsdR hsdM+ RP4-2-TC::Mu::Km Tn7 λpri Laboratory collection E. coli K12  Laboratory collection E. coli K12 / pBBRMCS-5 Gmr [281] E. coli K12 / pBBRMCS-5-bgl3A bgl3A+;Gmr [281] E. coli K12 / pBBRMCS-5-bgl3B bgl3B+;Gmr [281] E. coli K12 / pBBRMCS-5-bgl3C bgl3C+;Gmr [281] E. coli K12 / pBBRMCS-5-bgl3D bgl3D+;Gmr [281] C. japonicus Ueda 107 Wild Type Laboratory collection C. japonicus bgl3A Ueda 107 bgl3Aa [281] C. japonicus bgl3B Ueda 107 bgl3Bb [281] C. japonicus bgl3C Ueda 107 bgl3Cc [281] C. japonicus bgl3D Ueda 107 bgl3Dd [281] C. japonicus bgl3Abgl3B Ueda 107 bgl3Abgl3B [281] C. japonicus bgl3Abgl3C Ueda 107 bgl3Abgl3C [281] C. japonicus bgl3Abgl3D Ueda 107 bgl3Abgl3D [281] C. japonicus bgl3Bbgl3C Ueda 107 bgl3Bbgl3C [281] C. japonicus bgl3Bbgl3D Ueda 107 bgl3Bbgl3D [281] C. japonicus bgl3Cbgl3D Ueda 107 bgl3Cbgl3D [281] C. japonicus bgl3Abgl3Bbgl3C Ueda 107 bgl3Abgl3Bbgl3C [281] C. japonicus bgl3Abgl3Bbgl3D Ueda 107 bgl3Abgl3Bbgl3D [281] C. japonicus bgl3Abgl3Cbgl3D Ueda 107 bgl3Abgl3Cbgl3D [281] C. japonicus bgl3Bbgl3Cbgl3D Ueda 107 bgl3Bbgl3Cbgl3D [281] C. japonicus 4G Ueda 107 bgl3Abgl3Bbgl3Cbgl3D [281] C. japonicus xylA Ueda 107 xylAe [302] C. japonicus 4G xylA Ueda 107 bgl3Abgl3Bbgl3Cbgl3DxylA This study C. japonicus xyl31A Ueda 107 xyl31Af [33] Plasmids    pRK2013 ColE1 RK2-Mob+ RK2-Tra+; Kmr [311] pK18mobsacB pMB1 ori mob+ sacB+; Kmr [312] pK18xylA Contains 500bp upstream and downstream of xylA cloned into pK18mobsacB; Kmr [302] Primers   xylA CONF (5‟) AGGTTTGTTCCATC [302] xylA CONF (3‟) GAACTTGAAASCTGCCTG [302] xylA INT (5‟) GAATTCGCATCGGCAAAA [302] 182  xylA INT (3‟) TCTAGAACCGCCCCAGAA [302] a Gene locus CJA_0204 b Gene locus CJA_1497 c Gene locus CJA_0223 d Gene locus CJA_1140 e Gene locus CJA_3061 f Gene locus CJA_2706   183  6.6.2 Supporting figures                      10        20        30        40        50        60        70        80                        ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  MKDDFPQRLLRVSAASLLVLAALAGCDSRAPSTTDSTTTVSADTVTTQATEKVEWPVLNSAIKKDPAVEARVDDLLARMT  CjGH3B  -----------------------------------------------------VWPKVTSKVKKDPILEAKIDQLMARMS  CjGH3C  -----MHLSCKTLMCSLAVLVALGGCSK----SSDEAPTPTAGEPVTETSGISLWPEVQSRIAKDPAIEAKVAELLAQMS  CjGH3D  -------------MKKRHPLATFG-------LIAAALLTGSVLVQAAPDTNIKLWPKPHSPIQDSAEFTARVDAILQKMT                    90       100       110       120       130       140       150       160                 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  LEEKIGQLVQPEIRHVTPEDIKQYHVGSVLNGGGSTPGANKYASLEDWVKLADSFYYASVDKSDGRIGIPVIWGTDAVHG  CjGH3B  LEEKIGQMMQPEIRHLTPEDVKQYHVGSVLNGGGSVPNSNRYSKAADWLAMADAFYAASMDESDGKVAIPIMWGTDAVHG  CjGH3C  PEQKVGQLIQPELRQITPEEVTRYSVGSILNGGGSFPAENKYAKVEDWLALADSFYQASMSTEGGRVAIPVIWGTDAVHG  CjGH3D  LEEKVGQIMQAEIQTVTPEDVKKYHLGSVLNGGGSMPNRIENAKPKDWVEFYDALYDASMDTSDGGQAVPILWGTDAVHG                   170       180       190       200       210       220       230       240                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  LGNVIGATLFPHNIGLGATNNPELLKQIGWATAREIAATGLDWDFSPTVAVARDDRWGRTYESWSEDPQIVHAFAGKMVE  CjGH3B  VGNIVGATLFPHNIGLGATQNPELIKEIGKVTATEIAVTGLDWDFSPTVAVARDDRWGRTYESYSEDPAIVRLYAAEMVA  CjGH3C  HNNVIGATLFPHNIALGAMRNPELIRQIGAATAAEVAVTGIDWTFAPTLAVARDDRWGRTYESYAEDPEIVKAYGGMMVE  CjGH3D  HNNLTGATLFPHNIGLGATHNAELIRRIGAATAKEVRSTGIEWVFAPTLAVAQNDRWGRTYESYAEDPKVVATLATAMVE                   250       260       270       280       290       300       310       320                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  GLQGTGGSDRLFTHEHVIATAKHFIGDGGTLNGVDRGETQGDEKVLRDIHGAGYFSAIESGVQVVMASFTSWEGTRMHGH  CjGH3B  GLQGDADTASFLSITQVVATAKHFLGDGGTLNGIDRGDCSASESELLEIHAAGYYSAIEAGVQTVMASFNSWHGQHMHGH  CjGH3C  GLQGIPGTAELFDGTRVVATAKHFLADGGTEGGIDRGDAVISEADLVAIHNPGYLTALASGAQTVMASFSSWQGVKMHGH  CjGH3D  GLQGKVNTREFLTENHVIATAKHFLADGGTEAGDDQGNARINEKELIKIHNAGYVPAIEAGVQTIMASFSEWNGQKVHGS                   330       340       350       360       370       380       390       400                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  KYLLTDVLKDRMGFDGLVVGDWSGHSFIPGCTALNCPQSLMAGLDIYMVPEPDWKELYKNLLAQAKTGELPMARVDDAVR  CjGH3B  RYLLTDVLKEQMGFDGFIVGDWNGHGFVEGASVLNCPQAINAGLDMFMVPDPEWKTLYQNTLDQVRDGIIPLARVDDAVR  CjGH3C  TYLLTDALKKRMGFDGFVVGDWNGHAFVPGCTTTSCPQAINAGLDMFMAPDPNWKELYENTLAQVKSGAISQARLDDAVG  CjGH3D  HYLLTEVLKNRMGFDGFVVGDWNGHGQVPGCTNDSCAQAINAGIDLVMVT-YDWKDMITNTLAQVKSGEISQARLDDAVR                   410       420       430       440       450       460       470       480                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  RILRVKIRAGLFEKGAPSTRPLAGKKDVLGAPEHREVARQAVRESLVLLKNKNNLLPLARQQTVLVTGDGADNIGKQSGG  CjGH3B  RILRVKLRADLWGKGLPSSRPLAGRDELLGAAAHRAIARQAVRESLVMLKNKNNLLPLSPKSRVLVAGDGADNISKQTGG  CjGH3C  RILRVKLRAGLFEAGLPSTRPLAGQQALLGSAEHRAVARQAVRESLVLLKNNGSVLPANPAGKILVTGDGADNIGKQSGG  CjGH3D  RILRVKMRAGLWEK-KPSARANAADLAVVGSAEHRAIARQAVRESLVLLKNANKVLPINPRQTVLVAGDAADHIGKQAGG                   490       500       510       520       530       540       550       560                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  WSVSWQGTGNTNAD--FPGATSIYAGINAVVEQAGGKTLLSDDGSFSEKPDVAIVVFGEDPYAEMQGDVGNMAYKPRDTS  CjGH3B  WSVNWQGTGNTMED--FPGATTLWMGIKAAVTAAGGDAELSPDGTYSSRPDVALVIFGEDPYAEMQGDIQHQLLKSGDTA  CjGH3C  WTITWQGTGNVNSD--FPGATSIYQGIATAVNAAGGHVELSSDGSYQQKPDLAFVVFGENPYAEMQGDVNSLLYQ--NEQ  CjGH3D  WSVWWQGVADASENYRFPGATSIYAGIKQAVEHHGGKVVLSVDGSFTQKPDVAVVVFGENPYAEGSGDRATLEFEPAKKK                   570       580       590       600       610       620       630       640                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  DWELLKKLRSQGIPVVSLFISGRPLWVNREINASDAFVAVWLPGTEGQGIADVIFRNAQGEINYDVKGRLSFSWPKRPEQ  CjGH3B  DLDLLRRLKADGIPVVALFITGRPMWVNRELNAADAFVVIWQPGTEGAGVADVLFARAEGGVNYPMGGRLTFSWPKRPDQ  CjGH3C  DLALLKKLRAEGIKVVALFITGRPLWANSFINASDAFVVVWQPGTEANGIADVVLANADGSVNHDFKGQLSFSWPADPGQ  CjGH3D  SLALLKTLKAQGIPVVSVFISGRPLWVNPELNASDAFVAAWLPGSEGAGVADVVIAGADGKPRYDFTGRLSFSWPKSPLQ  184   Figure 6.S1. Amino acid sequence alignment of the CjGH3 enzymes. Signal peptide cleavage sites are indicated by red arrows. The cleavable signal peptide is removed by the action of signal peptidase II in case of Bgl3A and Bgl3C, and signal peptidase I in case of Bgl3D.                 650       660       670       680       690       700       710       720                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  TPLNRGDANYDPLFPYGYGLSYGDKDTLGDNLSE-EGIQLAEALDVLDLFNRRPIEPWQLEIIGYQNDRVPMASSTVT-A  CjGH3B  GPLNVHDTNYDPLFPYGYGLRYGDKDILGDTLSE-EGISMPQSTRVLELFNRRPMGNYVIALEGNRNDRQLMNGNLAK-A  CjGH3C  SPLNVGQADYQPQFAYGYGLRYRAHKELANLSET-IKAPAATTSERLAIFNQRPQAPWQLVLRDHLNNSRTVTTSKDE-V  CjGH3D  DVLNPHHKGYQPLFKLGYGLHYKSGKAGPEKLPENMPGVASDKPQDIELYVRRPLEPWHIFIENYERQQILSGAFAALPK                   730       740       750       760       770       780       790       800                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  SSLKIQAVDRNEQEDARRVQWNGSGPGQVALSVGDRQDFIGYVKSDSALVFDIKVNAAPTVTTYLRLGCG-SYCASDIDL  CjGH3B  STLTVTVVDRDVQEDARRAVWNGEGEGLVALSTPNRQVLSDYYESDAALFFDIKVDQAPEQQAFVRIGCG-PSCHSDVDV  CjGH3C  STLSVSAVDNKVQEDARRAVWNGTGKGTLSFASIQRSDLSAYAANKAALVVDFKLNQAPASAVTIGMSCG-TDCETDIDV  CjGH3D  GDVKAITSDKDVQEDALTFTWKDTWRAGLTLEGGEPLDLTAHVKTG-ALSLDINIIELAKGGVSFKLECQRDGCERLVPY                   810       820       830       840       850       860       870       880                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  TEKLKGFAGQDWQTVTVPLHCYPNSGANFGVTQPPEEFWTQVLQPFSLLTSGTLDVTFAQVRVVKGAGKDVACP------  CjGH3B  TELLRSLEGKGWASIRVDLACYPEVETNFGLRRLPHELFALILEPFSLVANGKMDISFSRVYIEKNRAQHGTFGVVG---  CjGH3C  TAALKAATPDAWQTLAIPLSCYSNARIKM----------DMVVAPFVIGTDGALDITLYNLRIEQ-ADQTITCPE-----  CjGH3D  TLKAREMLGKGWHKVIVPLSCFVHEGDDFS----------AVTMPFALETGGAGQVEVANMQLLLNTPKDASLLSCPDYK                   890       900       910       920       930       940       950       960                ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  --------------------------------------------------------------------------------  CjGH3B  --------------------------------------------------------------------------------  CjGH3C  --------------------------------------------------------------------------------  CjGH3D  TQSVTPDMLNEWWALEWWLPRHEQKLKDKQAILDKKGQVDLLFIGDSITQGWEKEGAEVWKKYYAKRNAFNLGFGGDRTE                   970       980       990       1000      1010      1020      1030      1040               ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| CjGH3A  --------------------------------------------------------------------------------  CjGH3B  --------------------------------------------------------------------------------  CjGH3C  --------------------------------------------------------------------------------  CjGH3D  NVLWRLRHGAVDGLDPKLVVLMIGTNNTGHRKENPAGIAAGIQLLLSEIQQRLPNSRVLLLAIFPRDANADGQLRQNNEK                   1050      1060      1070      1080      1090      1100             ....|....|....|....|....|....|....|....|....|....|....|....|.. CjGH3A  --------------------------------------------------------------  CjGH3B  --------------------------------------------------------------  CjGH3C  --------------------------------------------------------------  CjGH3D  TNALIAGLADKRKVFFRNINAQFLAKDGELPVDIMPDLLHPNEKGYAIWAKAIEQDIQSLMK  185   Figure 6.S2. Kinetic parameters of GH3 enzymes against different gluco-disaccharides and xyloglucan-based oligosaccharides. (A) Bgl3A. (B) Bgl3B. (C) Bgl3C. (D) Bgl3D. These data are summarized in Table 6.1.   186   Figure 6.S3.  Sequential digestion of GXXG by CjGH3D and CjXyl31A with a heat inactivation step between each treatment. (A) HPAEC-PAD analysis of GXXG and the product of each enzymatic step. (B) MALDI-TOF analysis of the same sample: GXXG ([M+Na]+ calculated: 953.8, observed: 953.29), XXG ([M+Na]+ calculated: 791.65, observed: 791.37), GXG ([M+Na]+ calculated: 659.54, observed: 659.18), and XG ([M+Na]+ calculated: 497.4, observed: 497.25).  187   Figure 6.S5. Growth analysis of GH3 mutants on gentiobiose. C. japonicus wild type and GH3 single (A), double (B), triple and quadruple (C) mutants were grown in defined media with 0.5% w:v gentiobiose as the sole carbon source for 24 hours at 30C with a high level of aeration. Growth was monitored as optical density (OD) at 600 nm. Experiments were performed in biological triplicate with the error bars representing the standard deviation. Growth rates and maximum OD can be found in Table 6.S1.    188    Figure 6.S6. Importance of xylose consumption during xyloglucan degradation. C. japonicus wild type, the 4G quadruple, xylA, xyl31A, and the 4G xylA quintuple mutants were grown with 0.5% w:v glucose (A), 0.5% w:v xylose (B), or 0.5% w:v xyloglucan (C) for 24 hours at 30C with a high level of aeration. Growth was monitored as optical density (OD) at 600 nm. Experiments were performed in biological triplicate with the error bars representing the standard deviation. Growth rates and maximum OD can be found in Table 6.S1.  189  Chapter 7: Conclusions Plant biomass is an extremely useful and clean source of renewable energy as well as valuable consumer products. However, plant biomass recalcitrance to hydrolysis, emerging from its tremendous structural complexity, imposes a great challenge that needs to be overcome prior to its incorporation into different industrial applications. Environmental microbes have been a fertile source of inspiration given their capability to effectively degrade and utilize plant biomass. The soil saprophyte Cellvibrio japonicus has been extensively studied recently on the genomic and biochemical level in order to decipher its molecular mechanisms to utilize nearly all plant polysaccharides. I have been interested in the plant polysaccharide xyloglucan (XyG) given its high abundance in the plant cell walls. This thesis addressed the XyG utilization pathway in C. japonicus in an attempt to unravel the first and last steps in the pathway. Prior to this research, the fact that C. japonicus harnesses a combination of endo- and exo-xyloglucanases in the deconstruction pathway was well established. However, the identity of the endo-xyloglucanase(s) as well as the exo-glucosidase(s) involved in the first and last steps, respectively, remained an enigma to be explored.  In Chapter 2, a specific multi-modular endo-xyloglucanase belonging to family GH74 was identified and extensively characterized on both the biochemical and structural levels. It was interestingly found that the GH74 catalytic module is in train with two cellulose binding modules belonging to the CBM10 and CBM2 families. Such modular architecture could be simply justified by the complex nature of plant cell walls in which cellulose is intimately associated with hemicellulose. Despite the high specificity of the CjGH74 towards XyG, reverse genetics and transcriptomic analyses suggested the presence of additional candidate(s) that play more fundamental role in the XyG saccharification pathway.  In Chapter 3, three specific GH5_4 endo-xyloglucanase enzymes (CjGH5D, CjGH5E and CjGH5F) with different modular composition and subcellular localization were identified and extensively characterized. One of the three candidates (CjGH5D) was found to be upregulated when the bacterium was grown on XyG indicating the prominent contribution of that enzyme in the XyG utilization pathway in C. japonicus. The biochemical data were supported by the crystal 190  structure of CjGH5D, in complex with the inhibitor XXXG-NHCOCH2Br, which clearly revealed the structural features required for XyG binding and catalysis.   Looking into the modular architecture of the three CjGH5_4 endo-xyloglucanases, a unique domain of unknown function (X181) was found upstream of the CjGH5F catalytic module. Therefore, in Chapter 4, I attempted the elucidation of the biological function of that domain in the context of plant polysaccharide degradation. It was interestingly found that X181 domain specifically binds the galactose-containing plant polysaccharides such as XyG and galactomannans suggesting a new CBM family. The appended XyG binding module indeed capacitates the XyG-active catalytic module to encounter its cognate substrate. Because the deletion mutant of the four endo-xyloglucanase genes in C. japonicus was still able to grow on XyG, I anticipated the presence of at least one additional enzyme with an endo-xyloglucanase activity that is yet to be identified. Therefore, I expanded my analysis to include the GH9 family in the following chapter.  In Chapter 5, I managed to overproduce and purify one enzyme (CjGH9B) out of the three GH9 candidates encoded by C. japonicus genome. Our in-depth biochemical investigation clearly demonstrates the high catalytic efficiency of the enzyme towards the mixed-linkage β-glucan (MLG). On the other hand, the enzyme displays a slight endo-xyloglucanase activity, which might explain the lack of growth perturbations on XyG when the four specific endo-xyloglucanases CjGH74, CjGH5D, CjGH5E and CjGH5F were deleted from the genome. My data shed light on one of the challenges that might hinder the reverse genetic analysis in such complex genomes due to the presence of large suite of enzymes that can exhibit non-specific side activities towards different substrates. In Chapter 6, I attempted the elucidation of the final step in the XyG saccharification pathway via the identification and characterization of the key XyG-specific β-glucosidase. The C. japonicus genome encodes four putative β-glucosidases belonging to the GH3 family (Bgl3A, Bgl3B, Bgl3C and Bgl3D). Our comprehensive analysis revealed that the four GH3 members are not functionally redundant. Instead, they uniquely target different glucosidic linkages in different polysaccharides. Moreover, our analysis demonstrated that only the exo-glucosidase Bgl3D is able to accommodate the XyGO substrates and can selectively cleave the β-1,4 glucosidic 191  linkages. The identification of the XyG-specific exo-glucosidase playing a fundamental role in XyG degradation in C. japonicus illuminates the last missing piece in the XyG utilization pathway in this soil saprophyte. Therefore, the XyG utilization model was updated to include the endo-xyloglucanases and the exo-glucosidase catalyzing the first and last step, respectively, in the pathway (Figure 7.1).  Figure  7.1. The updated xyloglucan utilization model in C. japonicus. XyG saccharification commences extracellularly via a suite of backbone cleaving endo-xyloglucanases belonging to the GH74 and GH5 families. The resulting XyGO fragments are transported via a TonB dependent transporter to the periplasmic space where a consortium of exo-acting enzymes including the exo-fucosidase Afc95A, the exo-galactosidase Bgl35A, the exo-xylosidase Xyl31A, and the XyG-specific exo-glucosidase Bgl3D are localized. The work done in this thesis deciphers the molecular mechanisms of XyG utilization by the soil saprophyte C. japonicus and underpins the fundamental contribution of this Gram-negative bacterium to the global carbon cycle. Moreover, our work expands the repertoire of active CAZymes for future analytical and biotechnological applications. For instance, I have shown that XyG-active enzymes from C. japonicus can be effectively harnessed as a useful toolkit in the 192  production of substrates and inhibitors that facilitate the biochemical and structural investigations of xyloglucanases of interest. In this regard, I used a mixture of Glc4-based XyGOs as an analytical standard to elucidate the bond cleavage specificity of the different endo-xyloglucanase enzymes. Furthermore, I successfully utilized XyG-based probes to solve the crystal structures of different endo-xyloglucanases and identify their structural determinants required for the catalysis. I have also prepared XyGO substrates to screen for the XyG-specific β-1,4 exo-glucosidase activity employed in the last step of the XyG saccharification pathway in C. japonicus. Undoubtedly, the structural complexity and the multiple ramification patterns of XyG renders the synthetic strategies for the aforementioned substrates and probes impractical and intractable. Indeed, our biocatalysis-based approach, which takes advantage of the elegant substrate and bond cleavage specificities of CAZymes, has proven its efficacy as the method of choice when obtaining complex products with high purity is required. Our current work, therefore, emphasizes the potential use of CAZymes as powerful tools to generate valuable products exploited in enzyme discovery and characterization.                  On the other hand, there is a widespread interest in CAZyme discovery for breaking down the complex plant biomass into simple fermentable sugars given the constantly decreasing reserves of fossil fuels. The ultimate goal is to develop and improve enzymatic cocktails that can provide the maximum simple sugar release from the complex glycan matrices. The unceasing advances in molecular biology, genomics, biochemistry, and structural biology will substantially increase the worldwide progress in the scope of enzyme discovery and characterization and will indeed provide more enzyme candidates for such industrial applications.  In our future directions, I will extend my study to include other polysaccharide utilization systems in C. japonicus. Notably, comprehensive studies from different research groups have been collectively targeting the cellulose, xyloglucan, xylan, mannan, arabinan, and pectin utilization machineries in this soil saprophyte (reviewed in [158]). Hence, I will be aiming at illuminating the unexplored MLG utilization pathway in C. japonicus. Thus far, I have successfully identified and characterized the key mixed-linkage glucanase CjGH9B. However, it remains to be explored whether C. japonicus genome encodes other MLG-specific backbone cleaving enzymes. Moreover, biochemical analysis of the C. japonicus GH3 exo-glucosidases 193  using the MLGase limit-digest products as substrates will identify other fundamental enzymes in the pathway and complement our reverse genetic analysis.  Another interesting future aspect that might be investigated is the ability of C. japonicus to utilize strict β-1,3-glucan backbone-containing polysaccharides such as the algal laminarin and the bacterial curdlan. Growth experiments clearly suggested the presence of degrading systems that target such substrates. Moreover, biochemical and reverse genetic analyses identified the two key exo-glucosidase Bgl3A and Bgl3C that work synergistically on all (1→3)-glucan substrates. Therefore, my current analysis sets the stage for future investigation to identify the key β-1,3 endo-glucanase(s) catalyzing the initial degradation step of the pathway.                194  Bibliography 1.  Hook, M. & Tang, X. (2013) Depletion of fossil fuels and anthropogenic climate change-A review, Energy Policy. 52, 797-809. 2.  Capellan-Perez, I., Mediavilla, M., de Castro, C., Carpintero, O. & Miguel, L. J. (2014) Fossil fuel depletion and socio-economic scenarios: An integrated approach, Energy. 77, 641-666. 3.  Abas, N., Kalair, A. & Khan, N. (2015) Review of fossil fuels and future energy technologies, Futures. 69, 31-49. 4.  Sims, R. E. H., Mabee, W., Saddler, J. N. & Taylor, M. (2010) An overview of second generation biofuel technologies, Bioresource Technology. 101, 1570-1580. 5.  Ragauskas, A. J., Beckham, G. T., Biddy, M. J., Chandra, R., Chen, F., Davis, M. F., Davison, B. H., Dixon, R. A., Gilna, P., Keller, M., Langan, P., Naskar, A. K., Saddler, J. N., Tschaplinski, T. J., Tuskan, G. A. & Wyman, C. E. (2014) Lignin Valorization: Improving Lignin Processing in the Biorefinery, Science. 344, 709-719. 6.  Tuck, C. O., Perez, E., Horvath, I. T., Sheldon, R. A. & Poliakoff, M. (2012) Valorization of Biomass: Deriving More Value from Waste, Science. 337, 695-699. 7.  Perlack, R. D., Wright, L. L., Turhollow, A. F., Graham, R. L., Stokes, B. J. & Erbach, D. C. (2005) Biomass as feedstock for a bioenergy and bioproducts industry: the technical feasibility of a billion-ton annual supply. in Oak Ridge National Laboratory, Oak Ridge, TN (  ORNL/TM-2005/66). 8.  Carroll, A. & Somerville, C. (2009) Cellulosic Biofuels, Annual Review of Plant Biology. 60, 165-182. 9.  Horn, S. J., Vaaje-Kolstad, G., Westereng, B. & Eijsink, V. G. H. (2012) Novel enzymes for the degradation of cellulose, Biotechnology for Biofuels. 5. 10.  Li, L. L., McCorkle, S. R., Monchy, S., Taghavi, S. & van der Lelie, D. (2009) Bioprospecting metagenomes: glycosyl hydrolases for converting biomass, Biotechnology for Biofuels. 2. 11.  Carpita, N. & McCann, M. (2000) The Cell Wall in Biochemistry and Molecular Biology of Plants (Buchanan, B. B., Gruissem, W. & Jones, R. L., eds) pp. 55-108, John Wiley & Sons, Somerset, NJ. 12.  Keegstra, K. (2010) Plant Cell Walls, Plant Physiology. 154, 483-486. 195  13.  Ellis, M., Egelund, J., Schultz, C. J. & Bacic, A. (2010) Arabinogalactan-Proteins: Key Regulators at the Cell Surface?, Plant Physiology. 153, 403-419. 14.  Albersheim, P., Darvill, A., Roberts, K., Sederoff, R. & Staehelin, A. (2010) Plant Cell Walls, 1 edn, Garland Science. 15.  Taiz, L. & Zeiger, E. (2010) Plant Physiology, Sinauer Associates, incorporated. 16.  Doblin, M. S., Pettolino, F. & Bacic, A. (2010) Plant cell walls: the skeleton of the plant world, Functional Plant Biology. 37, 357-381. 17.  McCann, M. & Roberts, K. (1991) Architecture of the primary cell wall in The Cytoskeletal Basis of Plant  Growth and Form   (Lloyd, C. W., ed) pp. 109-129, Academic Press, New York. 18.  Kumar, P., Barrett, D. M., Delwiche, M. J. & Stroeve, P. (2009) Methods for Pretreatment of Lignocellulosic Biomass for Efficient Hydrolysis and Biofuel Production, Industrial & Engineering Chemistry Research. 48, 3713-3729. 19.  Malherbe, S. & Cloete, T. E. (2002) Lignocellulose biodegradation: Fundamentals and applications, Reviews in Environmental Science & Biotechnology. 1, 105–114. 20.  Anwar, Z., Gulfraz, M. & Irshad, M. (2014) Agro-industrial lignocellulosic biomass a key to unlock the future bio-energy: A brief review, Journal of Radiation Research and Applied Sciences. 7, 163-173. 21.  Endler, A. & Persson, S. (2011) Cellulose Synthases and Synthesis in Arabidopsis, Molecular Plant. 4, 199-211. 22.  O'Dwyer, M. H. (1923) The hemicelluloses. III. The hemicellulose of American White Oak, Biochemical Journal. 17, 501-509. 23.  Scheller, H. V. & Ulvskov, P. (2010) Hemicelluloses, Annual Review of Plant Biology. 61, 263-289. 24.  Ebringerova, A., Hromadkova, Z. & Heinze, T. (2005) Hemicellulose, Polysaccharides 1: Structure, Characterization and Use. 186, 1-67. 25.  Levasseur, A., Drula, E., Lombard, V., Coutinho, P. M. & Henrissat, B. (2013) Expansion of the enzymatic repertoire of the CAZy database to integrate auxiliary redox enzymes, Biotechnology for Biofuels. 6. 26.  Vogel, J. (2008) Unique aspects of the grass cell wall, Current Opinion in Plant Biology. 11, 301-307. 196  27.  Pauly, M., Albersheim, P., Darvill, A. & York, W. S. (1999) Molecular domains of the cellulose/xyloglucan network in the cell walls of higher plants, Plant Journal. 20, 629-639. 28.  Schultink, A., Liu, L., Zhu, L. & Pauly, M. (2014) Structural diversity and function of xyloglucan sidechain substituents, Plants. 3, 526-542. 29.  Fry, S. C., York, W. S., Albersheim, P., Darvill, A., Hayashi, T., Joseleau, J. P., Kato, Y., Lorences, E. P., Maclachlan, G. A., McNeil, M., Mort, A. J., Reid, J. S. G., Seitz, H. U., Selvendran, R. R., Voragen, A. G. J. & White, A. R. (1993) An unambiguous nomenclature for xyloglucan-derived oligosaccharides, Physiologia Plantarum. 89, 1-3. 30.  Tuomivaara, S. T., Yaoi, K., O'Neill, M. A. & York, W. S. (2015) Generation and structural validation of a library of diverse xyloglucan-derived oligosaccharides, including an update on xyloglucan nomenclature, Carbohydrate Research. 402, 56-66. 31.  Vincken, J. P., York, W. S., Beldman, G. & Voragen, A. G. J. (1997) Two general branching patterns of xyloglucan, XXXG and XXGG, Plant Physiology. 114, 9-13. 32.  Hoffman, M., Jia, Z. H., Pena, M. J., Cash, M., Harper, A., Blackburn, A. R., Darvill, A. & York, W. S. (2005) Structural analysis of xyloglucans in the primary cell walls of plants in the subclass Asteridae, Carbohydrate Research. 340, 1826-1840. 33.  Larsbrink, J., Thompson, A. J., Lundqvist, M., Gardner, J. G., Davies, G. J. & Brumer, H. (2014) A complex gene locus enables xyloglucan utilization in the model saprophyte Cellvibrio japonicus, Mol Microbiol. 94, 418-433. 34.  Larsbrink, J., Rogers, T. E., Hemsworth, G. R., McKee, L. S., Tauzin, A. S., Spadiut, O., Klinter, S., Pudlo, N. A., Urs, K., Koropatkin, N. M., Creagh, A. L., Haynes, C. A., Kelly, A. G., Cederholm, S. N., Davies, G. J., Martens, E. C. & Brumer, H. (2014) A discrete genetic locus confers xyloglucan metabolism in select human gut Bacteroidetes, Nature. 506, 498-502. 35.  Attia, M. A. & Brumer, H. (2016) Recent structural insights into the enzymology of the ubiquitous plant cell wall glycan xyloglucan, Curr Opin Struct Biol. 40, 43-53. 36.  Popper, Z. A. & Fry, S. C. (2004) Primary cell wall composition of pteridophytes and spermatophytes, New Phytologist. 164, 165-174. 37.  Gibeaut, D. M., Pauly, M., Bacic, A. & Fincher, G. B. (2005) Changes in cell wall polysaccharides in developing barley (Hordeum vulgare) coleoptiles, Planta. 221, 729-738. 38.  Obel, N., Porchia, A. C. & Scheller, H. V. (2002) Dynamic changes in cell wall polysaccharides during wheat seedling development, Phytochemistry. 60, 603-610. 197  39.  Guillon, F., Bouchet, B., Jamme, F., Robert, P., Quemener, B., Barron, C., Larre, C., Dumas, P. & Saulnier, L. (2011) Brachypodium distachyon grain: characterization of endosperm cell walls, Journal of Experimental Botany. 62, 1001-1015. 40.  Planas, N. (2000) Bacterial 1,3-1,4-beta-glucanases: structure, function and protein engineering, Biochimica Et Biophysica Acta-Protein Structure and Molecular Enzymology. 1543, 361-382. 41.  Mellen, P. B., Walsh, T. F. & Herrington, D. M. (2008) Whole grain intake and cardiovascular disease: A meta-analysis, Nutrition Metabolism and Cardiovascular Diseases. 18, 283-290. 42.  Ye, E. Q., Chacko, S. A., Chou, E. L., Kugizaki, M. & Liu, S. M. (2012) Greater Whole-Grain Intake Is Associated with Lower Risk of Type 2 Diabetes, Cardiovascular Disease, and Weight Gain, Journal of Nutrition. 142, 1304-1313. 43.  Saha, B. C. (2003) Hemicellulose bioconversion, J Ind Microbiol Biotechnol. 30, 279-91. 44.  Carpita, N. C. (1996) Structure and biogenesis of the cell walls of grasses, Annual Review of Plant Physiology and Plant Molecular Biology. 47, 445-476. 45.  Popper, Z. A. & Fry, S. C. (2003) Primary cell wall composition of bryophytes and charophytes, Annals of Botany. 91, 1-12. 46.  Popper, Z. A. (2008) Evolution and diversity of green plant cell walls, Current Opinion in Plant Biology. 11, 286-292. 47.  Wang, H. L., Yeh, K. W., Chen, P. R., Chang, C. H., Chen, J. M. & Khoo, K. H. (2006) Isolation and characterization of a pure mannan from Oncidium (cv. Gower Ramsey) current pseudobulb during initial inflorescence development, Bioscience Biotechnology and Biochemistry. 70, 551-553. 48.  Buckeridge, M. S., dos Santos, H. P. & Tine, M. A. S. (2000) Mobilisation of storage cell wall polysaccharides in seeds, Plant Physiology and Biochemistry. 38, 141-156. 49.  Pereira, H., Graça, J. & Rodrigues, J. C. (2003) Wood chemistry in relation to quality in Wood Quality and Its Biological Basis (Barnett, J. R. & Jeronimidis, G., eds) pp. 53–86, Blackwell Publishing, Oxford. 50.  Alén, R. (2000) Structure and chemical composition of wood in Forest Products Chemistry (Stenius, P., ed) pp. 12–57, Fapet Oy, Helsinki. 198  51.  Willats, W. G. T., McCartney, L., Mackie, W. & Knox, J. P. (2001) Pectin: cell biology and prospects for functional analysis, Plant Molecular Biology. 47, 9-27. 52.  O'Neill, M. A., Ishii, T., Albersheim, P. & Darvill, A. G. (2004) Rhamnogalacturonan II: Structure and function of a borate cross-linked cell wall pectic polysaccharide, Annual Review of Plant Biology. 55, 109-139. 53.  Dixon, R. A. (2013) Microbiology: Break down the walls, Nature. 493, 36-37. 54.  Vanholme, R., Morreel, K., Darrah, C., Oyarce, P., Grabber, J. H., Ralph, J. & Boerjan, W. (2012) Metabolic engineering of novel lignin in biomass crops, New Phytologist. 196, 978-1000. 55.  Studer, M. H., DeMartini, J. D., Davis, M. F., Sykes, R. W., Davison, B., Keller, M., Tuskan, G. A. & Wyman, C. E. (2011) Lignin content in natural Populus variants affects sugar release, Proceedings of the National Academy of Sciences of the United States of America. 108, 6300-6305. 56.  Chapple, C., Ladisch, M. & Meilan, R. (2007) Loosening lignin's grip on biofuel production, Nature Biotechnology. 25, 746-748. 57.  Lombard, V., Ramulu, H. G., Drula, E., Coutinho, P. M. & Henrissat, B. (2014) The carbohydrate-active enzymes database (CAZy) in 2013, Nucleic Acids Research. 42, D490-D495. 58.  Aspeborg, H., Coutinho, P. M., Wang, Y., Brumer, H. & Henrissat, B. (2012) Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5), Bmc Evolutionary Biology. 12. 59.  Stam, M. R., Danchin, E. G. J., Rancurel, C., Coutinho, P. M. & Henrissat, B. (2006) Dividing the large glycoside hydrolase family 13 into subfamilies: towards improved functional annotations of alpha-amylase-related proteins, Protein Engineering Design & Selection. 19, 555-562. 60.  St John, F. J., Gonzalez, J. M. & Pozharski, E. (2010) Consolidation of glycosyl hydrolase family 30: A dual domain 4/7 hydrolase family consisting of two structurally distinct groups, Febs Letters. 584, 4435-4441. 61.  Campbell, J. A., Davies, G. J., Bulone, V. & Henrissat, B. (1997) A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities, Biochemical Journal. 326, 929-939. 199  62.  Lairson, L. L., Henrissat, B., Davies, G. J. & Withers, S. G. (2008) Glycosyltransferases: Structures, functions, and mechanisms, Annual Review of Biochemistry. 77, 521-555. 63.  Coutinho, P. M., Deleury, E., Davies, G. J. & Henrissat, B. (2003) An evolving hierarchical family classification for glycosyltransferases, Journal of Molecular Biology. 328, 307-317. 64.  Garron, M. L. & Cygler, M. (2010) Structural and mechanistic classification of uronic acid-containing polysaccharide lyases, Glycobiology. 20, 1547-1573. 65.  Lombard, V., Bernard, T., Rancurel, C., Brumer, H., Coutinho, P. M. & Henrissat, B. (2010) A hierarchical classification of polysaccharide lyases for glycogenomics, Biochemical Journal. 432, 437-444. 66.  Garron, M. L. & Cygler, M. (2014) Uronic polysaccharide degrading enzymes, Current Opinion in Structural Biology. 28, 87-95. 67.  Dodd, D. & Cann, I. K. O. (2009) Enzymatic deconstruction of xylan for biofuel production, Global Change Biology Bioenergy. 1, 2-17. 68.  Taylor, E. J., Gloster, T. M., Turkenburg, J. P., Vincent, F., Brzozowski, A. M., Dupont, C., Shareck, F., Centeno, M. S. J., Prates, J. A. M., Puchart, V., Ferreira, L. M. A., Fontes, C., Biely, P. & Davies, G. J. (2006) Structure and activity of two metal ion-dependent acetylxylan esterases involved in plant cell wall degradation reveals a close similarity to peptidoglycan deacetylases, Journal of Biological Chemistry. 281, 10968-10975. 69.  Bolam, D. N., Ciruela, A., McQueen-Mason, S., Simpson, P., Williamson, M. P., Rixon, J. E., Boraston, A., Hazlewood, G. P. & Gilbert, H. J. (1998) Pseudomonas cellulose-binding domains mediate their effects by increasing enzyme substrate proximity, Biochemical Journal. 331, 775-781. 70.  Tomme, P., Vantilbeurgh, H., Pettersson, G., Vandamme, J., Vandekerckhove, J., Knowles, J., Teeri, T. & Claeyssens, M. (1988) Studies of the cellulolytic system of Trichoderma reesei QM 9414. Analysis of domain function in two cellobiohydrolases by limited proteolysisStudies of the cellulolytic system of Trichoderma reesei QM 9414. Analysis of domain function in two cellobiohydrolases by limited proteolysis, European Journal of Biochemistry. 170, 575-581. 71.  Gilkes, N. R., Warren, R. A. J., Miller, R. C. & Kilburn, D. G. (1988) Precise excision of the cellulose binding domains from two Cellulomonas fimi cellulases by a homologous protease and the effect on catalysis, Journal of Biological Chemistry. 263, 10401-10407. 200  72.  Boraston, A. B., Bolam, D. N., Gilbert, H. J. & Davies, G. J. (2004) Carbohydrate-binding modules: fine-tuning polysaccharide recognition, Biochemical Journal. 382, 769-781. 73.  Naumoff, D. G. (2011) Hierarchical classification of glycoside hydrolases, Biochemistry-Moscow. 76, 622-635. 74.  Wolfenden, R., Lu, X. D. & Young, G. (1998) Spontaneous hydrolysis of glycosides, Journal of the American Chemical Society. 120, 6814-6815. 75.  Cantarel, B. L., Coutinho, P. M., Rancurel, C., Bernard, T., Lombard, V. & Henrissat, B. (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics, Nucleic Acids Research. 37, D233-D238. 76.  Koshland, D. E. (1953) Stereochemistry and the mechanism of enzymatic reactions, Biological Reviews of the Cambridge Philosophical Society. 28, 416-436. 77.  Davies, G. & Henrissat, B. (1995) Structures and mechanisms of glycosyl hydrolases, Structure. 3, 853-859. 78.  McCarter, J. D. & Withers, S. G. (1994) Mechanisms of enzymatic glycoside hydrolysis, Current Opinion in Structural Biology. 4, 885-892. 79.  Withers, S. & Williams, S. Glycoside Hydrolases, CAZypedia, http://www.cazypedia.org/index.php/Glycoside_hydrolases, data retrieved November 15th, 2017. 80.  Mark, B. L., Vocadlo, D. J., Knapp, S., Triggs-Raine, B. L., Withers, S. G. & James, M. N. G. (2001) Crystallographic evidence for substrate-assisted catalysis in a bacterial beta-hexosaminidase, Journal of Biological Chemistry. 276, 10330-10337. 81.  Knapp, S., Vocadlo, D., Gao, Z. N., Kirk, B., Lou, J. P. & Withers, S. G. (1996) NAG-thiazoline, an N-acetyl-beta-hexosaminidase inhibitor that implicates acetamido participation, Journal of the American Chemical Society. 118, 6804-6805. 82.  Vocadlo, D. J. & Withers, S. G. (2005) Detailed comparative analysis of the catalytic mechanisms of beta-N-acetylglucosaminidases from families 3 and 20 of glycoside hydrolases, Biochemistry. 44, 12809-12818. 83.  Burmeister, W. P., Cottaz, S., Rollin, P., Vasella, A. & Henrissat, B. (2000) High resolution x-ray crystallography shows that ascorbate is a cofactor for myrosinase and substitutes for the function of the catalytic base, Journal of Biological Chemistry. 275, 39385-39393. 201  84.  Amaya, M. F., Watts, A. G., Damager, I., Wehenkel, A., Nguyen, T., Buschiazzo, A., Paris, G., Frasch, A. C., Withers, S. G. & Alzari, P. M. (2004) Structural insights into the catalytic mechanism of Trypanosoma cruzi trans-sialidase, Structure. 12, 775-784. 85.  Watts, A. G., Damager, I., Amaya, M. L., Buschiazzo, A., Alzari, P., Frasch, A. C. & Withers, S. G. (2003) Trypanosoma cruzi trans-sialidase operates through a covalent sialyl-enzyme intermediate: Tyrosine is the catalytic nucleophile, Journal of the American Chemical Society. 125, 7532-7533. 86.  Ndeh, D., Rogowski, A., Cartmell, A., Luis, A. S., Basle, A., Gray, J., Venditto, I., Briggs, J., Zhang, X. Y., Labourel, A., Terrapon, N., Buffetto, F., Nepogodiev, S., Xiao, Y., Field, R. A., Zhu, Y. P., O'Neill, M. A., Urbanowicz, B. R., York, W. S., Davies, G. J., Abbott, D. W., Ralet, M. C., Martens, E. C., Henrissat, B. & Gilbert, H. J. (2017) Complex pectin metabolism by gut bacteria reveals novel catalytic functions, Nature. 544, 65-+. 87.  Rajan, S. S., Yang, X. J., Collart, F., Yip, V. L. Y., Withers, S. G., Varrot, A., Thompson, J., Davies, G. J. & Anderson, W. F. (2004) Novel catalytic mechanism of glycoside hydrolysis based on the structure of an NAD(+)/Mn(2+)-dependent phospho-alpha-glucosidase from Bacillus subtilis, Structure. 12, 1619-1629. 88.  Yip, V. L. Y., Varrot, A., Davies, G. J., Rajan, S. S., Yang, X. J., Thompson, J., Anderson, W. F. & Withers, S. G. (2004) An unusual mechanism of glycoside hydrolysis involving redox and elimination steps by a family 4 beta-glycosidase from Thermotoga maritima, Journal of the American Chemical Society. 126, 8354-8355. 89.  Fushinobu, S., Hidaka, M., Honda, Y., Wakagi, T., Shoun, H. & Kitaoka, M. (2005) Structural basis for the specificity of the reducing end xylose-releasing exo-oligoxylanase from Bacillus halodurans C-125, Journal of Biological Chemistry. 280, 17180-17186. 90.  Honda, Y. & Kitaoka, M. (2004) A family 8 glycoside hydrolase from Bacillus halodurans C-125 (BH2105) is a reducing end xylose-releasing exo-oligoxylanase, Journal of Biological Chemistry. 279, 55097-55103. 91.  Davies, G. J., Wilson, K. S. & Henrissat, B. (1997) Nomenclature for sugar-binding subsites in glycosyl hydrolases, Biochemical Journal. 321, 557-559. 92.  Divne, C., Stahlberg, J., Reinikainen, T., Ruohonen, L., Pettersson, G., Knowles, J. K. C., Teeri, T. T. & Jones, T. A. (1994) The three-dimensional crystal structure of the catalytic core of 202  cellobiohydrolase I from Trichoderma reeseiThe three-dimensional crystal structure of the catalytic core of cellobiohydrolase I from Trichoderma reesei, Science. 265, 524-528. 93.  Divne, C., Stahlberg, J., Teeri, T. T. & Jones, T. A. (1998) High-resolution crystal structures reveal how a cellulose chain is bound in the 50 angstrom long tunnel of cellobiohydrolase I from Trichoderma reeseiHigh-resolution crystal structures reveal how a cellulose chain is bound in the 50 angstrom long tunnel of cellobiohydrolase I from Trichoderma reesei, Journal of Molecular Biology. 275, 309-325. 94.  Rouvinen, J., Bergfors, T., Teeri, T., Knowles, J. K. C. & Jones, T. A. (1990) Three-dimensional structure of cellobiohydrolase II from Trichoderma reesei, Science. 249, 380-386. 95.  Grishutin, S. G., Gusakov, A. V., Markov, A. V., Ustinov, B. B., Semenova, M. V. & Sinitsyn, A. P. (2004) Specific xyloglucanases as a new class of polysaccharide-degrading enzymes, Biochimica Et Biophysica Acta-General Subjects. 1674, 268-281. 96.  Pauly, M. & Keegstra, K. (2016) Biosynthesis of the Plant Cell Wall Matrix Polysaccharide Xyloglucan, Annu Rev Plant Biol. 67, 235–259. 97.  Hu, J., Arantes, V., Pribowo, A. & Saddler, J. N. (2013) The synergistic action of accessory enzymes enhances the hydrolytic potential of a "cellulase mixture" but is highly substrate specific, Biotechnology for Biofuels. 6. 98.  Jabbour, D., Borrusch, M. S., Banerjee, G. & Walton, J. D. (2013) Enhancement of fermentable sugar yields by alpha-xylosidase supplementation of commercial cellulases, Biotechnology for Biofuels. 6, 58. 99.  Xu, C. L., Spadiut, O., Araujo, A. C., Nakhai, A. & Brumer, H. (2012) Chemo-enzymatic Assembly of Clickable Cellulose Surfaces via Multivalent Polysaccharides, Chemsuschem. 5, 661-665. 100.  Teeri, T. T., Brumer, H., Daniel, G. & Gatenholm, P. (2007) Biomimetic engineering of cellulose-based materials, Trends in Biotechnology. 25, 299-306. 101.  Eklof, J. M. & Brumer, H. (2010) The XTH Gene Family: An Update on Enzyme Structure, Function, and Phylogeny in Xyloglucan Remodeling, Plant Physiology. 153, 456-466. 102.  Baumann, M. J., Eklof, J. M., Michel, G., Kallas, A. M., Teeri, T. T., Czjzek, M. & Brumer, H. (2007) Structural evidence for the evolution of xyloglucanase activity from xyloglucan endo-transglycosylases: Biological implications for cell wall metabolism, Plant Cell. 19, 1947-1963. 203  103.  Kaewthai, N., Gendre, D., Eklof, J. M., Ibatullin, F. M., Ezcurra, I., Bhalerao, R. P. & Brumer, H. (2013) Group III-A XTH Genes of Arabidopsis Encode Predominant Xyloglucan Endohydrolases That Are Dispensable for Normal Growth, Plant Physiology. 161, 440-454. 104.  Michel, G., Chantalat, L., Duee, E., Barbeyron, T., Henrissat, B., Kloareg, B. & Dideberg, O. (2001) The kappa-carrageenase of P-carrageenovora features a tunnel-shaped active site: A novel insight in the evolution of clan-B glycoside hydrolases, Structure. 9, 513-525. 105.  Johansson, P., Brumer, H., Baumann, M. J., Kallas, A. M., Henriksson, H., Denman, S. E., Teeri, T. T. & Jones, T. A. (2004) Crystal structures of a poplar xyloglucan endotransglycosylase reveal details of transglycosylation acceptor binding, Plant Cell. 16, 874-886. 106.  Eklof, J. M., Shojania, S., Okon, M., McIntosh, L. P. & Brumer, H. (2013) Structure-Function Analysis of a Broad Specificity Populus trichocarpa Endo-beta-glucanase Reveals an Evolutionary Link between Bacterial Licheninases and Plant XTH Gene Products, Journal of Biological Chemistry. 288, 15786-15799. 107.  McGregor, N., Yin, V., Tung, C. C., Van Petegem, F. & Brumer, H. (2017) Crystallographic insight into the evolutionary origins of xyloglucan endotransglycosylases and endohydrolases, Plant Journal. 89, 651-670. 108.  Gloster, T. M., Ibatullin, F. M., Macauley, K., Eklof, J. M., Roberts, S., Turkenburg, J. P., Bjornvad, M. E., Jorgensen, P. L., Danielsen, S., Johansen, K. S., Borchert, T. V., Wilson, K. S., Brumer, H. & Davies, G. J. (2007) Characterization and three-dimensional structures of two distinct bacterial xyloglucanases from families GH5 and GH12, Journal of Biological Chemistry. 282, 19177-19189. 109.  dos Santos, C. R., Cordeiro, R. L., Wong, D. W. S. & Murakami, M. T. (2015) Structural Basis for Xyloglucan Specificity and alpha-D-Xylp(1 -> 6)-D-Glcp Recognition at the-1 Subsite within the GH5 Family, Biochemistry. 54, 1930-1942. 110.  McGregor, N., Morar, M., Fenger, T. H., Stogios, P., Lenfant, N., Yin, V., Xu, X. H., Evdokimova, E., Cui, H., Henrissat, B., Savchenko, A. & Brumer, H. (2016) Structure-Function Analysis of a Mixed-linkage beta-Glucanase/Xyloglucanase from the Key Ruminal Bacteroidetes Prevotella bryantii B(1)4, Journal of Biological Chemistry. 291, 1175-1197. 111.  Naas, A. E., MacKenzie, A. K., Dalhus, B., Eijsink, V. G. H. & Pope, P. B. (2015) Structural Features of a Bacteroidetes-Affiliated Cellulase Linked with a Polysaccharide Utilization Locus, Scientific Reports. 5. 204  112.  Ravachol, J., Borne, R., Tardif, C., de Philip, P. & Fierobe, H.-P. (2014) Characterization of All Family-9 Glycoside Hydrolases Synthesized by the Cellulosome-producing Bacterium Clostridium cellulolyticum, Journal of Biological Chemistry. 289, 7335-7348. 113.  Pereira, J. H., Sapra, R., Volponi, J. V., Kozina, C. L., Simmons, B. & Adams, P. D. (2009) Structure of endoglucanase Cel9A from the thermoacidophilic Alicyclobacillus acidocaldarius, Acta Crystallographica Section D-Biological Crystallography. 65, 744-750. 114.  Mandelman, D., Belaich, A., Belaich, J. P., Aghajari, N., Driguez, H. & Haser, R. (2003) X-ray crystal structure of the multidomain endoglucanase Cel9G from Clostridium cellulolyticum complexed with natural and synthetic cello-oligosaccharides, Journal of Bacteriology. 185, 4127-4135. 115.  Parsiegla, G., Belaïch, A., Belaïch, J. P. & Haser, R. (2002) Crystal structure of the cellulase Cel9M enlightens structure/function relationships of the variable catalytic modules in glycoside hydrolases, Biochemistry. 41, 11134-42. 116.  Schubot, F. D., Kataeva, I. A., Chang, J., Shah, A. K., Ljungdahl, L. G., Rose, J. P. & Wang, B. C. (2004) Structural basis for the exocellulase activity of the cellobiohydrolase CbhA from Clostridium thermocellum, Biochemistry. 43, 1163-1170. 117.  Kesavulu, M. M., Tsai, J. Y., Lee, H. L., Liang, P. H. & Hsiao, C. D. (2012) Structure of the catalytic domain of the Clostridium thermocellum cellulase CelT, Acta Crystallographica Section D-Biological Crystallography. 68, 310-320. 118.  Sakon, J., Irwin, D., Wilson, D. B. & Karplus, P. A. (1997) Structure and mechanism of endo/exocellulase E4 from Thermomonospora fusca, Nature Structural Biology. 4, 810-818. 119.  Okano, H., Kanaya, E., Ozaki, M., Angkawidjaja, C. & Kanaya, S. (2015) Structure, activity, and stability of metagenome-derived glycoside hydrolase family 9 endoglucanase with an N-terminal Ig-like domain, Protein Science. 24, 408-419. 120.  Beguin, P. & Alzari, P. M. (1998) The cellulosome of Clostridium thermocellum, Biochemical Society Transactions. 26, 178-185. 121.  Song, S., Tang, Y. B., Yang, S. Q., Yan, Q. J., Zhou, P. & Jiang, Z. Q. (2013) Characterization of two novel family 12 xyloglucanases from the thermophilic Rhizomucor miehei, Applied Microbiology and Biotechnology. 97, 10013-10024. 122.  Damasio, A. R. L., Ribeiro, L. F. C., Ribeiro, L. F., Furtado, G. P., Segato, F., Almeida, F. B. R., Crivellari, A. C., Buckeridge, M. S., Souza, T. A. C. B., Murakami, M. T., Ward, R. J., 205  Prade, R. A. & Polizeli, M. L. T. M. (2012) Functional characterization and oligomerization of a recombinant xyloglucan-specific endo-beta-1,4-glucanase (GH12) from Aspergillus niveus, Biochimica Et Biophysica Acta-Proteins and Proteomics. 1824, 461-467. 123.  Master, E. R., Zheng, Y., Storms, R., Tsang, A. & Powlowski, J. (2008) A xyloglucan-specific family 12 glycosyl hydrolase from Aspergillus niger: recombinant expression, purification and characterization, Biochemical Journal. 411, 161-170. 124.  Powlowski, J., Mahajan, S., Schapira, M. & Master, E. R. (2009) Substrate recognition and hydrolysis by a fungal xyloglucan-specific family 12 hydrolase, Carbohydrate Research. 344, 1175-1179. 125.  Yoshizawa, T., Shimizu, T., Hirano, H., Sato, M. & Hashimoto, H. (2012) Structural Basis for Inhibition of Xyloglucan-specific Endo-beta-1,4-glucanase (XEG) by XEG-Protein Inhibitor, Journal of Biological Chemistry. 287, 18710-18716. 126.  Ariza, A., Eklof, J. M., Spadiut, O., Offen, W. A., Roberts, S. M., Besenmatter, W., Friis, E. P., Skjot, M., Wilson, K. S., Brumer, H. & Davies, G. (2011) Structure and Activity of Paenibacillus polymyxa Xyloglucanase from Glycoside Hydrolase Family 44, Journal of Biological Chemistry. 286, 33890-33900. 127.  Ravachol, J., de Philip, P., Borne, R., Mansuelle, P., Maté, M. J., Perret, S. & Fierobe, H. P. (2016) Mechanisms involved in xyloglucan catabolism by the cellulosome-producing bacterium Ruminiclostridium cellulolyticum, Sci Rep. 6, 22770. 128.  Alahuhta, M., Adney, W. S., Himmel, M. E. & Lunin, V. V. (2013) Structure of Acidothermus cellulolyticus family 74 glycoside hydrolase at 1.82 angstrom resolution, Acta Crystallographica Section F-Structural Biology and Crystallization Communications. 69, 1335-1338. 129.  Martinez-Fleites, C., Guerreiro, C., Baumann, M. J., Taylor, E. J., Prates, J. A. M., Ferreira, L. M. A., Fontes, C., Brumer, H. & Davies, G. J. (2006) Crystal structures of Clostridium thermocellum xyloglucanase, XGH74A, reveal the structural basis for xyloglucan recognition and degradation, Journal of Biological Chemistry. 281, 24922-24933. 130.  Yaoi, K., Kondo, H., Hiyoshi, A., Noro, N., Sugimoto, H., Tsuda, S. & Miyazaki, K. (2009) The crystal structure of a xyloglucan-specific endo-beta-1,4-glucanase from Geotrichum sp. M128 xyloglucanase reveals a key amino acid residue for substrate specificity, Febs Journal. 276, 5094-5100. 206  131.  Matsuzawa, T., Saito, Y. & Yaoi, K. (2014) Key amino acid residues for the endo-processive activity of GH74 xyloglucanase, Febs Letters. 588, 1731-1738. 132.  Ichinose, H., Araki, Y., Michikawa, M., Harazono, K., Yaoi, K., Karita, S. & Kaneko, S. (2012) Characterization of an endo-processive-type xyloglucanase having a beta-1,4-glucan-Binding Module and an endo-type xyloglucanase from Streptomyces avermitilis, Applied and Environmental Microbiology. 78, 7939-7945. 133.  Yaoi, K., Nakai, T., Kameda, Y., Hiyoshi, A. & Mitsuishi, Y. (2005) Cloning and characterization of two xyloglucanases from Paenibacillus sp. strain KM21, Applied and Environmental Microbiology. 71, 7670-7678. 134.  Yaoi, K. & Mitsuishi, Y. (2004) Purification, characterization, cDNA cloning, and expression of a xyloglucan endoglucanase from Geotrichum sp. M128, Febs Letters. 560, 45-50. 135.  Enkhbaatar, B., Temuujin, U., Lim, J. H., Chi, W. J., Chang, Y. K. & Hong, S. K. (2012) Identification and Characterization of a Xyloglucan-Specific Family 74 Glycosyl Hydrolase from Streptomyces coelicolor A3(2), Applied and Environmental Microbiology. 78, 607-611. 136.  Feng, T., Yan, K.-P., Mikkelsen, M. D., Meyer, A. S., Schols, H. A., Westereng, B. & Mikkelsen, J. D. (2014) Characterisation of a novel endo-xyloglucanase (XcXGHA) from Xanthomonas that accommodates a xylosyl-substituted glucose at subsite-1, Applied Microbiology and Biotechnology. 98, 9667-9679. 137.  Yaoi, K. & Mitsuishi, Y. (2002) Purification, characterization, cloning, and expression of a novel xyloglucan-specific glycosidase, oligoxyloglucan reducing end-specific cellobiohydrolase, Journal of Biological Chemistry. 277, 48276-48281. 138.  Desmet, T., Cantaert, T., Gualfetti, P., Nerinckx, W., Gross, L., Mitchinson, C. & Piens, K. (2007) An investigation of the substrate specificity of the xyloglucanase Cel74A from Hypocrea jecorina, FEBS Journal. 274, 356-363. 139.  Hemsworth, G. R., Thompson, A. J., Stepper, J., Sobala, L. F., Coyle, T., Larsbrink, J., Spadiut, O., Goddard-Borger, E. D., Stubbs, K. A., Brumer, H. & Davies, G. J. (2016) Structural dissection of a complex Bacteroides ovatus gene locus conferring xyloglucan metabolism in the human gut, Open Biol. 6. 140.  Pozzo, T., Pasten, J. L., Karlsson, E. N. & Logan, D. T. (2010) Structural and Functional Analyses of beta-Glucosidase 3B from Thermotoga neapolitana: A Thermostable Three-Domain Representative of Glycoside Hydrolase 3, Journal of Molecular Biology. 397, 724-739. 207  141.  Karkehabadi, S., Helmich, K. E., Kaper, T., Hansson, H., Mikkelsen, N. E., Gudmundsson, M., Piens, K., Fujdala, M., Banerjee, G., Scott-Craig, J. S., Walton, J. D., Phillips, G. N. & Sandgren, M. (2014) Biochemical Characterization and Crystal Structures of a Fungal Family 3 beta-Glucosidase, Cel3A from Hypocrea jecorina, Journal of Biological Chemistry. 289, 31624-31637. 142.  Kato, Y., Matsushita, J., Kubodera, T. & Matsuda, K. (1985) A novel enzyme producing isoprimeveross from oligoxyloglucans of Aspergillus oryzae, Journal of Biochemistry. 97, 801-810. 143.  Matsuzawa, T., Mitsuishi, Y., Kameyama, A. & Yaoi, K. (2016) Identification of the Gene Encoding Isoprimeverose-producing Oligoxyloglucan Hydrolase in Aspergillus oryzae, J Biol Chem. 291, 5080-7. 144.  Yaoi, K. & Miyazaki, K. (2012) Cloning and Expression of Isoprimeverose-producing Oligoxyloglucan Hydrolase from Actinomycetes Species, Oerskovia sp. Y1, Journal of Applied Glycoscience. 59, 83-88. 145.  Larsbrink, J., Izumi, A., Ibatullin, F. M., Nakhai, A., Gilbert, H. J., Davies, G. J. & Brumer, H. (2011) Structural and enzymatic characterization of a glycoside hydrolase family 31 alpha-xylosidase from Cellvibrio japonicus involved in xyloglucan saccharification, Biochemical Journal. 436, 567-580. 146.  Mewis, K., Lenfant, N., Lombard, V. & Henrissat, B. (2016) Dividing the Large Glycoside Hydrolase Family 43 into Subfamilies: a Motivation for Detailed Enzyme Characterization, Applied and Environmental Microbiology. 82, 1686-1692. 147.  El Kaoutari, A., Armougom, F., Gordon, J. I., Raoult, D. & Henrissat, B. (2013) The abundance and variety of carbohydrate-active enzymes in the human gut microbiota, Nature Reviews Microbiology. 11, 497-504. 148.  Tauzin, A. S., Kwiatkowski, K. J., Orlovsky, N. I., Smith, C. J., Creagh, A. L., Haynes, C. A., Wawrzak, Z., Brumer, H. & Koropatkin, N. M. (2016) Molecular Dissection of Xyloglucan Recognition in a Prominent Human Gut Symbiont, MBio. 7. 149.  Fontes, C. & Gilbert, H. J. (2010) Cellulosomes: Highly Efficient Nanomachines Designed to Designed to Deconstruct Plant Cell Wall Complex Carbohydrates, Annual Review of Biochemistry, Vol 79. 79, 655-681. 208  150.  Eklof, J. M., Ruda, M. C. & Brumer, H. (2012) Distinguishing xyloglucanase activity in endo-beta(1 -> 4)glucanases, Methods in Enzymology. 510, 97-120. 151.  Gilbert, H. J., Stalbrand, H. & Brumer, H. (2008) How the walls come crumbling down: recent structural biochemistry of plant polysaccharide degradation, Current Opinion in Plant Biology. 11, 338-348. 152.  Ueda, K., Ishikawa, S., Itami, T. & Asai, T. (1952) Studies on the aerobic mesophilic cellulose- decomposing bacteria. Part 5-2. Taxonomincal study on the genus Pseudomonas, Journal of the agricultural chemical society of Japan. 26, 35-41. 153.  Anzai, Y., Kim, H., Park, J. Y., Wakabayashi, H. & Oyaizu, H. (2000) Phylogenetic affiliation of the pseudomonads based on 16S rRNA sequence, International Journal of Systematic and Evolutionary Microbiology. 50, 1563-1589. 154.  Humphry, D. R., Black, G. W. & Cummings, S. P. (2003) Reclassification of 'Pseudomonas fluorescens subsp. cellulosa' NCIMB 10462 (Ueda et al. 1952) as Cellvibrio japonicus sp. nov. and revival of Cellvibrio vulgaris sp. nov., nom. rev. and Cellvibrio fulvus sp. nov., nom. rev, International Journal of Systematic and Evolutionary Microbiology. 53, 393-400. 155.  Hazlewood, G. P. & Gilbert, H. J. (1998) Structure and function analysis of Pseudomonas plant cell wall hydrolases, Biochemical Society Transactions. 26, 185-190. 156.  Deboy, R. T., Mongodin, E. F., Fouts, D. E., Tailford, L. E., Khouri, H., Emerson, J. B., Mohamoud, Y., Watkins, K., Henrissat, B., Gilbert, H. J. & Nelson, K. E. (2008) Insights into plant cell wall degradation from the genome sequence of the soil bacterium Cellvibrio japonicus, Journal of Bacteriology. 190, 5455-5463. 157.  Nelson, C. E. & Gardner, J. G. (2015) In-frame deletions allow functional characterization of complex cellulose degradation phenotypes in Cellvibrio japonicus, Applied and Environmental Microbiology. 81, 5968-5975. 158.  Gardner, J. G. (2016) Polysaccharide degradation systems of the saprophytic bacterium Cellvibrio japonicus, World Journal of Microbiology & Biotechnology. 32, 121. 159.  Gardner, J. G. & Keating, D. H. (2010) Requirement of the Type II Secretion System for Utilization of Cellulosic Substrates by Cellvibrio japonicus, Applied and Environmental Microbiology. 76, 5079-5087. 209  160.  Gardner, J. G. & Keating, D. H. (2012) Genetic and functional genomic approaches for the study of plant cell wall degradation in Cellvibrio japonicus, Methods in Enzymology. 510, 331-347. 161.  Hazlewood, G. P., Laurie, J. I., Ferreira, L. M. A. & Gilbert, H. J. (1992) Pseudomonas-fluorescens subsp. cellulosa: an alternative model for bacterial cellulase, Journal of Applied Bacteriology. 72, 244-251. 162.  Larsbrink, J. (2013) Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria, KTH Royal Institute of Technology, Stockholm, Sweden. 163.  Parajuli, R., Dalgaard, T., Jorgensen, U., Adamsen, A. P. S., Knudsen, M. T., Birkved, M., Gylling, M. & Schjorring, J. K. (2015) Biorefining in the prevailing energy and materials crisis: a review of sustainable pathways for biorefinery value chains and sustainability assessment methodologies, Renewable & Sustainable Energy Reviews. 43, 244-263. 164.  Himmel, M. E., Ding, S. Y., Johnson, D. K., Adney, W. S., Nimlos, M. R., Brady, J. W. & Foust, T. D. (2007) Biomass recalcitrance: Engineering plants and enzymes for biofuels production, Science. 315, 804-807. 165.  Gardner, J. G., Crouch, L., Labourel, A., Forsberg, Z., Bukhman, Y. V., Vaaje-Kolstad, G., Gilbert, H. J. & Keating, D. H. (2014) Systems biology defines the biological significance of redox-active proteins during cellulose degradation in an aerobic bacterium, Molecular Microbiology. 94, 1121-1133. 166.  Silipo, A., Larsbrink, J., Marchetti, R., Lanzetta, R., Brumer, H. & Molinaro, A. (2012) NMR Spectroscopic analysis reveals extensive binding interactions of complex xyloglucan oligosaccharides with the Cellvibrio japonicus glycoside hydrolase family 31 α-xylosidase, Chemistry-A European Journal. 18, 13395-13404. 167.  Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions, Nature Methods. 8, 785-786. 168.  Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994) CLUSTAL-W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research. 22, 4673-4680. 169.  Kelley, L. A. & Sternberg, M. J. E. (2009) Protein structure prediction on the Web: a case study using the Phyre server, Nature Protocols. 4, 363-371. 210  170.  Pedelacq, J. D., Cabantous, S., Tran, T., Terwilliger, T. C. & Waldo, G. S. (2006) Engineering and characterization of a superfolder green fluorescent protein, Nature Biotechnology. 24, 79-88. 171.  Sundqvist, G., Stenvall, M., Berglund, H., Ottosson, J. & Brumer, H. (2007) A general, robust method for the quality control of intact proteins using LC-ESI-MS, Journal of Chromatography B-Analytical Technologies in the Biomedical and Life Sciences. 852, 188-194. 172.  Ibatullin, F. M., Banasiak, A., Baumann, M. J., Greffe, L., Takahashi, J., Mellerowicz, E. J. & Brumer, H. (2009) A Real-Time Fluorogenic Assay for the Visualization of Glycoside Hydrolase Activity in Planta, Plant Physiology. 151, 1741-1750. 173.  Ibatullin, F. M., Baumann, M. J., Greffe, L. & Brumer, H. (2008) Kinetic analyses of retaining endo-(xylo)glucanases from plant and microbial sources using new chromogenic xylogluco-oligosaccharide aryl glycosides, Biochemistry. 47, 7762-7769. 174.  McFeeters, R. F. (1980) A manual method for reducing sugar determinations with 2,2'-bicinchoninate reagent, Analytical Biochemistry. 103, 302-306. 175.  Winter, G. (2010) xia2: an expert system for macromolecular crystallography data reduction, Journal of Applied Crystallography. 43, 186-190. 176.  Kabsch, W. (1993) Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants, Journal of Applied Crystallography. 26, 795-800. 177.  Leslie, A. G. W. (2006) The integration of macromolecular diffraction data, Acta Crystallographica Section D-Biological Crystallography. 62, 48-57. 178.  Collaborative Computational Project, N. (1994) The CCP4 Suite: Programs for protein crystallography, Acta Crystallographica Section D Biological Crystallography. 50, 760-763. 179.  Vagin, A. & Teplyakov, A. (1997) MOLREP: an automated program for molecular replacement, Journal of Applied Crystallography. 30, 1022-1025. 180.  Emsley, P. & Cowtan, K. (2004) Coot: model-building tools for molecular graphics, Acta Crystallographica Section D-Biological Crystallography. 60, 2126-2132. 181.  Agirre, J., Iglesias-Fernandez, J., Rovira, C., Davies, G. J., Wilson, K. S. & Cowtan, K. D. (2015) Privateer: software for the conformational validation of carbohydrate structures, Nature Structural & Molecular Biology. 22, 833-834. 211  182.  Abbott, D. W. & Boraston, A. B. (2012) Quantitative approaches to the analysis of carbohydrate-binding module function, Cellulases. 510, 211-231. 183.  Simpson, P. J., Xie, H. F., Bolam, D. N., Gilbert, H. J. & Williamson, M. P. (2000) The structural basis for the ligand specificity of family 2 carbohydrate-binding modules, Journal of Biological Chemistry. 275, 41137-41142. 184.  Murashima, K., Kosugi, A. & Doi, R. H. (2005) Site-directed mutagenesis and expression of the soluble form of the family IIIa cellulose binding domain from the cellulosomal scaffolding protein of Clostridium cellulovorans, Journal of Bacteriology. 187, 7146-7149. 185.  Black, G. W., Rixon, J. E., Clarke, J. H., Hazlewood, G. P., Theodorou, M. K., Morris, P. & Gilbert, H. J. (1996) Evidence that linker sequences and cellulose-binding domains enhance the activity of hemicellulases against complex substrates, Biochemical Journal. 319, 515-520. 186.  Black, G. W., Rixon, J. E., Clarke, J. H., Hazlewood, G. P., Ferreira, L. M. A., Bolam, D. N. & Gilbert, H. J. (1997) Cellulose binding domains and linker sequences potentiate the activity of hemicellulases against complex substrates, Journal of Biotechnology. 57, 59-69. 187.  Oliveira, C., Carvalho, V., Domingues, L. & Gama, F. M. (2015) Recombinant CBM-fusion technology - Applications overview, Biotechnology Advances. 33, 358-369. 188.  Kavoosi, M., Lam, D., Bryan, J., Kilburn, D. G. & Haynes, C. A. (2007) Mechanically stable porous cellulose media for affinity purification of family 9 cellulose-binding module-tagged fusion proteins, Journal of Chromatography A. 1175, 187-196. 189.  Kavoosi, M., Meijer, J., Kwan, E., Creagh, A. L., Kilburn, D. G. & Haynes, C. A. (2004) Inexpensive one-step purification of polypeptides expressed in Escherichia coli as fusions with the family 9 carbohydrate-binding module of xylanase 10A from T. maritima, Journal of Chromatography B-Analytical Technologies in the Biomedical and Life Sciences. 807, 87-94. 190.  Kavoosi, M., Sanaie, N., Dismer, F., Hubbuch, J., Kilburn, D. G. & Haynes, C. A. (2007) A novel two-zone protein uptake model for affinity chromatography and its application to the description of elution band profiles of proteins fused to a family 9 cellulose binding module affinity tag, Journal of Chromatography A. 1160, 137-149. 191.  Hong, J., Ye, X., Wang, Y. & Zhang, Y. H. P. (2008) Bioseparation of recombinant cellulose-bindning module-proteins by affinity adsorption on an ultra-high-capacity cellulosic adsorbent, Analytica Chimica Acta. 621, 193-199. 212  192.  Lim, S., Chundawat, S. P. S. & Fox, B. G. (2014) Expression, purification and characterization of a functional carbohydrate-binding module from Streptomyces sp. SirexAA-E, Protein Expression and Purification. 98, 1-9. 193.  Gao, S., You, C., Renneckar, S., Bao, J. & Zhang, Y.-H. P. (2014) New insights into enzymatic hydrolysis of heterogeneous cellulose by using carbohydrate-binding module 3 containing GFP and carbohydrate- binding module 17 containing CFP, Biotechnology for Biofuels. 7. 194.  Hong, J., Ye, X. & Zhang, Y. H. P. (2007) Quantitative determination of cellulose accessibility to cellulase based on adsorption of a nonhydrolytic fusion protein containing CBM and GFP with its applications, Langmuir. 23, 12535-12540. 195.  Irwin, D. C., Cheng, M., Xiang, B. S., Rose, J. K. C. & Wilson, D. B. (2003) Cloning, expression and characterization of a family-74 xyloglucanase from Thermobifida fusca, European Journal of Biochemistry. 270, 3083-3091. 196.  Zverlov, V. V., Schantz, N., Schmitt-Kopplin, P. & Schwarz, W. H. (2005) Two new major subunits in the cellulosome of Clostridium thermocellum: xyloglucanase Xgh74A and endoxylanase Xyn10D, Microbiology-Sgm. 151, 3395-3401. 197.  Ishida, T., Yaoi, K., Hiyoshi, A., Igarashi, K. & Samejima, M. (2007) Substrate recognition by glycoside hydrolase family 74 xyloglucanase from the basidiomycete Phanerochaete chrysosporium, Febs Journal. 274, 5727-5736. 198.  Krissinel, E. & Henrick, K. (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions, Acta Crystallographica Section D-Biological Crystallography. 60, 2256-2268. 199.  McNicholas, S., Potterton, E., Wilson, K. S. & Noble, M. E. M. (2011) Presenting your structures: the CCP4mg molecular-graphics software, Acta Crystallographica Section D-Biological Crystallography. 67, 386-394. 200.  Shoseyov, O., Shani, Z. & Levy, I. (2006) Carbohydrate binding modules: Biochemical properties and novel applications, Microbiology and Molecular Biology Reviews. 70, 283-295. 201.  Hashimoto, H. (2006) Recent structural studies of carbohydrate-binding modules, Cellular and Molecular Life Sciences. 63, 2954-2967. 202.  Guillen, D., Sanchez, S. & Rodriguez-Sanoja, R. (2010) Carbohydrate-binding domains: multiplicity of biological roles, Applied Microbiology and Biotechnology. 85, 1241-1249. 213  203.  Gilbert, H. J., Knox, J. P. & Boraston, A. B. (2013) Advances in understanding the molecular basis of plant cell wall polysaccharide recognition by carbohydrate-binding modules, Current Opinion in Structural Biology. 23, 669-677. 204.  Hernandez-Gomez, M. C., Rydahl, M. G., Rogowski, A., Morland, C., Cartmell, A., Crouch, L., Labourel, A., Fontes, C. M. G. A., Willats, W. G. T., Gilbert, H. J. & Knox, J. P. (2015) Recognition of xyloglucan by the crystalline cellulose-binding site of a family 3a carbohydrate-binding module, Febs Letters. 589, 2297-2303. 205.  Gill, J., Rixon, J. E., Bolam, D. N., McQueen-Mason, S., Simpson, P. J., Williamson, M. P., Hazlewood, G. P. & Gilbert, H. J. (1999) The type II and X cellulose-binding domains of Pseudomonas xylanase A potentiate catalytic activity against complex substrates by a common mechanism, Biochemical Journal. 342, 473-480. 206.  Raghothama, S., Simpson, P. J., Szabo, L., Nagy, T., Gilbert, H. J. & Williamson, M. P. (2000) Solution structure of the CBM10 cellulose binding module from Pseudomonas xylanase A, Biochemistry. 39, 978-984. 207.  Nakamura, T., Mine, S., Hagihara, Y., Ishikawa, K., Ikegami, T. & Uegaki, K. (2008) Tertiary structure and carbohydrate recognition by the chitin-binding domain of a hyperthermophilic chitinase from Pyrococcus furiosus, Journal of Molecular Biology. 381, 670-680. 208.  Bolam, D. N., Xie, H. F., White, P., Simpson, P. J., Hancock, S. M., Williamson, M. P. & Gilbert, H. J. (2001) Evidence for synergy between family 2b carbohydrate binding modules in Cellulomonas fimi xylanase 11A, Biochemistry. 40, 2468-2477. 209.  Xu, G. Y., Ong, E., Gilkes, N. R., Kilburn, D. G., Muhandiram, D. R., Harrisbrandts, M., Carver, J. P., Kay, L. E. & Harvey, T. S. (1995) Solution structure of a cellulose-binding domain from Cellulomonas fimi by nuclear magnetic resonance spectroscopy, Biochemistry. 34, 6993-7009. 210.  Bianchetti, C. M., Brumm, P., Smith, R. W., Dyer, K., Hura, G. L., Rutkoski, T. J. & Phillips, G. N. (2013) Structure, Dynamics, and Specificity of Endoglucanase D from Clostridium cellulovorans, Journal of Molecular Biology. 425, 4267-4285. 211.  Ponyi, T., Szabo, L., Nagy, T., Orosz, L., Simpson, P. J., Williamson, M. P. & Gilbert, H. J. (2000) Trp22, Trp24, and Tyr8 play a pivotal role in the binding of the family 10 cellulose-binding module from Pseudomonas xylanase A to insoluble ligands, Biochemistry. 39, 985-991. 214  212.  Hemsworth, G. R., Dejean, G., Davies, G. J. & Brumer, H. (2016) Learning from microbial strategies for polysaccharide degradation, Biochemical Society Transactions. 44, 94-108. 213.  Hu, J., Arantes, V. & Saddler, J. N. (2011) The enhancement of enzymatic hydrolysis of lignocellulosic substrates by the addition of accessory enzymes such as xylanase: is it an additive or synergistic effect?, Biotechnology for Biofuels. 4. 214.  Harris, P. V., Welner, D., McFarland, K. C., Re, E., Poulsen, J.-C. N., Brown, K., Salbo, R., Ding, H., Vlasenko, E., Merino, S., Xu, F., Cherry, J., Larsen, S. & Lo Leggio, L. (2010) Stimulation of Lignocellulosic Biomass Hydrolysis by Proteins of Glycoside Hydrolase Family 61: Structure and Function of a Large, Enigmatic Family, Biochemistry. 49, 3305-3316. 215.  Banerjee, G., Car, S., Scott-Craig, J. S., Borrusch, M. S., Bongers, M. & Walton, J. D. (2010) Synthetic multi-component enzyme mixtures for deconstruction of lignocellulosic biomass, Bioresource Technology. 101, 9097-9105. 216.  Banerjee, G., Scott-Craig, J. S. & Walton, J. D. (2010) Improving Enzymes for Biomass Conversion: A Basic Research Perspective, Bioenergy Research. 3, 82-92. 217.  Fangel, J. U., Ulvskov, P., Knox, J. P., Mikkelsen, M. D., Harholt, J., Popper, Z. A. & Willats, W. G. T. (2012) Cell wall evolution and diversity, Frontiers in Plant Science. 3. 218.  Hsieh, Y. S. Y. & Harris, P. J. (2009) Xyloglucans of monocotyledons have diverse structures, Molecular Plant. 2, 943-965. 219.  Attia, M., Stepper, J., Davies, G. J. & Brumer, H. (2016) Functional and structural characterization of a potent GH74 endo-xyloglucanase from the soil saprophyte Cellvibrio japonicus unravels the first step of xyloglucan degradation, FEBS J. 283, 1701-1719. 220.  Rahman, O., Cummings, S. P., Harrington, D. J. & Sutcliffe, I. C. (2008) Methods for the bioinformatic identification of bacterial lipoproteins encoded in the genomes of Gram-positive bacteria, World Journal of Microbiology & Biotechnology. 24, 2377-2382. 221.  Eschenfeldt, W. H., Lucy, S., Millard, C. S., Joachimiak, A. & Mark, I. D. (2009) A family of LIC vectors for high-throughput cloning and purification of proteins, Methods Mol Biol. 498, 105-15. 222.  Fenger, T. H. & Brumer, H. (2015) Synthesis and Analysis of Specific Covalent Inhibitors of endo-Xyloglucanases, Chembiochem. 16, 575-583. 215  223.  Waterman, D. G., Winter, G., Gildea, R. J., Parkhurst, J. M., Brewster, A. S., Sauter, N. K. & Evans, G. (2016) Diffraction-geometry refinement in the DIALS framework, Acta Crystallographica Section D, Structural Biology. 72, 558-575. 224.  Evans, P. R. & Murshudov, G. N. (2013) How good are my data and what is the resolution?, Acta Crystallographica Section D. 69, 1204-1214. 225.  McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007) Phaser crystallographic software, Journal of Applied Crystallography. 40, 658-674. 226.  Pearson, W. R. (1999) Flexible Sequence Similarity Searching  with the FASTA3 Program Package in Bioinformatics Methods and Protocols (Stephen Misener, S. A. K., ed) pp. 185-219, Humana Press, Totowa, NJ. 227.  Cowtan, K. (2006) The Buccaneer software for automated model building. 1. Tracing protein chains, Acta Crystallographica Section D-Biological Crystallography. 62, 1002-1011. 228.  Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010) Features and development of Coot, Acta Crystallographica Section D-Biological Crystallography. 66, 486-501. 229.  Murshudov, G. N., Skubak, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011) REFMAC5 for the refinement of macromolecular crystal structures, Acta Crystallographica Section D-Biological Crystallography. 67, 355-367. 230.  Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010) MolProbity: all-atom structure validation for macromolecular crystallography, Acta Crystallographica Section D-Biological Crystallography. 66, 12-21. 231.  Gibson, D. G., Young, L., Chuang, R. Y., Venter, J. C., Hutchison, C. A. & Smith, H. O. (2009) Enzymatic assembly of DNA molecules up to several hundred kilobases, Nature Methods. 6, 343-U41. 232.  Raman, B., McKeown, C. K., Rodriguez, M., Brown, S. D. & Mielenz, J. R. (2011) Transcriptomic analysis of Clostridium thermocellum ATCC 27405 cellulose fermentation, Bmc Microbiology. 11. 233.  Riederer, A., Takasuka, T. E., Makino, S., Stevenson, D. M., Bukhman, Y. V., Elsen, N. L. & Fox, B. G. (2011) Global Gene Expression Patterns in Clostridium thermocellum as 216  Determined by Microarray Analysis of Chemostat Cultures on Cellulose or Cellobiose, Applied and Environmental Microbiology. 77, 1243-1253. 234.  Paetzel, M., Karla, A., Strynadka, N. C. J. & Dalbey, R. E. (2002) Signal peptidases, Chemical Reviews. 102, 4549-4579. 235.  Wong, D., Chan, V. J., McCormack, A. A. & Batt, S. B. (2010) A novel xyloglucan-specific endo-beta-1,4-glucanase: biochemical properties and inhibition studies, Applied Microbiology and Biotechnology. 86, 1463-1471. 236.  Barras, F., Bortoligerman, I., Bauzan, M., Rouvier, J., Gey, C., Heyraud, A. & Henrissat, B. (1992) Stereochemistry of the hydrolysis reaction catalyzed by endoglucanase Z from Erwinia chrysanthemi, Febs Letters. 300, 145-148. 237.  Gloster, T. M. & Vocadlo, D. J. (2012) Developing inhibitors of glycan processing enzymes as tools for enabling glycobiology, Nature Chemical Biology. 8, 683-694. 238.  Zhao, Y. X., Chany, C. J., Sims, P. F. G. & Sinnott, M. L. (1997) Definition of the substrate specificity of the 'sensing' xylanase of Streptomyces cyaneus using xylooligosaccharide and cellooligosaccharide glycosides of 3,4-dinitrophenol, Journal of Biotechnology. 57, 181-190. 239.  Raman, B., Pan, C., Hurst, G. B., Rodriguez, M., McKeown, C. K., Lankford, P. K., Samatova, N. F. & Mielenz, J. R. (2009) Impact of Pretreated Switchgrass and Biomass Carbohydrates on Clostridium thermocellum ATCC 27405 Cellulosome Composition: A Quantitative Proteomic Analysis, Plos One. 4. 240.  Cartmell, A., McKee, L. S., Pena, M. J., Larsbrink, J., Brumer, H., Kaneko, S., Ichinose, H., Lewis, R. J., Vikso-Nielsen, A., Gilbert, H. J. & Marles-Wright, J. (2011) The structure and function of an arabinan-specific alpha-1,2-arabinofuranosidase identified from screening the activities of bacterial GH43 glycoside hydrolases, J Biol Chem. 286, 15483-95. 241.  Beylot, M. H., McKie, V. A., Voragen, A. G. J., Doeswijk-Voragen, C. H. L. & Gilbert, H. J. (2001) The Pseudomonas cellulosa glycoside hydrolase family 51 arabinofuranosidase exhibits wide substrate specificity, Biochemical Journal. 358, 607-614. 242.  Braithwaite, K. L., Barna, T., Spurway, T. D., Charnock, S. J., Black, G. W., Hughes, N., Lakey, J. H., Virden, R., Hazlewood, G. P., Henrissat, B. & Gilbert, H. J. (1997) Evidence that galactanase A from Pseudomonas fluorescens subspecies cellulosa is a retaining family 53 glycosyl hydrolase in which E161 and E270 are the catalytic residues, Biochemistry. 36, 15489-500. 217  243.  Halstead, J. R., Fransen, M. P., Eberhart, R. Y., Park, A. J., Gilbert, H. J. & Hazlewood, G. P. (2000) alpha-Galactosidase A from Pseudomonas fluorescens subsp. cellulosa: cloning, high level expression and its role in galactomannan hydrolysis, FEMS microbiology letters. 192, 197-203. 244.  Millward-Sadler, S. J., Davidson, K., Hazlewood, G. P., Black, G. W., Gilbert, H. J. & Clarke, J. H. (1995) Novel cellulose-binding domains, NodB homologues and conserved modular architecture in xylanases from the aerobic soil bacteria Pseudomonas fluorescens subsp. cellulosa and Cellvibrio mixtus, The Biochemical journal. 312 ( Pt 1), 39-48. 245.  Larsbrink, J., Izumi, A., Hemsworth, G. R., Davies, G. J. & Brumer, H. (2012) Structural Enzymology of Cellvibrio japonicus Agd31B Protein Reveals alpha-Transglucosylase Activity in Glycoside Hydrolase Family 31, Journal of Biological Chemistry. 287, 43288-43299. 246.  Montanier, C., Money, V. A., Pires, V. M., Flint, J. E., Pinheiro, B. A., Goyal, A., Prates, J. A., Izumi, A., Stalbrand, H., Morland, C., Cartmell, A., Kolenova, K., Topakas, E., Dodson, E. J., Bolam, D. N., Davies, G. J., Fontes, C. M. & Gilbert, H. J. (2009) The active site of a carbohydrate esterase displays divergent catalytic and noncatalytic binding functions, PLoS biology. 7, e71. 247.  Nagy, T., Nurizzo, D., Davies, G. J., Biely, P., Lakey, J. H., Bolam, D. N. & Gilbert, H. J. (2003) The alpha-glucuronidase, GlcA67A, of Cellvibrio japonicus utilizes the carboxylate and methyl groups of aldobiouronic acid as important substrate recognition determinants, The Journal of biological chemistry. 278, 20286-92. 248.  Kellett, L. E., Poole, D. M., Ferreira, L. M., Durrant, A. J., Hazlewood, G. P. & Gilbert, H. J. (1990) Xylanase B and an arabinofuranosidase from Pseudomonas fluorescens subsp. cellulosa contain identical cellulose-binding domains and are encoded by adjacent genes, The Biochemical journal. 272, 369-76. 249.  Geoghegan, K. F., Dixon, H. B. F., Rosner, P. J., Hoth, L. R., Lanzetti, A. J., Borzilleri, K. A., Marr, E. S., Pezzullo, L. H., Martin, L. B., LeMotte, P. K., McColl, A. S., Kamath, A. V. & Stroh, J. G. (1999) Spontaneous alpha-N-6-phosphogluconoylation of a "His tag" in Escherichia coli: The cause of extra mass of 258 or 178 Da in fusion proteins, Analytical Biochemistry. 267, 169-184. 250.  Herve, C., Rogowski, A., Blake, A. W., Marcus, S. E., Gilbert, H. J. & Knox, J. P. (2010) Carbohydrate-binding modules promote the enzymatic deconstruction of intact plant cell walls 218  by targeting and proximity effects, Proceedings of the National Academy of Sciences of the United States of America. 107, 15293-15298. 251.  Davies, G. J. & Williams, S. J. (2016) Carbohydrate-active enzymes: sequences, shapes, contortions and cells, Biochemical Society Transactions. 44, 79-87. 252.  Arnal, G., Attia, M. A., Asohan, J. & Brumer, H. (2017) A Low-Volume, Parallel Copper-Bicinchoninic Acid (BCA) Assay for Glycoside Hydrolases, Methods Mol Biol. 1588, 3-14. 253.  Skerra, A. (2000) Engineered protein scaffolds for molecular recognition, Journal of Molecular Recognition. 13, 167-187. 254.  Abbott, D. W. & Boraston, A. B. (2007) The structural basis for exopolygalacturonase activity in a family 28 glycoside hydrolase, Journal of Molecular Biology. 368, 1215-1222. 255.  Kataeva, I. A., Seidel, R. D., Shah, A., West, L. T., Li, X. L. & Ljungdahl, L. G. (2002) The fibronectin type 3-like repeat from the Clostridium thermocellum cellobiohydrolase CbhA promotes hydrolysis of cellulose by modifying its surfaceThe fibronectin type 3-like repeat from the Clostridium thermocellum cellobiohydrolase CbhA promotes hydrolysis of cellulose by modifying its surface, Applied and Environmental Microbiology. 68, 4292-4300. 256.  Kim, D. Y., Han, M. K., Park, D. S., Lee, J. S., Oh, H. W., Shin, D. H., Jeong, T. S., Kim, S. U., Bae, K. S., Son, K. H. & Park, H. Y. (2009) Novel GH10 Xylanase, with a Fibronectin Type 3 Domain, from Cellulosimicrobium sp Strain HY-13, a Bacterium in the Gut of Eisenia fetida, Applied and Environmental Microbiology. 75, 7275-7279. 257.  Najmudin, S., Guerreiro, C., Carvalho, A. L., Prates, J. A. M., Correia, M. A. S., Alves, V. D., Ferreira, L. M. A., Romao, M. J., Gilbert, H. J., Bolam, D. N. & Fontes, C. (2006) Xyloglucan is recognized by carbohydrate-binding modules that interact with beta-glucan chains, Journal of Biological Chemistry. 281, 8815-8828. 258.  Luis, A. S., Venditto, I., Temple, M. J., Rogowski, A., Basle, A., Xue, J., Knox, J. P., Prates, J. A. M., Ferreira, L. M. A., Fontes, C. M. G. A., Najmudin, S. & Gilbert, H. J. (2013) Understanding How Noncatalytic Carbohydrate Binding Modules Can Display Specificity for Xyloglucan, Journal of Biological Chemistry. 288, 4799-4809. 259.  Montanier, C. Y., Correia, M. A. S., Flint, J. E., Zhu, Y. P., Basle, A., McKee, L. S., Prates, J. A. M., Polizzi, S. J., Coutinho, P. M., Lewis, R. J., Henrissat, B., Fontes, C. & Gilbert, H. J. (2011) A Novel, Noncatalytic Carbohydrate-binding Module Displays Specificity for Galactose-219  containing Polysaccharides through Calcium-mediated Oligomerization, Journal of Biological Chemistry. 286, 22499-22509. 260.  Charnock, S. J., Bolam, D. N., Nurizzo, D., Szabo, L., McKie, V. A., Gilbert, H. J. & Davies, G. J. (2002) Promiscuity in ligand-binding: The three-dimensional structure of a Piromyces carbohydrate-binding module, CBM29-2, in complex with cello-and mannohexaose, Proceedings of the National Academy of Sciences of the United States of America. 99, 14077-14082. 261.  Zhao, M. S. & Running, S. W. (2010) Drought-Induced Reduction in Global Terrestrial Net Primary Production from 2000 Through 2009, Science. 329, 940-943. 262.  Chauvaux, S., Souchon, H., Alzari, P. M., Chariot, P. & Beguin, P. (1995) Structural and functional analysis of the  metal-binding sites of Clostridium theromcellum endoglucanase CelD, Journal of Biological Chemistry. 270, 9757-9762. 263.  Wang, H. J., Hsiao, Y. Y., Chen, Y. P., Ma, T. Y. & Tseng, C. P. (2016) Polarity Alteration of a Calcium Site Induces a Hydrophobic Interaction Network and Enhances Cel9A Endoglucanase Thermostability, Applied and Environmental Microbiology. 82, 1662-1674. 264.  Eckert, K., Vigouroux, A., Lo Leggio, L. & Morera, S. (2009) Crystal Structures of A. acidocaldarius Endoglucanase Cel9A in Complex with Cello-Oligosaccharides: Strong-1 and-2 Subsites Mimic Cellobiohydrolase Activity, Journal of Molecular Biology. 394, 61-70. 265.  Tamoi, M., Kurotaki, H. & Fukamizo, T. (2007) beta-1,4-Glucanase-like protein from the cyanobacterium Synechocystis PCC6803 is a beta-1,3-1,4-glucanase and functions in salt stress tolerance, Biochemical Journal. 405, 139-146. 266.  Bai, Y. G., Wang, J. S., Zhang, Z. F., Shi, P. J., Luo, H. Y., Huang, H. Q., Luo, C. L. & Yao, B. (2010) A novel family 9 beta-1,3(4)-glucanase from thermoacidophilic Alicyclobacillus sp A4 with potential applications in the brewing industryA novel family 9 beta-1,3(4)-glucanase from thermoacidophilic Alicyclobacillus sp A4 with potential applications in the brewing industry, Applied Microbiology and Biotechnology. 87, 251-259. 267.  Planas, A., Juncosa, M., Cayetano, A. & Querol, E. (1992) Studies on Bacillus licheniformis endo-beta-1,3-1,4-D-glucanase: characterization and kinetic analysis, Applied Microbiology and Biotechnology. 37, 583-589. 268.  Fincher, G. B. (2009) Exploring the evolution of (1,3;1,4)-beta-D-glucans in plant cell walls: comparative genomics can help!, Current Opinion in Plant Biology. 12, 140-147. 220  269.  Lynd, L. R., Weimer, P. J., van Zyl, W. H. & Pretorius, I. S. (2002) Microbial cellulose utilization: fundamentals and biotechnology, Microbiol Mol Biol Rev. 66, 506-77. 270.  Krajmalnik-Brown, R., Ilhan, Z., Kang, D. & DiBaise, J. K. (2012) Effects of Gut Microbiomes on Nutrient Absorption and Energy Regulation, Nutr Clin Pract. 27, 201-214. 271.  Fernandes, J., Su, W., Rahat-Rozenbloom, S., Wolever, T. M. S. & Comelli, E. M. (2014) Adiposity, gut microbiota and faecal short chain fatty acids are linked in adult humans, Nutrition and Diabetes. 4, e121. 272.  Martens, E. C., Kelly, A. G., Tauzin, A. S. & Brumer, H. (2014) The devil lies in the details: how variations in polysaccharide fine-structure impact the physiology and evolution of gut microbes, J Mol Biol. 426, 3851-65. 273.  Grondin, J. M., Tamura, K., Déjean, G., Abbott, D. W. & Brumer, H. (2017) Polysaccharide Utilization Loci: Fuelling microbial communities, J Bacteriol. 274.  Bugg, T. D., Ahmad, M., Hardiman, E. M. & Singh, R. (2011) The emerging role for bacteria in lignin degradation and bioproduct formation, Current Opinion in Biotechnology. 22, 394-400. 275.  Jordan, D. B., Bowman, M. J., Braker, J. D., Dien, B. D., Hector, R. E. & Lee, C. C. (2012) Plant cell walls to ethanol, Biochem, J. 424, 241-252. 276.  Bokinsky, G., Peralta-Yahya, P. P., George, A., Holmes, B. M., Steen, E. J., Dietrich, J., Lee, T. S., Tullman-Ercek, D., Voigt, C. A., Simmons, B. A. & Keasling, J. D. (2011) Synthesis of three advanced biofuels from ionic liquid-pretreated switchgrass using engineered Escherichia coli, PNAS. 108, 19949-19954. 277.  Chundawat, S. P. S., Beckham, G. T., Himmel, M. E. & Dale, B. E. (2011) Deconstruction of Lignocellulosic Biomass to Fuels and Chemicals, Annual Review of Chemical and Biomolecular Engineering, Vol 2. 2, 121-145. 278.  Popper, Z. A., Michel, G., Herve, C., Domozych, D. S., Willats, W. G. T., Tuohy, M. G., Kloareg, B. & Stengel, D. B. (2011) Evolution and Diversity of Plant Cell Walls: From Algae to Flowering Plants, Annual Review of Plant Biology, Vol 62. 62, 567-588. 279.  Ellinger, D. & Voigt, C. A. (2014) Callose biosynthesis in arabidopsis with a focus on pathogen response: what we have learned within the last decade, Annals of Botany. 114, 1349-1358. 221  280.  Forsberg, Z., Nelson, C. E., Dalhus, B., Mekasha, S., Loose, J. S., Crouch, L. I., Rohr, A. K., Gardner, J. G., Eijsink, V. G. & Vaaje-Kolstad, G. (2016) Structural and Functional Analysis of a Lytic Polysaccharide Monooxygenase Important for Efficient Utilization of Chitin in Cellvibrio japonicus, J Biol Chem. 291, 7300-12. 281.  Nelson, C. E., Rogowski, A., Morland, C., Wilhide, J. A., Gilbert, H. J. & Gardner, J. G. (2017) Systems analysis in Cellvibrio japonicus resolves predicted redundancy of beta-glucosidases and determines essential physiological functions, Molecular Microbiology. 104, 294-305. 282.  Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H. & (Eds), e. a. (2009) Essentials of Glycobiology, Cold Spring Harbor Laboratory Press, Plainview, NY. 283.  Greffe, L., Bessueille, L., Bulone, V. & Brumer, H. (2005) Synthesis, preliminary characterization, and application of novel surfactants from highly branched xyloglucan oligosaccharides, Glycobiology. 15, 437-445. 284.  Neidhardt, F. C., Bloch, P. L. & Smith, D. F. (1974) Culture medium for enterobacteria, J Bacteriol. 119, 736-47. 285.  Castro, L. D. S., Pedersoli, W. R., Antonio, A. C. C., Steindorff, A. S., Silva-Rocha, R., Martinez-Rossi, N. M., Rossi, A., Brown, N. A., Goldman, G. H., Faa, V. M., Persinoti, G. F. & Silva, R. N. (2014) Comparative metabolism of cellulose, sophorose and glucose in Trichoderma reesei using high-throughput genomic and proteomic analyses, Biotechnology for Biofuels. 7. 286.  Yamane, K., Suzuki, H., Hirotani, M., Ozawa, H. & Nisizawa, K. (1970) Effect of nature and supply of carbon sources on cellulase formation in Pseudomonas fluorescens var. cellulosa, J Biochem. 67, 9-18. 287.  Sugisawa, H. & Edo, H. (1966) Thermal Degradation of Sugars .I. Thermal Polymerization of Glucose, Journal of Food Science. 31, 561-&. 288.  Huang, T. T. & Wages, J. M. (2016) New-to-nature sophorose analog: a potent inducer for gene expression in Trichoderma reesei, Enzyme Microb Technol. 85, 44-50. 289.  Arellano-Reynoso, B., Lapaque, N., Salcedo, S., Briones, G., Ciocchini, A. E., Ugalde, R., Moreno, E., Moriyon, I. & Gorvel, J. P. (2005) Cyclic beta-1,2-glucan is a brucella virulence factor required for intracellular survival, Nature Immunology. 6, 618-625. 222  290.  Briones, G., De Iannino, N. I., Roset, M., Vigliocco, A., Paul, P. S. & Ugalde, R. A. (2001) Brucella abortus cyclic beta-1,2-glucan mutants have reduced virulence in mice and are defective in intracellular replication in HeLa cells, Infection and Immunity. 69, 4528-4535. 291.  Rigano, L. A., Payette, C., Brouillard, G., Marano, M. R., Abramowicz, L., Torres, P. S., Yun, M., Castagnaro, A. P., El Oirdi, M., Dufour, V., Malamud, F., Dow, J. M., Bouarab, K. & Vojnov, A. A. (2007) Bacterial cyclic beta-(1,2)-glucan acts in systemic suppression of plant immune responses, Plant Cell. 19, 2077-2089. 292.  Bull, A. T. & Chesters, C. G. (1966) The biochemistry of laminarin and the nature of laminarinase, Adv Enzymol Relat Areas Mol Biol. 28, 325-64. 293.  Zavaliev, R., Ueki, S., Epel, B. L. & Citovsky, V. (2011) Biology of callose (beta-1,3-glucan) turnover at plasmodesmata, Protoplasma. 248, 117-30. 294.  Zhang, R. & Edgar, K. J. (2014) Properties, chemistry, and applications of the bioactive polysaccharide curdlan, Biomacromolecules. 15, 1079-96. 295.  Kollar, R., Reinhold, B. B., Petrakova, E., Yeh, H. J. C., Ashwell, G., Drgonova, J., Kapteyn, J. C., Klis, F. M. & Cabib, E. (1997) Architecture of the yeast cell wall: beta(1-6)-glucan interconnects mannoprotein, beta(1-3)-glucan, and chitin, Journal of Biological Chemistry. 272, 17762-17775. 296.  Bhatia, Y., Mishra, S. & Bisaria, V. S. (2005) Purification and characterization of recombinant Escherichia coli-expressed Pichia etchellsii beta-glucosidase II with high hydrolytic activity on sophorose, Applied Microbiology and Biotechnology. 66, 527-535. 297.  Uchiyama, T., Yaoi, K. & Miyazaki, K. (2015) Glucose-tolerant beta-glucosidase retrieved from a Kusaya gravy metagenome, Frontiers in Microbiology. 6. 298.  Henrissat, B., Teeri, T. T. & Warren, R. A. J. (1998) A scheme for designating enzymes that hydrolyse the polysaccharides in the cell walls of plants, Febs Letters. 425, 352-354. 299.  Rixon, J. E., Ferreira, L. M., Durrant, A. J., Laurie, J. I., Hazlewood, G. P. & Gilbert, H. J. (1992) Characterization of the gene celD and its encoded product 1,4-beta-D-glucan glucohydrolase D from Pseudomonas fluorescens subsp. cellulosa, Biochem J. 285 ( Pt 3), 947-55. 300.  Blattner, F. R., Plunkett, G., 3rd, Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., 223  Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B. & Shao, Y. (1997) The complete genome sequence of Escherichia coli K-12, Science. 277, 1453-62. 301.  Tchieu, J. H., Norris, V., Edwards, J. S. & Saier, M. H. (2001) The complete phosphotransferase system in Escherichia coli, Journal of Molecular Microbiology and Biotechnology. 3, 329-346. 302.  Nelson, C. E., Beri, N. R. & Gardner, J. G. (2016) Custom fabrication of biomass containment devices using 3-D printing enables bacterial growth analyses with complex insoluble substrates, J Microbiol Methods. 130, 136-143. 303.  Kunath, B. J., Bremges, A., Weimann, A., McHardy, A. C. & Pope, P. B. (2017) Metagenomics and CAZyme Discovery, Methods Mol Biol. 1588, 255-277. 304.  Medie, F. M., Davies, G. J., Drancourt, M. & Henrissat, B. (2012) Genome analyses highlight the different biological roles of cellulases, Nature Reviews Microbiology. 10, 227-U. 305.  Mukherjee, S., Seshadri, R., Varghese, N. J., Eloe-Fadrosh, E. A., Meier-Kolthoff, J. P., Goker, M., Coates, R. C., Hadjithomas, M., Pavlopoulos, G. A., Paez-Espino, D., Yoshikuni, Y., Visel, A., Whitman, W. B., Garrity, G. M., Eisen, J. A., Hugenholtz, P., Pati, A., Ivanova, N. N., Woyke, T., Klenk, H. P. & Kyrpides, N. C. (2017) 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life, Nature Biotechnology. 35, 676-+. 306.  Bergthorsson, U., Andersson, D. I. & Roth, J. R. (2007) Ohno's dilemma: Evolution of new genes under continuous selection, Proceedings of the National Academy of Sciences of the United States of America. 104, 17004-17009. 307.  Zhang, X., Rogowski, A., Zhao, L., M.G., H., Avci, U., Knox, J. P. & Gilber, H. J. (2002) Understanding How Complex Molecular Architecture of Mannan-degrading Hydrolysis Contributes to Plant Cell Wall Degradation, Journal of Biological Chemistry. 289, 2002-2012. 308.  Naas, A. E., Mackenzie, A. K., Mravec, J., Schuckel, J., Willats, W. G., Eijsink, V. G. & Pope, P. B. (2014) Do rumen Bacteroidetes utilize an alternative mechanism for cellulose degradation?, MBio. 5, e01401-14. 309.  Reintjes, G., Arnosti, C., Fuchs, B. M. & Amann, R. (2017) An alternative polysaccharide uptake mechanism of marine bacteria, Isme Journal. 11, 1640-1650. 310.  Xie, Z. Z., Lin, W. T. & Luo, J. F. (2015) Genome sequence of Cellvibrio pealriver PR1, a xylanolytic and agarolytic bacterium isolated from freshwater, Journal of Biotechnology. 214, 57-58. 224  311.  Figurski, D. H. & Helinski, D. R. (1979) Replication of an origin-containing derivative of plasmid RK2 dependent on a plasmid function provided in trans, Proceeding of the National Academy of Sciences. 76, 1684-1652. 312.  Schafer, A., Tauch, A., Jager, W., Kalinowski, J., Theirbach, G. & Puhler, A. (1994) Small mobilization multi-purpose cloning vectors derived from the Escherichia coli plasmids pK18 and pK19: selection of defined deletions in the chromosome of Corynebacterium glutamicum, Gene. 145, 69-73.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0364095/manifest

Comment

Related Items