Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Characterization and modification of two carbohydrate-binding modules Boraston, Alisdair Bennett 2000

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2000-486087.pdf [ 16.12MB ]
Metadata
JSON: 831-1.0089877.json
JSON-LD: 831-1.0089877-ld.json
RDF/XML (Pretty): 831-1.0089877-rdf.xml
RDF/JSON: 831-1.0089877-rdf.json
Turtle: 831-1.0089877-turtle.txt
N-Triples: 831-1.0089877-rdf-ntriples.txt
Original Record: 831-1.0089877-source.json
Full Text
831-1.0089877-fulltext.txt
Citation
831-1.0089877.ris

Full Text

Characterization and Modification of Two Carbohydrate-Binding Modules by ALISDAIR BENNETT BORASTON B.Sc. in Microbiology and Immunology, The University of British Columbia, 1993 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Department of Microbiology and Immunology) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH C O L U M B I A February 2000 © Alisdair Bennett Boraston, 2000 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, 1 agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of M I I O t ® f r ^ - j AfJb M MurJO/.D C>")f The University of British Columbia Vancouver, Canada Date jjjA H f imo DE-6 (2788) ii A B S T R A C T The C-terminal carbohydrate-binding module of Xylanase 10A CBM2a, from Cellulomonas fimi was produced and secreted by the methylotrophic yeast Pichia pastoris. The polypeptide was highly N-glycosylated. MALDI-TOF mass spectrometry combined with protease digests and site-directed mutation located the N-linked glycans to three of five potential N-linked glycosylation sites. The glycans were of the high mannose type ranging in size from (GlcNAC)2-(Man) 8 to (GlcNAC)2-(Man)i4. A small proportion of the N-linked glycans had increased masses and negative charge consistent with the presence of phosphate groups. There was also a low level of O-glycosylation on the C B M . Construction of an N-glycosylation negative mutant allowed characterization of the O-glycosylation. O-linked glycans were composed entirely of mannose in a ratio of one mole of mannose to four moles of protein. The overall distribution of mannose on the O-glycosylated C B M mutant ranged from one to nine mannose residues with the oligosaccharide sizes ranging from (Man)i to (Man)4. An extension of the fluorophore-assisted carbohydrate electrophoresis technique allowed the identification of a l -2, a l -3 , and a l -6 linkages in the O-linked glycans. MALDI-TOF mass spectrometry mapping isolated the glycosylation to three regions of the polypeptide with each region having a maximum of four mannose residues attached to each. The glycans inhibit the binding of CBM2a to cellulose. Removal of glycosylation sites by mutation allowed production of CBM2a mutants N-glycosylated at single sites. Glycans on N87 drastically impaired (200-300 fold decrease in K a ) the binding of CBM2a to bacterial -microcrystalline cellulose (BMCC). In contrast, glycans on N24 decreased the K a for B M C C only ten-fold. A CBM2a mutant without N-glycosylation sites had only a 2-3 fold lower binding affinity than CBM2a produced by E. coli. Although N-glycosylation did not affect the thermal or chemical stability of CBM2a significantly, U V resonance enhanced Raman spectroscopy and iii fluorescence spectroscopy of the glycosylated CBM2a indicated an alteration in the environment of one or more internal tryptophan residues and a change in the hydrogen-bonding pattern of one or more surface tryptophans. The C-terminal carbohydrate-binding module of xylanase 10A, CBM13, from Streptomyces lividans belongs to the family 13 carbohydrate-binding modules. CBM13 binds to the insoluble polysaccharides xylan, holo-cellulose, pachyman, and lichenan. It also binds soluble xylan, arabino-galactan, and laminarin. The association constant for binding to soluble xylan is ~6 x 103 per mole of xylan polymer. Site-directed mutation was used to demonstrate the presence of three functional sites involved in the binding of CBM13. These binding sites are similar in sequence and predicted to be similar in structural organization to the a, p, and y sites in ricin toxin B-chain (RTB). Fluorescence spectrophotometric titrations were used to quantify the binding of saccharides to CBM13. The binding specificity was very low being restricted only by the requirement for pyranose sugars. The association constants for binding to small sugars were also low (~1 x 102 M " 1 to 1 x 103 M" 1). This is the first bacterial family 13 C B M to be characterized in detail and the first C B M shown to be multivalent. iv TABLE OF CONTENTS ABSTRACT ii TABLE OF CONTENTS iv LIST OF TABLES ix LIST OF FIGURES x LIST OF ABBREVIATIONS xiii ACKNOWLEDGEMENTS xvi CHAPTER 1: INTRODUCTION 1 1.1 Carbohydrate-binding modules 2 1.1.1 Historical perspective 2 1.1.1.1 The early years 2 1.1.1.2 A new definition 3 1.1.2 Carbohydrate-binding module families 4 1.1.3 The "Original" CBMs and their properties 6 1.1.3.1 Family 1: the fungal CBMs 6 1.1.3.2 Family 2: the C-terminal C B M from Cellulomonasfimi xylanase 10A 7 1.1.3.3 Family 3 CBMs 10 1.1.3.4 Family 4. the N-terminal C B M s from Cellulomonas fimi cellulase 9B 12 1.1.4 The biotechnological applications of CBMs 13 1.2 Protein glycosylation 14 1.2.1 Yeast glycosylation 14 1.2.2 Glycoprotein-ligand interactions 16 1.3 Objectives 17 9 V CHAPTER 2: GLYCOSYLATION OF A RECOMBINANT CARBOHYDRATE-BINDING MODULE SECRETED BY PICHIA PASTORIS. 18 2.1 Summary 19 2.2 Introduction 19 2.3 Materials and Methods 21 2.3.1 Strains and vectors 21 2.3.2 D N A manipulations 21 2.3.3 Production and purification of CBM2a and CBM2a mutants 24 2.3.4 Western immunoblotting 27 2.3.5 Fluorophore-assisted carbohydrate electrophoresis 27 2.3.6 Phenol sulfuric acid assay for total carbohydrate 28 2.3.7 Protease digests 28 2.3.8 Enzymatic deglycosylation and preparation of glycans for MS 28 2.3.9 MALDI-TOF mass spectrometry 29 2.3.10 Protein concentration determination 30 2.4 Results 30 2.4.1 Production of CBM2a in Pichiapastoris 30 2.4.2 N-linked glycan composition and profiling 32 2.4.3 N-linked glycosylation site mapping 37 2.4.4 Identification, composition and profiling of O-linked glycans 42 2.4.5 Mannosylation site mapping 46 2.5 Discussion 50 2.5.1 N-linked glycosylation 50 2.5.2 O-linked glycosylation 51 2.5.3 Glycosylation and cellulose binding 53 vi CHAPTER 3: FUNCTIONAL PROPERTIES OF A GLYCOSYLATED RECOMBINANT CARBOHYDRATE-BINDING MODULE SECRETED BY PICHIA PASTORIS. 54 3.1 Summary 55 3.2 Introduction 55 3.3 Materials and Methods 57 3.3.1 Strains and vectors 57 3.3.2 Protein production and purification 57 3.3.3 MALDI-TOF mass spectrometry 58 3.3.4 Binding assays 58 3.3.5 Fluorescence spectroscopy 59 3.3.6 Thermal and chemical melts 59 3.3.7 U V resonance enhanced Raman spectroscopy 61 3.3.8 Protein concentration determination 61 3.4 Results 61 3.4.1 Qualitative cellulose-binding 61 3.4.2 Quantitative cellulose binding 63 3.4.3 Environment of the tryptophans in glycosylated CBM2a 65 3.4.4 Structural stability of glycosylated CBM2a 73 3.4.5 N-linked glycans on PpCBM2a.8 75 3.5 Discussion 75 3.5.1 Effects of glycosylation on structure 75 3.5.2 Effects of glycosylation on stability .- 77 3.5.3 The effect of glycosylation on binding 77 3.5.4 O-glycosylation 79 vii 3.5.5 A structural model for the adsorption of CBM2a to cellulose 80 3.5.6 Biotechnological improvements 83 C H A P T E R 4: C H A R A C T E R I Z A T I O N O F A N O V E L L E C T I N - L H C E C A R B O H Y D R A T E - B I N D I N G M O D U L E F R O M STREPTOMYCES LIVIDANS X Y L A N A S E 10A 84 4.1 Summary 85 4.2 Introduction 85 4.3 Materials and Methods 86 4.3.1 Carbohydrates and polysaccharides 86 4.3.2 DNA amplification and cloning 87 4.3.3 Protein purification 89 4.3.4 Thermal protein melts 90 4.3.5 Protein concentration determination 91 4.3.6 Fluorescence analysis of protein-carbohydrate binding 91 4.3.7 Affinity electrophoresis 92 4.3.8 MALDI-TOF mass spectrometry 94 4.3.9 Adsorption assays on insoluble polysaccharides 94 4.3.10 Sugar analysis 95 4.4 Results 95 4.4.1 Amino-acid similarity with Ricin 95 4.4.2 Production and purification of CBM13 98 4.4.3 Binding to insoluble polysaccharides 100 4.4.4 Soluble polysaccharide binding characteristics 102 4.4.5 Multivalency of CBM13 investigated by site-directed mutagenesis 106 viii 4.4.6 Stability o f C B M 1 3 and C B M 13 variants 107 4.4.7 Qualitative monosaccharide binding 110 4.4.8 Spectro-fluorometric characterization of carbohydrate binding 110 4.5 Discussion ; 118 4.5.1 Multivalency o f C B M 1 3 118 4.5.2 Polysaccharide binding specificity 121 4.5.3 Specificity of small sugar binding 123 4.5.4 Orientations of bound of sugars 124 4.5.5 Structural and evolutionary implications 126 4.5.5.1 The (3-trefoil: a scaffold for carbohydrate recognition 126 4.5.5.2 Evidence for a carbohydrate recognition motif. 127 4.5.5.3 Microbial family 13 carbohydrate-binding modules: an evolutionary link 130 C H A P T E R 5: C A R B O H Y D R A T E - B I N D I N G M O D U L E S : A N E M E R G I N G P E R S P E C T I V E O N S U B S T R A T E R E C O G N I T I O N 132 5.1 Carbohydrate-binding modules: "The big picture" 133 5.2 Type A carbohydrate-binding modules 133 5.3 Type B carbohydrate-binding modules 136 5.4 Type C carbohydrate-binding modules 138 5.5 Unifying properties of the C B M types 140 5.6 Summary 141 C H A P T E R 6 : B I B L I O G R A P H Y 142 6.1 Bibliography 143 LIST OF T A B L E S Table 1.1: C B M families Table 2.1: Polymerase chain reactions Table 2.2: Oligonucleotide primers used in the PCR generation of CBM2a mutants Table 2.3: CBM2a mutants Table 2.4: Masses of N-linked glycans determined by MALDI-TOF mass spectrometry Table 2.5: MALDI-TOF mass spectrometry data for CBM2a after deglycosylation and digi with protease Table 3.1: Affinity of CBM2a mutants for bacterial microcrystalline cellulose Table 3.2: Potassium iodide quenching values Table 4 Table 4 Table 4 .1: .2: .3: Specificity of CBM13 for polysaccharides Specificity and affinity of CBM13 for soluble sugars Abbreviations to figure 4.12 1 LIST OF FIGURES Figure 1.1: Tertiary structure of bmlTrCel7A determined by solution N M R spectroscopy 8 Figure 1.2: Tertiary structure of bm2aCfXynlOA determined by solution N M R spectroscopy 8 Figure 1.3: Tertiary structure of bm3aCtCipC determined by X-ray crystallography 11 Figure 1.4: Tertiary structure of bm4CfCel9B determined by solution N M R spectroscopy 11 Figure 2.1: Construction of a CBM2a mutagenesis cassette in the cloning vector pZErOl . l 22 Figure 2.2: Primers used to introduce amino acid substitutions into CBM2a 23 Figure 2.3: N-glycosylation of CBM2a and its effect on cellulose binding 31 Figure 2.4: Sugar composition of glycans of CBM2a produced by Pichia pastoris 33 Figure 2.5: F A C E analysis of N-linked glycans on CBM2a produced by Pichia pastoris 34 Figure 2.6: MALDI-TOF mass spectra of PpCBM2a glycans 35 Figure 2.7: Potential N-glycosylation sites of CBM2a 38 Figure 2.8: MALDI-TOF mass spectrometric mapping of N-glycosylation sites on CBM2a produced in Pichia pastoris 39-40 Figure 2.9: SDS-PAGE mobility of CBM2a mutants produced by Pichia pastoris and detected by western blotting 43 Figure 2.10: SDS-PAGE mobility of EndoFl treated CBM2a mutants purified from Pichia pastoris 44 Figure 2.11: Mass spectra of CBM2a.5 produced by Pichia pastoris 45 Figure 2.12: F A C E analysis of O-linked glycans on CBM2a.5 produced by Pichia pastoris 47 Figure 2.13: Mass spectra of O-glycosylated peptides generated by chymotrypsin cleavage of CBM2a.5 produced in Pichia pastoris 48 Figure 2.14: Map of O-glycosylation in CBM2a.5 49 xi Figure 3.1: Qualitative binding analysis of CBM2a mutants produced by Pichia pastoris 62 Figure 3.2: Binding isotherm of CBM2a produced by E. coli 64 Figure 3.3: Fluorescence emission scans of£. coli produced and Pichia pastoris produced CBM2a 66 Figure 3.4: Fluorescence emission scans of potassium iodide quenched CBM2a 67 Figure 3.5: Potassium iodide quenching of CBM2a 69 Figure 3.6: U V resonance Raman spectroscopic analysis of EcCBM2a.8 71-72 Figure 3.7: Chemical denaturation of CBM2a mutants 74 Figure 3.8: Thermal denaturation of CBM2a mutants 74 Figure 3.9: MALDI-TOF mass spectra of PpCBM2a.8 glycans 76 Figure 3.10: Placement of the N24 and N87 glycosylation sites in a bottom view and end-on view 78 Figure 3.11: Model of the adsorption of CBM2a to cellulose 81 Figure 3.12: Schematic representation of the cellulose chain organization in a cellulose microfibril 82 Figure 4.1: Amino acid alignment of the a, f3, and y repeats of selected family 13 carbohydrate-binding modules 96 Figure 4.2: Family 13 C B M structures 99 Figure 4.3: MALDI-TOF mass spectrum of EVIAC purified CBM13 100 Figure 4.4: Insoluble polysaccharide binding characteristics of CBM13 101 Figure 4.5: Affinity gel electrophoresis of CBM13 on xylan 103 Figure 4.6: Quantitative affinity electrophoresis of CBM13 binding to xylan and arabino-galactan 105 XI1 Figure 4.7: Quantitative affinity electrophoresis of CBM13 mutants binding to xylan 106 Figure 4.8: Intrinsic fluorescence properties of native and chemically denatured CBM13 108 Figure 4.9: Thermal denaturation of C B M 13 109 Figure 4.10: Competition affinity electrophoresis of CBM13 on birchwood xylan in the absence and presence of xylose I l l Figure 4.11: Fluorescence emission spectra of CBM13 in the presence of 50 mM xylose, 50mM galactose, and absence of sugar 112 Figure 4.12: Stern-Volmer analysis of CBM13 in the absence of ligand, presence of 25 mM xylose, and presence of 25 mM galactose 113 Figure 4.13: Fluorescence titration of CBM13 with xylose 115 Figure 4.14: Cation and pH dependence of CBM13 binding to xylose 116 Figure 4.15: Amino acid alignment of the a domains of the family 13 "super-family" 128 Figure 5.1: Structures of type A carbohydrate-binding modules 135 Figure 5.2: Structures of type B carbohydrate-binding modules 137 Figure 5.3: Structures of type C carbohydrate-binding modules 139 xiii L I S T O F A B B R E V I A T I O N S amu atomic mass units A„ absorbance at wavelength "n" nm A N T S 8-aminonaphthalene-l,3,6-trisulfonic acid A r a D D-arabinose A r a L L-arabinose B M C C bacterial microcrystalline cellulose B M G Y buffered minimal media with yeast extract B M M buffered minimal media C- C-terminus of a polypeptide C B D cellulose-binding domain C B M carbohydrate-binding module C B P carbohydrate-binding protein C M C carboxymethyl cellulose C o n A concanavalin A D a Dalton A C P change in heat capacity A G change in Gibb's free energy A H change in enthalpy D M S O dimethylsulfoxide D N A deoxyribonucleic acid d N T P deoxyribonucleic acid triphosphate D P degree of polymerization AT change in entropy E D T A ethylenediaminetetraacetic acid xiv EHEC hydroxy ethyl cellulose EndoFl endoglycosidase F l f fraction of quencher accessible fluorescence F A C E fluorophore-assisted carbohydrate electrophoresis FruD D-fructose GalD D-galactose GlcD D-glucose GlcNAC N-acetylglucosamine GlcNACD D-N-acetylglucosamine H B A H p-hydroxybenzoic acid hydrazide HEC ethyl-hydroxyethyl cellulose HRP horse-radish peroxidase LMAC immobilized metal affinity chromatography IPTG isopropyl-P-D-thiogalactoside K a association constant Kd dissociation constant kDa kiloDalton K s v Stern-Volmer quencing constant M ALDI matrix-assi sted-laser-desorption MALDI-TOF matrix-assisted-laser-desorption time-of-flight Man mannose ManD D-mannose M W molecular weight N - N-terminus of a polypeptide NAPS U B C Nucleic Acid and Protein Services unit XV N D not-determined P A S A phosphoric acid swollen Avicel™ PCR polymerase chain reaction PNGaseF peptide-N-glycosidase F PVDF polyvinylidene difluoride rpm revolutions per minute RTB ricin toxin B-chain SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis TFA tri-fluoroacetic acid T m melting temperature TYP tryptone yeast-extract phosphate medium U V ultra-violet resonance Raman spectroscopy U V R R ultra-violet UVRRS ultra-violet resonance Raman WT wild-type x g centrifugal force relative to gravitational force XylD D-xylose Zeo Zeocin® Zeo r Zeocin® resistance xvi A C K N O W L E D G E M E N T S I owe a debt of gratitude to my supervisors, Doug Kilburn and Tony Warren. Their unique way of allowing students their freedom, but with gentle direction, brought me to this stage of my scientific career. Any scientific success I have had throughout this project or will have in the future is in large part due to them. Peter Tomme, my partner in crime some would say, introduced me to the field of carbohydrate-binding modules. His vigor and passion for scientific exploration was an inspiration to me. I was encouraged by him to probe a question as deeply as possible and, perhaps, approach it from more creative angles. This will stay with me regardless of whether or not my career stays on a scientific path. "Team CBD", which has included, Brad, Jeff, Peter, and myself, has been marvellous interaction of people with common goals. The discussions and Friday afternoon experiments have all been fruitful, despite some potentially infamous entries into the "Journal of Negative Results". The Cellulase Lab has been home for several years. I have been as comfortable here as in my own living room. It is the people, both past and present, that has created this atmosphere. Emily Jr. has my gratitude for a great deal assistance, but mostly for her friendship. My discussions with Brad have always been useful and, on many occasions, very amusing ("Barqs has bite!"). Jeff "Boom boom" Kormos could always be counted on to see the other side of an argument (and deliver a big serve!). Dominik, the first to co-habitate the south-west corner of N C E 324 with me, first earned my respect for putting up with my rookie questions and being an excellent skier. He subsequently earned my respect for being an excellent scientist who showed me that a failed project is only the beginning of the next triumph! Helen Smith deserves a spot in the xvii Cellulase Lab Hall-of-Fame. For years she kept us organized and amazed us with her generosity. The many others who have come and gone, though too numerous to mention, have been no less instrumental in making Cellulase life great. Keg parties, coffee break, and Whistler trips are the good things in life. I am eternally grateful to my parents for their constant support. The greatest thanks go to my wife, Cathy. She has provided me with love and support (emotional, physical, and financial!) and has exercised great patience during my thesis work. Above all, she has been there in my times^of need. As I write this she is incubating the newest member of the Boraston family to whom this document is dedicated. Chapter 1 I n t r o d u c t i o n 2 1.1 Carbohydrate-binding modules 1.1.1 Historical perspective. 1.1.1.1 The early years. The structural polysaccharides of plant cell walls make up the bulk of biopolymers on this planet. Numerous organisms have evolved diverse polysaccharolytic enzyme systems to exploit this abundant energy source. Because cellulose, a P-1,4- linked polymer of glucose, is the most abundant component of plant cell walls, it is not surprising that cellulose-hydrolyzing enzymes, cellulases, were the first enzymes of these enzyme systems to be studied in depth. The modular organization common to many glycosyl hydrolases was first observed for cellulases from the bacterium Cellulomonas fimi (Warren et al., 1986; Gilkes et ai, 1989; Gilkes et al, 1988) and the fungus Trichoderma reesei (Tomme et al., 1988). These enzymes could be separated into large and small polypeptides by limited proteolysis. The catalytic activities were isolated to the larger polypeptides, which had no cellulose binding activity. The smaller polypeptides bound tightly to the cellulose and had no catalytic activity. The latter polypeptides were termed "cellulose-binding domains" or CBDs. Subsequently, the gene fragments encoding the CBD polypeptides have been expressed heterologously and, indeed, they are independently folding units that retain their function in isolation. These beginnings initiated a field of investigation that involves studying the use of CBDs as tags for the immobilization of industrially relevant proteins and as tags for the purification of proteins. In turn, this has motivated the discovery and study of new CBDs with the intention of finding examples with novel and desirable binding properties. The number of putative CBDs reached 120 in less than ten years (Tomme et al, 1995) and the number continues to increase. 3 1.1.1.2 A new definition. The definition of a "domain" or a "module" is ambiguous in enzymology. In the field of glycosyl hydrolases the two terms are used interchangeably to refer to a portion of an enzyme, such as the catalytic region or the substrate-binding region. For the purposes of consistency and clarity it has recently been proposed to refer to a contiguous amino acid sequence that has an independent fold as a "module". This is consistent with the frequent referral to the "modular organization" of glycosyl hydrolases (Henrissat et al, 1998). A "domain" will be considered simply as a region of a protein that is not necessarily composed of a contiguous amino acid sequence. Thus, a cellulose-binding domain is also a module. Because the substrate-binding regions are composed of contiguous amino acid sequences that appear to fold independently of the other enzyme components the term "module" is becoming preferred and will be used throughout this document. As the number of novel glycosyl hydrolases grows so does the number of non-catalytic modules found within these enzymes. Many of these modules remain unclassified and unstudied but of those that are studied a large number are turning out to be substrate-binding modules. Furthermore, the substrate specificity of these binding modules is not limited to cellulose as it was with the CBDs. Binding modules have been found that bind to xylans (Simpson et al, 1999; Nordberg et al, 1997) and mannans (Stoll, 1998). Some binding-modules bind to mono-, di-, and trisaccharide ligands (Boraston, McLean, and Tomme, unpublished). To reflect the expanded specificity of the substrate-binding modules found in glycosyl hydrolases, these modules will be referred to as carbohydrate-binding modules (CBMs) rather than cellulose-binding domains. 4 1.1.2 Carbohydrate-binding module families. Before the widespread application and ease of gene cloning, CBMs were discovered by identifying the cellulose binding polypeptides in a limited proteolytic digest of an intact enzyme (Gilkes et al, 1988; Tomme et al, 1988). Subsequently, advances in molecular biology have made it possible to delete regions of a sequenced gene, produce the polypeptide recombinantly, and assay the resulting polypeptide for binding activity. This is now the most popular method for determining the functional boundaries of a C B M . Many other putative CBMs have been identified by amino acid sequence similarity to proven CBMs. The amino acid sequence alignments of thirteen C B M families have been described previously (Tomme et al, 1995). The C B M family classification has expanded to at least 17 families (Table 1.1) (Boraston, McLean, and Tomme, unpublished). Such a family classification is useful in identifying putative CBMs and predicting their binding properties. The large number of glycosyl hydrolases and the correspondingly large number of CBMs has resulted in problems with the nomenclature of catalytic and binding modules. For several years no naming convention has existed. This has led to a list of confusing proprietary glycosyl hydrolase and C B M designations. Recently a system of nomenclature was introduced to systematically name glycosyl hydrolases (Henrissat et al, 1998). A system consistent with this has been proposed for naming CBMs (Boraston et al, 1999). This document will adopt a refined version of this along with a convenient abbreviated nomenclature. The systematic name will be as follows: bm<CBM family numberxtwo letter designation of the source organism><glycosyl hydrolase source enzyme name>. The "bm" prefix identifies the module as a binding module. For example the family 2a C B M found at the C-terminus of xylanase 10 (XynlOA) from Cellulomonas fimi, formerly called CBDc e x , would be called bm2aCfXynlOA. In the case of tandem repeats from the same family the systematic name will be followed by a number that CX O CT" P o P 8 3 £ re re 3 re P P o £ « O 3* " i re P 3 s-a. <-*• O 3 C c 1/3 O H o 3 3 a P 3 a. O P. cr c 3 P g o 1/1 c a* re ca-re ca. -v| CD CTi J ^ . O ~H O H • _ CD CD C"3 ^3 = > ( D O S £ . Co 3. O 0) O < ° -< o o 3 3 ~j r-to O £ o^. X CD O > 3 0 _ ; CO 03 cn > l 1 ( r o —»• —* o ro oi o o o =3 =3 co to CD c 2. 2. o c c co g: 2 CD CD CD O ° O o E. E. o ° ° 2 . co to KQ CD CD ^5 CO 03 O O ZT 0) cL CD CO CD O a> 0) Q. § CO 03 CI c r Q) CD r— ~i - CD ~ * CO Q) CO :~~ CO " to to cn co r o co co n TJ =S 03 S. <Q Q 03 3 Go ZJ o > CO zr CD CQ O =J. CO o 03 CD O = O C_ CD 3 CO Z3 O =3 O to a S C O o Q S CD 5 to CD co X 0 :3 CD CO 00 -v| rH D m 3 ^ J " 03 to ^ CD cn J j . CO CO r o (O cu cr cu cr 0) to m cn P cn >  > > 3 & 03 CD x i 3. o o P_ > v ? o § • ffi. P D CD CD r—•-CD Q. CD X 03 a co a-cn cn cn cn oo cn ZJ ZJ ZJ to to to JQ1 O O o o o c c c to rj a a B) CD CD CD ZJ in to O o o o s & CD CD CD CD CD CD C C C Q; O O O p to to to co CD CD 0 o CD o co CD 03 ZJ Q . 0) ZJ CL 03 CD 03 Q- r~ •*-03 -o cr CO CD CO ~ co oo co to CO 03 § 03 3. CO 3- 3 X l i t X >< ZJ o o = ro CQ = o co 03 o o 03 X 03 o * <?_ o to ro C3 CD o § CO CO 03 > =3 to 3; 3; X X ZJ ZJ -»• o > > CD CD to CD o CD > 2 > o cr > ->• cn ->• O CO CD o cn A A A cn co co o cn cn co l co co o ro =3 CO HL O —I — CL. Oi CD P W O CD 03 =3 Q. — E. CD CD X o CD CQ O CO 03 O O zr 03 O O CD CD ZJ ZJ O =;• x to CO to O 03 O O CO CO CD CD 03 c m " c 3 cr CD o ro £Z c cr CT CD CD o o CD ro o => to ro Q. ro to o to ro o o to to ro ro CD 2. P CL ^ CT =5 CD 9i. - Q3_ CO " co - i cn co co „ 7J\ CD co ro g o a"S eg i l l * ! 2 r <D 2. ?> o 3 03 _», -» • - ' CO rr> CD _». CD CD 03 CD CD t o TJ (D T3 (D GO CD 3 fh B> <" CD (/) O c CD m 3 N < 3 CD H Si n re 6 corresponds to the relative position of the C B M starting from the N-terminus. For example the N-terminal family 4 C B M of the tandem repeats in cellulase 9B (Cel9B) from C. fimi, formerly called C B D N I , would be called bm4CfCel9Bl; the second repeat called bm4CfCel9B2; the tandem referred to as bm4CfCel9B1.2. Though it is a very precise nomenclature when discussing CBMs from several families and sources it is cumbersome in repetition for a discussion of a single C B M . A simple abbreviated nomenclature will be employed for this situation. It will be as follows: C B M < C B M family number>-<repeat l>.<repeat 2>. For example the family 2a C B M from XynlOA would be called CBM2a. The tandem family 4 CBMs from Cel9B would be called CBM4-1.2 and the individual repeats called CBM4-1 and CBM4-2. 1.1.3 The "Original" CBMs and their properties. 1.1.3.1 Family 1: the fungal CBMs. Family 1 is the second largest C B M family containing approximately 45 entries from mainly fungal sources. They are about 35 amino acids long and can be found at the N - or C-termini of the parent enzymes. The first examples of this family were found in two cellobiohydrolases, Cel7A and Cel6A, from Trichoderma reesei (Tomme et al, 1988). These CBMs have become the subject of intensive study, particularly the C B M from Cel7A for which a solution structure has been determined. The family 1 CBMs from Cel7A (bmlTrCel7A, formerly called CbhICBD) and Cel6A (bmlTrCel6A, formerly called CbhIICBD) bind to insoluble cellulose and chitin with association constants in the range of l x l O 5 M " 1 and bind to cello-hexaose with association constants of ~3xl0 3 M " 1 . bmlTrCel7A binds to insoluble cellulose reversibly whereas bmlTrCel6A does not (Carrard et al, 1999; Linder et al, 1996). The apparent paradox is that the adsorption data of bmlTrCel6A on cellulose can be described by a Langmuir model, a model 7 that assumes true equilibrium binding. This phenomenon is also observed with the family 2a CBMs. The tertiary structure of bmlTrCel7A was determined by N M R spectroscopy (Mattinen et al, 1998). The polypeptide has a "wedge" shape with three solvent-exposed tyrosines forming a flat face (Figure 1.1). These tyrosine residues are important in the binding of bmlTrCel7A to cellulose as demonstrated by the reduced binding affinities of alanine mutants of these residues (Reinikainen et al, 1992; Linder et al, 1995b). 1.1.3.2 Family 2: the C-terminal C B M from Cellulomonas fimi xylanase 10A. Family 2 is the largest C B M family with most of the entries coming from bacterial sources. This family is split into two subfamilies, family 2a and family 2b (Tomme et al, 1995); each subfamily has distinctly different properties. Family 2a CBMs bind specifically to cellulose. Despite the relatively high sequence identity between family 2a and family 2b the family 2b CBMs bind specifically to xylan. The family 2a CBMs were discovered before the family 2b CBMs and were the first CBMs to be extensively characterized. The N-terminal family 2a C B M from Cel6A of C. fimi (bm2aCfCel6A), separated proteolytically from the catalytic module, was the first C B M to have its adsorption to cellulose described quantitatively (Gilkes et al, 1992). However, the C-terminal C B M of xylanase 10A from C. fimi (bm2aCfXynlOA or CBM2a) has become the best-characterized example of the family 2a CBMs. The structure of CBM2a, determined by solution N M R spectroscopy, is described as a nine-stranded 3-barrel (Xu et al, 1995). Three highly conserved, solvent-exposed tryptophans form a relatively hydrophobic ridge on the polypeptide (Figure 1.2). Biochemical studies indicate that this flat ridge of aromatics is the binding site of the family 2a binding modules. Mutation of 8 Figure 1.1: Tertiary structure of bmlTrCel7A determined by solution N M R spectroscopy. The aromatic residues involved in binding are shown in red. The solvent surface is shown in transparent gray. Figure 1.2: Tertiary structure of bmCfXynlOA determined by solution N M R spectroscopy. The aromatic residues involved in binding are shown in red. The solvent surface is shown in transparent gray. these tryptophans in bm2aCfCel6A and the N-terminal C B M of XynlOA from Pseudomonas fluorescens (bm2aPfXynlOA) resulted in large decreases in the affinity of these CBMs for insoluble cellulose (Din et al, 1994b; Poole et al, 1993). Furthermore, the chemical shifts of the N M R peaks corresponding to these tryptophan residues in bm2aCfXynlOA and bm2aPfXynlOA were changed upon the titration of cello-oligosaccharides (Xu et al, 1995; Nagy et al, 1998). The thermodynamics of the adsorption of bm2aCfXynlOA to crystalline cellulose bear out the involvement of a hydrophobic surface on the polypeptide in the binding process (Creagh et al, 1996). The A H (enthalpy change) of binding was favorable but very small. The TAS (entropic term) of binding was also favorable and large, providing the bulk of the energy to drive binding. The ACp (heat capacity change) was large and negative. This was interpreted to result from the dehydration of hydrophobic surfaces upon binding resulting in the gain in entropy from the freed water molecules (Creagh et al, 1996). It seems that the tryptophan ridge provides the hydrophobic surface on the C B M , the dehydration of which upon adsorption to cellulose provides a great deal of the binding energy. Though the precise interaction of these tryptophans with the cellulose remains unknown it is clear that they play a pivotal role in the adsorption of family 2a CBMs to crystalline cellulose. CBM2a binds to crystalline cellulose (bacterial microcrystalline cellulose and Avicel), amorphous cellulose or regenerated cellulose (phosphoric acid swollen cellulose), and chitin with association constants of approximately l x l O 6 M " 1 , as do other members of family 2a (Boraston & Mclean, unpublished; Bolam et al, 1998). The adsorption of CBM2a to crystalline cellulose is irreversible, despite an apparent Langmuir-type adsorption isotherm (Creagh et al, 1996). This paradox remains unresolved. Biologically, the lack of solution exchange would mean that the enzyme, XynlOA, is fixed to one location on the cellulose, a seemingly inefficient method for the complete hydrolysis of the substrate. CBM2a compensates for its irreversible 10 binding by being mobile on the surface of the cellulose allowing the entire enzyme to be mobile (Jervis et al, 1997). The mechanism of movement of CBM2a on the surface of the cellulose is unknown. 1.1.3.3 Family 3 CBMs. Cellulosomes are super-assemblies of polysaccharolytic enzymes commonly produced by cellulolytic, anaerobic bacteria (Bayer et al, 1998) and some anaerobic rumen fungi (Blum et al, 1999; Fujino et al, 1998). The scaffoldin acts as the backbone of the cellulosome. It consists of one or more CBMs and multiple dockerin domains, which are responsible for interacting with cohesin domains on the enzyme components to form the super-assembly. These CBMs are frequently family 3 CBMs. In addition to the prevalence of family 3 CBMs in the scaffoldins they can be found as modules in the enzyme components of the cellulosome or as modules in the enzymes of organisms with free cellulase systems. The C B M from cellulose-binding protein A of Clostridium cellulovorans (bm3CcCbpA, formerly called CBDcios) and the C B M from the scaffoldin unit CipC of Clostridium thermocellum (bm3CtCipC, formerly called CipC-CBD) are the two best-characterized members of this family. Both bind to crystalline cellulose, amorphous cellulose, and chitin with association constants of l x l O 6 M ' 1 to l x l O 7 M " 1 (Morag et al, 1995; Goldstein et al, 1993). No binding to soluble cellulose has been reported. The structure of bm3CtCipC, determined by X -ray crystallography, has similar features to the structure of CBM2a (Tormo et al, 1996). The polypeptide has nine fi-strands in a jelly roll topology with a bound calcium atom (Figure 1.3). A planar surface of aromatic residues is thought to be the binding site based on the similarity of the binding properties and structural properties to the family 1 CBMs and family 2a CBMs. The role of these residues in binding has not yet been demonstrated. Figure 1.3: Tertiary structure of bm3CtCipC determined by X-ray crystallography. The aromatic residues involved in binding are shown in red. The solvent surface is shown in transparent gray. Figure 1.4: Tertiary structure of bm4CfCel9B determined by solution N M R spectroscopy. The aromatic residues involved in binding are shown in red. The solvent surface is shown in transparent gray. The yellow region shows the hydrophobic base of the binding groove. 12 1.1.3.4 Family 4: the N-terminal CBMs from Cellulomonas fimi cellulase 9B. The first family 4 CBMs were found as tandem modules at the N-terminus of Cel9B from C. fimi. This family has grown to include ten entries, which may eventually be separated into two subfamilies. The N-terminal module, bm4CfCel9B or CBM4-1, was the first reported C B M to bind specifically to amorphous and soluble cellulose but not to crystalline cellulose (Coutinho etal, 1992; Tomme et al, 1996). CBM4-1 remains unique among published CBMs in its ability to bind cello-oligosaccharides tightly. The tertiary structure of CBM4-1 was solved by solution N M R spectroscopy (Johnson et al, 1996a). Like all of the C B M structures solved to date, the fold of CBM4-1 is entirely 3-sheet (Figure 1.4). The most interesting feature of this structure is the presence of a "groove" on one face of the polypeptide (Figure 1.4). Lining this groove is a stretch of hydrophobic amino acids. Polar residues capable of acting as hydrogen bond donor/acceptors flank the groove. This groove was an attractive candidate as the substrate binding site because its width and length could accommodate a cellulose chain five glucose units in length. Indeed, the N M R chemical shifts of residues in and around this groove were very sensitive to the presence of ligand. Mutation shows residues in this groove, particularly two tyrosine residues, to be very important in substrate binding, confirming that the groove is the substrate-binding site (Kormos, 1998). The conformation of this binding site helps to explain the binding specificity of CBM4-1. The CBMs capable of binding crystalline cellulose all have flat binding sites. It is thought that this flat surface is necessary to be appropriately complementary to the flat surface presented by the cellulose. CBM4-1 contrasts by having a shallow groove as a binding site that is capable of accommodating individual sugar chains. Presumably, the individual cellulose chains in crystalline cellulose cannot gain access to this binding groove. 13 It was recently found that CBM4-1 has a calcium-binding site (Johnson et al, 1998). This binding site is distant from the sugar-binding site and does not have a role in sugar binding. The bound calcium appears to have a structural role because it improves the thermal stability of the polypeptide. 1.1.4 The biotechnological applications of CBMs. The low cost of cellulose has made the cellulose-CBM system an attractive prospect for affinity applications. CBMs have been used to immobilize proteins that have been chemically fused to them or produced from gene fusions. Similarly, CBMs have been used as affinity tags to purify the products of gene CBM-fusions. C B M fusions designed to be immobilized have been of three general types: bioprocessing enzymes, capture agents, and growth factors. Bioprocessing enzyme-CBM fusions have varied from model systems, such as phosphatase A (Greenwood et al, 1989) and Agrobacterium 0-glucosidase (Ong et al, 1991), to fusions with greater potential industrial applications, such as a CBM-heparinase fusion (Shpigel et al, 1999) and a factor X fusion (Assouline et al, 1993). In all cases, the fusion was active in solution and retained its activity when immobilized on cellulose. CBM-capture agent fusions have included a streptavidin-CBM fusion (Le et al, 1994) and a protein A - C B M fusion (Ramirez et al, 1993). These fusions could be used while immobilized on cellulose to bind the target ligand. Lastly, several CBM-growth factor fusions have been studied, the most notable of which is a stem cell factor-CBM2a fusion that could be used in an immobilized form to stimulate proliferation of a model cell line (Doheny et al, 1999). 14 CBMs from families 2a and 3a have used with limited success as purification tags (see Tomme et al, 1998 for a review). Occasionally, fusions employing these CBMs can be eluted from cellulose with distilled water. However, for the majority of them, harsh conditions, such as extremes of pH or chaotropic agents, are required to desorb the fusions. It has recently been shown that members of family 6 and family 9 can be eluted from cellulose with cellobiose (Winterhalter et al, 1995; Sakka et al, 1998). The ability to elute of C B M fusions under these mild conditions will undoubtedly renew interest in the use of CBMs as purification tags. 1.2 Protein glycosylation. The field of glycobiology, as it pertains to the glycosylation of proteins, is large and diverse. The mechanisms of glycosylation and the properties of the resulting glycoproteins have been studied in prokaryotes, eukaryotic microbes, and higher eukaryotes. The size and composition of the glycans are nearly as diverse as the population of organisms that perform this post-translational modification. The relative simplicity of yeast glycans has made these fungi attractive model organisms to study. Saccharomyces cerevisiae has received the bulk of the attention. However, the emerging popularity of Pichia pastoris as a host for the production of heterologous proteins has created interest in studying protein glycosylation in this organism. 1.2.1 Yeast glycosylation. Yeast have two general glycosylation pathways. The first occurs in the endoplasmic reticulum and results in the attachment of glycans to asparagine residues, so-called N-linked glycosylation. The second pathway occurs in the Golgi apparatus where glycans can be attached to serine or threonine residues. This is called O-linked glycosylation. 15 In organisms that N-glycosylate, triantennary manno-oligosaccharides with a chitobiose core and three successive terminal glucose units are transferred to the polypeptide from a lipid-linked intermediate during protein translation (Lehle, 1992; Kukuruzinska et al, 1998). The acceptor sequence on the growing polypeptide is Asn-Xxx-Ser/Thr, where Xxx can be any amino acid except proline. In most organisms, trimming of the oligosaccharide and the addition of new sugars elaborate this core structure in the Golgi apparatus. In S. cerevisiae the three glucose sugars and one mannose sugar are removed and the remaining glycan is most commonly modified by the addition of one to seven mannose sugars. Often, this will be further modified by the addition of up to 150 mannose residues giving what has been called "hyper-glycosylation". S. cerevisiae N-linked glycans contain a l -2 , a l -3 , and a l -6 linkages with the majority of the terminal linkages being a l -3 . The structures of glycans found on glycoproteins produced by the methylotrophic yeast, Pichia pastoris, are slightly different from those found on S. cereivisiae glycoproteins. The main differences are the generally smaller size of the P. pastoris glycans and lack of terminal al-3 mannose additions (Grinna et al, 1989). Furthermore, esterified phosphate groups have been observed in the Mang-Man^ N-glycans produced by P. pastoris. Phosphate is found in the glycans of vacuolar proteins but not in the smaller glycans of S. cerevisiae (Miele et al, 1997a). Researchers speculate that the different glycans in P. pastoris reflect an N-glycan processing mechanism that is different from that of S. cerevisiae. O-glycosylation in mammalian cells is isolated to the Golgi apparatus. In yeast cells the process of O-glycosylation begins in the endoplasmic reticulum with the donation of a mannose sugar from a lipid intermediate to a serine or threonine residue on the polypeptide (see Tanner et al, 1987 for a review). There is little information regarding a consensus acceptor sequence for O-linked glycans other than the requirement for a hydroxyamino acid. Up to three additional mannose sugars are transferred from GDP-mannose intermediates to the initial O-linked 16 mannose during transit of the polypeptide through the Golgi apparatus. The linkages of the between the first three sugars are a l -2 and the terminal sugar is al-3 linked. The process of O-glycosylation in P. pastoris is less well studied. The glycans are generally of the same size as those in other yeast systems and believed to be entirely composed of a l -2 linked mannose (Dumanda/., 1998). 1.2.2 Glycoprotein-ligand interactions. In some protein-ligand systems, glycosylation of proteins appears to provide a level of control over the strength of the interaction. In most cases, the greater the degree of glycosylation of a protein, the lower its affinity it has for the ligand. Human plasma antithrombin III was fractionated into two species: a highly glycosylated species and a species with little glycosylation (Peterson et ai, 1985). The highly glycosylated species had a lower affinity for heparin than the less glycosylated antithrombin III. Deglycosylation of the polypeptide restored the heparin binding activity to that of the less glycosylated form. The same was shown for osteonectin from human and bovine sources binding to collagen (Xie et al., 1995) and for erythropoietin binding to the erythropoietin receptor (Nagao et al, 1993). The physical basis for these observations is unknown; however, it is postulated that the reduction in affinity results from partial occlusion of the ligand-binding site. The biological relevance of this phenomenon is the potential for an organism to modulate the affinity of a protein-ligand system by controlling the composition, size and placement of glycans. This is currently supposition, but it is based on the knowledge that the glycosylation of certain proteins changes depending on the physiological state of the organism, for example in disease or embryonic development in mammals (Lis et al, 1993). 17 1.3 Objectives. The first main objective of this thesis was to further elucidate the structural and functional features of a family 2a C B M by modifying its binding properties. It was previously observed that glycosylation of a CBM2a fusion expressed in a mammalian host alters its binding properties (Assouline et al, 1993). The effect on glycosylation on a family 2a C B M will be investigated. The second main objective was to contrast the structural and functional properties of the family 2a CBMs with a novel C B M that represents a different class of C B M . The properties of CBM2a, the C-terminal family 2a C B M from xylanase 10A of C. fimi, were modified through the site-specific control of glycosylation when this polypeptide is produced in the methylotrophic yeast, Pichia pastoris. The goals of this study were: 1. To characterize the glycans and identify the specific sites of glycosylation. 2. To control the sites of glycosylation by mutation and determine the effects of glycosylation on the structure and function of CBM2a. 3. To relate the observed structural and functional effects of glycosylation to what is currently known about the adsorption of CBM2a to crystalline cellulose. In addition to this, the substrate binding properties of a novel family 13 C B M from xylanase 10A of Streptomyces lividans, CBM13, were investigated. The focus of this study was to: 1. Determine the substrate specificity and binding affinity of CBM13. 2. Investigate the stoichiometry of sugar binding. 3. Relate the functional aspects of CBM13 to probable structural features. 18 Chapter 2 G l y c o s y l a t i o n o f a R e c o m b i n a n t C a r b o h y d r a t e - B i n d i n g M o d u l e s e c r e t e d b y Pichia pastoris. 19 2.1 Summary The C-terminal carbohydrate-binding module of Xylanase 10A, CBM2a, from Cellulomonas fimi was produced and secreted by the methylotrophic yeast Pichia pastoris. The polypeptide was highly N-glycosylated. MALDI-TOF mass spectrometry combined with protease digests and site-directed mutation located the N-linked glycans to three of five potential N-linked glycosylation sites. The glycans were of the high mannose type ranging in size from (GlcNAC)2-(Man) 8 to (GlcNAC)2-(Man)i 4. A small proportion of the N-linked glycans had increased masses and negative charge consistent with the presence of phosphate groups. There was also a low level of O-glycosylation on the C B M . Construction of an N-glycosylation negative mutant allowed characterization of the O-glycosylation. O-linked glycans were composed entirely of mannose in a ratio of one mole of mannose to four moles of protein. The overall distribution of mannose on the O-glycosylated C B M mutant ranged from one to nine mannose residues with the oligosaccharide sizes ranging from (Man)i to (Man)4. An extension of the fluorophore-assisted carbohydrate electrophoresis technique allowed the identification of a l -2, a l -3 , and a l -6 linkages in the O-linked glycans. MALDI-TOF mass spectrometry mapping isolated the glycosylation to three regions of the polypeptide with each region having a maximum of four mannose residues attached to each. 2.2 Introduction Carbohydrate-binding modules (CBM) are found as discrete folded units within the modular structures of glycosyl hydrolases (Tomme et al, 1995). CBMs have found use as fusion partners with other protein modules for the purposes of immobilization (Greenwood et al, 1992; Tomme et al, 1998). In particular, the family 2a C B M from the family 10 mixed function xylanase/glucanase from Cellulomonas fimi, CBM2a, has been used as a fusion partner with 20 several proteins of potential commercial importance (Assouline et al, 1995; Assouline et al, 1993). Examples of C B M fusions incorporating CBM2a have been successfully produced in P. pastoris (Guarna et al, 1996), which has become an attractive organism for the high-level production of recombinant proteins (Clare et al, 1991; Cregg et al, 1993; Higgins et al, 1998). However, as with many recombinant proteins produced in eukaryotic hosts, glycosylation of these proteins, particularly the C B M , was an important issue to consider. As a result of the importance of P. pastoris as a host for the production of recombinant proteins, the post-translational modification performed by this organism has become a matter of practical interest. P. pastoris is capable of both N - and O-linked glycosylation (Grinna et al, 1989; Duman et al, 1998; Miele et al, 1997b). The N-linked glycans are of the high mannose type typical of fungal systems. O-linked glycans may be present on recombinant proteins produced by P. pastoris (Duman et al, 1998), but they are not as well characterized as the N-linked glycans. Yeast cells typically assemble O-linked glycans on hydroxy-amino acids such as serine and threonine. These saccharides are usually short and consist mainly of a l -2 linked mannose (Tanner et al, 1987). Previous studies on O-glycosylation of proteins secreted by P. pastoris have found glycans with broadly similar properties to those found in other yeast systems. The information that is currently lacking is more detailed linkage information and information on the occupancy of the potential O-glycosylation sites. This study investigates the site specificity of both the N - and O-linked glycosylation of a recombinant C B M expressed in P. pastoris. The size distribution and composition of both types of glycans are investigated in addition to a novel method of determining sugar linkages in 0-linked glycans. 21 2.3 Materials and Methods 2.3.1 Strains and vectors. A l l subcloning steps were performed using pZErO 1.1 (Invitrogen, San Diego, CA). The P. pastoris expression/shuttle vector was pPICZaA (Invitrogen, San Diego, CA). Al l D N A manipulations were performed in E. coli TOPP 10F. Expression clones were obtained by electro-transformation of P. pastoris strain GS115. 2.3.2 D N A manipulations. Agarose gel electrophoresis, small-scale plasmid isolation, and E. coli transformations were performed as described previously (Sambrook et al, 1989). P. pastoris transformations were performed as recommended by the supplier (Invitrogen, San Diego, CA). Large scale plasmid isolation for sequencing was done using Qiagen Tip-100 columns (Qiagen, Chatsworth, CA). Restriction enzymes were used as recommended by the manufacturers. Al l PCR products and D N A fragments resulting from restriction digests were purified from agarose gels after electrophoresis using Qiaex II (Qiagen, Chatsworth, CA) according to the manufacturer's protocol. D N A was sequenced using the AmpliTaq dye termination cycle sequencing protocol and an Applied Biosystems Model 377 sequencer by the NAPS Unit, Biotechnology Laboratory, UBC. The gene fusion encoding the Saccharomyces cerevisiae a-factor leader peptide (from pPICZaA), a hexa-histidine tag, a Factor Xa site, and the synthetic CBM2a gene in pBS (StrataGene, La Jolla, CA) (pBSaFCBM2a, unpublished) was used as a PCR template to introduce an N24Q substitution and a Narl restriction endonuclease site to facilitate later cloning procedures (Figure 2.1). Standard cloning procedures were used to insert the PCR product (Product A, Table 2.1) into pBSaFCBM2a to produce pBSaFCBM2a-l followed by subcloning Figure 2.1: Construction of a CBM2a mutagenesis cassette in the cloning vector pZErOl . l . * represents the N24Q substitution introduced by primer ©. The Narl restriction endonuclease site for cloning was introduced by primer ©. 23 Table 2.1: Polymerase chain reactions P C R Template Primers RE Sites Acceptor Plasmid Product Plasmid A p B S a F C B M 2 a ®+© Hpal-Notl p B S a F C B M 2 a p B S a F C B M 2 a - 1 B p B S a F C B M 2 a - 1 ©+© £co/? \ / -B lun t p Z E R O I . 1 pOB C p B S a F C B M 2 a - 1 ®+© E c o R V - B l u n t p Z E R O I . 2 pOC D p B S t x F C B M 2 a ©+® EcoRV-B\ur\t p Z E R O I . 3 pOD CD ® - © © <-Hindlll Aatll Hpal Psp 14061 X X N24 S31 BamHI S75 Pstl Narl ® © N87 T105 Notl Figure 2.2: Primers used to introduce amino acid substitutions into CBM2a. Primers are shown as horizontal arrows and labeled corresponding to the sequences in table 2.2. Restriction sites are shown as vertical lines. Codons corresponding to the target amino amino acids are shown as • . ft represents mutated codons. The underlined Narl restriction site is not in the native D N A sequence but was introduced early in the cloning procedures using primer 1. 24 of the a-factor-CBM2a region into the pZErOl . l multiple cloning site to produce p0aFCBM2a-1 (Figure 2.1). Base pair substitutions resulting in specific amino acid substitutions were introduced with three PCR reactions using primers that overlap suitable restriction sites for inserting the PCR fragment into the original CBM2a gene (Figure 2.2 and Table 2.1). Silent restriction sites were incorporated into the mutagenic primer sequences to enable screening for clones that have the desired mutations (Table 2.2). Purified PCR fragments, were ligated blunt ended into EcoRV digested pZErOl . l to produce the plasmids outlined in Table 2.2. The subcloned PCR fragments in plasmids pOB, pOC, and pOD encoding the altered fragments of the CBM2a gene were subcloned and arranged in p0cxFCBM2a-l using the appropriate restriction enzymes shown in Figure 2.2 to produce the desired combinations of mutations (Table 2.3). Finally, the gene fusions encoding the a-factor leader peptide, hexa-histidine tag, Factor Xa cleavage site, and the altered CBM2a gene were inserted into the P. pastoris shuttle vector pPICZaA via Hindlll and NotI restriction sites (Table 2.3). 2.3.3 Production and purification of CBM2a and CBM2a mutants. Small scale expression in P. pastoris was done at 30 °C in 50 ml culture tubes containing 5 mis of buffered minimal glycerol medium, pH 6.0, containing yeast extract (BMGY) (Invitrogen, 1995). Cultures were grown to an ODeoo of 2.0 and induced by centrifuging the cultures for 5 min at 1500 rpm, removing the supernatant and replacing the medium with buffered minimal medium, pH 6.0, containing 0.5% methanol supplemented with 0.04% L -histidine (BMM) (Invitrogen, 1995). Cultures were incubated 48 hours at 30 °C. Supernatants were harvested by centrifugation at 4 °C for 10 min at 15000 rpm. Large scale cultures were started by inoculation of 350 mis of B M G Y in 2 L baffled flasks with 1 ml of a 5 ml overnight culture grown in B M G Y at 30 °C. Cultures were grown to an OD600 of -2.0 and induced by centrifuging the cultures for 5 min at 1500 rpm in sterile 350 ml centrifuge bottles, and replacing Table 2.2: Oligonucleotide primers used in the PCR generation of CBM2a variants. Primer ® © ® AA Substitution Introduced Cloning RE Site Screening RE Site Sequence (5' TO 3') N24Q S31G S31G with N24Q S75N N87Q T105A Notl Hpal Hpal or Psp1406l BstEII Pstl Narl Notl AatllorHpal Narl BstEII Bpu1102 Bpu1102 Sspl Styl Apal AACGCGGCCGCTTATTAACCAACGGTGCAAGG GGTACCGTTCAGAGAGAAAGCGGTTGGCGCC GCGTTGGTACCG GGTGTTAACCAGTGGAACACCGGTTTCACCGC TC A G G T T A C C G T T A A A GGTGTTAACCAGTGGAACACCGGTTTCACCGC TAACGTTACCGTTAAAAACACGG G C T C 4 GCT CCGGTTGAC GGTTTCACCGCTCAGGTTACCGTTAAAAACAC G G G C T G 4 GCTCCGGTTGAC TCCACCTGCAGO/1 ATA TT ACCGTTCC ACGG GGTTGGCGCCGCGTTGGTACCGGTGTGAGAAC CTTC G AAACCGAACTG AACGCGGCCGCTTATTAACCAACGGTGCAAGG G G C C C'CGTTCAGAGAGAA GACTGGTTCCAATTGACAAG Underlined lettering indicates the cloning restriction endonuclease site sequence. Bold lettering indicates the screening/silent restriction endonuclease site sequence. Italic lettering denotes substituted bases to introduce the amino acid substitution and/or silent restriction endonuclease sites. Table 2.3: CBM2a mutants. Subclone P. pastoris Plasmid Amino Acid Substitutions N-Glycosylation Site3 Polypeptide Abbreviation11 p O a F C B M 2 a . 4 p P I C a F C B M 2 a . 4 N 2 4 Q S 3 1 G N87Q T105A N73 CBM2a.4 p O a F C B M 2 a . 5 p P I C a F C B M 2 a . 5 N24Q S31 G S75N N87Q T105 NONE CBM2a.5 p O a F C B M 2 a . 6 p P I C a F C B M 2 a . 6 N 2 4 Q S 7 5 N N 8 7 Q T105A N29 CBM2a.6 p O a F C B M 2 a . 7 p P I C a F C B M 2 a . 7 N 2 4 Q S 3 1 G S 7 5 N N87Q N103 CBM2a.7 p O a F C B M 2 a . 8 p P I C a F C B M 2 a . 8 S31G S75N N87Q T105A N24 CBM2a.8 p O a F C B M 2 a . 9 p P I C a F C B M 2 a . 9 N24Q S31G S75N T105A N87 CBM2a.9 p O a F C B M 2 a p P I C a F C B M 2 a None ALL CBM2a a remaining N-glycosylation site. b abbreviations used for mutants produced in E. coli will be preceded by Ec (eg. EcCBM2a). Abbreviations used for mutants produced in P. pastoris will be preceded by Pp. 26 the supernatant with 350 ml of sterile B M M . The resuspended cells were returned to sterile 2 L baffled flasks. Cultures were incubated for 48 hours at 30 °C with additions of methanol to 0.5% every 12 hours. The supernatant was harvested by removal of the cells through centrifugation at 4 °C for 20 minutes at 8000 rpm. The pH of the supernatant was adjusted to 8.0 by the addition of NaOH. Vacuum filtration through a 0.7 urn glass fibre filter was used to remove precipitated material and other particulates. Recombinant CBM2a and mutants were purified by immobilized metal affinity chromatography (EV1AC). 40 mis of a 50% slurry (20 ml resin volume) of His-Bind resin (Novagen, Milwaukee, MI) were packed into a 1.5 cm ID column. This was washed with 8 column volumes of distilled water prior to charging the resin with 10 column volumes of 50 mM NiSCV The column was equilibrated by washing with 10 column volumes of 20 mM Tris.HCl, pH 8.0, with 0.5 M NaCl (binding buffer). The charged resin was removed from the column and incubated with the processed supernatant in batch with stirring at 4 °C for 2 hours. The His-Bind resin, with adsorbed proteins, was recovered by vacuum filtration using a 0.7 um glass fibre filter. The resin was washed on the filter with 500 ml of binding buffer. Adsorbed proteins were eluted by washing the resin on the filter with 50 to 100 mis of binding buffer containing 500 mM imidazole. The eluted protein was assessed for purity by SDS-PAGE then exchanged into the 50 mM potassium phosphate, pH 7.0, or 25 mM Tris.HCl, pH7.4, and concentrated in a stirred ultra-filtration unit (Amicon, Beverly, MA, ) using a IK cutoff filter (Filtron, Northborough, MA) . Samples of glycosylated protein were further purified by affinity chromatography on ConA-Sepharose (Pharmacia, Uppsala, Sweden). Al l of the following operations were performed at a flow rate of 0.5 ml/min. A 1 ml column (0.5 cm diameter), packed bed volume, of ConA-Sepharose resin was washed with 5 column volumes of 500 mM methyl-a-D- mannoside (Sigma) in 20 m M Tris.HCl, pH 7.4 (Buffer B), followed by equilibration with 10 column 27 volumes 20 m M Tris.HCl, pH 7.4 containing 2 mM CaCl 2 and 2 mM M g C l 2 (Buffer A). 5 ml of glycosylated CBM2a at 4 mg/ml were loaded onto the column. The column was washed with 25 column volumes of Buffer A followed by elution with 20 column volumes of Buffer B with collection of 2 ml fractions. The fractions were assayed for protein by SDS-PAGE. 2.3.4 Western immunoblotting. Proteins separated by electrophoresis through 12% or 16% polyacrylamide gels were electroblotted to polyvinylidene difluoride (PVDF) (Immobilon™, Millipore, Bedford, M A ) membranes. Recombinant CBMs were detected using rabbit polyclonal anti-CBM2a antibodies at a dilution of 1/10000. Goat anti-rabbit IgG antibodies conjugated to horseradish-peroxidase were used as secondary antibodies at a dilution of 1/10000. Glycoproteins were detected using concanavalin A (ConA) conjugated to horseradish peroxidase (Seikagaku, Tokyo, Japan) at a dilution of 1/2000. Antibodies or ConA-HRP were diluted in 10 ml of PBS containing 0.5% B S A and 0.05% Tween-20. These solutions were incubated with the PVDF blot for 1 hour at room temperature. Blots were washed three times with 75 ml of PBS containing 0.05% Tween-20 after probing. Blots were developed using a chemi-luminescent horseradish peroxidase detection kit from Amersham. 2.3.5 Fluorophore-assisted carbohydrate electrophoresis. Monosaccharide composition analysis and glycan size profiling were done with kits purchased from Glyko Inc. (Novato, CA). Monosaccharide composition analysis was performed using 50-200 pg of protein. N - and O- linked glycan profiling was done using 50-200 pg of protein. N-linked glycans were released from the protein using peptide N-glycosidase F (PNGaseF) (Boehringer Mannheim, Laval, Quebec). O-linked glycans were released by 28 hydrazinolysis. Al l reactions and electrophoretic separations were performed according to the protocols supplied by the manufacturer (Glyko, 1995). 2.3.6 Phenol sulfuric acid assay for total carbohydrate. The phenol sulfuric acid assay for quantification of sugars was performed as described previously (Chaplin, 1986). Solutions of mannose from 0-100 u M were used to generate a standard curve. The concentration of glycosylated protein used in the assay ranged from 3-40 uM. 2.3.7 Protease digests. 200 ul of purified protein (100-200 uM) was desalted and exchanged into distilled water by overnight drop dialysis using VS 0.025 u M filters (Millipore, Bedford, MA) . The concentration of the exchanged protein was measured by absorbance at 280 nm. 3 nmoles of this protein were dried in a vacuum concentrator followed by dissolution in 5 ul 8 M urea containing 10 mM DTT. This was incubated for 30 minutes at 50 °C. 145 ul of 50 m M H C 0 3 was added slowly to the sample followed by the addition of 1 ul of chymotrypsin (1 mg/ml) or 1 ul of chymotrypsin and 1 ul of trypsin (both at 1 mg/ml). The protease digests were incubated at 37 °C for 4 hours. Samples of the reactions were diluted 1 to 4 in 70% acetonitrile/0.1% trifluoroacetic acid (TFA) prior to preparation for mass spectrometry. 2.3.8 Enzymatic deglycosylation and preparation of glycans for MS. A l l deglycosylation reactions were performed with recombinant endoglycosidase F l (EndoFl) purchased from Boehringer Mannheim (Laval, Quebec). Deglycosylation reactions for SDS-PAGE were performed in 20 u.1 volumes containing 100 mM potassium phosphate buffer, 29 pH 6.0, -10 pg of protein, and 0.05 U EndoFl. Reactions were incubated for 2 hours at 37 °C. Reactions for the preparation of glycans for mass spectrometry were performed in 20-100 pi volumes containing 25 mM ammonium acetate buffer, pH 6.0, -3 nmoles protein, and 0.5-1 U of EndoFl. Reactions were incubated for 4 hours at 37 °C. 2 volumes of 95% ethanol chilled to -20 °C were added and the reactions held at -20 °C for 30 minutes. Precipitated material was pelleted by centrifugation at 14000 rpm for 10 minutes at 4 °C. The supernatant fraction containing the released glycans was removed and evaporated to dryness in a vacuum concentrator. The dried material was dissolved in 5 ul of distilled water and samples were diluted 1 to 1, 1 to 4, and 1 to 9 in 70% acetonitrile/0.1% TFA prior to preparation for mass spectrometry. 2.3.9 MALDI-TOF mass spectrometry. A l l matrix-assisted-laser-desorption time-of-flight (MALDI-TOF) spectra were collected using a SELDI-MassPhoresis mass spectrometer (Ciphergen, Palo Alto, CA). Samples of intact C B M were prepared for mass spectroscopy by diluting 2 pi of a 10 pmole/ pi solution of C B M with an equal volume of 70% acetonitrile/0.1% TFA saturated with sinipinic acid (Sigma). Bovine super oxide dismutase (12230.9 daltons) prepared in the same matrix was used to calibrate the mass spectrometer. Samples of protease digests were prepared by diluting 2 pi of the pre-diluted reaction with and equal volume of 70% acetonitrile/0.1% TFA saturated with cinnamic acid (Sigma). Glycan samples were prepared by diluting 2 ul of the pre-diluted glycans with an equal volume of 10 mg/ml dihydroxy benzoic acid (Sigma) in 70% acetonitrile/0.1% TFA. Angiotensin (1296.5 daltons) and fibrinopeptide B (1570.6 daltons) were used as external calibrants for the protease digested samples and the glycan samples. The same matrix was used for the calibrants as for the experimental samples. 1 (il of sample was spotted onto the M A L D I target and allowed to dry at room temperature under atmospheric pressure. Spectra were 30 acquired and analyzed using the SELDI-MassPhoresis software supplied with the mass spectrometer. 2.3.10 Protein concentration determination. The concentration of purified CBM2a was determined from A280nm using a calculated molar extinction coefficient (Mach etal, 1992) of 27625 M ~ l c m - 1 . 2.4 Results 2.4.1 Production of CBM2a in Pichia pastoris. Expression of the CBM2a construct with the Saccharomyces cerevisiae a-factor leader sequence led to the secretion of the polypeptide into the culture supernatant. The secreted polypeptide was a mixture of four species, the smallest being similar in size to CBM2a produced by E. coli; it was not bound by the mannose specific lectin concanavalin A (ConA) (Figure 2.3). The three species of higher molecular weight were bound by ConA (Figure 2.3). The heterogeneity of mass and reactivity with ConA suggested glycosylation of the polypeptide. Treatment of the cultures at induction with tunicamycin, an inhibitor of N-linked glycosylation (Elbein, 1984), resulted in the secretion of a homogeneous population of polypeptide having the expected molecular weight of non-glycosylated CBM2a (12 kDa), confirming the presence of N -linked glycans (Figure 2.3). ConA did not bind this polypeptide. The glycosylated CBM2a bound poorly to cellulose; however, when produced in the presence of tunicamycin the polypeptide bound tightly to cellulose (Figure 2.3). It thus appeared that N-glycosylation of CBM2a reduced its affinity for cellulose (see Chapter 3). Figure 2.3: N-glycosylation of CBM2a and its effect on cellulose binding. Duplicate cultures of P. pastoris containing the CBM2a gene were grown 8 hours at 30 °C then induced with 0.5% methanol for 24 hours. At induction one of the duplicate cultures was treated with 30ugAil tunicamycin (Lanes 5-7 in both panels). Samples were run in a SDS-PA gel and blotted. The blot was probed first with anti-CBM2a antibodies (Panel A) then stripped and reprobed with ConA conjugated to HRP (Panel B). Lane 1: E. coli produced CBM2a. Lanes 2 and 5: P. pastoris culture supernatant. Lanes 3 and 6: P. pastoris culture supernatant bound to Avicel then desorbed by boiling with SDS-PAGE loading buffer. Lanes 4 and 7: P. pastoris culture supernatant pre-incubated with Avicel to remove cellulose binding proteins. Lanes labeled "T" are samples from tunicamycin treated cultures; samples labeled " U " are from untreated cultures. 32 2.4.2 N-linked glycan composition and profiling. Mannose, glucose and N-acetylglucosamine were present in the monosaccharide analysis of glycosylated CBM2a (Figure 2.4). Glucose also appeared to be present in the monosaccharide analysis of an E. coli produced protein (i.e. non-glycosylated) and in the distilled water that was used in the reactions. It was concluded that glucose was a contaminant rather than a component of the glycans on CBM2a. RNaseB contains high-mannose N-linked glycans ranging in size from (GlcNAC)2-(Man)s to (GlcNAC) 2-(Man) 9 (Fu et al, 1994). Comparison of the glycan profiles of CBM2a and RNaseB showed that the glycans on CBM2a were larger than those on RNaseB, presumably (GlcNAC)2-(Man)9 to (GlcNAC)2-(Man)i4 (Figure 2.5). To confirm the size of neutral glycans present on CBM2a positive ion MALDI-TOF spectra of glycans released by endoglycosidase F l were obtained (Figure 2.6). The masses of the glycans corresponded well with the masses of K+ adducts of glycans ranging in size from (GlcNAC)i-(Man) 8 to (GlcNAC)i-(Man)i 4 (Table 2.4), sizes consistent with those estimated from F A C E analysis. The F A C E analysis also showed two bands of greater mobility than (GlcNAC)2-(Man)g which did not correspond to any bands in the RNaseB standard. It was shown previously that the N-linked glycans produced by P. pastoris can contain esterified phosphate groups resulting in acidic glycans. F A C E profiling depends upon the addition of a negatively charged fluorophore to allow the migration of neutral glycans when an electric current is applied. The presence of acidic groups would result in glycans with a higher charge to mass ratio giving anomalous bands of greater migration. Glycans with increased masses consistent with the presence of phosphate groups were identified by negative ion MALDI-TOF of glycans released with endoglycosidase F l (Figure 2.6 and Table 2.4). A B 1 2 3 4 1 2 3 c 1 2 3 a Figure 2.4: Sugar composition of glycans of CBM2a produced by P. pastoris. Panel A: composition of non-amine sugars. Lane I, standards; Lane 2, PpCBM2a; Lanes 3 and 4, PpCBM2a.5. Panel B: composition including amine sugars. Lane 1, standards; Lanes 2 and 3, PpCBM2a. Panel C: negative controls. Lane 1, standards; Lane 2, E. coli protein; Lane 3, distilled water. The standards are labeled as follows: a) N-acetyl-galactosamine; b) mannose; c) fucose; d) glucose; e) galactose; f) N-acetyl-glucosamine. Figure 2.5: FACE analysis of N-linked glycans on CBM2a produced by P. pastoris. N -linked glycans were released from CBM2a or RNaseB with enodoglycosidase F l and labeled with ANTS by reductive amidation. Glycans were electrophoresed through high percentage acrylamide gels and visualized under U V light (362nm). Lane 1: gluco-oligosaccharide standard prepared by partially acid hydrolyzing wheat starch. Labels indicate oligosaccharides of 3 glucose units (G3) to 12 glucose units (G12). Lane 2: PpCBM2a N-linked glycans. Lane 3: RNaseB N-linked glycans. 35 Figure 2.6: MALDI-TOF mass spectra of PpCBM2a glycans. CBM2a produced by P. pastoris was deglycosylated by treatment with endoglycosidase F l for 8 hours at 37 °C. The glycans were partially purified by removing the remaining protein with an ethanol precipitation. The glycans remaining in solution were analyzed by MALDI-TOF mass spectrometry directly or after concentration by vacuum drying. Mass spectra were obtained in positive ion mode (solid line) or negative ion mode (dotted line). The masses of the labeled peaks are given in Table 2.4. 36 3 3 P fa "-<! co cn o cn co M CD CD 3 co c/> g 2. 2. 1 1 1 cu cu cr cr cr <<: v< <<; cr CD a. cr g r i >-i 3 P P Cu O O o r + CD cu o o CD CO *o O 3 cu g I cu o o >-"t >-| re cn *o O 3 cu 3-CD cr ft fl> P 5T « co p 3 co * CD 3 P cn >T3 (D O P Ch s= l-l ft cr a> 3 P CO CO CO 13 CD O r+ <-% P o co_ Cu P CO cn o o 3 O o 3 fl) o o <-t CD o > o p 3 Cu X 3 P 3 3 O co CD 3 " l ^ fl) Ch 5 0 . 00 to Cu c CD cn CD cn -r> co ro CD 00 "vi CD CJ1 -T> ro ro ro ro ro ro CO CD 4>> CO —x O 00 CD cn CO —^  o —v cn CD ro CD O 4^ CO —v cn CD CO cn CO 1. CD CD 4*. K) O 00 CD 4v —v 4*. CO fo b CD CD cn CO ho O CD ro ro ro ro ro ro _». _». _» . _» . 00 CD cn CO —^ O 00 Oi CO ro o 00 ro cn CD CO —^ 45> CO ro CD cn CO — to - s i 4^ ro o 00 CD 4^ K) cn 4^ CO O CO CD 4^ CO b ro ro ro ro rO ro CO cn 4^ ro O CD -vi cn -t^  ro —1 CD CO —^ 4^ Co ro CD CD 00 X CD 4* ro o cn CO i CD cn ro 4^ CO ->• CD CD CD CO ho b CD ro ro ro ro cn CO ro O 00 cn CO CD O 4*. cn o CD co o p° 4*. ro "-si 4* _ x CO bo CD cn 4^ CO ro CD cn o r O r O r O M r O r O - u . - ^ c o c D c n c o r o o c o - v i c n " o co -vi cn o N I cn u o b co 4^ co !r> 45w CO 4^  co ro CD CD cn co co co ro ro ro ro ro CO cn 00 CD CD rO O CD -vi cn CD CO CD ' 4* 00 - v l ;-sl ho cn cn CD CD O "n m U O CD > ro ro ro cn 4^  ro oo o 4^  CD -v l ro o cn CD co CD N | Oi U O v | CO CD CO 00 o fl) 3 + + o TJ O "D O 00 3 0) TJ TJ O CO 2 ro fl) CO TJ TJ o CO ro fl) TJ TJ o CD 3 ro a> oo O o > O 0) 3 J 9: o CD I to J> I 5T s o o fi) CD Q. .0) H p CO CO CD CO o I 3' ?r CD Cu 0Q << a P 3 co Cu CD r-K CD 3 CD Cu cr I I H O 3 P T3 CD O CD 37 2.4.3 N-linked glycosylation site mapping. Endoglycosidase F l from Chryseobacterium meningosepticum cleaves N-linked glycans at the pi-4 linkage of the chitobiose core leaving a single GlcNAC attached to the protein (Plummer et al, 1991; Tarentino et al, 1992). This property was exploited by using the remaining peptide linked GlcNAC as a marker of N-linked glycosylation. The amino acid sequence of CBM2a has five Asn-Xxx-Ser/Thr potential acceptors of N-linked glycosylation that are separated by chymotrypsin or trypsin cleavage sites (Figure 2.7). Peptides from EndoFl deglycosylated CBM2a and non-glycosylated CBM2a were obtained by treatment of denatured CBM2a with either chymotrypsin or a mixture of chymotrypsin and trypsin. Analysis of these mixtures by MALDI-TOF showed that peptides containing the N24 (Peak D), N73 (Peak E), and N87 residues (Peaks B and C), all potential sites of N-glycosylation were increased by a mass consistent with the presence of a GlcNAC residue (Figure 2.8 and Table 2.5). Furthermore, chymotrypsin cleaved completely at W72 of non-glycosylated (resulting in peak X) CBM2a but not at W72 of glycosylated CBM2a. Presumably the presence of a GlcNAC at N73 inhibits cleavage at this site, a further indication of glycosylation on N73. A peak corresponding to the expected mass of peptide G85-F100 (Peaks A and C) increased by the mass of 162 daltons, the mass of a single hexose sugar was also present. Fungal O-glycosylation consists only of mannose linked directly to the polypeptide (i.e. no chitobiose core as in N-glycosylation) (Tanner et al, 1987; Lehle, 1992). The observation of a 162 dalton sugar linked directly to the polypeptide is consistent with O-linked glycosylation. The function of the five potential N-linked glycosylation sites as acceptors was also investigated by systematic removal of the glycosylation sites through site-directed mutation of key amino acids in the potential glycosylation site. These mutations and their designations are outlined in 38 1 31 61 91 ~s\ s S G T G T N P A G C O V §w3 L 7 T V D G w v r V T V R ^ N A A P T A F * S wrn T V f N Q RIP S G Q Q V F* T N G g I P A G G T A Q N G T P C T V G T A IN v t v Q A V v J I s , F * G V T G a Q H 30 60 90 110 Figure 2.7: Potential N-glycosylation sites of CBM2a. Panel A : the primary sequence of CBM2a. Boxed sequences indicate potential N-glycosylation sites. T indicates chymotrypsin cleavage sites. ^ denotes trypsin cleavage sites. The numbering is in accordance with the co-ordinate file of the structure of CBM2a. Panel B: placement of the potential glycosylation sites in the three dimensional structure of CBM2a. The gold tryptophan residues show the putative hydrophobic binding ridge. Asparagines, which may act as N-linked glycosylation sites, are shown in green with labels indicating their position in the primary amino acid sequence. 39 Figure 2.8 (following page): MALDI-TOF mass spectrometric mapping of N -glycosylation sites on CBM2a produced by P. pastoris. Samples of CBM2a purified from E. coli cultures (solid lines) or CBM2a purified from P. pastoris cultures (dotted line) were treated with endoglycosidase F l followed by urea denaturation and protease treatment. Panel A: mass spectra of chymotrypsin digested samples. Panel B: mass spectra of samples digested with both trypsin and chymotrypsin. Peak labels are described in Table 2.2 and text. J — I 1—I—I 1—I—I 1—I 1—I 1 1—I—1—I 1 — I — I — I — I — I I—I I I I I I I I I I I I I I I I I I I I I I I I I I L I ' 1 1 I 1 ' 1 I 1 1 1 I 1 1 1 I 1 1 1 I 1 ' 1 I 1 1 ' I 1 ' 1 I 1 1 ' I 1 1 1 I ' 1 ' I 1 1 1 I 1 ' 1 I 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 m/z (amu) ..| i i i — L _ i — L . a - i , . . i — u i — l - j . . i—i • 11 i I.I I—i i i I I I i—I , i , . i . J * • • » I • • • 1 • • * I • ' • I • ' • I *>" I 1 I " | I T " T • y~t"T"T' - f ' '' I' ' '' |"' f " ' 'I T - I " T T - y T " T*Hr*"|"TT~'l—|"T 1 I I "[""f T T " ] " 1 — I I J I — I — I " | "|-'T-T 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 m/z (amu) 41 p o w 3, 2 X g > » p II 3 ol ^ 2 . P 0 3 3 CD tr . a 3 2- & c r « 5* 8 5 A 3 T R O £ 3 3. * s 3 3 t 1 «> eg 3 <g 2>oS' •i I—> c 6 • to a" » * rr to <-*•, § 3 3 „ CD — • T 3 p 3 P S cn t*r fa cn O 3 P HOQ r" C CD c cn to Cu 3 OQ to - J Cu 3 B P 3 g 5» O to o C O to Cu o 3 cn ri n 3 ? N 3 g- 0 0 ja o 3* 3 o <3 cn 5" P 3 Cu Cu O c cy rT o 3 " 3 o ( -» -«5 cn 5' cn 5' S" era' a to cn T3 CD < CD o cr -i P 3 cn CT P cn cn Cu o 3 3 * 2 . >-t CD Cu o' r - f to Cu 3 o cT o c_ P" 3 P cn X m O O O • o : o : g > s > g —J C J -fc^  w no W + + CO - v l J i . CD 00 o bo bo CO CD CD cn p b O O O Q o Z > o + I m X CD co cn o o CD ^vl CD cn cn ^ co ro o o o ^ o o o o o o o o o o o o o o o o G) a z > o CD oo cn o o + O _ Z o oo Z —| - i u) 2 _ cn ro ro o po -vj I : co ro co oi w m -n ' o + £ o 5 o o ro ro 00 co1 oo cn ro s 3 CO - s i - v l 00 00 cn - v l J=>. CO CD CD cn ro co -vl 4^  cn 2 CO CO CD 0 0 o> ro cn 1033 - v | CO ro - v | 4^ ro 1549 1119 CD 4^ bo CD ro fc, k )k k CD CO - v l - v l i 4^ — X cn — i CO CO ro CD cn CO CD - V l CD CD - v l - v | O CD cn cn — CO O 4^ 4^ ro i . CO '-vl —>. ro 4^ 4^ " CO CD 1 0 CD ro -vl CO -vl CO 2 S i CD CD CD ro T J CD a> T~ 0) <p_ a o m ro O fi> O - D CO a> CO ro o to" T J CO t3 E CD o v> » cn a — o Q. ST O o TJ CD g ^ W CD co CO CD m o DO CD io «> w ! Q. "O CO ft) CO CO — I " CO O DO ro H ss cT to 1*1 H O 3 p cn cn cn T3 CS a o 3 cn Cu P O P P 3> cn >-t Cu CD o o cn IT O 3 P 3 Cu 0Q CD cn 5' 3 o CD P cn CD 42 Table 2.3. A l l of the CBM2a mutants were secreted by P. pastoris. Both N24 and N87 could be glycosylated as indicated by the increased molecular mass of CBM2a.8 and CBM2a.9 (Figure 2.9). Treatment of purified glycosylated CBM2a, CBM2a.8 and CBM2a.9 with EndoFl decreased their masses to that of non-glycosylated CBM2a (Figure 2.10). The CBM2a mutant having the N73 glycosylation site intact was not glycosylated, which contradicted the results of the MALDI-TOF mapping. 2.4.4 Identification, composition and profiling of O-linked glycans. The MALDI-TOF mapping of the N-glycosylation sites of CBM2a suggested the presence of O-linked sugars. To confirm this, the mutant of CBM2a lacking all N-glycosylation sites, CBM2a.5, was studied further. The MALDI-TOF spectra of CBM2a.5 purified by I M A C alone, and by I M A C followed by chromatography on ConA-Sepharose both showed multiple components, each differing by the mass of a hexose sugar (Figure 2.11). IMAC purified CBM2a.5 had a detectable maximum of five protein-linked sugars. The non-glycosylated form predominated with glycoforms having increasing numbers of sugars decreasing in prevalence. CBM2a.5 purified by ConA chromatography had a maximum of nine sugars. Apart from glucose contamination, mannose was the only sugar found in the F A C E monosaccharide analysis of CBM2a.5 (Figure 2.4). No amino sugars could be detected (data not shown). Therefore, the O-linked glycans comprised only mannose. A ratio 0.25 moles of mannose to 1 mole of CBM2a.5 was determined by a phenol sulfuric acid assay for total sugar. This suggested that 25% of the protein was mannosylated; however, considering that many of the mannosylated polypeptides carry more than 1 mannose, the proportion of mannosylated CBM2a is probably significantly lower. An estimate of 10% is more reasonable, assuming an average of >2 mannose residues per glycosylated peptide. 43 1 2 3 4 5 6 7 8 *4- c <4- B <- A Figure 2.9: SDS-PAGE mobility of CBM2a mutants produced by P. pastoris and detected by western blotting. Cultures of P. pastoris containing the mutated CBM2a genes were grown 8 hours at 30 °C then induced with 0.5% methanol for 24 hours. Samples of cleared supernatant were analyzed by western blotting with anti-CBM2a antibodies. Lanel: EcCBM2a, Lane 2: PpCBM2a.4, Lane 3: PpCBM2a.5, Lane 4: PpCBM2a.6, Lane 5: PpCBM2a.7, Lane 6: PpCBM2a.8, Lane 7: PpCBM2a.9, Lane 8: PpCBM2a (see table 2.3 for abbreviations). Arrow A indicates masses corresponding to CBM2a produced in E. coli. Arrows B and C indicate glycoforms of higher molecular weight. Figure 2.10: SDS-PAGE mobility of EndoFl treated CBM2a mutants purified from P. pastoris. Lane 1: molecular weight standards. Lane 2: EcCBM2a. 5ug of purified P. pastoris produced CBM2a mutants were incubated for 1 hour at 37 °C in 100 mM potassium phosphate buffer (pH 6.0) without endoglycosidase F l (Lanes 3, 4, and 5) or with endoglycosidase F l (Lanes 6, 7, and 8) in a 20ul volume. The entire mixture was electrophoresed through a 16% SDS-PAG under reducing conditions. Lanes 3 and 6: PpCBM2a, Lanes 4 and 7: PpCBM2a.8, Lanes 5 and 8: PpCBM2a.9 (see table 2.2 for abbreviations). Arrow A indicates masses corresponding to CBM2a produced by E. coli. Arrows B and C indicate glycoforms of higher molecular weight. 45 A 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I 1 1 I 1 I I I I I I I I I I I L ; [v. 00 o CM 12200 12400 12600 12800 13000 13200 13400 13600 m/z (amu) B _| l l i i I i i i i I i i i i I i i i • 1 i • • i L_ -i , , 1 1 1 1 1 1 — i 1 1 1 1 1 1 1 1 1 1 1 1 1 — i 1 p 11800 12300 12800 13300 13800 14300 m/z (amu) Figure 2.11: Mass spectra of CBM2a.5 produced by P. pastoris. Panel A: I M A C purified. The masses of unglycosylated CBM2a.5 (12408.7 da) and the masses of glycoforms are shown. Panel B: I M A C purified followed by purification by ConA-sepharose affinity chromatography. 46 The sizes of glycans released from CBM2a.5 by hydrazinolysis and analyzed by F A C E were distributed over the range covered by the Glci to GIC4 standards (Figure 2.12). The mobilities of the glycan bands relative to malto-oligosaccharide standards were compared to the mobilities determined for ocl-2, a l -3 , al-4, and a l -6 linked mannose standards (Figure 2.12). From this it was determined that single protein linked mannose sugars predominate with cd-2, a l -3 , and cd-6 linked mannose also present. The linkages of the larger glycans could not be identified by this method but are likely trisaccharides and tetrasaccharides having mixed linkages of mannose. 2.4.5 Mannosylation site mapping. Peptide mixtures of ConA purified CBM2a.5 was obtained by treatment of denatured polypeptide with either chymotrypsin or a mixture of chymotrypsin and trypsin. The peptide mixtures were analyzed by MALDI-TOF mass spectrometry in order to localize the mannosylation to regions of the C B M . Three peptides were detected with multiple accompanying peaks that were increased in mass by steps of 162 daltons (Figure 2.13). The peptides corresponding to the chymotrypsin digest products S55-W72 (expected 1848.5 Daltons) and G85-F100 (expected 1563.9 Daltons) had up to four attached hexose sugars. In contrast, only one glycosylated form of the chymotrypsin digest product N73-F84 (expected 1146.2 Daltons) with a single mannose residue could be detected. The placement of these glycosylation sites in the primary amino acid sequence of CBM2a is shown in Figure 2.14. The predominance of hydroxyamino acids in these peptides strongly suggests that this mannosylation is O-linked. Furthermore, O-linkage of mannosyl groups has been demonstrated previously (Duman et al, 1998). 47 G3 G2 G1 Figure 2.12: F A C E analysis of O-linked glycans on CBM2a.5 produced by P. pastoris. Panel A : PpCBM2a.5 glycans. Lanel: O-linked glycans; Lane 2: gluco-oligosaccharide standards. Panel B: mannose standards. Lane 1: al-3,6 mannotriose; Lane2: gluco-oligosaccharide standards; Lane 3: a 1-6 mannobiose; Lane 4: al-4 mannobiose; Lane 5: al-3 mannobiose; Lane 6: a l-2 mannobiose. Panel C: PpCBM2a.5 glycans (lower concentration). Lanel: O-linked glycans; Lane 2: gluco-oligosaccharide standards. Labels indicate gluco-oligosaccharides of 1 glucose units (GI) to 6 glucose units (G6) produced by partial acid hydrolysis of wheat starch. 48 T — i — i — i — | — i — i — i — i — | — i — i — i — i — | — i — i — i — i — p | l i — r — I — p — i — r — i — i — | — i — i — i — i — | — i — i — i — r 1125 1175 1225 1275 1325 1500 1700 1900 2100 2300 m/z (amu) m/z (amu) c 1800 2000 2200 2400 m/z (amu) Figure 2.13: Mass spectra of O-glycosylated peptides generated by chymotrypsin cleavage of CBM2a.5. Panels A, B, and C correspond to the 1146.2, 1563.9, and 1848.5 da peptides, respectively, shown in the sequence of CBM2a.5 (Figure 2.12). The peaks are labeled with the measured masses and number of added sugar residues. The spectrum of the glycosylated 1379.5 da peptide generated by combined trypsin and chymotrypsin cleavage is not shown, as it is a subfragment of the 1848.5 da fragment. 1 H H H H H H I E G R L T S G P A G C Q V L W G V N Q W N T G F T A Q V T V K N T G S A P V D G W T L T F S F P S G Q Q V 1379.5 Da 1146.2 Da TQAWSSgVg^SGSAVgVRNAPWNGNIPAGG I I j 1848.0 Da T A Q F G F Q G S H T C T L S I A A P T A F S L N G A P C T V G I I 1563.3 Da Figure 2.14: Map of O-glycosylation in CBM2a.5 produced by P. pastoris. Grey/Black text indicate fragments generated by chymotrypsin cleavage. Non-underlined/underlined text denotes fragments generated by trypsin cleavage. Bracketed fragments containing O-glycosylation sites are identified by the given masses. The residues acting as potential acceptors of O-glycosylation are shown in bold/italic text. The sequence contains the N24Q, S31G, S75N, N87Q, and T105A mutations that removed all of the potential N -glycosylation sites. The residue treated as "1" in numbering the primary amino acid sequence according to the number in the co-ordinate file is labeled with 1 above it. 50 2.5 Discussion 2.5.1 N-linked glycosylation. The size distribution, sugar content, and phosphate content of the N-glycans on CBM2a are nearly identical to those previously found on other N-glycosylated proteins produced by P. pastoris (Miele et ai, 1997a; Miele et ai, 1997b; Grinna et ai, 1989). The consistency of the N -glycosylation performed by P. pastoris distinguishes it from S. cerivisiae. In general, S. cerivisiae glycans are more heterogeneous in size (Tanner et al., 1987; Lehle, 1992). Furthermore, S. cerivisiae has a tendency to hyperglycosylate, forming extremely large glycans containing greater than 50 mannose residues. This has been observed only infrequently in P. pastoris (Grinna et al., 1989), and it was not found for CBM2a. The relative homogeneity of the N-linked glycans attached to proteins secreted by P. pastoris has been proposed to be a biotechnological advantage of this expression system (Grinna et al., 1989; Miele et al., 1997b). Of the five sites that could potentially act as N-linked glycan acceptors only three were glycosylated. Several factors have been proposed to determine the suitability of a potential N -glycosylation site. Positive determinants are proximity to the N-terminus of the polypeptide and placement in a turn or loop; a negative determinant is sidechains that are prone to burying during protein folding (Dwek, 1995). Of the five potential acceptors in CBM2a, four are on the surface: N24, N73, N87, and N103 (Figure 2.7). N29 is placed in a turn, however, it is inappropriate as an acceptor because it is buried. N103 is exposed but it is near the C-terminus of the polypeptide. It is thought that sites at the C-terminus of a polypeptide have less exposure time to the glycosylation mechanism during co-translational modification than sites at the N-terminus, so they are glycosylated infrequently (Gavel et al., 1990). The remaining N-glycosylation sites on CBM2a appear to meet the criteria for N-linked glycosylation acceptor sites. 51 It is interesting that the N73 site functions as an acceptor of N-glycosylation in the polypeptide containing the other N-glycosylation sites but does not when all the other sites are absent. The occupancy of an adjacent N-glycosylation site can affect the occupancy of another potential site (Gavel et al, 1990). However, this usually occurs at sites separated by less than four amino acids in the primary sequence and is a negative effect: glycosylation at one site hinders glycosylation at the other. Though untested, it is possible that the lack of glycosylation at N24 or N87 in the CBM2a.4 mutant affects the rate of folding such that N73 is longer accessible as a glycosylation site. 2.5.2 O-linked glycosylation. The percentage of the total population of PpCBM2a.5 that had any degree of mannosylation was low (-10%). The maximum number of mannose units attached to PpCBM2a.5 was nine. Of the peptides that were mannosylated, the attached saccharides ranged in length from one mannose unit to four mannose units and decreased in frequency with increasing oligosaccharide length. This is consistent with previously reported results (Duman et al, 1998). The oligosaccharides were localized to three regions of the polypeptide (Figure 2.12). Amino acids 55-72 had one to four mannose residues, amino acids 73-84 had only one detectable mannose residue, and amino acids 85-100 had one to four mannose residues attached to them. This totals nine mannose residues at full glycosylation of each peptide. This corresponds well with the maximum of nine mannose residues per PpCBM2a.5 molecule detected by M A L D I -TOF mass spectrometry. Assuming that the mannosylation is O-linked, amino acids 73-84 include only one serine, and, therefore, one glycosylation site. The other two sequences had a maximum of four mannose residues attached. This may be a result of variably sized oligosaccharides present in these peptide regions such that when averaged over several serine or 52 threonine acceptors up to a maximum of four mannose residues per region is reached. The other simple alternative is that a single serine or threonine residue in each sequence acts as an acceptor. Based on the structure of CBM2a there is nothing to distinguish the potential hydroxyamino acid acceptors of glycosylation from one another. Al l of hydroxyamino acids in CBM2a are exposed on the surface of the polypeptide. A novel extension of the F A C E oligosaccharide profiling technique was used to determine the linkages present in the disaccharides of the O-linked glycans on PpCBM2a.5. a l -2 , a l -3 , al-4, and a l -6 linked manno-disaccharide standards had different mobilities in the F A C E gels (Figure 2.10). Comparison of these mobilities with those observed in sugars released from PpCBM2a.5 allowed the identification of a l -2 , a l -3 , and a l -6 linkages in the disaccharides released from PpCBM2a.5. This is in contrast to the linkages previously reported in O-linked glycans found on P. pastoris secreted proteins. These glycans were found to be composed of predominantly a l -2 linked mannose as deduced by sensitivity to an a-mannosidase specific for a l -2 linkages (Duman et al, 1998). However, a small proportion of the glycans were resistant to this enzyme. P. pastoris lacks al-3 linkages in N-linked glycans (Grinna et al, 1989) and there is no detectable (al-3)-mannosyltransferase activity in membrane preparations of P. pastoris cells (Verostek et al, 1995). Based on this, it was presumed that the a-mannosidase insensitive fraction was not due to al-3 linkages. It appears, however, that P. pastoris can form al-3 linkages and a l -6 linkages in its O-glycans, and this may explain the a l -2 mannosidase insensitive fraction, al-3 linkages are common in yeast O-linked glycans; a l -6 linkages are not (Tanneryal, 1987; Lehle, 1992). 53 2.5.3 Glycosylation and cellulose binding. Previous studies have implied that glycosylation of CBM2a fusions expressed in eukaryotic hosts have impaired abilities to bind cellulose. The preliminary binding results presented here indicate that this is most likely due to glycosylation of the CBM2a module. This presents a potential barrier to the use of CBM2a as a fusion partner with proteins expressed in eukaryotic hosts. The effects of glycosylation on the function of CBM2a are explored fully in the following chapter. 54 Chapter 3 F u n c t i o n a l P r o p e r t i e s o f a G l y c o s y l a t e d R e c o m b i n a n t C a r b o h y d r a t e - B i n d i n g M o d u l e S e c r e t e d b y Pichia pastoris. 55 3.1 Summary CBM2a, a carbohydrate-binding module at the C-terminus of xylanase 10A from Cellulomonas fimi is highly N-glycosylated when produced and secreted by the methylotrophic yeast Pichia pastoris. The glycans inhibit the binding of CBM2a to cellulose. Removal of glycosylation sites by mutation allowed production of CBM2a mutants N-glycosylated at single sites. Glycans on N87 drastically impaired (200-300 fold decrease in K a ) the binding of CBM2a to bacterial-microcrystalline cellulose (BMCC). In contrast, glycans on N24 decreased the K a for B M C C only ten-fold. A CBM2a mutant without N-glycosylation sites had only a 2-3 fold lower binding affinity than CBM2a produced by E. coli. Although N-glycosylation did not affect the thermal or chemical stability of CBM2a significantly, U V resonance enhanced Raman spectroscopy and fluorescence spectroscopy of the glycosylated CBM2a indicated an alteration in the environment of one or more internal tryptophan residues and a change in the hydrogen-bonding pattern of one or more surface tryptophans. The CBM2a mutant lacking N -glycosylation sites is suitable for the production of functional CBM2a-fusion proteins in cells that N-glycosylate. 3.2 Introduction The first carbohydrate-binding module to be characterized in depth was the family 2a binding module found at the C-terminus of xylanase 10A from Cellulomonas fimi (Ong et al, 1993). It binds crystalline cellulose, amorphous cellulose, and chitin with micro-molar affinity constants (Ong et al, 1993). CBM2a is a nine stranded all P-sheet molecule with a (3-barrel topology (Xu et ai, 1995). Three tryptophans, grouped to form a ridge on one surface of the molecule, are very important in substrate binding (Xu et al, 1995; Din et al, 1994a; Poole et al, 1993). The binding of CBM2a to bacterial microcrystalline cellulose (BMCC) is entropically 56 driven with a large negative heat capacity (Creagh et al, 1996). Dehydration of the hydrophobic tryptophan ridge and the cellulose surface is proposed to drive the binding of CBM2a to the cellulose. Alteration of the binding properties of CBM2a by mutation of amino acids on its binding face is currently underway. However, post-translational modifications, such as glycosylation, of CBM2a by eukaryotic hosts provide a practical situation where the alteration of its physical properties may effect the application of CBM2a as an affinity tag. The exact biological roles of glycosylation are often unclear. Protein linked glycans have been shown to play roles in cell trafficking, protein trafficking and protein half-lives (Lis et al, 1993). However, such biological roles are less attributable to glycans in fungal or bacterial systems. In such cases, glycans are thought to have importance in the physiochemical properties of the glycosylated protein. Glycosylation of a protein can affect its thermal stability and proteolytic sensitivity, folding kinetics, solubility, and tertiary structure (Lis et al, 1993). Changes in any of these properties may affect the biological function of a recombinant protein or a protein that is native to the expression host. The increasing popularity of eukaryotic cells and organisms as hosts for recombinant protein expression has made the effect of glycosylation on the properties of recombinant proteins a very relevant consideration. This study examines the properties of a glycosylated family 2a C B M produced and secreted by Pichia pastoris. The effect of glycosylation on the cellulose binding properties, structure, and stability is investigated. 57 3.3 Materials and Methods 3.3.1 Strains and vectors. Mutants of CBM2a were prepared by site directed PCR mutagenesis and P. pastoris (strain GS115) clones expressing CBM2a were obtained as described previously (Chapter 2). Bacterial expression plasmids were prepared by inserting the mutated CBM2a gene fragments into pTUGKH, a derivative of pTUGA (Graham et al, 1995), using 5' Aatll and 3' Hindlll restriction sites. E. coli strain DH5a and JM101 were used for cloning and expression, respectively. Cells were transformed by electroporation as described previously (Sambrook et al, 1989). 3.3.2 Protein production and purification. CBMs were produced in P. pastoris and purified as described previously (Chapter 2). Production of CBM2a and CBM2a mutants in bacteria was done using E. coli JM101 strains harboring the pTUG plasmid with the appropriate C B M gene fragment insert. The cells were removed from one litre cultures by centrifugation at 7000 rpm for 30 min at 4 °C. The supernatant was concentrated and buffer exchanged into 20 mM Tris pH 8.0 containing 0.5 M NaCI (Buffer A) to a final volume of 75 mis using cross-flow filtration with a IK cutoff membrane (Filtron, Northborough, MA) . The concentrated solution was loaded onto a 10 ml His-bind column (Novagen, Milwaukee, MI) at 2 mls/min. The column was then washed with 8 column volumes of Buffer A followed by step elution with Buffer A containing 25, 30, 40, 50, 75, 250, or 500 mM imidazole. Fractions containing C B M were determined by SDS-PAGE, pooled and buffer exchanged into 50 mM potassium phosphate, pH 7.0, using a stirred pressure cell (Amicon, Beverly, MA) . Purity was assessed by SDS-PAGE and protein concentrations determined by absorbance at 280 nm. 58 3.3.3 MALDI-TOF mass spectrometry. De-glycosylation reactions and MALDI-TOF mass spectrometry were performed as described previously (Chapter 2). 3.3.4 Binding assays. Bound samples for western blotting were prepared by adding one milliliter of conditioned culture supernatant to 50 ul of Avicel™ (100 mg/ml in distilled water). The samples were rotated at 4 °C for 4 hours. The cellulose was pelleted by centrifugation at 15000 rpm for 5 min at 4 °C. The supernatant was removed, the cellulose washed in 1 ml of 50 mM potassium phosphate buffer, pH 7.0, and the cellulose re-pelleted by centrifugation. This washing was repeated twice. Bound polypeptide was desorbed by boiling the cellulose with 40 uL SDS-P A G E loading buffer (25 mM Tris.HCl, pH 8.8, 1% SDS and 50 mM p-mercaptoethanol). Samples of 20 uL were then analyzed for protein by western blotting (section 2.3.4). Binding constants were determined from depletion binding isotherms. Triplicate samples of C B M at concentrations ranging from 0.5 to 15 u M were incubated in 1.5 ml Eppendorf tubes with 1 mg of B M C C in a final volume of 1 ml. Control tubes contained no cellulose. Al l samples were buffered with 50 mM potassium phosphate, pH 7.0. Samples were rotated for 4 hours at 4 °C. The B M C C was removed by centrifugation at 13000 rpm for 20 min at 4 °C in a drum rotor. Supernatants containing unbound material were transferred to clean tubes and absorbance measurements taken at 280 nm and 350 nm. The absorbance at 350 nm was taken as an approximation of the light scattering due to cellulose particles remaining in solution. This value was subtracted from the A280 to give a net A 2 8o resulting from the protein. The net A2go was used to calculate the free C B M concentration. Bound protein concentration was then 59 calculated by subtracting the free protein from the total protein determined from the control samples. The equilibrium association constants (K a) were determined by non-linear regression of the depletion isotherm data as described previously (Bolam et al, 1998). 3.3.5 Fluorescence spectroscopy. A l l fluorescence measurements were made with a Perkin Elmer LS-50 luminescence spectrometer (Perkin Elmer, Norwalk, CT). Emission scans were performed using 2.5 u M C B M in 50 mM potassium phosphate buffer, pH 7.0. The excitation wavelength was 280 nm. Emission intensities were collected over the wavelength range of 300 nm to 400 nm. The excitation and emission slit widths were 10 nm. Five scans were averaged. Solute quenching experiments were performed using potassium iodide (KI) and 2.5 u M C B M . Scans were collected as described above. Ionic strength was kept constant by the addition of NaCI to samples such that the final concentration of salt (total KI and NaCI concentration) was 1 M . Samples were buffered with 25 mM Tris.HCl buffer, pH 7.5. The excitation wavelength was 295 nm. Emission intensities were collected at 340 nm with integration times of 5 seconds. The excitation and emission slit widths were 10 nm. Triplicate readings were averaged. Data were plotted as Fo/(Fo-F) versus 1/KI concentration, where Fo is the fluorescence intensity in the absence of quencher and F is the fluorescence of the sample (Eftink et al, 1981). 3.3.6 Thermal and chemical melts. A l l thermal melt experiments were performed with a Cary lOOe UV-Vis spectrophotometer (Varian, Melbourne, Australia). U V difference spectra were taken by collecting a baseline from 230 nm to 330 nm at 25 °C using 800 pi of polypeptide at 15 u M in 25 mM Tris.HCl, pH 7.4. The temperature of the cuvette block was ramped to 80 °C and the sample 60 allowed to equilibrate for 20 minutes. The sample was rescanned over the same wavelength range. Thermal denaturation experiments were performed using 800 pi samples of polypeptide at 15 u M in 25 mM Tris.HCl, pH 7.4. The temperature of the cuvette block was ramped at 2 °C per minute from 25 °C to 75 °C. Data were collected at a wavelength of 290 nm at 0.2 °C intervals with the baseline value taken at 25 °C. Linear regression was used to fit trend lines to the linear pre- and post-transition portions of the raw melting curve. The results of the linear regression were used to predict absorbance values for fully native and fully unfolded polypeptide over the complete temperature interval. The fraction of native protein was determined using the equation: fn = (r - r„)/(rn-ru) where r is the measured absorbance value, and r n and r u correspond to the predicted absorbance values of fully native and fully unfolded polypeptide, respectively, at a given temperature. T m values corresponding to the temperature at which 50% of the polypeptides were unfolded were determined graphically. A l l chemical denaturation experiments were performed with a Perkin-Elmer LS-50 luminescence spectrometer (Perkin Elmer, Norwalk, CT). Fluorescence emission spectra were taken as described above in the presence and absence of 6 M guanidine.HCl and 25 mM Tris.HCl, pH 7.4. The protein concentration was 2.5 pM. Unfolding experiments were performed by incubating 2.5 u M C B M with increasing concentrations of guanidine.HCl in 25 mM Tris.HCl, pH 7.4, for 30 minutes. Fluorescence intensity values were collected using an excitation wavelength of 295 nm and an emission wavelength of 340 nm. The emission and excitation slit widths were 5.0 nm and the integration time was 5 seconds. Triplicate readings were collected 61 and averaged. The fraction of native polypeptide was calculated as described for the thermal unfolding experiments. 3.3.7 U V resonance enhanced Raman spectroscopy. U V R R spectra were collected at 227 nm using 400 \xW average power and a 30 minute integration time using equipment described previously (Greek, 1998). Spectra were collected at room temperature. A l l protein samples were at a concentration of 20 uM. Spectra were calibrated using ethanol spectra obtained before and after each experiment. For spectra collected at 227 nm, 80 mM Na2S04 was used as a standard to provide an internal intensity reference at 984 cm"1. 3.3.8 Protein concentration determination. The concentration of purified protein was determined by A280nm using a calculated molar extinction coefficient (Mach et ai, 1992) of 27625 M " 1 cm"1. 3.4 Results 3.4.1 Qualitative cellulose-binding. CBM2a contains five potential N-glycosylation sites, three of which function as acceptors of N-linked glycans when the polypeptide is produced by P. pastoris (Chapter 2). A series of CBM2a mutants lacking four out of the five potential N-glycosylation sites were constructed previously (Table 2.3 in Chapter 2). The binding of these mutants to cellulose was assessed qualitatively by incubating conditioned P. pastoris supernatants with Avicel™ and desorbing the bound material by boiling the Avicel™ with SDS-PAGE loading buffer. The desorbed C B M was detected by western blotting with anti-CBM2a antibodies (Figure 3.1). PpCBM2a.9 (glycosylated at N87) did not bind well to Avicel™. Only a small amount of PpCBM2a.9 with a molecular weight similar to that of unglycosylated C B M was bound. This 62 A B 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Figure 3.1: Qualitative binding analysis of CBM2a mutants produced by P. pastoris. Cultures of P. pastoris containing the modified CBM2a genes were grown 8 hours at 30 °C then induced with 0.5% methanol for 24 hours. Samples of cleared supernatant were analyzed by western blotting with anti-CBM2a antibodies directly (Panel A) or after binding and elution from cellulose with SDS-PAGE loading (Panel B). Lanes 1-8: EcCBM2a, PpCBM2a.4, PpCBM2a.5, PpCBM2a.6, PpCBM2a.7, PpCBM2a.8, PpCBM2a.9, and PpCBM2a, respectively. 63 indicates that glycosylation at N87 virtually abolishes the ability of PpCBM2a.9 to bind cellulose. Much of the PpCBM2a sample bound to Avicel™; however, binding appeared limited to the lower molecular weight bands. N-glycosylated PpCBM2a.8 bound relatively well to Avicel™, as did the remaining C B M mutants that were not significantly N-glycosylated when expressed in P. pastoris. 3.4.2 Quantitative cellulose binding. Association constants for all CBM2a mutants were determined from depletion isotherms (see Figure 3.2 for a representative isotherm). The parameters for CBM2a and CBM2a mutants produced mE. coli binding to B M C C are shown in table 3.1. The mutations may have modified Table 3.1: Affinity of CBM2a mutants for bacterial microcrystalline cellulose. E.colib P.pastoris0 CBD Variant3 K a x 10 6 No (pMoles/g BMCC) K a x 10"6 (NT1) No (pMoles/g BMCC) CBM2a.4 1.8 ± 0.4 10.7 ± 0.7 N Dd ND CBM2a.5 1.8 ± 0.6 10.7 ± 1.1 0.6 ± 0.1 13.8 ± 0.5 CBM2a.6 3.0 ± 0.7 10.4 ± 0.6 ND ND CBM2a .7 2.1 ± 0.5 10.5 ± 0.7 ND ND CBM2a.8 2.8 ± 0.4 10.2 ± 0.3 0.3 ± 0.0 10.0 ± 0.7 CBM2a.9 3.5 ± 1.1 12.4 ±1.1 0.01 ± 0.00 11.5° C B M 2 a 2.9 ±0.7 12.5 ± 0.9 ND ND a C B M abbreviations correspond to those outlined in Chapter 2. b binding parameters for C B M variants produced in E.coli. 0 binding parameters for C B M variants produced in P.pastoris. d ND, not determined e isotherm did not reach saturation so the capacity was set as a constant based on the average capacity of B M C C . Error values represent standard errors determined from the non-linear regression. K a and No values represent the association constants and capacity values, respectively. 64 Free (p.M) Figure 3.2: Binding isotherm of CBM2a produced in E. coli. Samples of CBM2a were incubated with 1 mg of B M C C in a 1 ml volume, of 50 mM potassium phosphate buffer, pH 7.0, for 4 hours at 4 °C. The B M C C was removed by centrifugation and the concentration of free and bound C B M determined as described in the Materials and Methods. Solid circles represent measured values. Solid line shows the non-linear least squares fit to the equation described in the Materials and Methods. 65 the binding of the C B M to bind cellulose, but the magnitudes of the changes were within the limits of error and do not merit interpretation. The capacity values (N 0) were also unchanged. The binding parameters were determined for three C B M mutants produced by P. pastoris. two singly glycosylated CBMs (PpCBM2a.8 and PpCBM2a.9) and the mutant lacking all N -glycosylation sites (PpCBM2a.5) (Table 3.1). PpCBM2a.8 and PpCBM2a.9 were purified by IMAC followed by a ConA purification step to obtain populations of C B M lacking unglycosylated material. Glycosylation at N24 in PpCBM2a.8 decreased the association constant 10-fold relative to the same mutants produced in E. coli. Glycosylation of N87 (PpCBM2a.9) decreased binding 200-300 fold. This represented very little binding and was consistent with the qualitative binding experiments. Approximately 10-15% of PpCBM2a.5 was O-glycosylated (Chapter 2). PpCBM2a.5, purified by IMAC, was further fractionated into O-glycosylated (PpCBM2a.5g+) and unglycosylated (PpCBM2a.5g-) populations with a ConA chromatography step. The association constant for PpCBM2a.5g- was 3-5 fold lower than that for CBM2a produced in E. coli. The association constant for PpCBM2a.5g+ was similar to that for CBM2a produced in E. coli. 3.4.3 Environment of the tryptophans in glycosylated CBM2a. Changes in the emission intensities were observed for the glycosylated variants when compared with their non-glycosylated counterparts (Figure 3.3); however, all of the non-glycosylated variants had identical emission spectra (results not shown). Reductions in fluorescence intensity were observed for glycosylated PpCBM2a and PpCBM2a.9. In contrast, an increase in fluorescence intensity was observed for glycosylated PpCBM2a.8. To determine i f the change in emission intensity of PpCBM2a.8 was from internal or external tryptophans, emission spectra for glycosylated and non-glycosylated PpCBM2a.8 were collected in the 66 Figure 3.3: Fluorescence emission scans ofE.coli produced (solid lines) and P.pastoris produced (dotted lines) CBM2a. Panels A to D show CBM2a, CBM2a.8, CBM2a.9, and CBM2aN87A, respectively. The solid line in panel D is the emission spectrum of EcCBM2a. Vertical dashed lines show the maximum emission wavelength of the E.coli proteins. 67 absence and in the presence of potassium iodide, a molecule that selectively quenches solvent exposed tryptophans (Figure 3.4) (Eftink et al, 1981). The KI quenched spectra of EcCBM2a.8 and PpCBM2a.8 were very similar in intensity indicating that the changes in emission intensity were due to the solvent exposed tryptophans. A B 300 325 350 375 4 0 0 300 .325 350 375 400 Emission Wavelength Emission Wavelength (nm) Figure 3.4: Fluorescence emission scans of potassium iodide quenched CBM2a. Panel A: scans of PpCBM2a.8 (dot-dashed line), EcCBM2a.8 (solid line), PpCBM2a.8 with 1 M KI (dotted line), and EcCBM2a.8 with 1 M KI (dashed line). Panel B: difference scans of PpCBM2a.8 (dot-dashed line) and EcCBM2a.8 (solid line) obtained by subtracting the quenched scans from the unquenched scans. Changes in fluorescence intensity, either quenching or dequenching, can result from changes in the collisional contacts of the tryptophan sidechains with solvent molecules or other amino acid sidechains (Eftink, 1991). Alternatively, changes in fluorescence intensity can result from changes in the electronic state of the indole nitrogen. Donation of an electron, via a hydrogen bond, from the indole nitrogen to ammonium or carboxylate groups can result in quenching of the tryptophan fluorescence (Eftink, 1991). It appears that when N24 is glycosylated an external tryptophan has decreased collisional contacts or loses a hydrogen bond. CBM2aN87A shows a 68 dequenching of fluorescence relative to CBM2a similar to that of PpCBM2a.8. This suggests the direct or indirect involvement of N87 in an interaction with a surface exposed tryptophan. The fluorescence emission spectrum of E. coli produced CBM2a has an emission maximum of 340 nm (Figure 3.3). This is characteristic of relatively exposed tryptophans and is consistent with the presence of three out of five tryptophans in CBM2a being surface exposed (Xu et al, 1995). PpCBM2a and PpCBM2a.8 have emission maxima at 342 nm, a 2 nm shift to the red relative to E. coli produced CBM2a. PpCBM2a.9 showed no shift in the emission maximum relative to unglycosylated CBM2a. The quenched spectrum of glycosylated PpCBM2a.8, resulting mainly from internal tryptophans, was red shifted when compared to unglycosylated PpCBM2a.8 (Figure 3.4). This suggests that the shift in the emission maximum of glycosylated PpCBM2a.8, and likely PpCBM2a, result from alterations in the environment of the buried tryptophans. The shifts in emission maxima described above implied that glycosylation at N24 resulted in environmental perturbations of one or more internal tryptophans. Common environmental perturbations result from alterations in the static polarity of the indole ring microenvironment or changes in the ability of solvent molecules to undergo dipolar relaxation. The latter possibility requires the indole nitrogen of the tryptophan to be solvent exposed and, therefore, must result from changes in the exposed tryptophans or increased exposure of internal tryptophans. To investigate the potential for the increased solvent exposure of the internal tryptophans in PpCBM2a and PpCBM2a.8 quantitative fluorescence quenching was employed (Figure 3.5 and Table 3.2) (Eftink et al, 1981). Neither the quenching constants (K s v ) nor the fraction (f) of tryptophans accessible by solute were significantly different between the glycosylated and unglycosylated CBMs. A red shifted emission spectrum of tryptophans is generally considered to 69 B 0 1 2 3 4 5 6 7 8 9 10 11 1/[KI] (M1) 0 1 2 3 4 5 6 7 8 9 10 11 1/[KI] (M 1) 0 1 2 3 4 5 6 7 8 9 10 11 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 10 11 1/[KI] (M"1) 1/[KI] (M'1) Figure 3.5: Potassium iodide fluorescence quenching of CBM2a tryptophan fluorescence. Panel A: EcCBM2a. Panel B: PpCBM2a. Panel C: EcCBM2a.8. Panel D: PpCBM2a.8. The dotted lines show the trend lines determined by linear regression. The slope and intercept of these lines give the inverse of the average quenching constant (K s v ) and the inverse of the fraction of accessible fluorophore (f). These values are shown in table 3.2. 70 be indicative of an increase in the hydrophilicity of the tryptophan environment (Eftink, 1991), which, in this instance, results from changes in the static polarity of the internal tryptophans and not increased solvent exposure. Table 3.2: Potassium iodide quenching values. Ksv(eff) 3 (WI1) f(eff)b E.coli WT 3.46 ± 0.40 0.51 ±0.02 P. pastoris WT 3.62 ± 0.09 0.51±0.01 E.coli N24 3.84 ± 0.07 0.54± 0.01 P.pastoris N24 3.73 ± 0.10 0.5U0.01 a effective Stern-Volmer quenching constant determined by the inverse values of the slope of the quenching data plotted according to the method of Lehrer (Figure 3.5). b effective fraction of fluorophore exposed determined by the inverse values of the y-intercept of the quenching data plotted according to the method of Lehrer (Figure 3.5). The tertiary structure of PpCBM2a.8 with respect to the tryptophan residues was studied by U V resonance Raman spectroscopy (UVRRS). The U V R R spectra of EcCBM2a.8 and PpCBM2a.8 at 227 nm are shown in Figure 3.6. The sulfate internal standard line at ca. 984 cm"1 is clearly visible. This signal was used to normalize the intensity of the spectra. The tryptophan modes W18 {ca. 760 cm"1), W17 {ca. 880 cm"1), W16 {ca. 1010 cm"1), W10 {ca. 1240 cm"1), W7 {ca. 1350 cm"1), and W3 {ca. 1555 cm"1) are also clearly seen. The most obvious feature of the spectra is the reduction in intensity of the tryptophan modes of PpCBM2a.8 relative to the internal standard. Such reductions in tryptophan signal intensity are correlated with a decrease in the hydrophobicity of tryptophan environments and/or the breaking of a hydrogen bond involving a tryptophan residue (Liu et al, 1989; Austin et al, 1993). A large shift from 870 cm to 884 cm"1 is seen in the W17 line (Figure 3.6). A large shift to higher wavenumbers in W17 is evidence that a moderate to strong hydrogen bond is completely broken upon glycosylation (Miura et al, 1988; Austin et al, 1993). The relative intensities of the 1345 cm"1 component of Figure 3.6: U V resonance Raman spectroscopic analysis of EcCBM2a.8 (dotted line) and PpCBM2a.8 (solid line). Panel A: full spectra taken at 227 nm. Spectra were normalized to the intensity of the internal sulfate standard (SC>42")- Panel B (following page): expanded 830 and 930 cm"1 region to show the detail of the W17 peaks. Panel C (following page): expanded 1300 and 1400 cm"1 region to show the detail of the W7 peaks. 1348 cm 1300 1320 1340 1360 1380 1400 Raman Shift (cm") 73 the tryptophan Fermi doublet relative to the ca. 1365 cm"1 component is an indicator of tryptophan hydrogen bonding and environmental polarity (Miura et al, 1988; Austin et al, 1993). Large ratios indicate strong hydrogen bonds and/or hydrophilic environments. In EcCBM2a.8 relative to PpCBM2a.8 this ratio decreases (Figure 3.6). However, the I065/I1340 in PpCBM2a.8 is approximately 1, which is considered to indicate a very hydrophilic environment. Because, in this case, the decreased ratio likely does not indicate changes in tryptophan hydrophilicity, the loss of a hydrogen bond is further supported. 3.4.4 Structural stability of glycosylated CBM2a. The unfolding of CBM2a in guanidine.HCl was accompanied by a large increase in fluorescence intensity and a large red-shift in the emission maximum (Figure 3.7). The change in fluorescence intensity at an emission wavelength of 350 nm was used to monitor the unfolding transition of CBM2a and glycosylated CBM2a mutants over a range of guanidine.HCl concentrations (Figure 3.7). The unfolding transition of EcCBM2a and EcCBM2a.8 occurred at the same concentration of denaturant (2.2 M). The unfolding transition of PpCBM2a and PpCBM2a.8 occurred at slightly lower concentrations of denaturant (1.9 M and 2.1 M , respectively). The thermal denaturation of CBM2a could be detected by changes in the U V absorbance spectrum (Figure 3.8). The change in U V absorbance at a wavelength of 290 nm was used to monitor the unfolding transition of CBM2a and glycosylated CBM2a variants over a temperature range of 25 °C to 75 °C (Figure 3.8). The melting temperatures (Tm) were taken as the points at which 50% of the protein is unfolded. The unfolding of EcCBM2a and EcCBM2a.8 occurred at the same temperature (66.5 °C). The unfolding of PpCBM2a and PpCBM2a.8 occurred at temperatures of 65.0 °C and 66.5 °C, respectively. 74 B (A C CD CU U c CO o (0 cu 300 325 350 375 400 Emission Wavelength (nm) .1 CO Z c o CO GuHCI Concentration (M) Figure 3.7: Chemical denaturation of CBM2a mutants. Panel A: fluorescence emission scans of EcCBM2a.8 (solid line), PpCBM2a.8 (dot-dashed line), PpCBM2a.8 in 6 M guanidine.HCl (dashed line), and EcCBM2a.8 in 6 M guanidine.HCl (dotted line). Panel B: guanidine melts of PpCBM2a (closed squares), PpCBM2a.8 (open circles), EcCBM2a.8 and EcCBM2a (open diamonds and closed triangles, respectively). B 250 260 270 280 290 300 310 320 Wavelength (nm) 1.2-CO • > 0 . 8 : '"S Z 0.6^ c o 0.4^ 10 1_ L L °-2: 0.0^ - 0 . 2 : I i i r [ i i i | r r i | i i r [ i i i | i r r | i r r [ i i r | 30 35 40 45 50 55 60 65 70 75 80 Temperature (°C) Figure 3.8: Thermal denaturation of CBM2a mutants. Panel A: U V difference scan of EcCBM2a at 80 °C. Panel B: melting profiles of PpCBM2a (dashed line), PpCBM2a.8 (dotted line), EcCBM2a and EcCBM2a.8 (overlapping solid line). 75 These results indicate that neither the amino acid substitutions nor the presence of glycans grossly perturbs the stability of CBM2a or CBM2a.8 produced in E. coli and P. pastoris. The slightly lower chemical and thermal stability of glycosylated CBMs may result from very small structural perturbations. Stability studies could not be carried out on PpCBM2a.9 due to the extremely low yields of this mutant. 3.4.5 N-linked glycans on PpCBM2a.8. To confirm the size of neutral glycans present on PpCBM2a.8, positive ion MALDI-TOF spectra of glycans released by endoglycosidase F l were obtained (Figure 3.9). The masses of the released sugars corresponded well with the masses of K+ adducts of glycans ranging in size from (GlcNAC)i-(Man)g to (GlcNAC)i-(Man)i6. Glycans with increased masses consistent with the presence of phosphate groups were identified by negative ion MALDI-TOF of the same glycan preparation (Figure 3.9). Acidic glycans were found to have 8 to 14 mannose residues. This glycan profile agreed very well with the population of glycans found on PpCBM2a (Chapter 2). 3.5 Discussion 3.5.1 Effects of glycosylation on structure. The N-glycosylation of CBM2a when it is produced by P. pastoris causes subtle changes in the structure of the C B M . In particular, glycosylation at N24 appears to result in the loss of a hydrogen bond. Furthermore, this seems to involve a surface exposed tryptophan(s). N87 plays a role in this hydrogen bond but it is not clear how. One possibility is that N87 directly hydrogen bonds with the indole nitrogen of one of the surface tryptophans, likely W17 based on its proximity to N87. The presence of a glycan at N24 may provide a better alternative for N87 to hydrogen bond with the glycan hydroxyl groups than with the tryptophan resulting in the loss of the tryptophan hydrogen bond. Alternatively, N24 and N87 may participate in a more 76 —| 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 T i 1 | i i i i | r 1450 1700 1950 2200 2450 2700 2950 m/z (amu) Figure 3.9: MALDI-TOF mass spectra of PpCBM2a.8 glycans. PpCBM2a.8 was deglycosylated by treatment with endoglycosidase F l for 8 hours at 37 °C. The glycans were partially purified by removing the remaining protein with an ethanol precipitation. The glycans remaining in solution were analyzed by MALDI-TOF mass spectrometry directly or after concentration by vacuum drying. Mass spectra were obtained in positive ion mode (solid line) or negative ion mode (dotted line). Doublet peaks obtained in positive ion mode are N a + and K + adducts of the glycans. 77 complicated network of hydrogen bonds, which when disrupted by glycosylation or substitution of N87 results in the loss of a hydrogen bond by a surface tryptophan. The CBM2aN87A variant has an unchanged affinity for cellulose (B. McLean, unpublished) indicating that the hydrogen bond formed by the tryptophan is unimportant in binding. Glycosylation at N24 also alters the environment of one or more internal tryptophans. This appears not to be due to increased solvent exposure of these residues but is likely due to a small rearrangement in packing. N24 is on the "b" strand of the (3-sheet structure of CBM2a (Xu et al, 1995). W12, which is buried in the hydrophobic core of the molecule, is on the "a" strand and immediately adjacent to N24. Perturbation of the "b" strand due to the motion of a large moiety such as a glycan would likely be reflected by changes in the environment of W12. In particular, V25 on the "b" strand forms contacts with W12, movement of this hydrophobic residue away from W12 may explain the increased hydrophilicity of the internal tryptophan environment. 3.5.2 Effects of glycosylation on stability. Glycosylation does effect the stability and structure of proteins. Despite the indication of structural rearrangements due to glycosylation the stability of glycosylated CBM2a was relatively unaffected indicating a lack of gross structural changes resulting from glycosylation at any of the sites. 3.5.3 The effect of glycosylation on binding. The lack of changes in the stability of glycosylated C B M suggests the greatly reduced binding ability of PpCBM2a, and by extension PpCBM2a.9, was not due to large structural changes. The most probable explanation is that N87 is very near to the surface tryptophans that form part of the binding site of CBM2a (Figure 3.10). The presence of a large glycan on this Figure 3.10: Placement of the N24 (green) and N87 (red) glycosylation sites in a bottom view (left) and end-on view (right) of CBM2a. The tryptophan residues involved in binding to crystalline cellulose are shown in yellow. 79 residue in both PpCBM2a and PpCBM2a.9 would surely impose steric hindrances sufficient to abolish binding. The relationship between glycosylation at N24 and the reduced binding of PpCBM2a.8 is less clear. The observed perturbations in the structure may be responsible for the 10-fold drop in the association constant. However, N24 is relatively close to the binding tryptophans (Figure 3.10) making the most plausible explanation steric effects resulting from the large sugar hindering the interaction of the binding face of the C B M with the cellulose surface. This effect is not complete as indicated by the retention of a considerable ability to bind cellulose. The N-glycosylation of CBM2a produced in P. pastoris is not homogeneous, even when only a single glycosylation site acts as an acceptor. It is clear that glycans of variable size can be attached to a single glycosylation site. What has been considered is the average effect of the variable glycans attached to the polypeptides. It is not clear whether variable glycan sizes have variable effects on the structure and binding of CBM2a. 3.5.4 O-glycosylation. About 10% of a population of PpCBM2a.5 (i.e. in the absence of N-glycosylation) is O-glycosylated. O-glycosylation has been detected in the presence of N-glycosylation for PpCBM2a (Chapter 2). It is probable that in all of the N-glycosylated mutants there is a similar 10% background of O-glycosylation. However, the results indicate that O-glycosylation has little, if any, effect on the binding properties of CBM2a. Furthermore, in comparing the N -glycosylated variants, it seems safe to assume that the degree of O-glycosylation background is similar. Because the stabilities of the variants did not change it is probable that the 0 -glycosylation did not effect this or the level was insufficient to play a role. The fluorescence 80 spectra of PpCBM2a.8 and PpCBM2a.9 relative to unglycosylated C B M had no shared common changes. O-glycosylation did not seem to produce an effect on the protein that could be measured by fluorescence. Alternatively, the level of O-glycosylation was insufficient to produce a measurable effect relative to that resulting from the N-glycosylation. Though the background of O-glycosylation should not be ignored, it appears that in this case it is not prevalent enough to justify serious attention relative to the large amount of N-glycosylation. 3.5.5 A structural model for the adsorption of CBM2a to cellulose. A large body of evidence shows three solvent exposed tryptophan residues to be important in the interaction of family 2a CBMs with cellulose (Chapter 1). However, a satisfactory structural model of how the C B M interacts with cellulose has not been presented. The current model views cellulose as a flat surface of cellulose chains sitting side-by-side with the entire pyranose ring of each glucose residue exposed to the solvent. The C B M is proposed to orient itself such that the ridge of binding tryptophans is parallel with a single cellulose chain and the "wedge" formed by the binding ridge points directly into the chain (Figure 3.11)(Tormo et al, 1996). In this model, a limited number of hydrogen bond donor-acceptor residues on the C B M interact with the cellulose chains flanking the chain that interacts with the tryptophan residues. This model has a serious structural flaw. The structure of cellulose is misrepresented. Crystalline cellulose consists of cellulose chains that are stacked and staggered such that they appear as a pyramid when viewed end on (Figure 3.12)(Blackwell, 1982; Atalla, 1993; Sarko, 1986). The sides of the crystal are much like a staircase with the steps formed by the edges of the pyranose rings and the hydroxyl substituents. Only the single chains at the peaks of the pyramids have their entire pyranose rings exposed to solvent. Because the chains at the peaks of the pyramids represent an exceedingly low proportion of the entire surface area of the cellulose crystals it seems more likely that the C B M interacts directly with the "staircase" surfaces of the 81 Figure 3.11: Model of the adsorption of CBM2a to cellulose. Panel A : Tormo et al. (1996) model of CBM2a adsorbed to cellulose. Panel B: new model of CBM2a adsorbed to cellulose. Cellulose chains are shown in purple. CBM2a tryptophan residues are shown in a green ball-and-stick representation. Asparagine 87 is shown in a yellow ball-and-stick representation. Asparagine 24 and the modeled (GlcNAC)2-(Man) 7 glycan is shown in a red ball-and-stick representation. Solvent surfaces are shown in transparent grey. K2 Figure 3.12: Schematic representation the cellulose chain organization in a cellulose microfibril. crystals. Indeed, bm2aCfCel6A does saturate B M C C with a monolayer of C B M (Gilkes et ai, 1992). The ability of CBM2a with a large glycan at N24 (PpCBM2a.8) to effectively bind to cellulose introduces a new consideration. In this model the glycan would likely result in relatively serious steric hindrance to the interaction of the C B M with cellulose. A more acceptable scenario is one where the C B M is tilted onto the "staircase" surface such that the glycan moves away from the cellulose surface (Figure 3.11). This new model is attractive because it incorporates a more accurate representation of the cellulose surface and it provides a situation where more extensive interactions, and thus a more stable overall interaction, could be formed between the C B M and the cellulose. Furthermore, it aids in explaining how a branched glycan of up to 16 sugar residues at N24 may be situated such that is has a minimal impact on the binding affinity of the C B M . Likewise it also shows how a glycan at N87 could be sterically lethal to binding. The weakness of this model is that it makes no attempt to represent the specific role of the tryptophans. It is still assumed that the tryptophans interact with the pyranose rings of the glucose residues as is observed in many protein-carbohydrate interactions. However, the 83 greater portion of the pyranose rings is occluded by the organization of the cellulose chains in the cellulose crystal and, therefore, in the absence of detailed structural information on the CBM2a-cellulose complex the specific role of the tryptophans remains unclear. 3.5.6 Biotechnological improvements. The most significant practical finding was the construction of a CBM2a mutant that lacks N-glycosylation sites and retains its affinity for cellulose when produced in P. pastoris. The impeded binding of N-glycosylated C B M presents a barrier to the use of these modules as fusion partners for the immobilization of proteins produced in hosts that perform N-linked glycosylation. The use of the CBM2a.5 variant will allow the construction of C B M gene fusions for expression in eukaryotic hosts that will bind cellulose with a high affinity. 84 Chapter 4 C h a r a c t e r i z a t i o n o f a N o v e l L e c t i n - l i k e C a r b o h y d r a t e - B i n d i n g M o d u l e f r o m Streptomyces lividans X y l a n a s e 1 0 A . 85 4.1 Summary The C-terminal carbohydrate-binding module of xylanase 10A, CBM13, from Streptomyces lividans belongs to the family 13 carbohydrate-binding modules. CBM13 binds to the insoluble polysaccharides xylan, holo-cellulose, pachyman, and lichenan. It also binds soluble xylan, arabino-galactan, and laminarin. The association constant for binding to soluble xylan is ~6 x 103 per mole of xylan polymer. Site-directed mutation was used to demonstrate the presence of three functional sites involved in the binding of CBM13. These binding sites are similar in sequence and predicted to be similar in structural organization to the a , P, and y sites in ricin toxin B-chain (RTB). Fluorescence spectrophotometric titrations were used to quantify the binding of saccharides to CBM13. The binding specificity was very low being restricted only by the requirement for pyranose sugars. The association constants for binding to small sugars were also low (~1 x 102 M " 1 to 1 x 103 M" 1). This is the first bacterial family 13 C B M to be characterized in detail and the first C B M shown to be multivalent. 4.2 Introduction Streptomyces lividans produces a number of enzymes with polysaccharolytic activity. One of these enzymes is a 60 kDa modular endo-beta-1,4 xylanase, xylanase 10A or XynlOA. This enzyme has a N-terminal catalytic module and a 130 amino acid C-terminal module. The presence of the C-terminal module has been shown qualitatively to improve the association of the whole enzyme with insoluble xylan and, thus, it was proposed to define a new family of carbohydrate binding module (Dupont et ai, 1998). In fact, this module shows strong sequence similarity to the family 13 carbohydrate-binding modules (Tomme et al, 1995) and will be referred to as CBM13. Family 13 is the most diverse of the C B M families having sequence entries from plants, animals, and microbes. Notable members of this family are the P-trefoil 86 galactose specific lectins from Ricinus communis (B-chains from ricin agglutinin and toxin, R C A and RTB, respectively), Abrusprecatorius, Sambucus nigra, and Viscum album (Tomme et al, 1995). However, the ligand specificity of this family is not completely reflected by the conspicuous inclusion of these lectins; other members of family 13 have been shown to bind mannose and sialic acid. The general promiscuity of the family 13 CBMs and ability to bind small sugars suggests that CBM13 may recognize saccharides other than insoluble xylan. Futhermore, the presumed biological function of XynlOA is to participate in the degradation of xylan associated with plant biomass (Dupont et al, 1998; Vincent et al, 1997). Plant cell walls contain predominantly cellulose with other structural polysaccharides of variable composition (hemi-cellulose) interspersed. The main-chain polysaccharide and the substitutions of hemi-celluloses are assembled from assortments of monosaccharides including xylose, arabinose, galactose, glucose, rhamnose, mannose, and acetamido galactose or glucose (Heredia et al, 1995). The diversity of sugars present in the biological substrate of XynlOA further implies the potential for CBM13 to bind more assorted ligands. This hypothesis is tested in the following study, which investigates the specificity of CBM13 for polysaccharides, oligosaccharides and the monosaccharide components of hemicellulose. Also investigated is the potential for multiple binding sites on CBM13 and the possible role of polysaccharide conformation in determining binding specificity is discussed. 4.3 Materials and Methods 4.3.1 Carbohydrates and polysaccharides. Microcrystalline cellulose (Avicel™ PH101) was obtained from F M C International (Little Island, County Cork, Ireland). Bacterial micro-crystalline cellulose (BMCC) was prepared from cultures of Acetobacter xylinum (ATCC 23769) as described previously (Gilkes et al, 87 1992). Regenerated cellulose (PASA) was obtained by phosphoric acid treatment of Avicel PH101 as reported previously (Coutinho et al, 1992). D-glucose, l-O-methyl-a glucose, 1-0-methyl-/?-glucose, 2-deoxy-glucose, N-acetyl D-glucosamine, glucuronic acid, D-galactose, 6-deoxy-L-galactose (L-fucose), D-mannose, 6-deoxy-L-mannose (L-rhamnose), L-arabinose, D-ribose, D-xylose, cellobiose, lactose, gentibiose, raffinose, chitin (shrimp shells), lichenan, yeast mannan, oat spelt xylan, larchwood xylan, soluble starch (potato) and arabinogalactan were purchased from Sigma Chemical Company. Hydroxyethyl-cellulose (HEC, viscosity -0.08 -0.15 Pa (2 % (w/v) solution )) was purchased from Aldrich Chemical Company. Maltose was obtained from Baker chemical corp. Barley /?-glucan (viscosity 20-30 cSt), pachyman (Lot MPA80801), xylo- and arabino-oligosaccharides (> 95 % pure), arabinan (sugar beet), linear L -arabinan, and pectic galactan (potato) were purchased from MegaZyme Ltd. (North Rocks, N.S.W., Australia). Birchwood Xylan (Roth 7500; M W -25000) was obtained from Carl Roth R G (Karlsruhe, Germany). The xylans were fractionated into a water-soluble and water-insoluble fraction according to the procedures described by (Blake et al, 1971) and (Selvendran et al, 1987). Holocellulose (from pine) was a kind gift from Dr. Jack Saddler, University of British Columbia. 4.3.2 D N A amplification and cloning. Genomic D N A from Streptomyces lividans was prepared as described elsewhere (Betzler et al, 1987). The gene fragment encoding the 132 residues of the C-terminal substrate-binding domain CBM13 from Streptomyces lividans Xylanase 10A (GeneBank M64551)(Vincent et al, 1997) was obtained and amplified by PCR. Appropriate restriction sites were also introduced at the 5' and 3' ends of the CBMJ3 gene for cloning in the pTug expression vectors (Graham et al, 1995; Tomme et al, 1996). Each PCR mixture (50 pL total) contained 25-50 ng genomic D N A 25-50 pmole primers, 10% DMSO, 0.4 mM 2'-deoxynucleoside 5'-triphosphates, and 1U Pwo 88 D N A polymerase in buffer (Boehringer Mannheim, Laval, Quebec). A protocol of twenty successive cycles of denaturation at 94 °C for 1 min, annealing at 55 °C for 30 s, and primer extension at 72 °C for 1.5 min was followed. An Nhel site (underlined) was introduced at the 5' end of the CBM13 gene fragment, using the oligonucleotide 5 ' - T G A C T T G A C G T C C G C T A G C G A G C C C C C C G C G G A C G G G - 3 ' as primer. A Hindlll (underlined) restriction site was introduced at the 3' end of the C B M 13 sequence using the oligonucleotide 5 ' - T G A C G A G C G G C C G C A A G C T T A T C A G G T G C G G G T C C A G C G T T G - 3 ' as primer. The resulting 0.43 kb PCR fragment was digested with Nhel and Hindlll and cloned in frame with the sequence encoding the Cex leader peptide and the hexa-histidine tail in pTugKH previously digested with the same restriction enzymes, to give pTugCBM13. D N A was sequenced by the using the AmpliTaq dye termination cycle sequencing protocol and an Applied Biosystems Model 377 sequencer by the NAPS Unit of the Biotechnology Laboratory, UBC. Site-directed mutations were introduced by "mega-primer" PCR. The first PCR step to introduce the mutation was performed using the mutagenic primer and the appropriate 5' or 3' flanking primer that was used to amplify the entire fragment. The conditions outlined above were used for amplification. Amplified products were purified using Qiaex II (Qiagen, Chatsworth, CA) after electrophoresis through a 1% agarose gel. This amplified D N A was then employed as a "mega-primer" in a second PCR reaction utilizing the appropriate 5' or 3' flanking primer to amplify the entire desired gene fragment. The gene fragments encoding the mutated CBM13 genes were then inserted into pTugKH as described above. Restriction digests targeting silent restriction sites inserted into the mutagenic primer sequences were used to screen for positive clones containing the mutations. Constructs were sequenced as described above. 89 4.3.3 Protein purification. Overnight cultures of E. coli strain JM101/pTugCBM13 were diluted 500-fold in tryptone-yeast extract-phosphate medium (TYP) (Sambrook et al, 1989) supplemented with 100 jug kanamycin / mL, and grown at 30 °C to a cell density (A600nm) of 0.3. Isopropyl-l-thio-/?-D-galactopyranoside (IPTG) was added to a final concentration of 0.2 mM to induce transcription of the gene fragment encoding CBM13 and incubation was continued for a further 36 h at 30 °C. The cells were harvested by centrifugation (8500 x g) for 10 min at 4 °C and resuspended to about 1/50 of the original culture volume by gentle mixing in 20 mM Tris.HCl buffer, pH 7.9 containing 10 mM imidazole and 0.5 M NaCl. Cells were ruptured by two passages through a French pressure cell (21000 lb/in^) and cell debris was removed by centrifugation for 20 min at 27000 x g and 4 °C. CBM13 was purified from the clarified cell extract by immobilized metal affinity chromatography (EMAC) as described below. A small column (1.5 x 10 cm) was packed with 10 mL of His.Bind resin (50 % slurry) (Novagen, Milwaukee, MI) to give a final bed volume of ~5 mL. A l l subsequent operations were done at room temperature and at flow-rates of 2 mL/min using a Pharmacia P50 peristaltic pump. The column was washed with 25 mL distilled water, charged with 50 mL charge solution (50 mM NiS04) and equilibrated with 50 mL binding buffer (20 m M Tris.HCl buffer, pH 7.9 containing 10 mM imidazole and 0.5 M NaCl). 100 mL of clarified cell extract, diluted 1:2 in binding buffer, was loaded on the column. Unbound protein was eluted by washing the column with 10 column volumes of binding buffer. The adsorbed proteins were recovered by stepwise elution with 10 mL of 25, 30, 35, 40, 50, 75, 100 and 250 mM imidazole in 20 mM Tris. HC1 buffer, pH 7.9 containing 0.5 M NaCl. The column was then regenerated by stripping the N i ^ + from the column with 5 column volumes of 100 mM E D T A in 20 mM Tris.HCl buffer, pH 7.9 -90 0.5 M NaCI followed by 5 column volumes of 6 M guanidinium hydrochloride in the same buffer. The column was then recycled as described above. Protein fractions were analysed for purity on 20 % Phast gels (Pharmacia, Uppsala, Sewden). Pure CBM13 fractions were pooled, de-salted, exchanged into the appropriate buffer and concentrated in a stirred ultra-filtration unit (Amicon, Beverly, M A ) on a IK cutoff filter (Filtron, Northborough, M A ) . 4.3.4 Thermal protein melts. A l l thermal melt experiments were performed on a Cary lOOe UV-Vis spectrophotometer (Varian, Melbourne, Australia). U V difference spectra were taken by collecting a baseline from 230 nm to 330 nm at 25 °C using 800 uL of polypeptide at 15 u M in 25 m M Tris.HCl, pH 7.4. The temperature of the cuvette block was ramped to 80 °C and the sample allowed to equilibrate for 20 minutes. The sample was rescanned over the same wavelength range. Melting experiments were performed using 800 pL samples of polypeptide at 15 u M in 25 mM Tris.HCl, pH 7.4. The temperature of the cuvette block was ramped at 2 °C per minute from 25 °C to 75 °C. Data were collected at a wavelength of 272 nm at 0.2 nm intervals with the baseline value taken at 25 °C. Linear regression was used to fit trend lines to the linear pre- and post-transition portions of the raw melting curve. The results of the linear regression were used to predict absorbance values for fully native and fully unfolded polypeptide over the complete temperature interval. The fraction of native protein was determined by using the equation: fn = (r - ru)/(r„-r„) Where r is the measured absorbance value. r n and r u correspond to the predicted absorbance values of fully native and fully unfolded polypeptide, respectively, at a given temperature. Tm values were determined graphically by determining the temperature at which 50% of the polypeptide was unfolded. 91 4.3.5 Protein concentration determination. The concentration of purified protein was determined by A28o n m using a calculated molar extinction coefficient (Mach etal., 1992) of 32342 M " 1 cm"1. 4.3.6 Fluorescence analysis of protein-carbohydrate binding. A l l fluorescence measurements were performed on a Perkin Elmer LS-50 luminescence spectrometer (Perkin Elmer, Norwalk,CT). Emission scans were performed using 5 u M CBM13 in the presence or absence of 25 mM ligand in 25 mM Tris.HCl buffer, pH 7.5. The excitation wavelength was 280 nm. Emission intensities were collected over the wavelength range of 300 nm to 400 nm. The excitation and emission slit widths were 5 nm. Five scans were averaged. Solute quenching experiments were performed using potassium iodide (KI) and 5 u M CBM13. Ionic strength was kept constant by the addition of NaCl to samples such that the final concentration of salt (total KI and NaCl concentration) was 1 M . Samples were buffered with 25 mM Tris.HCl buffer, pH 7.5. The excitation wavelength was 295 nm. Emission intensities were collected at 350 nm with integration times of 2 seconds. The excitation and emission slit widths were 5 nm. Four readings were averaged. Data were plotted as relative fluorescence intensity (Fo/F, where Fo is the fluorescence intensity in the absence of KI ; F is the fluorescence intensity with KI) vs. KI concentration (Eftink et al, 1981). Quantitative binding experiments were performed by adding small aliquots of the appropriate carbohydrate in 25 m M Tris.HCl buffer, pH 7.5 to 800 uL of CBM13 (5 u M in the same buffer) under continuous stirring. After each ligand addition, the incubation mixture was stirred for 2 min to allow the complex to reach equilibrium and the fluorescence intensity was measured 92 using excitation wavelengths of 275, 285, and 295 nm and measuring emission at 350 or 330 nm wavelengths and a slit width of 5 or 10 nm. Three to five 30 s integration periods were averaged for each data point. The fluorescence yield was constant over this time period. The emission spectra of all solutions were corrected for background fluorescence caused by buffer and carbohydrates and for dilution and inner filter effects (Eftink, 1997) as required. Plots of relative fluorescence (F / F 0 , where F is the initial fluorescence; F D is the fluorescence in the presence of carbohydrate) versus carbohydrate concentration were constructed and association binding constants K a ( M 1 ) and the maximal fluorescence change (F m a x ) in the protein upon full complexation with sugar ligand were derived by a non-linear least squares fit of the corrected data to a one-site binding model using Origin v.5.0 (Microcal, Northhampton, M A ) software. To evaluate the requirement for metal ions for binding, CaCl 2 or M g C l 2 (0-1 M) was added to both carbohydrate (xylose) and protein solutions. The pH dependence for binding of xylose on CBM13 was investigated in 25 mM citrate-phosphate-borate buffer, pH 3.0-10.5 as described above. 4.3.7 Affinity electrophoresis. Qualitative and quantitative binding of CBM13 to soluble polysaccharides was evaluated by affinity electrophoresis (Takeo, 1985; Johnson et al, 1996b) in 10% polyacrylamide gels polymerized in the absence or presence of various amounts of polysaccharide (usually 0-1 % w/v). Soybean trypsin inhibitor (10 pg) was used as an internal standard. To prevent it from running off the gel the standard was loaded after 45 min of electrophoresis. Electrophoresis was for 1.5 h at 4 °C, pH 8.8 and 150 V in a Mini PROTEAN II system (BioRad, Missauga, Ontario). After electrophoresis the gels were stained with Coomassie blue. The migration distances of CBM13 and reference proteins were measured directly on the gels and used to determine the association constants (K a) as described below. 93 Dissociation constants (Ka) were obtained from the abscissa of the straight lines in the 1/ r versus C plots according to the transformed affinity equation (Takeo, 1985) 1 / r= VR0(l + C / K d ) where r is the relative migration distance of the C B M in the presence of affinity ligand in the gel, R0 is the relative migration distance of the free C B M in the absence of affinity ligand, C is the molar concentration of affinity ligand in the gel and Kd is the dissociation constant of C B M for the macromolecular affinity ligand. K a s were calculated as the reciprocals of the Ka values. Al l migration distances of C B M were measured relative to the migration of the reference proteins. For complexes between CBM13 and neutral polysaccharides of intermediate molecular weight, departure from linearity in the 1/ r versus C plots is observed at higher sugar concentrations due to limited mobility of the complex caused by incomplete molecular sieving effects of the polyacrylamide (Takeo, 1985). More accurate dissociation constants were derived after re-graphing the data in a 1 / (Ro-r) versus 1 / C plot (Takeo, 1985) according to the equation: \/(R0-r) = \/(R0-Rc)(l+K(i/C) where r, R0, Kd , and C are defined as above and Rc is the relative migration distance of the complex at high excess of affinity ligand where all C B M molecules are fully complexed. 94 Interaction of CBM13 with low molecular weight affinity ligands was assessed by competitive affinity electrophoresis (Takeo, 1985) in gels containing both soluble birchwood xylan (0.43 mM or 2.5 g/L) and a competing mono- or disaccharide (0.1 M). 4.3.8 MALDI-TOF mass spectrometry. Purified CBM13, drop dialyzed overnight into distilled water using a 0.025 um vs-membrane (Millipore, Bedford, MA) , was diluted in distilled water to 40 picomoles per pL and mixed 1:1 with a saturated matrix solution of sinapinic acid (sigma) in 70 % acetonitrile and 0.1 % trifluoro acetic acid. 1 uL was spotted onto the M A L D I target and allowed to dry. The mass was obtained by positive ion MALDI-TOF mass spectrometry on a SELDI-MassPhoresis system (Ciphergen, Palo Alto, CA). Bovine super-oxide dismutase (12230.6 Da) was used to calibrate the machine. 4.3.9 Adsorption assays on insoluble polysaccharides. For qualitative adsorption experiments, 75 pg purified CBM13 was mixed (4 °C) end over end with B M C C (2 mg), Avicel (10 mg), PAS A (2 mg), pine holocellulose (20 mg), oat spelt or birchwood xylan (10 mg) or linear arabinan (5 mg) in a final volume of 1 mL potassium phosphate buffer (50 mM, pH 7.0). After 1 h, polysaccharides were collected by centrifugation (13000 x g, 4 °C) and washed with 1 mL potassium phosphate buffer (50 mM, pH 7.0). The washing step was repeated three times. After the last washing step, insoluble polysaccharides were collected by centrifugation and boiled for 5 min after addition of 40 uL SDS loading buffer. Fractions of 20 pL were then analyzed for protein by SDS-PAGE (13 %). 95 For semi-quantitative binding analysis (percent CBM13 bound for a fixed amount of each polymer), 75 pg CBM13 was incubated with 1 mg of each polysaccharide as described above. After lh incubation, polysaccharides were removed by centrifugation (15 min at 13000 x g, 4 °C) and unbound protein left in the supernatants was measured ( A ^ ^ ) and used to calculate the amount of C B M bound to each polysaccharide as described previously (Johnson et al, 1996b; Bolam etal, 1998). 4.3.10 Sugar analysis. The total sugar concentration of mono and disaccharide solutions and of the soluble polysaccharides was determined by the phenol-sulfuric acid method (Chaplin, 1986). Reducing sugar was measured with hydroxybenzoic acid hydrazine (FIBAH) (Lever, 1973). Appropriate monosaccharide standards were used for both assays. 4.4 Results 4.4.1 Amino-acid similarity with Ricin. The polypeptide encoded by the cloned CBM J 3 gene fragment is approximately 130 amino acids in length. Within this polypeptide are three repeated sequences (domains) of 40 amino acids having 30% - 55% sequence identity and greater than 55% sequence similarity (Figure 4.1). The family 13 C B M from S. lividans arabino-furanosidase B (ABFB) has the same pattern of repeats, highly similar to one another and to those in CBM13 (35%-65% identity and greater than 50% similarity with the repeats in CBM13). The repeats from both of these modules show significant sequence identity (20%-30%) and significant sequence similarity (25%-70%) with the duplicated modules in ricin toxin B-chain (RTB). Perhaps more important than the overall sequence identity is the conservation between the repeats of particular residues that have a demonstrated structural or functional role in RTB, as discussed below. 96 CBM13 a CBM 13 p CBM13 y A B F B a ABFB p ABFBy RTB l a RTB lp RTB ly RTB 2a RTB2p RTB 2y A D G G Q H K G V G S G R G E f f l R - ^ Y G D K — S V j - G V Q S G L AGSGAfflRGAGSNR G R I T - V Y G D K I T V g j - G V E S G L P E P I V R I V G R N G L — N T | R — S N G K [ G T H I N P R S S L -PFVTTDVGiiY-iLS 20 jVPDASTSDGT.Q D A A G - - T S N G S K HAVGNGTANGTL DVLGGSQDDGAL 40 | V R D G R F H N G N . T T Y G — Y S P J V Y L A A T S G N S G T J L T V Q T -Q A N S - i ISEl --GSIRPQQNRDNHiTSDS--NIREIVEKHL-rm - - G T I 2 N - L Y S G L V | J D V R A S - D P S L K Q P J I J 5 -H S — GTNk WG--GDNKK SN--GSNKR W G - - G T N R Q SG--GRNKQ N G - - G G N | K K S N T D A N | L N T A A T D A T R — N I Y - A V S - B G E D H S S E K A E Q ! -SgGPA-SSGgR - P L H G D P - - N | I - A T | A R L N S | G T R T T - S T | T R V N S | G T - G L T G T P P -T L K R | Q-IWiN L - P T | N T Q P F A L Y A I MFKNjj LPLF Figure 4.1 Amino acid alignment of the a, P, and y repeats of selected family 13 carbohydrate-binding modules. CBM13 is the binding module from xylanase A from Streptomyces lividans; ABFB, the binding module from arabino-furanosidase B from S. lividans; RTB 1, module 1 of ricin toxin B chain; RTB 2, module 2 of ricin toxin B chain, a, P, and y refer to the repeated sequences as in ricin toxin B chain, indicates residues involved in substrate binding in RTB; denotes the aromatic residues involved in substrate binding. A denotes residues forming the hydrophobic core of RTB. ^ indicates cysteine residues forming disulfide bonds (no disulfide bonds are present in the y domains of RTB). Alignment was prepared using ClustalW (Thompson etal, 1994). The l a and 2y domain binding sites of RTB have been well characterized by crystallography and equilibrium binding studies (Rutenber et ai, 1991; Houston et ai, 1982; Zentz et al, 1978). Recent mutagenic studies established that the ip domain is also capable of binding galactose (Frankel et al, 1996; Frankel et ai, 1996). Four residues in the l a site participate in the substrate binding by hydrogen bonding either directly with substrate or with adjacent residues to orient and stabilize them. The conformation of the l a binding site of RTB is shown in Figure 4.2. A key aspartic acid residue (D22) forms strong hydrogen bonds with C3 and C4 of galactose 97 and is proposed to mediate the primary interaction with the carbohydrate. A tryptophan residue in the binding site stacks against the B-face of the galactose pyranose ring, a type of interaction that is ubiquitous in protein-carbohydrate interactions (Quiocho, 1986; Quiocho, 1988; Drickamer, 1997; Weis et al, 1996). The 2y site of RTB forms nearly identical interactions with galactose except for the substitution of a tyrosine for the tryptophan and the substitution of an isoleucine for a glutamine. In the latter case, the isoleucine forms a hydrophobic interaction with C6 of galactose instead of the hydrogen bond formed between C6 and a glutamine in the l a site. These binding residues, with the exception of the isoleucine in the 2y domain, are all highly conserved in the three domains of CBM13. Both of the modules in RTB have disulfide bonds within the a and P domains; the y domains lack the requisite cysteine pairs. The cysteine residues participating in these bonds are conserved in all three domains of CBM13. Furthermore, there are no free sulfhydryl groups in CBM13 indicating that all of the cysteine residues are involved in disulfide bonds (Dupont et al, 1998). CBM13 likely has the same disulfide bonding pattern as RTB with an additional disulfide bond in the y domain. There are five residues in each of the repeated sequences in RTB that are important in forming the hydrophobic core of the protein. These amino acids represent the most highly conserved class of residues between RTB, ABFB, and CBM13. Murzin et al. (1992) predicted a P-trefoil fold for the RTB modules based on the repeated domain motifs, secondary structure predictions, and the conservation of hydrophobic core residues with functionally divergent proteins having P-trefoil folds (see section 4.5.5.1 for a discussion of P-trefoil folds). This structure prediction was later proven correct (Rutenber et al, 1991). The predictable nature of this fold allowed the 98 molecular modeling of CBM13 (Guex et al, 1999) based on the presence of these structurally relevant residues and its overall sequence similarity with RTB and abrin toxin B-chain (AbrB), family 13 CBMs whose structures have been determined by X - ray crystallography (Rutenber et al, 1991; Tahirov et al, 1995) (Figure 4.2). The lack of disulfide bonds in the y domains of the template structures resulted in a model also lacking a disulfide bond in the y domain. This bond was added in the CBM13 model for consistency with the biochemical data. The pseudo threefold axis of symmetry in the predicted P-trefoil fold reflects the repeated sequences of the a, P, and y domains. The putative binding sites are predicted to be approximately 24A from one another and spatially located on adjacent faces of the protein. Similarities in the conformations of the three putative binding sites are evident. The conservation of structural and functional amino acid residues between CBM13 and RTB with the concomitant similarities in predicted structure supports the supposition that CBM13 binds carbohydrates other than xylan. From conservation of the binding residues, an ability to bind galactose, and perhaps other monosaccharides, would be predicted. In addition, CBM13 should be multivalent as indicated by the presence of three extremely similar potential binding sites. 4.4.2 Production and purification of CBM13. CBM13 was produced and purified as described the Materials and Methods. The yield of purified polypeptide was 60-80 mg/L. The purity and mass of the polypeptide was assessed MALDI-TOF mass spectrometry (Figure 4.3) and SDS-PAGE (results not shown). Purity was estimated to be greater than 95%. The mass was determined to be 15240.9 Da. The expected 99 Figure 4.2 Family 13 C B M structures. Panel A : conformation of the l a binding site of RTB. The bound lactose molecule is shown in green. The tryptophan residue responsible for stacking against the B-face of the galactose moiety of lactose is shown in red. Hydrogen bonding residues are shown in blue. The aspartic acid residue at the base of the binding pocket that forms two strong hydrogen bonds with the galactose moiety is shown in yellow. The solvent surface of RTB is shown as a grey transparent surface. Panel B: structural model of CBM13. Dashed arrows indicate the threefold axis of symmetry. The domains are labeled according to their order in the primary amino acid sequence according to the convention established for RTB. Putative binding residues and the solvent surface are color coded as in Panel A . 100 mass was 15250.5 Da making the measured mass within the 0.1% error of the mass spectrometer. 5000. r 4000h ~ 3000 W c o — 20001-1000h i r — r + 2 + 1 J I I I I 3000 5000 7000 9000 11000 13000 15000 17000 m/z Figure 4.3 MALDI-TOF mass spectrum of I M A C purified CBM13. The +1 peak corresponds to a 15240.9 da singly charged species. The +2 peak corresponds to a 15240.5 da doubly charged species. See Materials and Methods for details. 4.4.3 Binding to insoluble polysaccharides. Qualitative binding experiments indicated that CBM13 bound p i - 4 linked xylose polymers (birchwood xylan and oat spelt xylan); however, it did not bind insoluble polymers consisting of only p i - 4 linked glucose (Figure 4.4). Because xylan is a structural polysaccharide found in plant cell walls, it is found associated with other polysaccharides, mainly cellulose (Heredia etai, 1995). Pine holo-cellulose, delignified wood pulp consisting of cellulose and 101 45.0 kDa 3 4 5 6 8 31.0 kDa 21.5 kDa 14.4 kDa 6.5 kDa c / y x c / Figure 4.4 Insoluble polysaccharide binding characteristics of CBM13. Panel A : SDS-PAGE analysis of bound samples desorbed by boiling with SDS-PAGE loading buffer (see Materials and Methods for sample preparation). Lane 1: molecular weight standards; Lane 2: B M C C ; Lane 3: Avicel; Lane 4: PASA; Lane 5: 500 ng purified CBM13; Lane 6: holocellulose; Lane 7: birchwood xylan; Lane 8: oat spelt xylan; Lane 9: linear arabinan. Panel B: depletion analysis of CBM13 adsorption to insoluble polysaccharides (the xylan used was birchwood xylan; the mannan was ivory nut mannan). 102 hemicellulose, was used as a substrate that would more closely resemble the natural substrate of xylanase 10A. The preparation used consisted of approximately 15% xylan with the remaining fraction being mostly cellulose (Dr. Peter Tomme, personal communication). CBM13 bound to holo-cellulose indicating that the association of the xylan with cellulose does not alter the presentation of the xylan sufficiently to prevent it being a substrate for binding (Figure 4.4). Less but significant binding was observed on pachyman, a (l,3)-(3-D-glucan, and lichenan, a (l,3)(l,4)-f3-D-glucan (Figure 4.4). The heterogeneous nature and semi-solubility of many of these polysaccharides preclude quantitative characterization of binding. 4.4.4 Soluble polysaccharide binding characteristics. Affinity gel electrophoresis was used to determine that CBM13 bound relatively tightly to soluble birchwood xylan and arabino-galactan (PI-3 linked D-galactose with a l - 3 , 5 linked L-arabinose substitutions linked al-3 to the galactan backbone) and weakly to arabinan (al-5 linked L-arabinose) and laminarin (PI-3 linked D-glucose) (Figure 4.5). The relative migration of CBM13 was not affected in gels containing EHEC, HEC, yeast mannan, agarose, dextran, oat P-glucan, oat P-glucan, starch, or pectic galactan. A summary of CBM13s specificity for polysaccharides is shown in table 4.1. In order to determine the binding affinity of CBM13 for xylan and arabino-galactan quantitative affinity electrophoresis was used (Takeo, 1985). The relative mobilities of CBM13 were compared in gels containing 0%, 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.65%, and 0.8% xylan and 0%, 0.4%, 0.6%, 0.84%, and 1.08% larch arabino-galactan. The association constants for CBM13 binding to xylan and arabino-galactan were 1.13 L/g and 0.19 L/g, respectively (Figure 4.6). These values only express partitioning. In order to determine the molar association constants, the average molecular mass of the polysaccharide must be known. This was calculated for xylan by determining the average degree of polymerization (DP) 103 B 1 Figure 4.5 Affinity gel electrophoresis of C B M 13 on xylan. Panel A : native polyacrylamide gel; Panel B: native polyacrylamide gel containing 0.25% w/v (0.43 mM) birchwood xylan. Lane 1: soy-bean trypsin inhibitor (lOug); Lane 2: CBM13 (lOug). PAGE was performed as described under Materials and Methods. The two bands in lane 2 represent unlabeled (upper band) and Cascade Blue® labeled species (lower band), respectively. Table 4.1: Specificity of CBM13 for polysaccharides Substrate Composition Solubility Relative Binding 3 Method'5 B M C C (31-4 GlcD insoluble - G/D Avicel P1-4 GlcD insoluble - G/D P A S A P1-4 GlcD insoluble - G/D Chitin P1-4 GlcNACD insoluble - D Birchwood xylan p1-4XylD insoluble ++ G/D Oat spelt xylan p1-4XylD branched insoluble ++ G Linear arabinan a1-5 AraL insoluble - G Holocellulose mixed insoluble ++ G Lichenan p1-3,4GlcD insoluble + D Pachyman p1-3 GlcD insoluble + D Galactan (larch) P1-4 GalD insoluble - D Ivory nut mannan P1-4 ManD insoluble - D Birchwood xylan p1-4XylD soluble ++ A E Arabino-galactan b1-3 GalD backbone soluble a1-3,5 AraL + A E substitutions Arabinan (beet) ot1-5 AraD soluble +/- A E Laminarin p1-3 GlcD soluble +/- A E E H E C p 1-4 GlcD hydroxyethyl substitutions soluble - A E HEC p 1-4 GlcD ethyl-hydroxyethyl substitutions soluble - A E Yeast mannan P1-4 ManD branched soluble - A E Agarose p1-3,4 GalD soluble - A E Dextran a1-4 GlcD backbone soluble A E ot1-6 GlcD branches Oat p-glucan p 1-3,4 GlcD soluble - A E Barley p-glucan p1-3,4GlcD soluble - A E Starch a1-4 GlcD soluble - A E Pectic galactan 31-4 GalD soluble - A E a no binding (-); possible binding (+/-); weak binding (+); relatively good binding b( + + ) -b G: SDS-PAGE; D: depletion; A E : affinity electrophoresis 105 from a comparison of total versus reducing sugars. The average DP of the xylan preparation was 44 giving an average molecular mass of approximately 5826 g/mole. Conversion of the calculated partition value to a molar value results in an association constant of 6.6 x 103 M ' 1 . A similar computation cannot be done for arabino-galactan due to its high degree of substitution. The large number of branches having free reducing ends would lead to an underestimation of the DP, resulting in an erroneous association constant. Birchwood xylan is a relatively linear polymer (Bouveng et al, 1958; Glaudemans et al, 1958; Timell et al., 1959) and, therefore, the method used to estimate the DP is valid. 3.5-O o.o H - TTTrrrrr • m * • - * 7 i : / •_ / ! , 1 - . . . <•' ' • — i — i — i — • 1 • — r — i — r - ~ T T " i t i •15 -10 -5 0 5 ^[Polysaccharide] 10 15 F i g u r e 4.6 Quantitative affinity electrophoresis of C B M 13 binding to xylan and arabino-galactan. lOug samples of CBM13 were run on affinity gels with increasing concentrations of either xylan (•) or arabino-galactan ( O ) . Relative mobility values (r) were calculated for C B M 13 relative to a non-binding standard (lOug soy-bean trypsin inhibitor). Ro corresponds to the relative mobility of CBM13 in the absence of polysaccharide. l/(Ro-r) values were plotted against the reciprocal of the polysaccharide concentration and the association constants determined as the absolute value of the intercept on the abscissa. 106 4.4.5 Multivalency of CBM13 investigated by site-directed mutagenesis. Aspartic acid residues at the base of the binding-sites in RTB, which are conserved in CBM13, form strong hydrogen bonds with the substrate. Substitution of these aspartic acid residues in RTB resulted in loss of binding at the modified binding site (Frankel et ai, 1996; Frankel et ai, 1996). These residues were targeted for alanine substitution in CBM13 in to investigate the possible presence of functional a , P, and y domain carbohydrate-binding sites. Single mutants at D51 and D92 were made in addition to a double mutant with substitutions at 1/[Xylan] Figure 4.7 Quantitative affinity electrophoresis of C B M 13 mutants binding to xylan. lOug samples of CBM13 WT (•), D51A ( T ) , D92A (•), and D9A/D92A ( • ) were run on affinity gels with increasing concentrations xylan. Relative mobility values (r) wer calculated relative to a non-binding standard (lOug soy-bean trypsin inhibitor). R corresponds to the relative mobility in the absence of polysaccharide. l/(Ro-r) values were plotted against the reciprocal of the polysaccharide concentration and the association constants determined as described in the Materials and Methods. 107 D9 and D92. The binding of these mutants was analyzed by quantitative affinity electrophoresis on xylan (Figure 4.7). Each single mutation resulted in an approximately three-fold decrease in binding affinity from 6.6 x 103 M " 1 of xylan polymer to 1.8 x 103 M " 1 and 2.6 x 103 M " 1 of xylan polymer for the D51 and D92 substitutions respectively. Additional substitution of D9 in the D92A mutant effected a further five-fold decrease in binding affinity from 2.6 x 103 M " 1 of xylan polymer to 0.5 x 103 M " 1 of xylan polymer. These results confirm the presence of three separate binding sites, one in each of the repeated domains of CBM13. 4.4.6 Stability of CBM13 and C B M 13 variants. Fluorescence emission scans of chemically denatured C B M 13 and thermal melts of CBM13 monitored by U V difference spectroscopy were used to asses the impact of the amino acid substitutions on the stability of CBM13. Denaturation of CBM13 with 6 M guanidine HC1 resulted in a decrease in fluorescence emission intensity and a large red shift of 10-12 nm in the wavelength of maximum emission (Figure 4.8). Treatment of all of the CBM13 mutants with 6 M guanidine HC1 also resulted in quenching of fluorescence emission and a red shift of 10-12 nm (Figure 4.8). The thermal unfolding of CBM13 could be monitored by changes in absorbance at 272 nm (Figure 4.9). Unfolding of CBM13 and CBM13 variants followed what appeared to be a simple two-state transition (Figure 4.9). The melting temperatures were 56.5, 54.5, 57.5, and 56.5 °C for wild-type, D 5 1 A D92A, and D9A/D92A mutants respectively. A l l similar T m s except for the D51A variant, which differed by approximately -2.5 °C. —\—i—i—i i i—i—r—i—i—r-i—i—i—i—i—i—i—i—i—i—i—i—i—i—f-3 0 0 3 2 0 3 4 0 3 6 0 3 8 0 4 0 0 Emission wavelength (nm) T - i — i — | — i — i — i — i — | — i — i — i — i — | — i — i — i — i — r ~ ~ i — i — r 3 0 0 3 2 0 3 4 0 3 6 0 3 8 0 4 0 0 Emission wavelength (nm) Figure 4.8 Intrinsic fluorescence properties of native and chemically denatured CBM13. Panel A: wild-type CBM13. Panel B: alanine substitution at aspartic acid 51 (D51A). Panel C: alanine substitution at aspartic acid 92 (D92A). Panel D: alanine substitutions at aspartic acids 9 and 92 (D9A/D92A). Scans of protein in 50 m M potassium phosphate buffer (K-phos) are shown as solid lines. Scans of protein in 50 mM K-phos with 6 M guanidinium hydrochloride are shown as dotted lines. A 0.03 -0.04 1 I 250 260 270 280 290 300 310 320 Wavelength (nm) Figure 4.9 Thermal denaturation of CBM13. Panel A: U V difference spectrum of CBM13 at 75 °C. Baseline spectrum of CBM13 was taken at 25 °C. The minimum at 272 nm was used to monitor unfolding. Panel B: thermal denaturation profde of CBM13. The fraction of native polypeptide was calculated as described in the Materials and Methods. Dotted lines indicate the point taken as the melting temperature (Tm). 110 4.4.7 Qualitative monosaccharide binding. The ability of CBM13 to bind mono- and disaccharides was assessed by competition affinity electrophoresis. D-xylose, L-arabinose, D-galactose, D-mannose, D-glucose, D-ribose, and cellobiose but not fructose prevented the retardation of the migration of CBM13 by xylan (Figure 4.10). These results indicate that all of the pyranose sugars tested must impede the binding of CBM13 to xylan, probably by direct competition for binding sites. 4.4.8 Spectro-fluorometric characterization of carbohydrate binding. The sensitivity of the aromatic residues in CBM13 to substrate binding was tested by fluorescence emission scans of CBM13 in the absence and presence of ligand (Figure 4.11). Binding to sugars could be detected in all cases by a 5 nm blue shift from a maximum of 335 nm in the absence of ligand to 330 nm in the presence of ligand. In most cases a decrease in the emission intensity was also observed except for spectra collected in the presence of galactose (Figure 4.11) and lactose (not shown), where the emission intensity increased. The changes in emission spectra upon sugar binding are consistent with the movement of a tryptophan residue(s) into a more hydrophobic environment. Shielding of a tryptophan from solvent due to stacking against a sugar residue may explain this observation. Solute quenching studies of CBM13 were performed to further assess changes in the exposure of tryptophan residues upon ligand binding. Experiments were done using an excitation wavelength of 295 nm to selectively excite tryptophan residues. Stern-Volmer plots (Eftink et al, 1981) of CBM13 relative fluorescence at different concentrations of KI were concave, possibly indicating heterogeneously emitting classes of tryptophans (Figure 4.12). The initial slopes of data obtained in the presence of xylose and galactose were smaller and the data less curved than when the I l l Figure 4.10 Competition affinity electrophoresis of C B M 13 on birchwood xylan in the absence and presence of xylose. Panel A : native gel without xylan or xylose. Panel B: electrophoresis in the presence of 0.25 % w/v (0.43 mM) birchwood xylan. Panel C: electrophoresis in the presence of 0.25 % w/v (0.43 mM) birchwood xylan and 0.1 M xylose as competing ligand. Lane 1: soy-bean trypsin inhibitor (lOug); Lane 2: CBM13 (lOug). PAGE was performed as described under Materials and Methods. The two bands in lane 2 represent unlabeled (upper band) and Cascade labeled Blue® labeled species (lower band), respectively. 112 300 320 340 360 380 400 Emission wavelength (nm) Figure 4.11 Fluorescence emission spectra of CBM13 in the presence of 50 mM xylose (dotted line), 50 m M galactose (dashed line), and absence of sugar (solid line). The excitation wavelength used was 295 nm. 113 1.9i 1 1 1 1 1 1 1 1 1 r 0 g I i i i i i i i 1 1 1 1 1 0.0 0.2 0.4 0.6 0.8 1.0 1.2 [KI] (A/I"1) Figure 4.12 Stern-Volmer analysis of CBM13 in the absence of ligand ( T ) , presence of 25 m M xylose ( A ) , presence of 25 mM galactose (•). F 0 is the initial fluorescence of the sample in the absence of KI. F is the fluorescence in the presence of quencher. See Materials and Methods for details. 114 sugar was absent. Similar results were obtained with arabinose and ribose (results not shown). Because of the large size of hydrated KI it is limited to quenching exposed aromatic residues (Eftink et al, 1981). This indicated decreased solvent accessibility of exposed tryptophan residues in the presence of the tested sugar. The molar ratio of tryptophan to tyrosine in CBM13 is 5:3 so the fluorescence emission of tryptophan will be prominent irrespective of the choice of excitation wavelength. Added to which, fluorescence resonance energy transfer often results in the emitted energy of excited tyrosines, which is relatively insensitive to environment, being absorbed and re-emitted by tryptophan (Eftink, 1991). This argues that the observed changes in the fluorescence emission of CBM13 in the presence of ligand result predominantly from changes in tryptophan fluorescence. The tryptophan residue in the l a binding site of RTB is somewhat exposed to solvent (Rutenber et al, 1991). The blue shifts in the emission maximum and changes in KI quenching of tryptophan upon addition of ligand to C B M 13 were in agreement with the movement of tryptophan into a more apolar environment that would result from ligand shielding the indole ring from solvent. For this reason, in the analysis of quantitative fluorometric data, changes in the fluorescence emission upon addition of ligand was considered to be representative of binding at the single site containing a tryptophan residue, the a site. The functions of the characterized lectin members of family 13 are all divalent cation independent. To confirm that the binding activity of C B M 13 does not require the presence of metal ions, fluorescence titrations using xylose (see Figure 4.13 for a representative titration) as a ligand were performed in the presence of 0, 1.0, 10.0, and 100.0 mM CaCb and M g C ^ . The association constants for binding to xylose decreased slightly with increasing CaCb and MgCk concentrations (Figure 4.14). The pH optimum for binding was 7.5 (Figure 4.14). Figure 4.13 Fluorescence titration of CBM13 with xylose. Data obtained with an excitation wavelength of 295 nm and an emission wavelength of 350 nm is shown. Solid line is the best-fit line to the relative fluorescence values determined by non-linear regression to a one binding site model (see Materials and Methods). 116 Figure 4.14 Cation and pH dependence of CBM13 binding to xylose. Panel A: pH dependence of binding. The association constant at pH 7.5 was taken as 100% binding for the calculation of the relative affinities. Panel B: CaCh (open bars) and MgCk (gray bars) dependence of binding. The association constants at 0 mM ion concentration were taken as 100% binding for the calculation of the relative affinities. See Materials and Methods for experimental details. 117 Based on the cation independence and optimal pH for binding being around physiological pH, tris buffer at pH 7.5 was chosen for further binding studies. Fluorescence titrations were used to determine association constants for binding to several monosaccharides, disaccharides and oligosaccharides (Table 4.2). The affinities for monosaccharides varied from less than 1 x 102 M" 1 to 1 x 103 M " 1 and of the monosaccharides investigated only fructose did not bind. Table 4.2 Specificity and affinity of C B M 13 for soluble sugars. Sugar dp a Composition1 5 K a(x10" 2M" 1 )c AG(kJ/m D-mannose 1 0.6 ± 0.0 9.9 2-deoxy-glucose 1 0.7 ±0 .1 10.5 L-fucose 1 1.3 ± 0.1 11.9 D-xylose 1 1.4 ± 0.0 12.0 L-arabinose 1 1.5 ± 0.1 12.3 D-glucose 1 1.7 ± 0.0 12.4 Me-p-D-glucose 1 3.0 ± 0.1 13.9 D-glcNAc 1 3.7 ± 0.1 14.4 D-galactose 1 6.1 ± 0.8 15.6 D-ribose 1 6.4 ±0 .2 15.8 Me-a-D-mannose 1 6.6 ± 0.2 15.8 Me-a-D-glucose 1 9.4 ± 0.1 16.7 L-rhamnose 1 10.4 ± 0.1 16.9 Cellobiose 2 GlcD(pl-4)GlcD 0.3 ± 0.0 8.4 Maltose 2 G lcD( a l - 4 )G lcD 1.5 ±0 .0 12.2 Xylobiose 2 XylD(pl-4)XylD 1.8 ± 0.1 12.7 Gentibiose 2 GlcD(pi-6)GlcD 4.6 ±0 .1 14.9 Lactose 2 Ga lD( a l - 4 )G lcD 9.7 ±0 .4 16.8 Arabinotriose 3 A raL ( a l - 5 )A raL 1.4 ± 0.1 12.0 Raffinose 3 Ga lD (a l - 6 ) G h i D ( a l -2)FruD 1.5 ± 0.1 12.1 Arabinohexaose 6 ( a l -5 )AraL 3.6 ± 0.2 14.3 Xylotetraose 4 (pi-4)XylD 6.2 ±0 .4 15.7 Birchwood xylan 44 (pi-4)XylD 54.5 ±5 .1 21.0 a degree of polymerization; number of monosaccharide units in the ligand. b Sugar linkage and composition c errors represent standard errors determined from the non-linear least squares fit to a one-site binding model. d 20 °C (293 °K) was taken as the reference temperature for the calculation of binding energies. 118 4.5 Discussion Xylanase A from S. lividans has endoglucanolytic activity on xylan, a polymer of P-1,4 linked xylose. Thus, based on the frequency of binding modules found in other modular glucohydrolases, it is not surprising that xylanase A contains a binding module with xylan binding activity. This binding module also demonstrated an ability to bind arabino-galactan and laminarin, in keeping with the presence of homologous amino acid sequences found in an arabino-furanosidase from S. lividans and P1-3 glucanases from other bacterial and eukaryotic sources (Tomme et al, 1995). The presence of such polysaccharide binding modules in glucohydrolases is common; however, CBM13 is currently unique in its multivalency and ability to bind a broad range of small sugars. 4.5.1 Multivalency of CBM13. CBM13 is multivalent. Aspartic acid residues predicted to be critical to ligand binding were substituted with alanine residues in each of the three potential binding sites, which in all cases resulted in reduced binding to xylan. The spectral properties of chemically denatured mutants and the melting temperatures of the mutants were relatively unchanged compared to the wild-type CBM13 indicating that the structures of the mutants were not greatly perturbed. Therefore, the changes in affinity were due to the participation of three separate binding sites in the interaction with substrate. The individual binding sites of C B M 13 did not appear to show any differences in specificity for particular sugars. The fluorescence technique employed to monitor binding only measured binding at the a site (section 4.4.8), therefore, all of the sugars studied by this technique must bind to this site. This technique was biased towards interactions at the a site, however, the 119 binding of CBM13 to xylan in affinity electrophoresis, which is not biased towards individual sites, could be effectively prevented by high concentrations of the same monosaccharides studied by fluorescence (section 4.4.7). Furthermore, the qualitatively assessed effectiveness of the monosaccharides as competitors was roughly proportional to the measured affinities (results not shown). For this to occur monosaccharides must bind to all of the three binding sites (a , (3, and y) to effectively inhibit interaction with the polysaccharide. Lectins often contain contiguous modules with multiple binding sites or form multimeric complexes consisting of two or more non-covalently linked modules. The association constants of individual binding sites are frequently low (~103-105 M" 1). However, extremely high association constants (~108-109 M" 1) can result from the presence of multiple binding sites able to form multivalent interactions with substrates. CBM13 binds 10 to 20 times better to xylan than to xylo-oligosaccharides with a degree of polymerization of four or less. With binding sites in each of the a , P and y domains it is tempting to consider this increased affinity for a polyvalent ligand a consequence of avidity resulting from the interaction of multiple binding sites on CBM13 with multiple binding sites on the same xylan molecule. This is questionable. The quantification of binding to xylan does not take into account the stoichiometry of binding to a potentially polyvalent ligand. Also, due to the complexity of the analysis, the binding model employed did not take into account the identity and interaction of multiple binding sites on the protein. Some estimation of the effects of these factors can be made. The structural model of CBM13 allows its diameter to be approximated at 30A and the separation of the binding sites to be approximated at 25A. The individual xylose monomers in xylan span approximately 5 A . This gives an estimated molecular footprint of not less than 6 xylose monomers spanned by CBM13 and a minimum of 5 xylose monomers to span two binding sites. Given an average DP of 44 for xylan, no more than 7 molecules of CBM13 can bind per molecule of xylan. Realistically, this 120 value would likely be lower. Using the estimated stoichiometry of 7 (nxyian) a more representative association value of 0.8xl0 3 M " 1 (5.5xl0 3 M " 1 divided by 7) can be estimated. This is still larger than the values measured for xylo-oligosaccharides, but the advantage in binding energy is minimal (AAG ~ -4 kJ/mole at 20 °C), hardly consistent with the large increases in binding energies resulting from multivalency seen in lectin systems (AAG ~ -20 kJ/mole at 20 °C). A more plausible explanation stems from the assumption that only one of the binding sites in CBM13 can interact with the same xylan molecule at a given moment. In this instance, the apparent macroscopic association constant (K a p p ) for xylan becomes the sum of the individual microscopic association constants (Kap = K a + Kp + K Y ) rather than the product. This results from the increased probability that one of three sites will "meet" ligand compared with when only one site is present. Taking this into account, and the stoichiometry of xylan, the average microscopic association constant of a single binding site for xylan can be estimated to be approximately 0.3xl0 3 M " 1 (5.5xl0 3 M " 1 divided by 7 [n x y i a n ] further divided by 3 [ n c B M n ] ) - This value falls between the association constants for xylobiose and xylotetraose, which, because their length cannot span two binding sites, must bind at single sites. Based on this explanation, in all probability, only one of the a, P, or y binding sites can interact with a single binding site on a single xylan polymer. This does not exclude the possibility that multiple sites on a single CBM13 molecule can interact with multiple molecules of xylan; however, CBM13 did not show any ability to agglutinate xylan (results not shown) as would be an expected outcome of crosslinking xylan molecules with CBM13. Based on these arguments it would appear that the only advantage of CBM13 having multiple competent binding sites is a statistical advantage increasing the macroscopic association constant by the sum of the individual microscopic association constants. 121 Carbohydrate-binding modules frequently occur as tandems. Examples are found in families 4, 6, 16 and almost without exception in family 9 (Coutinho et al, 1992; Sakka et al, 1996; Cann et al, 1999; Winterhalter et al, 1995). Despite their frequency, no advantage to binding has yet been demonstrated for the tandem modules; the biological relevance of repeated CBMs in glycosyl hydrolases remains unknown. CBM13 is the first example of a single binding module found in a glycosyl hydrolase that has multiple binding sites. Yet, CBM13 demonstrates another instance where the multivalency of a binding module does not appear to provide a significant advantage in binding to polysaccharides. 4.5.2 Polysaccharide binding specificity. CBM13 has a greater specificity for polysaccharides than it does for monosaccharides. For example, CBM13 binds galactose but not when it is linked pl-3 or pi-4 in galactan or agarose. Although it does bind to arabino-galactan, this seems to be the exception due to the lack of binding to other galactans. It is probable that CBM13 is binding to arabino-galactan via the arabinose substituents branched from the galactan backbone. Glucose is bound by CBM13 as a monosaccharide or as a a 1-4 or pi-6 linked disaccharide, yet longer a 1-4 polymers of glucose (starch) or polymers of glucose that are predominantly P1-4 linked (cellulose, derivatized cellulose, and the P-glucans) are not bound by CBM13. Curiously, glucose polymers that are mostly pl-3 linked (laminarin, lichenan, and pachyman) are bound by CBM13, but weakly. The structural model of CBM13 (Figure 4.2) predicts shallow binding sites only able to accommodate monosaccharide residues. This is true with RTB, which, though it binds the disaccharide lactose, only binds the galactose moiety (Rutenber et al, 1991). It also seems consistent with the observation that CBM13 demonstrates no advantage in binding xylobiose over xylose. CBM13 probably recognizes the same features in monosaccharides as it does in the 122 monomer units of polysaccharides, features that, at least with glucose and galactose, are unavailable in the polymers of these sugars. This is clearly not the case with xylose and xylan, which are both bound effectively by CBM13. Two explanations are plausible: 1) xylose is bound differently than glucose and galactose monomers, and perhaps all of the six carbon sugars, and the presence of the linkage in the polymers are not amenable to binding, or 2) intramolecular interactions, such as hydrogen bonding between adjacent sugar residues, in glucose polymers and galactose polymers result in a polysaccharide conformation that is not appropriate as a ligand for CBM13. As discussed in section 4.5.4 multiple binding orientations of the monosaccharide sugars is probable. Accepting the possibility of various monosaccharides being bound in different orientations, the nature of the glycosidic linkage in a polysaccharide will be important in determining the acceptability of the polysaccharide as a ligand. The poor binding to cellobiose provides some evidence that polysaccharide conformation may be important. Because CBM13 binds five carbon and six carbon pyranose sugars without preference, it is evident that the C6 hydroxymethyl group is unimportant in defining a monosaccharide as a ligand. Xylobiose and cellobiose differ obviously in chemical composition by the lack of the C6 groups, but the glycosidic linkage and the orientation of the hydroxyl groups are identical, yet cellobiose is a poor ligand for CBM13 whereas xylobiose is a relatively good ligand. The conformation of these two disaccharides differs in that the glucose units of cellobiose are rotated around the glycosidic bond 180° relative to one another and held in this conformation by intramolecular hydrogen bonds between the C3 hydroxyl group and the pyranose oxygen of the adjacent sugar (Atalla, 1993). This conformation must mask what would be recognized in xylobiose. It is probable that both the orientation in which monosaccharide residue is bound by CBM13 and the conformation of the polysaccharide determine its suitability as a ligand for CBM13. 123 4.5.3 Specificity of small sugar binding. The amino acid sequence similarity of CBM13 and RTB is relatively high. Based on this it was hypothesized that CBM13 may have a binding affinity for monosaccharides. This was indeed true. The affinities were very low, but not unusual for many lectins (Lis et al, 1998). Binding was cation independent, as it is for all of the characterized lectin members of the family 13 CBMs, such as RTB, abrin B chain, Sambucus nigra lectin and mistletoe lectin. CBM13 showed very little specificity for monosaccharides, which is unusual for carbohydrate binding proteins but not unprecedented as demonstrated by Mordica charantia (bitter gourd) lectin (Das et al, 1981). Rearrangement of solvent molecules caused by the proximity of complementary polyamphiphilic surfaces presented by acceptor and ligand has been strongly implicated in providing the energy to drive the association of proteins and carbohydrates (Lemieux et al, 1991; Lemieux, 1996). One important process is the desolvation of polar groups caused by hydrogen bond formation. Hydrogen bonds are highly directional and their formation has been assumed to be largely responsible for determining sugar-binding specificity. Because CBM13 shows little or no preference for monosaccharides based on the positioning of the sugar hydroxyl groups it appears that polar interactions play a minor role in the recognition of carbohydrates by CBM13. This is contradicted by the aspartic acid mutations decreasing the affinity of CBM13 for xylan. Homologous aspartic acid residues in RTB form strong hydrogen bonds with galactose. In CBM13, if these aspartic acid residues form polar interactions, this may be specific to binding xylan or this may reflect an indirect role of these residues, such as maintaining a specific conformation of the binding site. Conversely, the size and shape, rather than polarity, of aspartic residues may provide the appropriate binding interactions. The only feature common to all of the ligands for CBM13 is the pyranose ring. Interaction of aromatic residues with the relatively hydrophobic surface presented by the carbons of the pyranose ring has been observed in most lectin-carbohydrate interactions (Weis et al, 1996). Desolvation of these surfaces has 124 been proposed to provide favorable binding entropy (Lemieux, 1996). It is conceivable that this type of interaction provides a significant fraction of the net energy and a corresponding lack of specificity in substrate binding by CBM13. CBM13 binds five carbon and six carbon pyranose sugars without preference. This argues that the C6 hydroxymethyl group of the six carbon sugars is unimportant in the binding of ligand to CBM13. This is in contrast to RTB where two different interactions with the C6 group are important in binding galactose. In the l a site the hydroxyl group on C6 forms a hydrogen bond with glutamine 35. In the 2y site a hydrophobic contact between isoleucine 246 and the C6 group replaces the hydrogen bond in the l a site (Rutenber et al, 1991). Though C B M 13 has glutamine residues homologous to glutamine 35, and, therefore, capable of forming hydrogen bonds with the ligand, this interaction does not appear to be a determining factor in ligand binding. 4.5.4 Orientations of bound of sugars. The bulk of the energy for the binding of ligand to C B M 13 probably comes from the stacking interaction of an aromatic residue against the pyranose ring and other non-polar interactions. In some mannose binding lectins, such relatively promiscuous interactions allow L -fucose (6-deoxy-a-L-galactose) to bind in a flipped and rotated orientation such that different portions of the pyranose ring present a structure similar to that of mannose (Lis et al, 1998). Such a phenomenon may explain the lack of specificity of CBM13 but it also raises another possibility. In the absence of strong hydrogen bonds there may be orientations of different pyranose sugars such that other portions of the pyranose ring can form favorable interactions with the aromatic residue or there may even be multiple orientations of a single pyranose sugar. The observation that CBM13 can bind several L-series monosaccharides suggests that multiple binding orientations must be possible. Some spectroscopic evidence implies the same. The 125 binding of galactose and lactose to CBM13 resulted in enhanced fluorescence emission, which was in contrast to the quenching of fluorescence emission observed for the binding of all non-galactose containing ligands. This suggests that galactose and lactose bind by different modes than the other monosaccharides. RTB binds galactose with the C4 hydroxyl of galactose hydrogen bonding with an aspartic residue and oriented towards the interior of the module (Rutenber et al, 1991). The fluorescence spectra of CBM13 upon the binding of galactose and lactose were similar suggesting that lactose is bound via the galactose moiety rather than the glucose moiety. Because CBM13 does not bind pi-4 or pi-3 linked galactose but does bind lactose (galactose pi-4 linked to glucose) the preferred binding orientation in this case would be similar to that of RTB. The galactose at the reducing end of lactose is oriented such that the C4 and C3 hydroxyls are pointing into the binding site and the pyranose ring interacts with the aromatic residue. In fact, the affinity of CBM13 for lactose (~ lx l0 3 M" 1) is not dissimilar to the affinity of the l a site of RTB for lactose (~3xl0 3 M" 1). It is very possible that C B M 13 binds the reducing ends of galactose polymers; however, due to relatively low concentration of the reducing ends, low solubility of the polysaccharides, and low affinities, binding would be difficult to detect. CBM13 binds preferentially to linear P-1,4 linked xylose polymers in which a similar positioning of the C4 hydroxyl is impossible because it is tied up in the glycosidic bond. For xylose, the bound orientation is likely one that leaves the CI and C4 hydroxyls relatively free from the binding site with the C2 and C3 hydroxyls pointing into the binding site. Such an arrangement should allow binding to xylose polymers and is consistent with the reduced binding of 2-deoxy-glucose, assuming glucose binds in a similar orientation, and may explain the inability to bind cellobiose, where the C3 hydroxyl is oriented to participate in an intramolecular hydrogen bond. Multiple modes of substrate binding are not unprecedented for carbohydrate binding proteins. E. coli maltose binding protein (MBP) binds P-cyclodextrin and maltodextrins 126 in slightly different orientations but in the same binding site (Hall et al, 1997a; Hall et al, 1997b). This can be detected by differences in the U V difference spectra of complexed MBP. 4.5.5 Structural and evolutionary implications. 4.5.5.1 The P-trefoil: a scaffold for carbohydrate recognition. The P-trefoil fold was first observed in the structure of soybean trypsin inhibitor (Sweet et al, 191 A). It has subsequently been found in several proteins of diverse functions. The fold contains 12 strands of p-sheet, forming six hairpin turns. A P-barrel structure is formed by six of the strands, attendant with three hairpin turns. The other three hairpin turns form a triangular cap on one end of the P-barrel called the "hairpin triplet". The subunit of this fold is a contiguous amino acid sequence with a four P-strand, two-hairpin structure having a trefoil shape. Each subunit contributes one hairpin (two P-strands) to the P-barrel and one hairpin to the hairpin triplet. The fold of the resulting molecule has a pseudo threefold axis (Figure 4.2)(see Murzin et al, 1992 for a review of P-trefoils). It is generally accepted that similar amino acid sequences result in similar folds but sequence similarity is not necessary for similar folds. This appears true for P-trefoil folds, which can have primary structures that are dissimilar. However, the hydrophobic amino acid residues that form the hydrophobic contacts in the barrel and hairpins appear well conserved in polypeptides adopting this fold. Much greater sequence variability is seen in the P-sheet regions not involved in forming the core of the protein (Murzin et al, 1992). As a result, functional sites can be found in many positions on the surface of P-trefoil proteins. In the case of the family 13 plant lectins, the substrate binding sites are small pockets formed between two hairpins, one in the hairpin triplet of a single trefoil subunit. An advantage of the symmetric nature of the P-trefoil fold is the 127 potential for up to three similar functional sites on a single molecule. This is seen with CBM13 and the family 13 plant lectins but only exploited by the latter as evidenced by multivalent binding interactions. The frequency of CBMs with sequences amenable to this fold, as judged by the relatively large number of family 13 CBMs (Figure 4.15), shows that the P-trefoil fold is an efficient scaffold for carbohydrate binding sites. 4.5.5.2 Evidence for a carbohydrate recognition motif. Family 13 contains about 21 modules from 19 different putative CBMs (A. Boraston & P. Tomme, unpublished). These modules show no less than 20% identity with one another. A less stringent comparison using a Hidden-Markov Model (Krogh et al, 1994) allowed the compilation of a large and diverse C B M super-family containing approximately 49 modules from up to 35 different proteins (Figure 4.15). These proteins share the repeated domain motif and conserved hydrophobic residues that are indicative of a P-trefoil fold. It has been proposed that Gly-X-X-X-Gln-X-Trp motifs mark a consensus for this super-family of carbohydrate binding modules (Hirabayashi et al, 1998). In fact, this glycine residue is poorly conserved and, based on the studies of RTB, does not participate in ligand binding. Furthermore, this motif does not have a complement of residues sufficient to bind ligand. Taking into account the structural studies of the p-trefoil plant lectins and alignment of the "super-family 13" a more accurate motif to predict the carbohydrate binding capacity of p-trefoil proteins can be suggested. The potential binding residues can be generalized as such: Pb-(X)6-i2-Gln-X-Ab-(X)6-io-Asn-Gln, where Pb represents polar groups capable of forming hydrogen bonds, most frequently aspartic acid, glutamine, glutamic acid; X , any amino acid (subscript numbers denote the possible number of residues); Ab, aromatic residue involved in binding, always tryptophan or tyrosine. As discussed in section 4.5.5.1 this binding motif uses a P-trefoil fold as a scaffold. Placed within 128 SLXA SLABFB SCABF ARTSPGH OXGH RFRPI TTCFG RCRTB.1 RCRICINE.2 APABRIND.2 APABRINA.2 VAPPML.2 VAML.2 SNAGGL.2 SNRIP1.2 SNRIP2.2 SNRIP3.2 SNRIP4.2 SNRIP5.2 RCRICINE.1 RCRTB.2 VAPPML.1 VAML.1 APABRIND.1 APABRINA.1 SNAGGL.1 SNRIP1. SNRIP2. SNRIP3. SNRIP4. SNRIP5. CBHA1 CBHA2 CBHA3 CBHA4 BSMT CEGLY5C HSPAGT2 VVCTYOLYSI CEGLY3 ANAGAL MTH16.9 CEGLY7 HSMANR HSPAGT1 HSPAGT3 MMPAGT GES-VAKGELR NAQ -DHQDIAFGALQQJ -QSLSN ADFR SLG AIVN -LVFN T--QLKSR L| YPMLP—PNDVWG EARN TRQFL IYN — R-HYFS — LGEIR NVE EDRPGWH GAIR S EDRPGWH GAIR SMl Figure 4.15: Amino acid sequence alignment of the a domains of the family 13 "super-family". A l l entries have additional P and y domains which were omitted for brevity. Abbreviations for sequence entries are given in table 4.3. Numeral extensions following a decimal indicate entries from module 1 or module 2 of repeated modules. Table 4.3 Abbreviations to figure 4.12. Abbreviation Identification SLXA Streptomyces lividans xylanase 10A SLABFB Streptomyces lividans arabinofuranosidase B SCABF Streptomyces coelicolor arabinofuranosidase B ARTSPGH Arthrobacter sp. (31-3 glucanase OXGH Oerskovia xanthineolytica pi-3 glucanase RFRPI Rarobacter faecitabidus serine protease TTCFG Tachypleus tridentatus factor G a-subunit RCRTB.1 Ricinus communis toxin B-chain RCRICINE Ricinus communis Ricin E APABRIND Abrus precatorius Abrin D APABRINA Abrus precatorius Abrin A VAPPML Viscum album mistletoe lectin VAML Viscum album pre-promistletoe lectin SNAGGL Sambucus nigra agglutinin SNRIP1 Sambucus nigra ribosome inactivating protein SNRIP2 Sambucus nigra ribosome inactivating protein SNRIP3 Sambucus nigra ribosome inactivating protein SNRIP4 Sambucus nigra ribosome inactivating protein SNRIP5 Sambucus nigra ribosome inactivating protein CBHA1 Clostridium botulinum hemagglutinin CBHA2 Clostridium botulinum hemagglutinin CBHA3 Clostridium botulinum hemagglutinin CBHA4 Clostridium botulinum hemagglutinin BSMT Bacillus sphaericus mosquitocidal toxin CEGLY3 Caenorhabditis elegans CEGLY5C Caenorhabditis elegans CEGLY7 Caenorhabditis elegans WCTYOLYSI Vibrio vulnificus cytolysin precursor ANAGAL Aspergillus niger a-galactosidase MTH16.9 Mycobacterium tuberculosis 16.9 kDa hypotheical protein HSMANR Homo sapiens mannose receptor HSPAGT1 Homo sapiens polypeptide GalNac transferase HSPAGT2 Homo sapiens polypeptide GalNac transferase HSPAGT3 Homo sapiens polypeptide GalNac transferase MMPAGT Mus musculus polypeptide GalNac transferase 130 this context, structural residues can be included in the motif as such: H c-Pb-(X)5.ii-H c-Gln-Hc-Ab-(X)6-io-Asn-Gln-(X)i-2-Ac, where H c denotes structural hydrophobic residues, usually aliphatic, forming the core of the protein and A c represents a structural aromatic residue, most frequently tryptophan, in the core of the protein. Though the cysteine residues that form disulfide bonds in RTB are quite well conserved in the super-family there is no evidence that they are structurally important or important in ligand binding; the disulfide bond is lacking in the 2y domain of RTB, which is an effective ligand-binding site. The three domains (a, B, and y) of CBM13 all conform to this proposed carbohydrate binding motif, and indeed, each domain has a binding site. Understanding this binding motif may provide the means to engineer carbohydrate-binding sites with desired specificities into P-trefoil folds. 4.5.5.3 Microbial family 13 carbohydrate-binding modules: an evolutionary link. Rutenber and Robertus hypothesized that the ancestor of the galactose binding P-trefoil lectins was a small galactose-binding polypeptide, similar to the l a domain of RTB that was capable of self-assembly into a trimer with similar architecture to a P-trefoil fold (Rutenber et al, 1991; Rutenber et al, 1987). They surmised that gene triplication and fusion resulted in a contiguous polypeptide with a P-trefoil fold. Further evolutionary processes resulted in the duplicated modules and evolved binding site configuration of RTB. Though speculative, this explanation appears plausible and can likely be extended to all of the lectin members of family 13 CBMs. CBM13 provides a useful link in this evolutionary scheme. CBM13, and all of the bacterial members of family 13, represent the stage in this process immediately following the step of gene triplication where the three repeated domains must be very similar in sequence. These domains do most closely resemble the l a domain of RTB (section 4.4.1). One possible flaw in this proposed evolutionary scheme is the assumption of a galactose specific ancestor. 131 Though galactose and N-acetyl-galactosamine appear to be among the most biologically relevant sugars in higher eukaryotic systems it is unlikely that specificity for either sugar developed first. It seems more likely that specificities for more "primitive" sugars common to lower organisms, such as glucose and mannose, would develop first. If the assumption that CBM13 represents an evolutionary predecessor to RTB is correct, then it is unlikely that the predecessor to modules like CBM13 would have greater specificity than CBM13. Confirmation of low ligand specificity in other microbial family 13 CBMs would support this. A preferable evolutionary ancestor to the family 13 CBMs might be a self-trimerizing 40 amino acid polypeptide with broad sugar binding specificity or specificity for sugars such a glucose and mannose. Overall, such a scenario implies the lateral transfer of microbial family 13 CBMs to plants and animals. It also raises the question why binding modules with low specificity and low affinity, in other words poor binding modules, have persisted throughout evolution. Either evolution is a "forgetful" and imperfect process or these binding modules serve functions not yet ascertained. Chapter 5 C a r b o h y d r a t e - B i n d i n g M o d u l e s : An emerging perspective on substrate recognition 133 5.1 Carbohydrate-binding modules: "The big picture". Proteins that bind carbohydrates have been classified into two groups (Quiocho, 1986; Quiocho, 1988). Group I carbohydrate-binding proteins (CBPs) bind ligands with association constants of 106 M " 1 or greater and have binding sites that enclose the substrate. Many sugar transport proteins and glycosyl hydrolase catalytic modules have these properties. The group II CBPs bind ligands at open binding sites with association constants of 106 M " 1 or less. Lectins are the archetypal example of this CBP group. Al l of the characterized CBMs are group II CBPs. The affinities of CBMs for various ligands are generally 103-106 M " 1 . Furthermore, the CBMs for which structures have been determined have shallow or even flat ligand-binding sites. This classification is useful in placing CBMs in the context of the large number of diverse proteins capable of binding carbohydrates; however, a more useful organization of CBMs is emerging that enables comparison of their structures and functions. The classification of CBMs into families is based on the amino acid sequence similarity of the modules. Members of a family have similar structures and binding properties. However, the family classification does not correlate with structural and functional similarities that are independent of sequence. Structural and functional similarities between C B M families are evident and have prompted the organization of CBMs into three types: A B, and C. 5.2 Type A carbohydrate-binding modules. Type A CBMs are functionally distinguished by their binding specificity for crystalline polysaccharides, such as cellulose and chitin, with little or no affinity for soluble sugars. Adsorption to these ligands is frequently observed to be irreversible, although this is not an obligate property of the type A CBMs. The binding sites are flat with several (usually three) 134 exposed aromatic amino acid residues, most frequently tryptophans and tyrosines. C B M families 1, 2a, 3, and 5 are type A CBMs. The CBMs from families 1, 2a, 3 and 5 are all B-sheet proteins with very similar binding sites (Figure 5.1). The aromatic residues that generate the relatively hydrophobic character of these binding sites have been shown by site directed mutation to be critically important in substrate binding (Din et al, 1994b; Poole et al, 1993; Linder et al, 1995a; Reinikainen et al, 1992). The flat surface of the binding sites formed by these aromatic residues seems ideally suited to bind to the relatively flat surfaces of cellulose crystals. The surface of a cellulose crystal is made up of the hydroxyl substituents of the cellulose chains and only a small portion of the edge of the pyranose rings exposed in a staircase structure (Sarko, 1986, see chapter 3). Because of this, it is unlikely that individual cellulose chains are important in binding, but the surface created by the edges of many cellulose chains is. The type A CBMs have very low or negligible affinity for cello-oligosaccharides (Bolam et al, 1998; Mattinen et al, 1997). This appears inconsistent with the binding of type A CBMs to amorphous cellulose, which is considered to have very little or no crystalline content. Considering the lack of affinity for cello-oligosaccharides, it is unlikely that these CBMs bind to the free cellulose chains in amorphous cellulose. It is more likely that the process of generating amorphous cellulose through the acid swelling of insoluble cellulose or complete acid-dissolution and regeneration of insoluble cellulose results in regions of micro-crystallinity with a high surface area but low contribution to the bulk crystallinity. Thus, it can be argued that type A CBMs are truly specific for crystalline substrates. The calorimetric studies with CBM2a indicated that binding to crystalline cellulose was entropically driven characteristic of dehydration of the complex resulting from the hydrophobic nature of the binding site (Creagh et al, 1996; Chapter 1). As yet other type A CBMs have not 135 Figure 5 .1: Structures of type A carbohydrate-binding modules. Aromatic residues shown or proposed to be involved in cellulose binding are shown in red. The solvent surfaces of the molecules are shown in transparent gray. Panel A : N M R structure of bmlTrCel7A (Mattinen et al, 1997). Panel B: N M R structure of bm2aCfXynlOA (CBM2a) (Xu et al, 1995). Panel C: crystal structure of bm3CtCipC (Tormo et al, 1996). Panel D: N M R structure of bm5EcCel5A (Brun et al., 1997). 136 been tested. However, the family 1, 3 and 5 C B M binding sites are similar in structure and composition to the binding site of CBM2a suggesting that similar binding thermodynamics will eventually be considered a feature of the type A CBMs. 5.3 Type B carbohydrate-binding modules. Type B binding modules differ from type A binding modules because they show specificity for soluble polysaccharides. Some bind to insoluble substrates, such as amorphous cellulose and insoluble xylan. They do not bind mono- or disaccharides. The binding sites of type B CBMs are extended and may appear as a groove. The type B binding modules currently include members of families 2b and 4. The type B binding modules can be considered as "chain binders" because the suitability of a sugar as a ligand is in part determined by its degree of polymerization. The minimum chain length appears to be three sugar units or longer (Tomme et al, 1996). The family 4 CBMs bind to amorphous cellulose and the family 2b CBMs bind to insoluble xylan. However, these substrates have a significant non-crystalline content. The family 4 CBMs do not bind crystalline cellulose (Tomme et al, 1996). Binding to insoluble cellulose, therefore, is presumably to exposed single chains in the amorphous regions. Based on the architecture of the type B C B M binding sites, the adsorption of all type B CBMs to insoluble substrates is most likely to occur at these amorphous regions rather than the crystalline regions. The binding sites of the type B CBMs are extended shallow grooves (Figure 5.2). The conformation of these binding sites make them well suited to accommodate sugar chains. Aromatic amino acid residues in these binding sites are important in binding substrate (Kormos, 1998; Johnson et al, 1996a; Johnson et al, 1996b; Simpson et al, 1999). Although this has not Figure 5.2: Structures of type B carbohydrate-binding modules. Aromatic residues shown or proposed to be involved in sugar binding are shown in red. The solvent surfaces of the molecules are shown in transparent gray. Panel A : N M R structure of bm4CfCel9B (Johnson et al, 1996). The hydrophobic base of the binding cleft is shown in yellow. Panel B: N M R structure of bm2bCfXynl 1A showing an end-on view of the binding cleft at the bottom of the molecule (Simpson et al., 1999). 138 yet been directly observed, it has been proposed that these binding sites accommodate the entire width of the sugar chain (Johnson et al, 1996a). The binding site would then be capable of interacting with the entire pyranose ring of the sugar residues and the hydroxyl substituents on both sides of the sugar chain. This is in contrast to the type A CBMs, which can only access the edge of the sugar chains in crystalline cellulose. 5.4 Type C carbohydrate-binding modules. The type C CBMs are the least well-characterised type of C B M . They may bind soluble or insoluble polysaccharides but are differentiated from type A and B CBMs by their ability to bind mono- and disaccharides. Family 13 is the best-characterised family that meets these criteria. Less complete studies on members of family 6 and 9 CBMs suggest that those families are also type C CBMs. The ability of the type C CBMs to bind small soluble sugars implies that they have binding sites of a corresponding size. Compared to the extended binding sites of the type A and B C B M ' s the type C binding sites are predicted to be relatively small, more closely resembling the sugar binding sites of lectins. Structural data is available only for the family 13 CBMs, which, as discussed in Chapter 4, have typical lectin-like binding sites (Figure 5.3). Some members of families 6 and 9 can be desorbed from insoluble cellulose with glucose or cellobiose (Winterhalter et al, 1995; Sakka et al, 1998). Quantitative measurement of binding by fluorescence spectroscopy, U V spectroscopy, and microcalorimetry indicate that these CBMs do bind these sugars (Boraston, McLean, Alam, and Creagh, unpublished). Therefore, they are classified tentatively as type C CBMs. 139 Figure 5 .3 : Structures of type C carbohydrate-binding modules. Residues shown or proposed to be involved in ligand binding are shown in red, blue, and yellow. The solvent surfaces of the molecules are shown in transparent gray. Panel A : X-ray crystal structure of RTB module 1 (Rutenber et al., 1991). The bound lactose molecule is shown in green. Panel B: Model of CBM13 showing the three carbohydrate binding sites. 140 5.5 Unifying properties of the C B M types. Functional or structural characterization of all types of CBMs have shown aromatic residues to be important to substrate binding. This is extremely common to carbohydrate-binding proteins (Drickamer, 1997; Drickamer, 1995; Weis et al, 1996). X-ray crystallography studies of lectins and enzymes in complex with sugar show aromatic residues stacked against the pyranose rings of the sugar (Weis et al, 1996). In lectins the angle of the sugar relative to the aromatic amino acid can vary from 17° to 52° (Weis et al, 1996). Mutation shows these interactions to be pivotal to substrate binding in lectins and enzymes (Frankel et al, 1996; Frankel et al, 1996; Sphyris et al, 1995; Koivula et al, 1998). Of the family 13 CBMs, the stacking of aromatic residues against the pyranose rings of the sugar has been observed only ricin-toxin B-chain. However, the ubiquity of this interaction in other protein-carbohydrate complexes and the demonstrated importance of aromatic residues in CBM-carbohydrate interactions strongly imply similar roles for the aromatic residues. Another common feature of C B M binding sites that has been noted but not well investigated is the frequent occurrence of aspartic acid, glutamic acid, asparagine, and glutamine amino acid residues. In lectins and enzymes complexed with sugar and studied by X - ray crystallography these residues have been observed to hydrogen-bond with the sugar hydroxyl groups. This has been demonstrated directly by X-ray crystallography of the family 13 C B M (type C), ricin-toxin B-chain, and indirectly by mutation of the related CBM13 (type C). Mutations of bmlTrCel7A (type A) and bm4CfCel9B (type B) indicate that such amino acid residues are important in binding to cellulose (Kormos, 1998; Linder et al, 1995a). However, these studies do not demonstrate directly how these amino acid residues interact with the substrate. Future high-141 resolution structure studies of CBM-carbohydrate complexes will undoubtedly identify a similar role for these potential hydrogen-bonding residues. 5.6 Summary. This organization of CBMs is similar to the organization of the families of glycosyl hydrolases into clans. There are 77 glycosyl hydrolase families, each family based on amino acid sequence similarity, which are grouped into 10 clans, each clan based on structural fold similarities (Coutinho & Henrissat, 1999). In addition to sequence and structural information, the grouping of the CBMs into types provides functional information. Ultimately, drawing comparisons between the C B M families through this general classification may generate insights into the biological functions of CBMs through a better understanding of the general substrate-binding mechanisms. Chapter 6 B i b l i o g r a p h y 143 6.1 Bibliography Assouline, Z., Graham, R., Miller, R.C.J., Warren, A.J. & Kilburn, D.G. (1995) Biotechnol.Prog. 11, 45-49 Assouline, Z., Shen, H. , Kilburn, D.G. & Warren, R.A. (1993) Protein Eng. 6, 787-792 Atalla, R.H. (1993) in Trichoderma reesei Cellulases and Other Hydrolases (Suominen, P. & Reinikainen, T., eds), The structures of native celluloses, pp. 25-39, Foundation for Biotechnical and Industrial Fermentation, Helsinki Austin, J.C., Rodgers, K.R. & Spiro, T.G. (1993) Methods Enzymol. 226, 374-396 Bayer, E.A., Shimon, L.J. , Shoham, Y. & Lamed, R. (1998) J.Struct.Biol. 124, 221-234 Betzler, M . , Dyson, P. & Schrempf, H. (1987) J.Bacteriol. 169, 4804-4810 Blackwell, J. (1982) in Cellulose and Other Natural Polymer Systems. Biogenesis, Structure, and Degradation (Brown, R .M. , Jr., ed.), The macromolecular organization of cellulose and chitin. pp. 403-428, Plenum Press, New York Blake, J.D. & Richards, G.N. (1971) Carbohydr.Res. 17, 253-268 Blum, D . L , L i , X . L . , Chen, H. & Ljungdahl, L .G. (1999) Appl.Environ.Microbiol. 65, 3990-3995 Bolam, D.N. , Ciruela, A., Mcqueen-Mason, S., Simpson, P., Williamson, M P . , Rixon, J.E., Boraston, A., Hazlewood, G.P. & Gilbert, H.J. (1998) Biochem.J. 331, 775-781 Boraston, A.B. , McLean, B.W., Kormos, J.M., Alam, M . , Gilkes, N R . , Haynes, C , Tomme, P., Warren, R.A.J. & Kilburn, D.G. (1999). Royal Society of Chemistry, Special Publication. In Press. Bouveng, H.O., Garegg, P.G., & Lindberg, B. (1958) Chem. and Ind. 52, 1727 Brun, E., Gans, P., Marion, D. & Barras, F. (1995) Eur.J.Biochem. 231, 142-148 Brun, E., Moriaud, F., Gans, P., Blackledge, M.J. , Barras, F. & Marion, D. (1997) Biochemistry 36, 16074-16086 Cann, I.K., Kocherginskaya, S., King, M.R., White, B.A. & Mackie, R.I. (1999) J.Bacteriol. 181, 1643-1651 Carrard, G. & Linder, M . (1999) Eur.J.Biochem. 262, 637-643 Chaplin, M.F. (1986) in Carbohydrate Analysis: a Practical Approach (Chaplin, M.F. & Kennedy, J.F., eds.), Monosaccharides, pp. 1-36, IRL Press, Oxford 144 Clare, J.J., Rayment, F.B., Ballantine, S.P., Sreekrishna, K. & Romanos, M A . (1991) Biotechnology (N.Y.) 9, 455-460 Coutinho, J.B., Gilkes, N.R., Warren, R . A , Kilburn, D.G. & Miller, R.C.J. (1992) Mol.Microbiol. 6, 1243-1252 Coutinho, P .M. & Henrissat, B. (1999) Carbohydrate-Active Enzymes server at URL: http://afmb.cnrs-mrs.fr/~pedro/CAZY/db.html Creagh, A.L . , Ong, E., Jervis, E., Kilburn, D.G. & Haynes, C A . (1996) Proc.Natl.Acad.Sci.U.S.A. 93, 12229-12234 Cregg, J.M., Vedvick, T.S. & Raschke, W.C. (1993) Biotechnology (N.Y.) 11, 905-910 Das, M.K. , Khan, M.I. & Surolia, A. (1981) Biochem J. 195, 341-343 Din, N . , Forsythe, I.J., Burtnick, L.D. , Gilkes, N.R., Miller, R C , Warren, R.A. & Kilburn, D.G. (1994a) Molec.Microbiol. 11, 747-755 Din, N . , Forsythe, I.J., Burtnick, L.D. , Gilkes, N.R., Miller, R.C.J., Warren, R.A. & Kilburn, D.G. (1994b) Mol.Microbiol. 11, 747-755 Doheny, J.G., Jervis, E.J., Guarna, M . M . , Humphries, R.K., Warren, R.A. & Kilburn, D.G. (1999) Biochem.J. 339, 429-434 Drickamer, K. (1995) Nat.Struct.Biol. 2,437-439 Drickamer, K. (1997) Structure. 5, 465-468 Duman, J.G., Miele, R.G., Liang, H. , Grella, D.K., Sim, K . L . , Castellino, F.J. & Bretthauer, R.K. (1998) Biotechnol.Appl.Biochem. 28, 39-45 Dupont, C , Roberge, M . , Shareck, F., Morosoli, R. & Kluepfel, D. (1998) Biochem.J. 330, 41-45 Dwek, R.A. (1995) Biochem Soc.Trans 23, 1-25 Eftink, M R . (1991) Methods Biochem Anal. 35, 127-205 Eftink, M.R. (1997) Methods Enzymol. 278,221-257 Eftink, M.R. & Ghiron, C A (1981) Anal.Biochem 114, 199-227 Elbein, A D . (1984) CRC Crit.Rev.Biochem 16, 21-49 Frankel, A., Tagge, E., Chandler, J., Burbage, C. & Willingham, M . (1996) Protein Eng. 9, 371-379 Frankel, A.E. , Burbage, C , Fu, T., Tagge, E., Chandler, J. & Willingham, M.C. (1996) Biochemistry 35, 14749-14756 Fu, D., Chen, L. & O'Neill, R.A. (1994) Carbohydr.Res. 261, 173-186 145 Fujino, Y. , Ogata, K. , Nagamine, T. & Ushida, K. (1998) Biosci.Biotechnol.Biochem. 62, 1795-1798 Gavel, Y . & von Heijne, G. (1990) Protein Eng. 3, 433-442 Gilkes, N R . , Jervis, E., Henrissat, B. , Tekant, B., Miller, R.C.J., Warren, R.A. & Kilburn, D.G. (1992) J.Biol.Chem. 267, 6743-6749 Gilkes, N R . , Kilburn, D.G., Miller, R.C.J. & Warren, R.A. (1989) J.Biol.Chem. 264, 17802-17808 Gilkes, N.R., Warren, R.A., Miller, R.C.J. & Kilburn, D.G. (1988) J.Biol.Chem. 263, 10401-10407 Gill , J., Rixon, J.E., Bolam, D.N. , Mcqueen-Mason, S., Simpson, P.J., Williamson, M P . , Hazlewood, G P . & Gilbert, H.J. (1999) Biochem.J. 342, 473-480 Glaudemans, C.P.J., Timell, T.E. (1958) J. Am. Chem. Soc. 80, 941 Glyko, Inc. F A C E monosaccharide composition kit manual. 1995 Glyko, Inc. F A C E O-linked oligosaccharide kit manual. 1995 Glyko, Inc. F A C E N-linked oligosaccharide kit manual. 1995 Goldstein, M.A. , Takagi, M . , Hashida, S., Shoseyov, O., Doi, R H . & Segel, I.H. (1993) J.Bacteriol. 175, 5762-5768 Graham, R.W., Greenwood, J.M., Warren, R.A., Kilburn, D.G. & Trimbur, D.E. (1995) Gene 158, 51-54 Greek, L.S. Development and characterization of an optical fiber based instrument for ultraviolet resonance Raman spectroscopy of biomolecules.(1998) The University of British Columbia, Vancouver, British Columbia. Ph.D Thesis. Greenwood, J.M., Gilkes, N.R., Kilburn, D.G., Miller, R.C.J. & Warren, R.A. (1989) FEBS Lett. 244, 127-131 Greenwood, J.M., Ong, E., Gilkes, N.R., Warren, R.A., Miller, R.C., Jr. & Kilburn, D.G. (1992) Protein Engineering Grinna, L.S. & Tschopp, J.F. (1989) Yeast. 5, 107-115 Guarna, M . M . , Cote, H . C , Amandoron, E.A., MacGillivray, R.T., Warren, R.A. & Kilburn, D.G. (1996) Ann.N.Y.Acad.Sci. 799, 397-400 Guex, N . & Peitsch, M.C. (1999) Electrophoresis 18, 2714-2723 Hall, J.A., Gehring, K. & Nikaido, H. (1997a) J.Biol.Chem 272, 17605-17609 Hall, J.A., Thorgeirsson, T.E., Liu, J., Shin, Y . K . & Nikaido, H. (1997b) J.Biol.Chem 272, 17610-17614 146 Henrissat, B., Teeri, T.T. & Warren, R.A. (1998) FEBS Lett. 425, 352-354 Heredia, A , Jimenez, A. & Guillen, R. (1995) Z.Lebensm.Unters.Forsch. 200, 24-31 Higgins, D R . & Cregg, J.M. (1998) MethodsMol.Biol. 103, 1-15 Hirabayashi, J., Dutta, S.K. & Kasai, K. (1998) J.Biol.Chem 273, 14450-14460 Houston, L L . & Dooley, T.P. (1982) J.Biol.Chem. 257, 4147-4151 Invitrogen. Pichia pastoris expression manual. 1995 Jervis, E.J., Haynes, C A & Kilburn, D.G. (1997) J.Biol.Chem. 272, 24016-24023 Johnson, P.E., Creagh, A.L . , Brun, E., Joe, K. , Tomme, P., Haynes, C A . & Mcintosh, L.P. (1998) Biochemistry 37, 12772-12781 Johnson, P.E., Joshi, M.D. , Tomme, P., Kilburn, D.G. & Mcintosh, L P . (1996a) Biochemistry 35, 14381-14394 Johnson, P.E., Tomme, P., Joshi, M.D. & Mcintosh, L.P. (1996b) Biochemistry 35, 13895-13906 Koivula, A., Kinnari, T., Harjunpaa, V. , Ruohonen, L. , Teleman, A., Drakenberg, T., Rouvinen, J., Jones, T . A & Teeri, T.T. (1998) FEBS Lett. 429, 341-346 Kormos, J. Mutational analysis of CBDN1.(1998) The University of British Columbia, Vancouver, British Columbia. M.Sc Thesis. Krogh, A , Brown, M . , Mian, I.S., Sjolander, K. & Haussler, D. (1994) J.Mol.Biol. 235, 1501-1531 Kukuruzinska, M . A & Lennon, K. (1998) Crit.Rev.Oral Biol.Med. 9, 415-448 Le, K.D. , Gilkes, N.R., Kilburn, D.G., Miller, R.C.J., Saddler, J.N. & Warren, R.A. (1994) Enzyme Microb.Technol. 16, 496-500 Lehle, L. (1992) Antonie Van Leeuwenhoek 61, 133-134 Lemieux, R .U. (1996) Accounts of Chemical Research 29, 373-380 Lemieux, R.U. , Delbaere, L.T., Beierbeck, H . & Spohr, U . (1991) Ciba.Found.Symp. 158, 231-245 Lever, M . (1973) Biochem.Med. 7,274-281 Linder, M . , Mattinen, M.L . , Kontteli, M . , Lindberg, G., StDhlberg, J., Drakenberg, T., Reinikainen, T., Pettersson, G. & Annila, A. (1995a) Protein Science 4, 1056-1064 Linder, M . , Mattinen, M.L . , Kontteli, M . , Lindeberg, G., Stahlberg, J., Drakenberg, T., Reinikainen, T., Pettersson, G. & Annila, A. (1995b) Protein Sci. 4, 1056-1064 Linder, M . & Teeri, T.T. (1996) Proc.Natl.Acad.Sci.U.S.A. 93, 12251-12255 Lis, H . & Sharon, N . (1993) Eur.J.Biochem 218, 1-27 Lis, H . & Sharon, N . (1998) Chemical Reviews 98, 637-674 Liu, G.Y., Grygon, C A . & Spiro, T.G. (1989) Biochemistry 28, 5046-5050 Mach, H. , Middaugh, C R . & Lewis, R.V. (1992) Anal.Biochem. 200, 74-80 Malburg, S.R., Malburg, L.M.J . , Liu, T., Iyo, A . H . & Forsberg, C.W. (1997) Appl.Environ.Microbiol. 63, 2449-2453 Mattinen, M L . , Linder, M . , Drakenberg, T. & Annila, A. (1998) Eur.J.Biochem. 256, 279-286 Mattinen, M L . , Linder, M . , Teleman, A. & Annila, A. (1997) FEBS Lett. 407, 291-296 Miele, R.G., Castellino, F.J. & Bretthauer, R.K. (1997a) Biotechnol.Appl.Biochem. 26, 79-83 Miele, R.G., Nilsen, S.L., Brito, T., Bretthauer, R.K. & Castellino, F.J. (1997b) Biotechnol.Appl.Biochem. 25, 151-157 Millward-Sadler, S.J., Davidson, K. , Hazlewood, G.P., Black, G.W., Gilbert, H.J. & Clarke, J.H. (1995) Biochem.J. 312, 39-48 Miura, T., Takeuchi, H . & Harada, I. (1988) Biochemistry 27, 88-94 Morag, E., Lapidot, A., Govorko, D., Lamed, R., Wilchek, M . , Bayer, E.A. & Shoham, Y . (1995) Appl.Environ.Microbiol. 61, 1980-1986 Murzin, A G . , Lesk, A . M . & Chothia, C. (1992) J.Mol.Biol. 223, 531-543 Nagao, M . , Matsumoto, S., Masuda, S. & Sasaki, R. (1993) Blood 81, 2503-2510 Nagy, T., Simpson, P., Williamson, M.P., Hazlewood, G.P., Gilbert, H.J. & Orosz, L. (1998) FEBS Lett. 429,312-316 Nordberg, K.E . , Bartonek-Roxa, E. & Hoist, O. (1997) Biochim.Biophys.Acta 1353, 118-124 Ong, E., Gilkes, N.R., Miller, R.C.J., Warren, A.J. & Kilburn, D.G. (1991) Enzyme Microb.Technol. 13, 59-65 Ong, E., Gilkes, N R . , Miller, R.C., Jr., Warren, R.A. & Kilburn, D.G. (1993) Biotechnol.Bioeng. 42, 401-409 Peterson, C.B. & Blackburn, M . N . (1985) J.Biol.Chem 260, 610-615 Plummer, T.H.J. & Tarentino, A.L . (1991) Glycobiology. 1, 257-263 Poole, D.B., Hazlewood, G.P., Huskisson, N.S., Virden, R. & Gilbert, H.J. (1993) FEMS Microbiol.Lett. 106, 77-84 148 Poole, D M . , Hazlewood, G.P., Huskisson, N.S., Virden, R. & Gilbert, H.J. (1993) FEMS Microbiol.Lett. 80, 77-83 Quiocho, F.A. (1986) Annual Review of Biochemistry 55, 287-315 Quiocho, F A . (1988) Current Topics in Microbiology and Immunology 139, 135-148 Ramirez, C , Fung, J., Miller, R.C.J., Antony, R., Warren, J. & Kilburn, D.G. (1993) Biotechnology (N.Y.) 11,1570-1573 Reinikainen, T., Ruohonen, L. , Nevanen, T., Laaksonen, L. , Kraulis, P., Jones, T.A., Knowles, J.K. & Teeri, T.T. (1992) Proteins 14, 475-482 Rutenber, E., Ready, M . & Robertus, J.D. (1987) Nature 326, 624-626 Rutenber, E. & Robertus, J.D. (1991) Proteins 10, 260-269 Sakka, K., Karita, S., Kimura, T. & Ohmiya, K. (1998) Ann.N. Y.Acad. Sci. 864, 485-488 Sakka, K., Takada, G., Karita, S. & Ohmiya, K. (1996) Ann.N. Y.Acad.Sci. 782, 241-251 Sakon, J., Irwin, D., Wilson, D.B. & Karplus, P A (1997) Nat.Struct.Biol. 4, 810-818 Sambrook, J., Fritsch, E.F. & Maniatis, T. (1989) Molecular Cloning: a Laboratory Manual. Cold Spring Harbour Laboratory Press, New York Sarko, A. (1986) in Cellulose: Structure, Modification and Hydrolysis (Young, R.A. & Rowell, R .M. , eds), Recent X-ray crystallographic studies of celluloses, pp. 29-66, Wiley -Interscience, New York Selvendran, R.R. & O'Neill, M A . (1987) Methods Biochem Anal. 32, 25-153 Shpigel, E., Goldlust, A., Efroni, G., Avraham, A , Eshel, A., Dekel, M . & Shoseyov, O. (1999) Biotechnol.Bioeng. 65, 17-23 Simpson, P.J., Bolam, D.N. , Cooper, A , Ciruela, A., Hazlewood, G.P., Gilbert, H.J. & Williamson, M.P. (1999) Structure.Fold.Des. 7, 853-864 Sphyris, N . , Lord, J.M., Wales, R. & Roberts, L . M . (1995) J.Biol.Chem. 270, 20292-20297 Stoll, D. The mannan degrading system of C.fimi. (1998) The University of British Columbia, Vancouver, British Columbia. P h D Thesis. Sweet, R .M. , Wright, H.T., Janin, J., Chothia, C H . & Blow, D . M . (1974) Biochemistry 13, 4212-4228 Tahirov, T.H., Lu, T.H., Liaw, Y . C , Chen, Y . L . & Lin, J.Y. (1995) J.Mol.Biol. 250, 354-367 Takeo, K. (1985) Electrophoresis 5, 187-195 Tanner, W. & Lehle, L . (1987) Biochim.Biophys.Acta 906, 81-99 \1 149 Tarentino, A.L . , Quinones, G., Schrader, W.P., Changchien, L . M . & Plummer, T.H.J. (1992) J.Biol.Chem 267, 3868-3872 Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) Nucleic Acids Research 22, 4673-4680 Timell, T.E. (1959) J. Am. Chem. Soc. 81, 4989 Tomme, P., Boraston, A., McLean, B., Kormos, J., Creagh, A.L . , Sturch, K. , Gilkes, N.R., Haynes, C.A., Warren, R.A.J. & Kilburn, D.G. (1998) J.Chromatogr.B 715, 283-296 Tomme, P., Creagh, A .L . , Kilburn, D.G. & Haynes, C A . (1996) Biochemistry 35, 13885-13894 Tomme, P., Van Tilbeurgh, H. , Pettersson, G., Van Damme, J., Vandekerckhove, J., Knowles, J., Teeri, T. & Claeyssens, M . (1988) Eur.J.Biochem. 170, 575-581 Tomme, P., Warren, R.A., Miller, R . C , Jr., Kilburn, D.G. & Gilkes, N R . (1995) in Enzymatic Degradation of Insoluble Polysaccharides (Saddler, J.N. & Penner, M . , eds.), Cellulose-binding domains: classification and properties, pp. in American Chemical Society. Tormo, J., Lamed, R., Chirino, A.J., Morag, E., Bayer, E.A., Shoham, Y. & Steitz, T.A. (1996) E M B O J. 15, 5739-5751 Verostek, M.F. & Trimble, R.B. (1995) Glycobiology. 5, 671-681 Vincent, P., Shareck, F., Dupont, C , Morosoli, R. & Kluepfel, D. (1997) Biochem.J. 322, 845-852 Warren, R.A., Beck, C.F., Gilkes, N.R., Kilburn, D.G., Langsford, M L . , Miller, R.C. Jr., O'Neill, G.P., Scheufens & M . , Wong, W.K. (1986) Proteins. 1, 345-341 Weis, W.I. (1994) Structure. 2, 147-150 Weis, W.I. & Drickamer, K. (1996) Annu.Rev.Biochem. 65, 441-473 Winterhalter, C , Heinrich, P., Candussio, A., Wich, G. & Liebl, W. (1995) Mol.Microbiol. 15, 431-444 Xie, R.L. & Long, G.L. (1995) J.Biol.Chem 270, 23212-23217 Xu, G.Y., Ong, E., Gilkes, N.R., Kilburn, D.G., Muhandiram, D.R., Harris-Brandts, M . , Carver, J.P., Kay, L E . & Harvey, T.S. (1995) Biochemistry 34, 6993-7009 Zentz, C , Frenoy, J.P. & Bourrillon, R. (1978) Biochim.Biophys.Acta 536, 18-26 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0089877/manifest

Comment

Related Items