Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A metagenomic search for glycoside phosphorylases using a phosphate dependent 2,4-dinitrophenyl glycoside… Macdonald, Spence 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2015_february_macdonald_spence.pdf [ 8.15MB ]
Metadata
JSON: 24-1.0135644.json
JSON-LD: 24-1.0135644-ld.json
RDF/XML (Pretty): 24-1.0135644-rdf.xml
RDF/JSON: 24-1.0135644-rdf.json
Turtle: 24-1.0135644-turtle.txt
N-Triples: 24-1.0135644-rdf-ntriples.txt
Original Record: 24-1.0135644-source.json
Full Text
24-1.0135644-fulltext.txt
Citation
24-1.0135644.ris

Full Text

	   	  	  A	  METAGENOMIC	  SEARCH	  FOR	  GLYCOSIDE	  PHOSPHORYLASES	  USING	  A	  PHOSPHATE	  DEPENDENT	  2,4-­‐DINITROPHENYL	  GLYCOSIDE	  COLORIMETRIC	  ASSAY	  	   by	  	  Spence	  Macdonald	  B.Sc.,	  Wilfrid	  Laurier	  University,	  2010	  	  	  	  	  A	  THESIS	  SUBMITTED	  IN	  PARTIAL	  FULFILLMENT	  OF	  	  THE	  REQUIREMENTS	  FOR	  THE	  DEGREE	  OF	  	  MASTER	  OF	  SCIENCE	  	  in	  	  THE	  FACULTY	  OF	  GRADUATE	  AND	  POSTDOCTORAL	  STUDIES	  (Biochemistry	  and	  Molecular	  Biology)	  	  	  THE	  UNIVERSITY	  OF	  BRITISH	  COLUMBIA	  (Vancouver)	  	  December	  2014	  	  ©	  Spence	  Macdonald,	  2014	  	   ii	  Abstract	  Carbohydrate	   active	   enzymes	   (CAZymes)	   comprise	   a	   large	   class	   of	   enzymes	  responsible	   for	   the	   assembly	   and	   degradation	   of	   glycans	   in	   biological	   systems.	  CAZymes	  are	  widely	  employed	  enzymes	  in	  industry,	  being	  used	  in	  brewing	  and	  food	  processing,	   animal	   feed	   preparation,	   industrial	   pulp	   and	   paper	   applications	   and	  increasingly	   in	  biofuel	  development.	  While	   the	  use	  of	  CAZymes	   is	   cost-­‐effective	   in	  glycan	   degradation,	   glycan	   assembly	   generally	   requires	   the	   use	   of	   expensive	  nucleotide	  sugar	  phosphates	  as	  a	  starting	  material.	  The	  high	  cost	  of	  these	  materials	  makes	  an	  assembly	  approach	  towards	  industrial-­‐scale	  glycan	  synthesis	  difficult	  and	  usually	   non-­‐viable.	   One	   class	   of	   CAZyme	   that	   has	   received	   little	   attention	   from	  industry	   is	   that	   of	   the	   glycoside	   phosphorylases	   (GPases),	  which	   offer	   a	   potential	  solution	   to	   the	  high-­‐costs	  associated	  with	  glycan	  synthesis.	  These	  enzymes	  bypass	  the	   need	   for	   expensive	   nucleotide	   sugar	   phosphates,	   and	   thus,	   we	   believe	   that	  approaches	  employing	  GPases	  will	  be	  of	  high	  market	  value.	  The	  bottleneck	   in	   this	  approach	  to	  glycan	  synthesis	  currently	  is	  the	  very	  limited	  range	  of	  GPases	  available,	  limiting	  the	  classes	  of	  glycan	  that	  can	  be	  assembled.	  To	  help	  increase	  the	  spectrum	  of	  known	  GPases	  available,	  we	  have	  turned	  to	  metagenomics	  as	  means	  to	  discover	  new	   enzymes	   belonging	   to	   this	   family.	   This	   will	   involve	   high-­‐throughput	   (HT)	  screening	  of	  bacterial	  genome	  fragments	  recovered	  directly	   from	  the	  environment	  for	   expression	   of	   novel	   GPases.	   Here,	   we	   report	   the	   development	   of	   a	   novel	   HT	  screening	  methodology	  that	  allows	  the	  screening	  of	  large	  libraries	  constructed	  from	  metagenomic	   DNA.	   A	   dual	   approach	   is	   described	   using	   functional	   screening	   and	  bioinformatic	   techniques.	  By	  using	  a	   synthetic	   substrate	  analogue	   that	  produces	  a	  colorimetric	  response	  when	  processed	  by	  a	  GPase	  we	  are	  able	  to	  identify	  functional	  candidates	  from	  libraries	  containing	  upwards	  of	  25	  000	  clones	  at	  a	  time.	  Likewise,	  utilising	  previous	  sequence	  data	  accumulated	  by	  our	  collaborators	  (Hallam	  Lab),	  we	  were	   able	   to	   identify	   GPases	   based	   on	   sequence	   homology.	   To	   date,	   through	   this	  screening	   methodology	   we	   have	   discovered	   5	   new	   GPases	   and	   a	   new	   class	   of	  CAZymes:	  the	  stereochemistry-­‐retaining	  β-­‐glycoside	  phosphorylases.	  	   	  	   iii	  Preface	  This	  thesis	  is	  original,	  unpublished,	  independent	  work	  by	  the	  author,	  Spence	  Macdonald.	  	  	   iv	  Table	  of	  contents	  Abstract	  .......................................................................................................................................	  ii	  Preface	  .......................................................................................................................................	  iii	  Table	  of	  contents	  ....................................................................................................................	  iv	  List	  of	  tables	  .............................................................................................................................	  vi	  List	  of	  figures	  ...........................................................................................................................	  vii	  List	  of	  schemes	  ........................................................................................................................	  xi	  List	  of	  abbreviations	  .............................................................................................................	  xii	  Acknowledgements	  ..............................................................................................................	  xiv	  1	   Introduction	  .......................................................................................................................	  1	  1.1	   Carbohydrates	  in	  biology	  .....................................................................................................	  1	  1.2	   Commercial	  applications	  of	  glycosidases	  .......................................................................	  1	  1.3	   Glycoside	  phosphorylases	  (GPases)	  .................................................................................	  2	  1.4	   Catalytic	  mechanism	  .............................................................................................................	  4	  1.5	   Applications	  of	  GPases	  ..........................................................................................................	  7	  1.6	   Metagenomics	  ..........................................................................................................................	  9	  1.7	   Aim	  of	  thesis	  ..........................................................................................................................	  12	  2	   Results	  and	  discussion	  .................................................................................................	  13	  2.1	   Development	  of	  a	  phosphate-­‐dependent	  screen	  for	  GPases	  ................................	  13	  2.1.1	   Isolating	  GPase	  activity	  ...............................................................................................................	  13	  2.1.2	   Known	  cellobiose	  and	  cellodextrin	  phosphorylases	  as	  test	  enzymes	  ...................	  13	  2.1.3	   Chromogenic	  substrates	  ............................................................................................................	  15	  2.2	   Metagenomic	  screen	  ...........................................................................................................	  19	  2.2.1	   Source	  of	  environmental	  DNA	  .................................................................................................	  19	  2.2.2	   Expression	  system	  ........................................................................................................................	  22	  2.2.3	   FOS62	  library	  ..................................................................................................................................	  24	  2.2.4	   TolDC	  library	  ...................................................................................................................................	  24	  2.2.5	   Consolidation	  plate	  .......................................................................................................................	  28	  2.2.6	   TLC	  validation	  .................................................................................................................................	  31	  	   v	  2.2.7	   B5	  sub-­‐library	  .................................................................................................................................	  33	  2.3	   BglX	  characterization	  .........................................................................................................	  36	  2.3.1	   Phosphate-­‐dependent	  cleavage	  ..............................................................................................	  36	  2.3.2	   Anomeric	  configuration	  of	  sugar-­‐1-­‐phosphates	  ..............................................................	  39	  2.3.3	   Phosphate	  reactivation	  of	  the	  glycosyl-­‐enzyme	  intermediate	  ..................................	  39	  2.3.4	   Phosphate	  and	  the	  catalytic	  dyad	  ..........................................................................................	  43	  2.4	   Bioinformatic	  screen	  ..........................................................................................................	  44	  2.4.1	   GH94	  sequence	  homology	  screen	  ..........................................................................................	  44	  2.4.2	   Natural	  substrates	  ........................................................................................................................	  47	  2.4.3	   GH94	  candidate’s	  activity	  towards	  DNPGlc	  .......................................................................	  49	  2.5	   Conclusions	  and	  future	  directions	  .................................................................................	  51	  3	   Methods	  .............................................................................................................................	  55	  3.1	   Cloning	  ....................................................................................................................................	  55	  3.2	   Protein	  expression	  ..............................................................................................................	  55	  3.3	   Immobilized	  metal	  affinity	  chromatography	  .............................................................	  56	  3.4	   TLC	  ............................................................................................................................................	  56	  3.5	   General	  spectroscopy	  methods	  ......................................................................................	  56	  3.6	   BglX	  reactivation	  .................................................................................................................	  57	  3.7	   1H	  NMR	  analysis	  ...................................................................................................................	  57	  3.8	   High	  throughput	  functional	  screen	  ...............................................................................	  58	  3.9	   Consolidation	  ........................................................................................................................	  58	  3.10	   Sub-­‐library	  ..........................................................................................................................	  58	  References	  ..............................................................................................................................	  60	  Appendix	  .................................................................................................................................	  66	  	  	   vi	  List	  of	  tables	  Table	  1:	  List	  of	  known	  GPases,	  their	  CAZy	  family	  and	  reaction	  details.	  (8)	  .....................	  5	  Table	  2:	  Annotations	  of	  candidate	  GH94s	  found	  through	  sequence	  homology	  screening.	  MW	  calculated	  at:	  http://proteome.gs.washington.edu/cgi-­‐bin/aa_calc.pl	  (accessed	  November	  2014)	  ........................................................................	  45	  	  	   vii	  List	  of	  figures	  Figure	  1:	  Metagenomic	  workflow	  outlining	  the	  process	  of	  constructing	  a	  metagenomic	  library	  from	  isolation	  of	  eDNA	  to	  functional	  and	  sequence-­‐based	  screening.	  Figure	  generously	  donated	  by	  Zach	  Armstrong.	  .......................................	  10	  Figure	  2:	  Methodology	  employed	  to	  differentiate	  glycosidases	  and	  GPases.	  ...............	  14	  Figure	  3:	  Activity	  profile	  of	  (A)	  CBP,	  (B)	  CDP	  and	  (C)	  phosphate	  dependency	  of	  CBP.	  30	  μg	  of	  purified	  enzyme	  was	  incubated	  for	  60	  min	  at	  40	  °C,	  samples	  were	  spotted	  on	  TLC	  plate	  at	  the	  indicated	  times.	  Forward	  reaction	  starting	  material:	  (A)	  CBP	  –	  10	  mM	  cellobiose	  and	  100	  mM	  phosphate;	  (B)	  CDP	  –	  10	  mM	  cellotriose	  and	  100	  mM	  phosphate.	  Reverse	  reactions	  starting	  material:	  (A)	  CBP	  –	  10	  mM	  D-­‐glucose	  and	  10	  mM	  Glc-­‐1-­‐P;	  (B)	  CDP	  –	  10	  mM	  cellobiose	  and	  10	  mM	  Glc-­‐1-­‐P.	  (C)	  CBP	  was	  incubated	  with	  10	  mM	  cellobiose	  and	  0	  or	  100	  mM	  phosphate.	  ........................................................................................................................................	  17	  Figure	  4:	  Chemical	  structures	  of	  assayed	  chromogenic	  substrates.	  .................................	  18	  Figure	  5:	  Phosphorolysis	  of	  DNPGlc	  by	  CBP	  and	  CDP.	  (A)	  5	  μg	  of	  purified	  enzyme	  was	  incubated	  with	  2	  mM	  DNPGlc	  and	  0	  or	  100	  mM	  phosphate	  for	  60	  min	  at	  30	  °C	  (200	  μL	  reaction	  volume).	  Background	  was	  measure	  by	  running	  a	  reaction	  with	  no	  enzyme.	  DNP	  concentration	  was	  determined	  by	  measuring	  absorbance	  at	  400	  nm.	  (B)	  TLC	  reactions	  prepared	  by	  combining	  50	  μg	  of	  purified	  enzyme	  with	  10	  mM	  DNPGlc	  and	  0	  or	  100	  mM	  phosphate	  and	  incubated	  for	  2	  h	  at	  30	  °C	  (20	  μL	  reaction	  volume).	  ............................................................................................................	  21	  Figure	  6:	  (A)	  BLAST	  report	  of	  the	  Hallam	  Lab’s	  fosmid	  end	  sequences	  against	  the	  CAZy	  library.	  A	  positive	  was	  determined	  according	  to	  the	  following	  search	  parameters:	  minimum	  read	  length:	  50	  bp;	  minimum	  BLAST	  score:	  0.32.	  (B)	  Magnified	  GH94	  results	  predicting	  6	  positive	  hits	  from	  the	  TolDC	  library.	  ........	  23	  	   viii	  Figure	  7:	  Picture	  of	  a	  section	  of	  a	  384-­‐well	  FOS62	  plate	  highlighting	  the	  reduction	  in	  background	  signal	  when	  reducing	  incubation	  time	  from	  18	  h	  to	  6	  h.	  The	  assay	  plate	  was	  incubated	  for	  18	  h	  at	  37	  °C.	  Photos	  were	  taken	  at	  6	  and	  18	  h.	  Accompanying	  black	  and	  white	  photos	  were	  altered	  to	  change	  the	  yellow	  channel	  to	  black,	  emphasizing	  positive	  signals.	  Positive	  clones	  are	  located	  at	  B2,	  C2,	  F1,	  G2,	  G3,	  H2,	  and	  H4.	  .........................................................................................................	  25	  Figure	  8:	  Comparison	  of	  signal	  measurements	  taken	  at	  400nm	  to	  normalizing	  with	  600nm.	  Plus	  (blue)	  and	  minus	  (green)	  phosphate	  sample	  FOS62	  assay	  plates	  were	  incubated	  for	  6h	  at	  37	  °C	  and	  absorbance	  measurements	  were	  taken	  at	  400	  and	  600	  nm.	  (A)	  displays	  a	  bar	  graph	  of	  400	  nm	  only	  and	  (B)	  shows	  the	  same	  400	  nm	  signals	  normalized	  with	  their	  corresponding	  600	  nm	  value.	  Asterisks	  indicate	  positive	  clones	  identified	  by	  visual	  inspection	  (yellow	  wells).	  Red	  dotted	  lines	  indicate	  the	  mean.	  ......................................................................................	  26	  Figure	  9:	  Workflow	  outlining	  the	  identification	  of	  GPases	  from	  a	  metagenomic	  library.	  ................................................................................................................................................	  27	  Figure	  10:	  400/600nm	  absorbance	  measurements	  from	  22656	  clones	  from	  the	  TolDC	  library.	  Blue	  dotted	  line	  indicates	  5	  standard	  deviations	  from	  the	  mean	  (4.9).	  ....................................................................................................................................................	  29	  Figure	  11:	  (A)	  400/600nm	  absorbance	  measurements	  in	  the	  presence	  (red)	  and	  absence	  (black)	  of	  phosphate	  from	  the	  238	  clones	  from	  the	  TolDC	  consolidation	  plate.	  The	  CBP	  expression	  strain	  was	  included	  in	  the	  screen	  as	  a	  positive	  control	  (black	  box).	  (B)	  The	  ΔPi	  plot	  shows	  the	  difference	  in	  the	  average	  400/600nm	  absorbance	  in	  the	  presence	  and	  absence	  of	  phosphate.	  Those	  clones	  with	  a	  ΔPi	  greater	  than	  half	  the	  ΔPi	  of	  the	  positive	  control	  (blue)	  were	  designated	  phosphate	  dependent	  (red)	  and	  designated	  for	  further	  validation.	  30	  	   ix	  Figure	  12:	  TLC	  of	  validation	  of	  the	  (A)	  TolDC	  library	  positive	  hits	  from	  and	  (B)	  the	  B5	  fosmid	  clone	  from	  the	  FOS62	  library.	  Whole	  cell	  lysates	  from	  the	  indicated	  clones	  were	  incubated	  with	  1	  mM	  DNPGlc	  for	  1	  h	  at	  30	  °C.	  .......................................	  32	  Figure	  13:	  Workflow	  outlining	  the	  generation	  of	  a	  sub-­‐library	  from	  a	  fosmid	  containing	  a	  gene	  of	  interest.	  A	  sub-­‐library	  built	  of	  fosmid	  fragments	  ligated	  into	  pUC19	  facilitates	  traditional	  sequencing	  methods,	  such	  as	  Sanger	  sequencing.	  ......................................................................................................................................	  34	  Figure	  14:	  BglX	  activity	  toward	  10	  mM	  DNPGlc	  in	  the	  presence	  of	  0	  and	  100mM	  phosphate.	  Reaction	  was	  incubated	  for	  1	  h	  at	  37	  °C.	  .....................................................	  35	  Figure	  15:	  (A)	  Phosphate-­‐dependent	  activity	  towards	  pNPGlc,	  pNPGlcNAc	  and	  DNPGlc.	  30	  μg	  of	  purified	  BglX	  and	  10	  mM	  substrate	  was	  incubated	  for	  60	  min	  at	  37	  °C	  the	  absence	  or	  presence	  of	  100	  mM	  phosphate.	  (B)	  Reaction	  scheme	  displays	  the	  phosphate-­‐dependent	  activity	  of	  BglX	  towards	  pNPGlc	  and	  pNPGlcNAc.	  ......................................................................................................................................	  38	  Figure	  16:	  1H	  NMR	  spectra	  for	  BglX	  and	  10	  mM	  pNPGlcNAc	  or	  pNPGlc	  with	  0,	  or	  100	  mM	  phosphate.	  Unfortunately	  the	  βGlcNAc	  and	  βGlc	  anomeric	  proton	  falls	  right	  under	  the	  large	  peaks	  from	  residual	  HOD,	  thus	  only	  the	  peaks	  from	  α-­‐anomers	  are	  shown.	  An	  αGlc-­‐1-­‐P	  standard	  was	  included	  to	  contrast	  the	  βGlc-­‐1-­‐P	  signal.	  ...............................................................................................................................................................	  41	  Figure	  17:	  (A)	  BglX	  inactivation	  by	  incubating	  100	  μM	  BglX	  with	  10	  mM	  DNP2FGlc	  in	  Buffer	  D.	  (B)	  Inactivate	  2FGlc-­‐BglX	  was	  reactivated	  by	  incubation	  with	  increasing	  concentrations	  of	  phosphate.	  Samples	  were	  incubated	  at	  25	  °C	  in	  the	  indicated	  concentration	  of	  phosphate.	  Aliquots	  were	  assayed	  in	  Buffer	  D	  containing	  50	  mM	  pNPGlc	  and	  20	  mM	  phosphate	  at	  the	  indicated	  time	  points.	  (C)	  Rate	  constants	  derived	  from	  traces	  in	  (B)	  were	  plotted	  against	  phosphate	  concentration	  to	  determine	  kreact/Kreact.	  ..............................................................................	  42	  	   x	  Figure	  18:	  Multiple	  peptide	  sequence	  alignment	  for	  CtCBP,	  GH94A,	  B	  and	  C.	  Aligned	  using	  ClustalW2:	  (http://www.ebi.ac.uk/Tools/msa/clustalw2/).	  Symbol	  legend:	  (*)	  indicates	  a	  position	  which	  has	  a	  single,	  fully	  conserved	  residue.	  (:)	  indicates	  conversion	  between	  groups	  of	  strongly	  similar	  properties.	  (.)	  indicates	  conversion	  between	  groups	  of	  weakly	  similar	  properties.	  .....................	  46	  Figure	  19:	  GH94	  activity	  towards	  cellobiose,	  cellotriose	  with	  and	  without	  phosphate	  (lanes	  1-­‐4)	  and	  towards	  Glc-­‐1-­‐P	  and	  glucose	  (lane	  5)	  and	  Glc-­‐1-­‐P	  and	  cellobiose	  (lane	  6).	  25	  μg	  of	  purified	  enzyme	  was	  incubated	  with	  5	  mM	  of	  the	  indicated	  substrate(s)	  and	  0	  or	  100	  mM	  phosphate.	  GH94A,	  D	  and	  E	  reactions	  were	  incubated	  at	  37	  °C	  for	  30	  min,	  GH94C	  incubation	  time	  was	  extended	  to	  16	  h.	  ..	  48	  Figure	  20:	  Phosphorolysis	  of	  DNPGlc	  by	  GH94A,	  C-­‐E.	  (A)	  50	  μg	  of	  purified	  enzyme	  was	  incubated	  with	  1	  mM	  DNPGlc	  and	  0	  or	  100	  mM	  phosphate	  for	  20	  h	  at	  30	  °C	  (200	  μL	  reaction	  volume).	  (BG)	  Background	  was	  measured	  by	  running	  the	  reaction	  with	  no	  enzyme.	  DNP	  concentration	  was	  determined	  by	  measuring	  absorbance	  at	  400	  nm.	  ................................................................................................................	  50	  Figure	  21:	  Schematic	  diagram	  of	  the	  Glucose-­‐6-­‐phosphate	  dehydrogenase	  activity	  assay.	  Glucose-­‐1-­‐phosphate	  produced	  from	  a	  GPase	  is	  converted	  to	  glucose-­‐1-­‐phosphate	  by	  phosphoglucomutase.	  Through	  conversion	  of	  glucose-­‐6-­‐phosphate	  to	  6-­‐phosphogluconolactone,	  glucose-­‐1-­‐phosphate	  dehydrogenase	  reduces	  NAD+	  to	  NADH,	  which	  can	  be	  detected	  by	  measuring	  absorbance	  at	  340	  nm.	  .......................................................................................................................................................	  52	  	  	   xi	  List	  of	  schemes	  Scheme	  1:	  Reaction	  catalyzed	  by	  glucosidases	  and	  GPases	  ....................................................	  3	  Scheme	  2:	  Catalytic	  mechanisms	  of	  (A)	  β-­‐inverting	  and	  (B)	  β-­‐retaining	  glycosidases	  and	  (C)	  β-­‐inverting	  and	  (D)	  β-­‐retaining	  GPases	  .................................................................	  6	  Scheme	  3:	  (A)	  Reaction	  catalyzed	  by	  C.	  thermocellum	  CBP	  (CtCBP)	  and	  (B)	  reaction	  catalyzed	  by	  C.	  thermocellum	  CDP	  (CtCDP).	  ......................................................................	  16	  Scheme	  4:	  Chomogenic	  assay	  target	  reaction	  for	  GPases.	  .....................................................	  20	  Scheme	  5:	  Peptidoglycan	  recycling	  reaction	  catalyzed	  by	  N-­‐acetylglucosaminidases.	  ...............................................................................................................................................................	  37	  Scheme	  6:	  Catalytic	  mechanisms	  of	  β-­‐retaining	  glycosidases	  and	  GPases	  utilizing	  a	  His-­‐Asp	  acid/base	  catalytic	  dyad	  ...........................................................................................	  40	  	  	   xii	  List	  of	  abbreviations	  ATP	   Adenosine	  triphosphate	  BAC	   Bacterial	  artificial	  chromosome	  BCR	   Biochemical	  reactor	  BLAST	   Basic	  local	  alignment	  search	  tool	  CAZymes	   Carbohydrate	  active	  enzymes	  CBP	   Cellobiose	  phosphorylase	  CDP	   Cellodextrin	  phosphorylase	  Cl-­‐MU-­‐Glc	   6-­‐Chloro-­‐4-­‐methylumbelliferyl	  β-­‐D-­‐glucopyranoside	  DDAO-­‐Glc	   7-­‐Hydroxy-­‐9H-­‐(1,3-­‐dichloro-­‐9,9-­‐dimethylacridin-­‐2-­‐one)	  β-­‐D-­‐glucopyranoside	  DNP	   2,4-­‐Dinitrophenyl	  DNP2FGlc	   2,4-­‐Dinitrophenyl	  2-­‐deoxy-­‐2-­‐fluoro-­‐β-­‐D-­‐glucopyranoside	  DNPGlc	   2,4-­‐Dinitrophenyl	  β-­‐D-­‐glucopyranoside	  	  eDNA	   Environmental	  DNA	  GH	   Glycoside	  hydrolase	  Glc-­‐1-­‐P	   Glucose-­‐1-­‐phosphate	  GlcNAc	   N-­‐Acetylglucosamine	  GlcNAc-­‐1-­‐P	   N-­‐Acetylglucosamine-­‐1-­‐phosphate	  GlcNAc-­‐MurNAc	   N-­‐Acetyl-­‐4-­‐O-­‐[2-­‐(acetylamino)-­‐2-­‐deoxy-­‐β-­‐D-­‐glucopyranosyl]-­‐	  muramic	  acid	  	  GPase	   Glycoside	  phosphorylase	  GT	   Glycosyltransferase	  MS	   Mass	  spectrometry	  MU-­‐Glc	   4-­‐Methylumbelliferyl	  β-­‐D-­‐glucopyranoside	  	  MurNAc	   N-­‐Acetylmuramic	  acid	  NDP	   Nucleoside	  diphosphate	  NMR	   Nuclear	  magnetic	  resonance	  ORF	   Open	  reading	  frame	  pNP	   p-­‐Nitrophenyl	  	  pNPGlc	   p-­‐Nitrophenyl	  β-­‐D-­‐glucopyranoside	  	  	   xiii	  pNPGlcNAc	   p-­‐Nitrophenyl	  N-­‐acetyl-­‐β-­‐D-­‐glucosaminide	  	  Rf	   Retention	  factor	  TLC	   Thin-­‐layer	  chromatography	  	  	   xiv	  Acknowledgements	  	   I	   would	   like	   to	   thank	   my	   supervisor,	   Stephen	   G.	   Withers	   for	   his	   guidance,	  patience	  and	  giving	  me	  the	  opportunity	  to	  study	  in	  his	  lab.	  I	  would	  also	  like	  to	  thank	  all	  members	  of	  the	  Withers	  group,	  past	  and	  present,	  specifically	  Zach	  Armstrong,	  Dr.	  Markus	  Blaukopf	  and	  Dr.	  Hongming	  Chen.	  Also	  my	  appreciation	  goes	  to	  Dr.	  Hallam	  and	   the	  Hallam	   Lab	  members,	   specifically	   Sam	  Kheirandish,	  who	   provided	   all	   the	  metagenomic	  material	  needed	  for	  this	  study.	  	   1	  1 Introduction	  1.1 Carbohydrates	  in	  biology	  Carbohydrates	   play	   a	  wide	   range	   of	   roles	   in	   biological	   systems.	   These	   roles	  include	   forming	   structural	   biopolymers,	   such	   as	   cellulose	   and	   chitin	   that	   provide	  cells	   with	   rigidity	   and	   mechanical	   strength,	   as	   well	   as	   functioning	   as	   storage	  polymers,	   such	   as	   glycogen	   and	   starch	   to	   allow	   the	   cell	   to	   sequester	   energy.	  Oligosaccharides	  add	  another	  layer	  of	  complexity	  by	  playing	  a	  key	  role	  in	  mediating	  cell-­‐cell	   interactions,	   typically	   in	  the	  form	  of	  glyco-­‐conjugated	  cell	  surface	  proteins	  or	   lipids	   (1).	   While	   cell	   surface	   glycans	   are	   often	   used	   for	   the	   purpose	   of	  endogenous	   cell	   recognition	   and	   communication,	   exogenous	   invaders,	   such	   as	  pathogens,	   can	   recognize	   and	   use	   surface	   glycans	   to	   gain	   entry	   to	   a	   cell.	  Understanding	  the	  mechanisms	  by	  which	  a	  pathogen	  interacts	  with	  the	  host	  has,	  in	  many	   cases,	   already	   led	   to	   the	   discovery	   and	   rational	   design	   of	   probes	   and	  inhibitors	  to	  reduce	  or	  prevent	  deadly	  human	  infections	  (2-­‐4).	  	  	   The	   assembly	   and	   degradation	   of	   these	   structures	   is	   achieved	   using	   a	   wide	  variety	  of	  carbohydrate-­‐active	  enzymes	  (CAZymes).	  Synthesis	  is	  primarily	  achieved	  using	   glycosyltransferases	   (GTs),	   while	   hydrolysis	   is	   mediated	   by	   glycoside	  hydrolases	   (GH)	   also	   known	   as	   glycosidases.	   An	   online	   database	   (www.cazy.org)	  has	   been	   created	  which	   classifies	   these	  CAZymes,	   as	  well	   as	   related	   carbohydrate	  modifying	   enzymes,	   into	   families	   based	   on	   amino	   acid	   sequence	   (5).	   This	  classification	  is	  invaluable	  in	  uniting	  our	  understandings	  and	  in	  predicting	  function,	  structure	  and	  mechanism.	  1.2 Commercial	  applications	  of	  glycosidases	  Glycosidases	   are	   among	   the	   most	   widely	   employed	   enzymes	   in	   industry,	  ranging	  from	  their	  massive	  use	  in	  the	  brewing	  and	  food	  processing	  sectors,	  through	  to	   applications	   in	   animal	   feed	   preparation,	   the	   pulp	   and	   paper	   industry,	   laundry	  detergents	  and	  increasingly,	  the	  biofuel	  industry	  (6).	  The	  most	  widely	  used	  of	  these	  enzymes	   are	   the	   amylases	   (starch	   conversion,	   ethanolic	   biofuel	   and	   laundry	  	   2	  detergents),	   xylanases	   (pulp	   and	   paper,	   animal	   feeds)	   and	   the	   cellulases	   (laundry	  detergents,	   stone-­‐washed	   jeans	   and	   cellulosic	   ethanol).	   However	   a	   variety	   of	  CAZymes	   are	   employed	   in	   other	   tasks.	   CAZymes	   are	   also	   used	   in	   the	   synthesis	   of	  glycans.	   Two	   classes	   of	   such	   enzymes	   used	   currently	   are:	   i)	   the	   transglycosylases	  (7),	   which	   interconvert	   glycosides	   (e.g.	   generation	   of	   galacto	   and	   fructo-­‐oligosaccharides	   from	   lactose	   and	   sucrose	   respectively,	   for	   use	   as	   pre-­‐biotics);	   ii)	  the	  NDP-­‐sugar-­‐dependent	  glycosyltransferases,	   though	  the	  application	  of	  this	  class	  has	   been	   severely	   limited	   by	   the	   costs	   of	   the	   nucleotide	   phosphosugars	   required.	  One	  class	  of	  enzyme	  that	  largely	  seems	  to	  have	  escaped	  the	  attention	  of	  industry	  is	  that	  of	  the	  glycoside	  phosphorylases	  (GPases).	  1.3 Glycoside	  phosphorylases	  (GPases)	  GPases	   cleave	   the	   glycosidic	   linkage	   using	   a	   phosphate	   rather	   than	   a	   water	  molecule,	   generating	   a	   sugar	   phosphate	   product	   (Scheme	   1)	   (8).	   This	   is	  advantageous	   for	   the	   host	   organism	   as	   it	   directly	   generates	   a	   glycolytic	  intermediate,	  which	   is	  more	  efficiently	  metabolized	   than	   is	   the	  parent	   sugar	   since	  no	   ATP	   is	   required	   to	   phosphorylate	   it.	   Since	   the	   sugar	   phosphate	   product	   is	   of	  higher	   free	   energy	   than	   the	   parent	   free	   sugar,	   the	   equilibrium	   constant	   for	   the	  phosphorylase	   reaction	   is	   close	   to	   unity,	   as	   opposed	   to	   a	   glycosidase,	   where	   the	  equilibrium	  heavily	  favours	  hydrolysis,	  driven	  by	  55	  M	  water.	  As	  noted	  below,	  this	  can	   be	   exploited.	   The	   GPase	   class	   of	   enzymes	   has	   been	   known	   for	   some	   time,	  particularly	   through	   classic	   enzymes	   such	   as	   glycogen	   phosphorylase	   (9),	   but	  relatively	   few	   other	   examples	   were	   known	   until	   recently.	   As	   more	   bacterial	  genomes	   become	   sequenced,	   it	   is	   now	   apparent	   that	   various	   organisms	   may	   be	  utilizing	   more	   GPases	   in	   the	   degradation	   of	   glycans	   than	   previously	   thought	   (8).	  Accordingly,	   GPases	   have	   now	  been	  discovered	  which	   degrade	   substrates	   such	   as	  cellodextrins,	  chitin,	  starch	  and	  the	  N-­‐linked	  glycans	  of	  glycoproteins	  (10-­‐14).	   It	   is	  probable	   that	   numerous	   forms	   of	   carbohydrates	   are	   degraded	   by	   a	   similar	  mechanism	  suggesting	  an	  abundance	  of	  GPases	  to	  discover	  in	  nature.	  	  	  	   	  	   3	  	  	  	  	  	  	  	  	  	  	  	  	  	   	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Scheme	  1:	  Reaction	  catalyzed	  by	  glucosidases	  and	  GPases	  OOHHOHO OROH OOHHOHO OOH OOHHOHO OHOH PO- O-O-O POHO-O GPaseglycosidaseH OH	   4	  GPases	   reported	   to	   date	   (Table	   1)	   catalyze	   the	   reversible	   release	   of	   a	  monosaccharide-­‐1-­‐phosphate	   from	   the	   non-­‐reducing	   end	   of	   their	   corresponding	  glycan,	  resulting	  in	  inversion	  or	  retention	  of	  stereochemistry	  of	  the	  product	  (8).	  An	  exception	   is	  α-­‐1,4-­‐Glucan:maltose-­‐1-­‐phosphate	  maltosyltransferase	  which	   releases	  a	   disaccharide-­‐1-­‐phosphate,	   α-­‐D-­‐Mal-­‐1-­‐P,	   instead	   of	   a	   monosaccharide	   (15).	   The	  mechanisms	  governing	  catalysis	  dictate	  the	  stereochemistry	  of	  the	  phosphorylated	  product,	  which	  allows	  for	  a	  broader,	  mechanism	  based	  classification	  of	  GPase.	  Those	  from	   GH65	   are	   considered	   α-­‐inverting	   enzymes	   because	   they	   invert	   the	  stereochemistry	  of	  the	  original	  glycosidic	  linkage.	  In	  this	  case	  the	  α-­‐configuration	  of	  the	  natural	  substrate	  is	  flipped	  to	  produce	  β-­‐sugar-­‐1-­‐phosphates.	  In	  contrast	  GH13,	  GT4	  and	  GT35	  contain	  α-­‐retaining	  GPases	  that	  produce	  α-­‐sugar-­‐1-­‐phoshates	  from	  α-­‐anomers,	  retaining	  the	  stereochemistry	  of	  the	  substrates.	  GH94,	  GH112	  and	  GH130	  contain	   β-­‐inverting	   GPases,	   which	   cleave	   β-­‐anomers	   to	   produces	   α-­‐sugar-­‐1-­‐phosphates.	  Missing	  are	  the	  β-­‐retaining	  GPases,	  which	  have	  not	  yet	  been	  shown	  to	  exist.	  1.4 Catalytic	  mechanism	  Glycosidases	  use	  a	  water	  molecule	  to	  hydrolytically	  cleave	  glycosidic	  linkages	  with	  either	  net	  retention	  or	  inversion	  of	  stereochemistry.	  Inverting	  glycosidases	  use	  a	   single	   displacement	   mechanism	   where	   a	   carboxylic	   acid	   residue,	   acting	   as	   a	  catalytic	  nucleophile,	  deprotonates	  a	  water	  molecule	  in	  the	  active	  site	  assisting	  it	  to	  attack	  and	  form	  a	  covalent	  bond	  with	  anomeric	  carbon	  (Scheme	  2A).	  In	  concert	  with	  this	  the	  glycosidic	  oxygen	  is	  protonated	  by	  a	  second	  carboxylic	  acid	  residue,	  acting	  as	   an	   acid/base	   catalyst,	   thereby	   facilitating	   bond	   cleavage	   (16).	   In	   contrast,	  retaining	  glycosidases	  use	  a	  double	  displacement	  mechanism	  (Scheme	  2B).	   	   In	   the	  first	  step,	  a	  nucleophilic	  carboxylate	  residue	  attacks	  the	  anomeric	  carbon	  forming	  a	  covalent	   intermediate	  with	   opposite	   stereochemistry.	   In	   the	   process	   an	   acid/base	  carboxylic	   acid	   residue	   protonates	   the	   leaving	   group	   oxygen,	   making	   it	   a	   better	  leaving	   group	   and	   thereby	   enhancing	   bond	   breakage.	   In	   the	   second	   step,	   the	  acid/base	   carboxylate	   residue	   deprotonates	   a	   water	   molecule	   in	   the	   active	   site	  facilitating	  nucleophilic	  attack	  on	  the	  anomeric	  carbon	  of	  the	  intermediate,	  releasing	   	  	   5	  	  	  	  	  	  	  	  	  	  Table	  1:	  List	  of	  known	  GPases,	  their	  CAZy	  family	  and	  reaction	  details.	  (8)	  	  	   	  EC Name Family Mechanism Cleaved2Bond Product2.4.1.7 Sucrose-phosphorylase D4Glc4α1,β24D4Fru α4D4Glc414P2.4.99.16 α41,44Glucan:maltose414P-maltosyltransferase D4Glc4(α1,44D4Glc)n α4D4Mal414P2.4.1.8 Maltose-phosphorylase D4Glc4α1,44D4Glc β4D4Glc414P2.4.1.64 Trehalose-phosphorylase D4Glc4α1,α14D4Glc β4D4Glc414P2.4.1.216 Threhalose464phosphate-phosporylase D4Glc4α1,α14D4Glc46P β4D4Glc414P2.4.1.230 Kojibiose-phosphorylase D4Glc4(α1,24D4Glc)n β4D4Glc414P2.4.1.279 Nigerose-phosphorylase D4Glc4α1,34D4Glc β4D4Glc414P2.4.1.282 34O4α4D4Glucosyl4L4rhamnose-phosphorylase D4Glc4α1,34L4Rha β4D4Glc414P2.4.1.20 Cellobiose-phosphorylase D4Glc4β1,44D4Glc α4D4Glc414P2.4.1.31 Laminaribiose-phosphorylase D4Glc4β1,34D4Glc α4D4Glc414P2.4.1.49 Cellodextrin-phosphorylase D4Glc4(β1,44D4Glc)n α4D4Glc414P2.4.1.280 N,N'4Diacetylchitobiose-phopshorylase D4GlcNac4β1,44D4GlcNac α4D4GlcNAc414P2.4.1.211 Galacto4N4biose/Lacto4N4biose-phosphorylase D4Gal4β1,34D4Gal/Glc α4D4Gal414P2.4.1.247 D4Galactosyl4β41,44L4rhamnose-phosphorylase D4Gal4β1,44L4Rha α4D4Gal414P2.4.1.281 D4Mannosyl4β41,44D4glucose-phosphorylase D4Man4β1,44D4Glc α4D4Man414P2.4.1.4 β41,44Mannooligosaccharide-phosphorylase D4Man4(β1,44D4Man)n α4D4Man414P2.4.1.231 Trehalose-phosphorylase GT4 Retaining D4Glc4α1,α14D4Glc α4D4Glc414P2.4.1.1 Glycogen-phosphorylase GT35 Retaining D4Glc4(α1,44D4Glc)n α4D4Glc414PGH13GH65GH94GH112GH130RetainingInvertingInvertingInvertingInverting	   6	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Scheme	  2:	  Catalytic	  mechanisms	  of	  (A)	  β-­‐inverting	  and	  (B)	  β-­‐retaining	  glycosidases	  and	   (C)	   β-­‐inverting	   and	   (D)	   β-­‐retaining	  GPases	  where	  B	   represents	   the	   acid/base	  residue.	  OOHHOHO OROH BO-O OOHHOHO OH OO enzymeenzymeenzyme H ROH OOHHOHO OH O-O enzyme[catalytic nucleophile][acid/base] OPO- O-OH OPO- O-OBenzyme Benzyme H	   7	  the	  sugar	  product	  and	  free	  enzyme.	  The	  result	  is	  net	  retention	  of	  stereochemistry	  of	  the	   product	   glycan	   (16).	   Modified	   versions	   of	   these	  mechanisms	   are	   shared	  with	  GPases,	   which	   cleave	   the	   glycosidic	   bond	   via	   phosphorolysis,	   where	   a	   phosphate	  molecule	   acts	   in	   the	   place	   of	   water	   generating	   sugar-­‐1-­‐phosphate	   derivatives	  (Scheme	   2C	   and	   D).	   Subtle	   differences	   are	   seen	   in	   the	   mechanism	   due	   to	   the	  acceptor	   substrate,	   phosphate,	   carrying	   a	   negative	   charge	   at	   physiological	   pH.	  Inverting	   GPases	   lack	   the	   catalytic	   base	   residue	   because	   phosphate	   is	   already	  ionized	   and	   therefore	   does	   not	   need	   to	   be	   deprotonated	   (17).	   Retaining	   GPases	  preserve	   their	   catalytic	   nucleophile	   residue,	   as	   it	   is	   still	   required	   to	   perform	   the	  primary	   attack	   on	   the	   anomeric	   carbon.	  However	   their	   acid/base	   residue	  may	   be	  displaced.	   While	   the	   catalytic	   mechanisms	   of	   these	   two	   classes	   of	   enzymes	   are	  similar,	   their	   equilibrium	   constants	   are	   not.	   The	   glycosidase	   reaction	   is	   favoured	  toward	   degradation	   because	   the	   reaction	   is	   driven	   by	   the	   high	   concentration	   of	  water	   (55	  M).	   On	   the	   other	   hand,	   GPases	   use	   the	   free	   energy	   associated	  with	   the	  glycosidic	  linkage	  to	  drive	  the	  formation	  of	  the	  sugar-­‐1-­‐phosphates,	  resulting	  in	  an	  equilibrium	  constant	  close	  to	  1.	  	  1.5 Applications	  of	  GPases	  The	   potential	   utility	   of	   this	   class	   of	   enzymes	   lies	   in	   their	   balanced	   reaction,	  allowing	   one	   to	   control	   the	   direction	   of	   the	   reaction	   in	   vitro.	   By	   including	   large	  amounts	   of	   inexpensive	   phosphate	   we	   can	   drive	   the	   reaction	   towards	   glycan	  degradation	   thereby	   providing	   the	   means	   to	   manufacture	   large	   amounts	   of	  activated	  sugar	  phosphate	   intermediates	   from	  cheap	  glycan	  sources.	  The	  products	  could	   then	   be	   used	   in	   the	   synthesis	   of	   custom	   oligosaccharides	   using	   the	  appropriate	   GPase	   and	   then	   driving	   the	   reaction	   towards	   synthesis	   by	   including	  large	  amounts	  of	   inexpensive	  sugar-­‐1-­‐phosphates	  or	  partitioning	  out	  the	  inorganic	  phosphate.	   In	   a	   few	   select	   cases,	   the	   precedent	   for	   the	   practical	   use	   of	   GPases	   to	  synthesize	   special	   carbohydrates	   has	   already	   been	   set.	   Examples	   include	   the	  glucosylation	   of	   Vitamin	   C	   (18)	   and	   glycerol	   (19),	   as	   well	   as	   the	   synthesis	   of	  prebiotic	  sugars	  (20).	  We	  believe	  there	   is	  a	   large	  market	   for	  such	  approaches	  that	  bypass	   the	   need	   for	   expensive	   nucleotide	   sugar	   phosphates.	   Another	   exciting	  	   8	  possibility	  is	  the	  degradation	  of	  ligno-­‐cellulose	  by	  phosphorylases,	  which	  promise	  to	  improve	  the	  efficiency	  of	  biomass	  to	  bioethanol	  conversion	  when	  expressed	  within	  ethanol-­‐producing	   strains	   by	   directly	   producing	   intermediates	   for	   glycolysis.	   This	  has	  never,	  to	  our	  knowledge,	  been	  considered.	  	  	   The	   bottleneck	   in	   this	   approach	   to	   glycan	   synthesis	   in	   particular	   is	   the	   very	  limited	   range	   of	   GPases	   available,	   thus	   their	   narrow	   substrate	   specificity.	   To	  compensate,	   recent	   studies	   have	   pursued	   the	   broadening	   of	   accepter	   and	   donor	  specificities	   through	  protein	  engineering	  of	  naturally	  occurring	  GPases.	  De	  Groeve	  and	   colleagues	   used	   directed	   evolution	   in	   combination	   with	   site	   saturation	  mutagenesis	   to	   alter	   the	   donor	   specificity	   of	   cellobiose	   phosphorylase	   from	  Cellulomonas	  uda	  (21).	   From	  growing	   clones	  on	  minimal	  media	   containing	   lactose	  they	   found	  an	  enzyme	  variant	  with	   a	  7.5-­‐fold	   increase	   in	   activity	   towards	   lactose.	  Further	   investigation	   revealed	   that	   the	   increase	   in	   activity	   was	   due	   to	   two	  mutations	   (T508àI508	   and	   N667àA667)	   in	   close	   proximity	   to	   the	   active	   site.	   The	  crystal	   structure	   revealed	   that	   residue	   667	   was	   located	   adjacent	   to	   the	   donor	  binding	  site	  while	  residue	  508	  was	  located	  at	  the	  entrance	  to	  the	  active	  site	  (21).	  To	  highlight	  the	  potential	  utility	  of	  GPases,	  this	  new	  “lactose”	  phosphorylase	  was	  then	  quickly	  applied	  to	  synthesize	  large	  quantities	  of	  expensive	  α-­‐galactose-­‐1-­‐phosphate	  from	   inexpensive	   lactose	   and	   phosphate	   (22).	   This	   group	   has	   since	   gone	   on	   to	  develop	   a	   HTS	   for	   the	   identification	   of	   GPases	   from	   directed	   evolution	   libraries	  based	   on	   the	  measurement	   of	   inorganic	   phosphate	   released	   after	   the	   acceptor	   is	  glycosylated	  (23).	  Another	  important	  advance	  towards	  achieving	  broader	  substrate	  specificity	   is	   detailed	   in	   a	   study	   performed	   by	  Nakai	   and	   colleagues.	   They	   used	   a	  rational	   protein	   engineering	   approach	   to	   develop	  maltose	   phosphorylase	   variants	  with	  selective	  activity	   towards	  maltose	   (α-­‐1,4),	   trehalose	   (α-­‐1,1)	  and	  kojibiose	   (α-­‐1,2)	   (24).	  The	  authors	  were	  able	   to	   identify	  a	  key	  determinant	   for	   regiospecificity	  and	   rationally	   modify	   it	   to	   produce	   new	   enzymes	   with	   activity	   towards	   specific	  anomeric	   bond	   configurations.	   While	   these	   approaches	   proved	   moderately	  successful	   at	   increasing	   substrate	   specificity,	   they	   only	   provide	   improvements	   on	  enzymes	   already	   discovered.	   To	   help	   increase	   the	   spectrum	   of	   known	   GPases	  	   9	  available,	  and	   improve	  substrate	  specificity,	  we	  have	  turned	  to	  metagenomics	  as	  a	  source	  of	  discovery	  for	  new	  GPases.	  1.6 Metagenomics	  Metagenomics	   is	   the	   study	   of	   genetic	   material	   recovered	   directly	   from	   the	  environment.	   By	   cloning	   environmental	   DNA	   (eDNA)	   into	   a	   surrogate	   host,	   vast	  libraries	   of	   metagenomic	   clones	   can	   be	   generated	   for	   the	   purpose	   of	   functional	  screening	   and	   sequence	   homology	   studies	   (Figure	   1).	   A	   major	   benefit	   of	  metagenomics	  is	  that	  the	  source	  organism	  does	  not	  need	  to	  be	  cultured	  (25,26).	  It	  is	  estimated	   that	   99%	   of	   microbial	   organisms	   cannot	   currently	   be	   cultured	   in	   a	  laboratory	   environment	   (27).	   By	   probing	   metagenomic	   libraries	   using	   high	  throughput	  activity	  assays	  we	  can	  obtain	  a	  unique	  look	  into	  the	  proteomes	  of	  these	  mysterious	  bacteria	  and	  hopefully	  identify	  interesting	  enzyme	  activities.	  	  	   Our	  collaborators	  in	  the	  Hallam	  lab	  have	  constructed	  metagenomic	  libraries	  of	  eDNA	  totalling	  over	  one	  million	  clones,	  each	  containing	  ~40	  kb	  genomic	  fragments	  cloned	  into	  bacterial	  fosmids.	  The	  metagenomic	  material	  has	  been	  collected	  from	  a	  range	   of	   sample	   types,	   from	   ocean	   water	   to	   soil	   to	   a	   mammalian	   GI	   tract.	   These	  libraries	  represent	  an	  excellent	  resource	  from	  which	  to	  discover	  new	  GPases.	  	   Two	  approaches	  are	   commonly	  used	  when	   screening	  metagenomic	   libraries,	  each	   with	   their	   advantages	   and	   disadvantages.	   In	   function-­‐based	   screening	   one	  assays	   a	   metagenomic	   clone’s	   ability	   to	   carry	   out	   a	   specific	   target	   reaction,	   for	  example,	  one	  involving	  a	  chromogenic	  or	  fluorescent	  reporter	  molecule.	  In	  this	  way	  one	   may	   identify	   a	   class	   of	   enzyme	   for	   which	   sequence	   information	   was	   not	  previously	  available.	  The	  primary	  disadvantage	  of	   function-­‐based	  screening	   is	   that	  of	   poor	   or	   absent	   expression	   of	   anonymous	   eDNA	  by	   the	   host	   strain’s	   expression	  machinery	   (28).	   Thus	   only	   a	   fraction	   of	   the	   genes	   is	   likely	   to	   be	   expressed	   in	   an	  active	  form.	  It	  is	  difficult	  to	  estimate	  what	  percentage	  of	  genes	  will	  be	  expressed.	  As	  some	   guide	   a	   study	   by	   Gabor	   and	   colleagues	   suggests	   that	  ~40%	   of	   an	   enzyme’s	  	  	  	   10	  	   	  	  	  	  	  	  	  	  	  	  	  	  	  	  	   	  Figure	   1:	   Metagenomic	   workflow	   outlining	   the	   process	   of	   constructing	   a	  metagenomic	   library	   from	   isolation	   of	   eDNA	   to	   functional	   and	   sequence-­‐based	  screening.	  Figure	  generously	  donated	  by	  Zach	  Armstrong.	  	   11	  activity	   is	   recovered	   from	   random	   cloning	   in	   E.	   coli	   (29).	   To	   arrive	   at	   this	  approximation	  the	  authors	  put	  forward	  a	  set	  of	  formulas	  that	  describe	  the	  likelihood	  of	   E.	   coli	   being	   able	   to	   express	   a	   gene	   from	   a	   pool	   of	   32	   taxonomically	   distinct	  prokaryotic	   genomes.	   The	   percentage	   is	   likely	   to	   be	   much	   lower	   the	   further	   the	  codon	  usage	  of	  the	  host	  is	  from	  that	  of	  the	  organism	  from	  which	  the	  gene	  derived.	  However,	   progress	   is	   being	   made	   on	   this	   front	   as	   several	   studies	   have	   reported	  improved	  expression	  by	  engineering	  ribosome	  proteins	  to	  broaden	  the	  translation	  machinery	   (30-­‐32).	  While	   getting	   a	   positive	   hit	   from	  a	   functional	   screen	  may	   still	  rely	  largely	  on	  luck,	  this	  is	  not	  a	  game-­‐stopper	  for	  the	  screen	  described	  below,	  as	  we	  are	   only	   interested	   in	   what	   we	   find,	   not	   what	   we	   don’t.	   As	   long	   as	   we	   obtain	   a	  reasonable	  number	  of	  hits	  we	  shall	  be	  able	  to	  use	  this	  approach.	  	   The	  second	  approach,	  sequence	  homology-­‐based	  screening	  can	  be	  done	  much	  faster,	  provided	  the	  eDNA	  sequence	  information	  is	  already	  known,	  and	  does	  not	  rely	  on	  biased	  expression	  machinery.	  However,	  the	  costs	  associated	  with	  sequencing	  an	  entire	  metagenomic	  library	  can	  reach	  upwards	  of	  $250	  000,	  although	  that	  price	  will	  likely	   fall	  significantly	   in	  the	  near	   future.	  Also,	  genes	   identified	  by	  this	  method	  are	  limited	  to	  those	  that	  are	  homologous	  to	  known	  genes,	  a	  limitation	  not	  shared	  with	  function-­‐based	  screens.	  Therefore,	  homology-­‐based	  screens	  are	  unlikely	  to	  uncover	  any	  enzymes	  that	  employ	  a	  novel	  mechanism	  toward	  a	  target	  reaction	  or	  that	  sit	  in	  a	  new	  sequence-­‐based	  family.	  	  	   12	  1.7 Aim	  of	  thesis	  The	  aim	  of	  this	  thesis	  was	  to	  screen	  for	  GPase	  activity	  from	  the	  metagenomic	  resources	   available	   through	   the	   Hallam	   lab.	   I	   have	   taken	   a	   balanced	   approach	   to	  achieving	  this	  objective	  by	  employing	  both	  function-­‐	  and	  sequence	  homology-­‐based	  screens.	  Here,	  I	  report:	  1. The	   development	   of	   a	   phosphate-­‐dependent	   colorimetric	   assay	   to	   identify	  GPases	  from	  metagenomic	  libraries.	  2. A	  functional	  screen	  of	  2	  metagenomic	  libraries	  for	  GPases.	  3. A	  sequence	  homology	  screen	  for	  GH94	  genes.	  4. Preliminary	  characterization	  of	  positive	  hits.	  	   13	  2 Results	  and	  discussion	  2.1 Development	  of	  a	  phosphate-­‐dependent	  screen	  for	  GPases	  2.1.1 Isolating	  GPase	  activity	  A	   central	   challenge	   in	   designing	   a	   screen	   that	   identifies	   GPases	   from	  metagenomic	   libraries,	   is	   distinguishing	   phosphorylase	   activity	   from	   the	   related	  glycosidase	  activity	  in	  a	  high-­‐throughput	  manner.	  One	  approach	  would	  be	  to	  carry	  out	  reactions	  and	  analyse	  for	  the	  formation	  of	  sugar	  phosphates.	  However	  it	  is	  hard	  to	  envisage	  a	  truly	  high-­‐throughput	  method	  of	  achieving	  this,	  since	  even	  MS	  analysis	  would	  be	  difficult	  on	  the	  scale	  envisaged.	  Further,	  it	  is	  quite	  probable	  that	  the	  sugar	  phosphates	  formed	  would	  be	  degraded	  rapidly	  by	  host	  enzymes.	  My	  approach	  was	  to	   focus	   on	   the	   role	   of	   phosphate,	   which	   does	   not	   participate	   in	   glycosidase-­‐mediated	  hydrolysis.	  Glycosidases	  carry	  out	  a	  similar	  reaction	  to	  GPases,	  however,	  where	   GPases	   use	   a	   phosphate	   molecule	   to	   cleave	   the	   glycosidic	   linkage,	  glycosidases	  use	  H2O	  (Scheme	  1).	  Any	  simple	  screening	  methodology	  that	  employed	  phosphate	   in	   an	   aqueous	   assay	   buffer	   would	   likely	   result	   in	   the	   identification	   of	  both	  glycosidase	  and	  GPase	  activities.	  Since	  GPases	  are	  far	  less	  abundant	  in	  nature	  than	  glycosidases,	  a	  pool	  of	  positive	  hits	  would	  be	  primarily	  glycosidases	  with	   the	  odd	  GPase	  scattered	  throughout.	  To	  tease	  out	  the	  GPases	  I	  proposed	  a	  methodology	  employing	   parallel	   screens,	   one	   in	   the	   presence	   and	   the	   other	   in	   the	   absence	   of	  phosphate.	   GPases	   will	   only	   be	   active	   in	   the	   presence	   of	   phosphate	   while	  glycosidases	  will	   be	  active	   in	  both.	  By	   comparing	   the	   two	   screens	  we	   can	   identify	  which	  hits	  are	  the	  GPase	  and	  which	  are	  the	  glycosidases	  (Figure	  2).	  2.1.2 Known	   cellobiose	   and	   cellodextrin	   phosphorylases	   as	   test	  enzymes	  	   For	  this	  approach	  to	  be	  successful	  GPases	  must	  only	  be	  active	  in	  the	  presence	  of	  phosphate.	  To	  evaluate	  whether	  reagents	  we	  developed	  would	  indeed	  behave	  in	  this	   way	   we	   cloned	   two	   known	   GPases:	   cellobiose	   phosphorylase	   (CBP)	   and	  cellodextrin	   phosphorylase	   (CDP)	   from	   Clostridium	   thermocellum	   (3.1	   Cloning)	   as	  	  	  	   14	  	  	  	  	  	  	  	  	  	  	  	  Figure	  2:	  Methodology	  employed	  to	  differentiate	  glycosidases	  and	  GPases.	  0"mM"phosphate""100"mM"phosphate"glycosidase+GPase+no+ac/vity+	   15	  test	   enzymes.	   CBP	   carries	   out	   the	   phosphorylase	   reaction	   shown	   in	   Scheme	   3A,	  while	  CDP	  (Scheme	  3B)	  sequentially	  cleaves	  glucose	  residues	  from	  the	  non-­‐reducing	  terminus	   of	   cello-­‐oligosaccharides,	   liberating	   α-­‐glucose-­‐1-­‐phosphate	   in	   the	   same	  way.	  As	  can	  be	  seen	  both	  are	  inverting	  β-­‐glycoside	  phosphorylases,	  thus	  both	  form	  α-­‐glucose-­‐1-­‐phosphate.	  CBP	  and	  CDP	  activities	  were	   confirmed	  by	  monitoring	   the	  production	   of	   sugar-­‐1-­‐phosphates	   from	   oligosaccharides	   by	   TLC	   (Figure	   3and	   B;	  lanes	   1-­‐3).	   When	   phosphate	   is	   absent	   no	   substrate	   cleavage	   occurs	   (Figure	   3C).	  Additionally,	  due	  to	  the	  equilibrium	  constant	  being	  close	  to	  1,	  the	  reverse	  reaction	  was	  also	  monitored	  with	  sugar-­‐1-­‐phosphates	  and	  an	  appropriate	  donor	  (glucose	  or	  cellobiose)	   as	   the	   starting	   material	   (Figure	   3A	   and	   B;	   lanes	   4-­‐6).	   After	   the	   full	  incubation	   time,	   equilibrium	   was	   reached	   as	   indicated	   by	   starting	   material	   and	  products	  being	  present	  in	  roughly	  equal	  amounts.	  	  2.1.3 Chromogenic	  substrates	  To	  visually	  identify	  active	  clones	  we	  assayed	  a	  set	  of	  aryl	  glycoside	  substrates	  against	  our	  two	  test	  GPases.	  My	  hope	  was	  that	  these	  compounds	  would	  bind	  at	  the	  active	  site	   in	  the	  place	  of	   the	  natural	  substrate,	  releasing	  a	  coloured	   leaving	  group	  after	  being	  processed	  by	  the	  enzyme.	  The	  leaving	  group	  could	  then	  be	  detected	  by	  measuring	  absorbance	  of	  the	  assay	  liquid.	  These	  compounds	  are	  commonly	  used	  to	  assay	  and	   identify	  glycosidases	  but	   there	  have	  been	  no	  reports	   in	   the	   literature	  of	  them	  being	  used	  for	  assay	  of	  GPase	  activity.	  It	  was	  expected	  that	  the	  reaction	  could	  occur	   with	   the	   phosphate	   displacing	   the	   aglycone.	   Glycosyl	   fluorides	   have	   been	  shown	   to	   function	   in	   this	   way	   (33,34),	   however,	   these	   compounds	   are	   not	  convenient	   for	  use	   in	  a	  high-­‐throughput	  screen.	  We	  tested	  5	  aryl	  glycosides	   in	   the	  presence	  of	  phosphate	  to	  determine	  which	  was	  the	  best	  candidate	  (Figure	  4).	  Three	  of	   the	   aryl	   glycosides	   contained	   a	   fluorescent	   aglycone:	   4-­‐methylumbelliferyl	  β-­‐D-­‐glucopyranoside	  (MU-­‐Glc),	  6-­‐chloro-­‐4-­‐methylumbelliferyl	  β-­‐D-­‐glucopyranoside	  (Cl-­‐MU-­‐Glc)	   	   and	   7-­‐Hydroxy-­‐9H-­‐(1,3-­‐dichloro-­‐9,9-­‐dimethylacridin-­‐2-­‐one)	   β-­‐D-­‐glucopyranoside	   (DDAO-­‐Glc).	   Unfortunately	   none	   of	   these	   compounds	   were	  processed	  by	  CBP	  or	  CDP.	  The	  other	  2	  candidate	  aryl	  glycosides	  were,	  p-­‐nitrophenyl	  	   	  	   16	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Scheme	  3:	  (A)	  Reaction	  catalyzed	  by	  C.	  thermocellum	  CBP	  and	  (B)	  reaction	  catalyzed	  by	  C.	  thermocellum	  CDP.	  	   17	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  3:	  Activity	  profile	  of	  (A)	  CBP,	  (B)	  CDP	  and	  (C)	  phosphate	  dependency	  of	  CBP.	  30	  μg	  of	  purified	  enzyme	  was	  incubated	  for	  60	  min	  at	  40	  °C,	  samples	  were	  spotted	  on	  TLC	  plate	  at	  the	  indicated	  times.	  Forward	  reaction	  starting	  material:	  (A)	  CBP	  –	  10	  mM	  cellobiose	  and	  100	  mM	  phosphate;	   (B)	  CDP	  –	  10	  mM	  cellotriose	  and	  100	  mM	  phosphate.	  Reverse	  reactions	  starting	  material:	  (A)	  CBP	  –	  10	  mM	  D-­‐glucose	  and	  10	  mM	  Glc-­‐1-­‐P;	  (B)	  CDP	  –	  10	  mM	  cellobiose	  and	  10	  mM	  Glc-­‐1-­‐P.	  (C)	  CBP	  was	  incubated	  with	  10	  mM	  cellobiose	  and	  0	  or	  100	  mM	  phosphate.	  	   18	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  4:	  Chemical	  structures	  of	  assayed	  chromogenic	  substrates.	  OHOHO OHOH O O OCH3OHOHO OHOH O N CH3H3CClCl OOHOHO OHOH NO2OOHOHO OHOH NO2O NO24-methylumbelliferyl beta-D-glucopyranoside (MU-Glc)OHOHO OHOH O O OCH3Cl6-chloro-4-methylumbelliferyl beta-D-glucopyranoside (Cl-MU-Glc)7-Hydroxy-9H-(1,3-dichloro-9,9-dimethylacridin-2-one) beta-D-glucopyranoside (DDAO-Glc)p-nitrophenyl beta-D-glucopyranoside (pNPGlc)2,4-dinitrophenyl beta-D-glucopyranoside (DNPGlc)	   19	  β-­‐D-­‐glucopyranoside	   (pNPGlc)	   and	   2,4-­‐dinitrophenyl	   β-­‐D-­‐glucopyranoside	  (DNPGlc),	   which	   release	   aglycones	   that	   absorbs	   in	   the	   visual	   wavelength	   range.	  pNPGlc	  showed	  limited	  activity	  in	  the	  presence	  of	  phosphate,	  but	  the	  best	  candidate	  was	   the	  highly	   activated	   chromogenic	   substrate	  DNPGlc.	  The	   low	  pKa	   (4.0)	   of	   the	  DNP	  released	  provides	  the	  substrate	  with	  high	  reactivity	  and	  also	  ensures	  that	  DNP	  is	  fully	  ionized	  in	  a	  variety	  of	  pH	  conditions,	  maximising	  colour	  development.	  In	  the	  presence	   of	   phosphate,	   GPases	   accept	   DNPGlc	   in	   place	   of	   the	   natural	   substrate,	  synthesizing	  glucose-­‐1-­‐phosphate	  while	  releasing	  a	  coloured	  leaving	  group	  that	  can	  be	  detected	  at	  400	  nm	  (Scheme	  4).	  As	  can	  be	  seen	  in	  Figure	  5A,	  both	  enzymes	  are	  active	   in	   the	   presence	   of	   phosphate,	   with	   minimal	   activity	   in	   its	   absence.	  Furthermore,	   Glc-­‐1-­‐P	   is	   produced	   by	   CBP	   and	   CDP	   demonstrating	   the	   reaction	  shown	  in	  Scheme	  4	  (Figure	  5B).	  This	  therefore,	  provides	  the	  basis	  of	  the	  screen,	  in	  which	   lysed	   extracts	   are	   assayed	   from	   metagenomic	   clones	   in	   the	   presence	   and	  absence	  of	  phosphate.	  GPases	  will	   give	  a	   signal	  only	   in	   the	  presence	  of	  phosphate	  while	  glycosidases	  will	  be	  active	  in	  both.	  	  2.2 Metagenomic	  screen	  2.2.1 Source	  of	  environmental	  DNA	  To	   identify	   the	   environmental	   clones	   that	   possess	   GPase	   activity	   two	  metagenomic	   libraries	   were	   screened.	   The	   first	   library,	   designated	   FOS62,	   was	  constructed	  from	  eDNA	  recovered	  from	  a	  biochemical	  reactor	  (BCR)	  system	  located	  in	  Trail,	  British	  Colombia	  (35).	  The	  BCR	  was	  designed	  to	  aid	   in	  the	  remediation	  of	  contaminated	  water	  produced	  as	  a	  by-­‐product	  of	  local	  smelting	  operations	  (36).	  The	  BCR	  was	   fuelled	  mostly	  by	  cellulose	  rich,	  pulp	  mill	  bio-­‐solids,	  which	  promotes	   the	  growth	   of	   microorganisms	   capable	   of	   removing	   arsenic,	   cadmium	   and	   zinc	   from	  smelting	  wastewater	   (35-­‐37).	   Given	   that	   the	  main	   carbon	   source	   is	   cellulosic	   bio-­‐solids,	   the	   BCR	   is	   likely	   enriched	  with	  microorganisms	   capable	   of	   cleaving	   β	   1,4-­‐glycosidic	   linkages.	   The	   second	   library,	   designated	   TolDC,	   was	   collected	   from	   a	  toluene-­‐degrading	   microbial	   community	   located	   in	   a	   hydrocarbon-­‐contaminated	  	  	  	   20	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Scheme	  4:	  Chomogenic	  assay	  target	  reaction	  for	  GPases.	  OOH OHOHO OH NO2 NO2beta-DNP-Glc OOHHOHO OH OPO3-2 -O NO2 NO2alpha-Glc-1-PCBP or CDP100mM phosphate + DNP***appears bright yellow	   21	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  5:	  Phosphorolysis	  of	  DNPGlc	  by	  CBP	  and	  CDP.	   (A)	  5	  μg	  of	  purified	  enzyme	  was	  incubated	  with	  2	  mM	  DNPGlc	  and	  0	  or	  100	  mM	  phosphate	  for	  60	  min	  at	  30	  °C	  (200	  μL	  reaction	  volume).	  Background	  was	  measure	  by	  running	  a	  reaction	  with	  no	  enzyme.	  DNP	  concentration	  was	  determined	  by	  measuring	  absorbance	  at	  400	  nm.	  (B)	   TLC	   reactions	   prepared	   by	   combining	   50	   μg	   of	   purified	   enzyme	  with	   10	  mM	  DNPGlc	  and	  0	  or	  100	  mM	  phosphate	  and	  incubated	  for	  2	  h	  at	  30	  °C	  (20	  μL	  reaction	  volume).	  	   22	  site.	   This	   library	   was	   selected	   because	   it	   was	   predicted	   to	   contain	   the	   highest	  proportion	   of	   GH94	   enzymes	   based	   on	   fosmid	   end	   sequencing	   (Figure	   6).	   Here,	  fosmid	   end	   sequences	   were	   BLASTed	   against	   the	   sequences	   of	   the	   CAZy	   library.	  Clones	  were	  considered	  positive	  if	  the	  fosmid	  end	  sequences	  contained	  a	  minimum	  read	  length	  of	  50	  bp	  and	  at	  minimum	  a	  BLAST	  score	  ratio	  of	  0.32	  corresponding	  to	  a	  known	  CAZy	  family.	  End	  sequencing	  can	  be	  used	  as	  an	  indicator	  of	  the	  total	  content	  of	  GH94	  genes	  present	  in	  a	  given	  library,	  assuming	  there	  is	  no	  bias	  for	  the	  specific	  gene	   of	   interest	   to	   be	   at	   the	   end	   of	   a	   fosmid.	   However,	   the	   specific	   fosmids	   so	  identified	  will	   rarely	   contain	  a	   full	  ORF.	  As	   the	  name	  entails,	   end	   sequencing	  only	  takes	   into	  account	   sequences	   (~500	  bp)	  on	   the	   terminal	   ends	  of	   the	  eDNA.	  Genes	  located	   this	   close	   to	   the	   termini	   are	   likely	   truncated	   and	   therefore	   unlikely	   to	  produce	  a	   functional	  enzyme.	  However	   it	   is	  probable	   that	   there	  will	  be	   full-­‐length	  GH94	  genes	  within	  fosmid	  libraries	  that	  show	  high	  GH94	  content	  in	  end	  sequences.	  2.2.2 Expression	  system	  Fosmid	   libraries	   were	   constructed	   by	   ligating	   eDNA	   (~40	   000	   bp)	   into	   the	  pCC1	   copy	   control	   vector	   and	   transfected	   to	   the	   E.	   coli	   EPI300	   expression	   host	  (Figure	   1)	   (37).	   A	   concern	   when	   working	   with	   large	   fosmids	   is	   the	   sequence	  stability.	   To	   maximise	   stability,	   low-­‐copy	   bacterial	   artificial	   chromosomes	   (BAC)	  vectors	   (38)	   have	   often	   been	   used	   (39-­‐42).	   These	   vectors	   use	   the	  E.	   coli	   F-­‐factor	  replication	   mode,	   hence	   the	   name	   “fosmid”.	   In	   this	   system	   replication	   is	   strictly	  controlled	  at	  the	  origin	  of	  replication,	  oriS,	  which	  limits	  the	  copy	  number	  to	  1-­‐2	  per	  host	  chromosome	  (43).	  The	  downside	  to	  this	  method	  is	   that	   the	   low	  copy	  number	  often	   limits	   protein	   expression	   to	   below	   detectable	   levels.	   To	   overcome	   this	  limitation	   the	  eDNA	  was	  cloned	  using	   the	  pCC1	  copy	  control	  system	  developed	  by	  Wild	  and	  colleagues	  (43).	  The	  pCC1	  vector	  allows	  for	  expression	  to	  be	  “switched	  on”	  by	   controlling	   the	   fosmid	   copy	   number.	   The	   pCC1	   vector	   is	   based	   on	   the	   original	  BAC	   vector,	   so	   it	   still	   uses	   an	   oriS	   to	   tightly	   control	   replication.	   The	   difference	  however,	  is	  a	  second	  high-­‐copy	  origin	  of	  replication,	  oriV,	  which	  is	  normally	  inactive	  in	   the	   host	   because	   they	   do	   not	   produce	   the	   TrfA	   replication	   protein	   needed	   for	  	  	   	  	   23	  	  	  	  	  	  	  	  	  	  Figure	  6:	   (A)	  BLAST	   report	   of	   the	  Hallam	  Lab’s	   fosmid	   end	   sequences	   against	   the	  CAZy	   library.	   A	   positive	   was	   determined	   according	   to	   the	   following	   search	  parameters:	   minimum	   read	   length:	   50	   bp;	   minimum	   BLAST	   score:	   0.32.	   (B)	  Magnified	  GH94	  results	  predicting	  6	  positive	  hits	  from	  the	  TolDC	  library.	  	   	  	   24	  replication	  at	  oriV.	  The	  EPI300	  host	  strain	  used	  for	  this	  screen	  contain	  a	  copy	  of	  a	  mutant	  trfA	  gene	  under	  tight	  control	  of	  the	  araC-­‐ParaBAD	  promoter	  (43).	  Therefore,	  in	  the	  pCC1	  system	  the	  fosmids	  are	  maintained	  at	  single	  copy	  levels	  to	  ensure	  stability,	  but	  when	  TrfA	  expression	  is	  induced	  by	  arabinose,	  the	  DNA	  is	  amplified	  up	  to	  100-­‐fold	  and	  expression	  is	  increased	  to	  detectable	  levels.	  2.2.3 FOS62	  library	  The	  FOS62	  library	  was	  initially	  screened	  (with	  no	  modifications)	  according	  to	  the	   methodology	   outlined	   by	   Mewis	   and	   colleagues	   (44),	   where	   cellulase	   and	  glucosidase	   activity	   was	   identified	   using	   DNP-­‐cellobioside	   as	   the	   chromogenic	  substrate.	  The	  substrate	  used	   in	  the	  FOS62	  screen,	  DNPGlc,	  was	   found	  to	  be	  much	  less	  stable	  than	  DNP-­‐cellobioside,	  resulting	  in	  high	  background	  and	  thus,	  many	  false	  positives	   being	   identified.	   To	   correct	   for	   this	   in	   subsequent	   screens,	   the	   assay	  incubation	   time	  was	   shortened	   from	  18	  h	   to	   6	   h.	   This	  minimized	   the	   background	  significantly,	   allowing	   true	   positive	   hits	   to	   be	   identified	   (Figure	   7).	   Also,	   to	   help	  minimize	  bias	  introduced	  by	  varying	  cell	  density	  the	  activity	  signal	  was	  normalized	  to	  600	  nm,	  which	  is	  the	  wavelength	  commonly	  used	  to	  determine	  optical	  density	  of	  liquid	   bacterial	   cultures	   (Figure	   8).	   Data	   for	   the	   FOS62	   screen	   have	   not	   been	  included	  since	  the	  screen	  was	  still	  at	  a	  development/optimization	  stage	  at	  the	  time	  this	  screen	  was	  preformed.	  Instead	  I	  will	  discuss	  the	  subsequent	  screen	  of	  the	  TolDC	  library	  using	  the	  optimized	  methodology.	  However,	  the	  FOS62	  screen	  did	  yield	  one	  positive	  hit,	  which	  is	  discussed	  later.	  	  2.2.4 TolDC	  library	  The	  TolDC	  library	  contained	  22655	  clones	  arrayed	  in	  384-­‐well	  culture	  plates,	  which	   were	   screened	   as	   described	   in	   the	   methods	   section	   (3.8	   High	   throughput	  functional	  screen)	  and	  the	  workflow	  in	  Figure	  9.	  The	  chromogenic	  DNPGlc	  substrate	  specifically	   identifies	   enzymes	   that	   are	   able	   to	   hydrolyze	   and,	   in	   the	   presence	   of	  phosphate,	   phosphorolyse	   β	   1,4-­‐glycosidic	   linkages	   by	   releasing	   DNP,	   which	  absorbs	  strongly	  at	  400	  nm.	  The	  clones	  were	  scored	  based	  on	  a	  400/600	  nm	  ratio,	  and	   those	   scoring	   greater	   than	   5	   standard	   deviations	   from	   the	   mean	   (2.8)	   were	  	  	   	  	   25	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  7:	  Picture	  of	  a	  section	  of	  a	  384-­‐well	  FOS62	  plate	  highlighting	  the	  reduction	  in	  background	  signal	  when	  reducing	  incubation	  time	  from	  18	  h	  to	  6	  h.	  The	  assay	  plate	  was	   incubated	   for	  18	  h	  at	  37	   °C.	  Photos	  were	   taken	  at	  6	  and	  18	  h.	  Accompanying	  black	   and	   white	   photos	   were	   altered	   to	   change	   the	   yellow	   channel	   to	   black,	  emphasizing	  positive	   signals.	  Positive	   clones	  are	   located	  at	  B2,	  C2,	  F1,	  G2,	  G3,	  H2,	  and	  H4.	  	   26	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	   	  	  	  Figure	  8:	  Comparison	  of	  signal	  measurements	  taken	  at	  400nm	  to	  normalizing	  with	  600nm.	  Plus	  (blue)	  and	  minus	  (green)	  phosphate	  sample	  FOS62	  assay	  plates	  were	  incubated	  for	  6h	  at	  37	  °C	  and	  absorbance	  measurements	  were	  taken	  at	  400	  and	  600	  nm.	  (A)	  displays	  a	  bar	  graph	  of	  400	  nm	  only	  and	  (B)	  shows	  the	  same	  400	  nm	  signals	  normalized	   with	   their	   corresponding	   600	   nm	   value.	   Asterisks	   indicate	   positive	  clones	   identified	  by	  visual	   inspection	   (yellow	  wells).	  Red	  dotted	   lines	   indicate	   the	  mean.	  	   27	  	  	  Figure	   9:	   Workflow	   outlining	   the	   identification	   of	   GPases	   from	   a	   metagenomic	  library.	  	   28	  considered	  active	  clones.	  Functional	  screening	  using	  these	  cut-­‐offs	  yielded	  97	  active	  clones,	  a	  positive	  hit	  rate	  of	  0.43	  %	  or	  1	  in	  234	  clones	  (Figure	  10).	  These	  97	  clones	  were	   re-­‐arrayed	   into	   a	   384-­‐well	   consolidation	   plate.	   To	   limit	   the	   number	   of	  overlooked	  active	   clones,	   and	   to	  utilize	   the	  extra	  287	  wells,	   I	   also	   included	  clones	  that	  met	  the	  following	  parameters:	  1.	  Absorbance	  at	  400	  nm	  greater	  than	  3	  standard	  deviations	  from	  the	  mean	  (0.54):	  113	  clones.	  These	  were	  included	  to	  ensure	  that	  all	  clones	   that	   gave	   a	   large	   400	   nm	   signal	   independent	   of	   cell	   density	   were	   not	  overlooked.	  However,	  the	  majority	  of	  clones	  from	  this	  group	  were	  identified	  due	  to	  high	  cell	  density.	  2.	  Positive	  clones	  from	  end	  sequence	  BLAST	  report	  (Figure	  6):	  46	  clones	   (only	   6	   of	   the	   46	   clones	   were	   identified	   as	   containing	   GH94s	   by	   end	  sequencing).	   It	   is	   highly	   unlikely	   that	   these	   clones	   will	   contain	   a	   GPase	   ORF,	   but	  given	   the	   extra	   space	   available	   on	   the	   validation	   plate	   I	   elected	   to	   include	   them.	  These	   loosened	   parameters	   increased	   the	   number	   of	   clones	   in	   the	   384-­‐well	  consolidation	  plate	  from	  97	  to	  238,	  although	  it	  is	  likely	  that	  the	  additional	  clones	  do	  not	  contain	  any	  GPases.	  2.2.5 Consolidation	  plate	  	  To	  identify	  GPase	  activity	  the	  consolidation	  plate	  was	  screened	  in	  the	  presence	  and	   absence	   of	   phosphate	   (as	   described	   in	   the	   methods	   3.9	   Consolidation).	   To	  demonstrate	  that	  the	  screen	  is	  able	  to	  identify	  GPase	  activity	  from	  an	  assay	  mixture	  containing	  cell	  lysates,	  CBP	  expression	  cells	  were	  used	  as	  a	  positive	  control.	  Figure	  11A	   displays	   the	   screening	   results	   with	   (red	   trace)	   and	   without	   (black	   trace)	  phosphate.	   The	   difference	   between	   400/600	   ratio	   score,	   or	   ΔPi,	   is	   displayed	   in	  Figure	   11B.	   Those	   11	   clones	  with	   a	   ΔPi	   greater	   than	   0.8,	   or	   half	   the	   signal	   of	   the	  positive	   control,	  were	  designated	   for	   further	  validation.	  To	  distinguish	   true	  GPase	  activity	  TLC	  was	  used	  to	  look	  for	  the	  formation	  of	  Glc-­‐1-­‐P.	  This	  step	  would	  be	  useful	  because	   phosphate-­‐dependent	   activity	   may	   be	   a	   result	   of	   glycosidases	   whose	  activity	   is	   enhanced	   by	   phosphate,	   perhaps	  where	   phosphate	   acts	   as	   an	   allosteric	  activator.	  	   	  	   29	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	   10:	   400/600nm	   absorbance	   measurements	   from	   22656	   clones	   from	   the	  TolDC	  library.	  Blue	  dotted	  line	  indicates	  5	  standard	  deviations	  from	  the	  mean	  (4.9).	  TolDC Scatter Plot0 5000 10000 15000 20000051015Well400nm/600nm	   30	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	   11:	   (A)	   400/600nm	   absorbance	   measurements	   in	   the	   presence	   (red)	   and	  absence	   (black)	   of	   phosphate	   from	   the	   238	   clones	   from	   the	   TolDC	   consolidation	  plate.	   The	   CBP	   expression	   strain	  was	   included	   in	   the	   screen	   as	   a	   positive	   control	  (black	   box).	   (B)	   The	   ΔPi	   plot	   shows	   the	   difference	   in	   the	   average	   400/600nm	  absorbance	   in	   the	   presence	   and	   absence	   of	   phosphate.	   Those	   clones	   with	   a	   ΔPi	  greater	   than	  half	   the	  ΔPi	  of	   the	  positive	  control	   (blue)	  were	  designated	  phosphate	  dependent	  (red)	  and	  designated	  for	  further	  validation.	  	   31	  2.2.6 TLC	  validation	  The	  11	  clones	  were	  screened	  for	  Glc-­‐1-­‐P	  production	  by	  TLC	  (Figure	  12).	  Low	  levels	  of	  glucose	  can	  be	  seen	  in	  all	  lanes	  due	  to	  background	  hydrolysis	  and	  cleavage	  by	   endogenous	  E.	   coli	   enzymes	   present	   in	   the	   cell	   lysate,	   as	   demonstrated	   by	   the	  faint	  glucose	  spot	  in	  the	  negative	  control	  lane	  (Figure	  12A,	  lane	  1).	  The	  three	  lanes	  containing	  more	  intense	  glucose	  signals	  were	  identified	  as	  glycosidases	  (Figure	  12A,	  lanes	  3,	  5	  and	  8).	  Importantly	  these	  had	  already	  been	  identified	  in	  previous	  screens	  by	  the	  Hallam	  lab	  and	  the	   fosmids	  were	  shown	  to	  contain	  CAZymes.	  The	  M12	  and	  O13	  clone	  each	   contained	  3	  CAZymes	  belonging	   to	   the	   families	  GH3,	  GH16,	  GH30,	  and	   the	   I04	   clone	   contained	   a	   GH1	   and	   a	   GH13.	   A	   DNA	   alignment	   of	   the	   whole	  fosmid	   sequences	   of	   M12	   and	   O13	   revealed	   that	   they	   possessed	   the	   same	  overlapping	  fosmid	  sequences,	  explaining	  why	  they	  both	  contained	  the	  same	  set	  of	  CAZy	  genes.	  Unfortunately	  however,	  none	  of	   the	  11	  clones	   from	  the	  TolDC	   library	  could	  be	  shown	  to	  produce	  Glc-­‐1-­‐P	  by	  TLC	  (Figure	  12A).	  There	  is	  a	  possibility	  that	  glucose-­‐1-­‐phosphate	  is	  being	  degraded	  within	  the	  mixture	  due	  to	  endogenous	  E.	  coli	  enzymes	  present	  in	  the	  cell	  lysate,	  that	  may	  well	  degrade	  Glc-­‐1-­‐P.	  We	  do	  see	  Glc-­‐1-­‐P	  being	   produced	   in	   the	   positive	   control,	   however,	   in	   this	   case	   CBP	  was	   expressed	  from	  a	  pET	  vector.	  The	  inducible	  T7-­‐promoter	  ensures	  high	  expression	  levels	  in	  the	  positive	  control.	  Presumably	  the	  expression	  levels	  of	  potential	  GPases	  from	  fosmid	  DNA	  are	  much	  lower.	  As	  a	  result,	  clones	  producing	  GPases	  from	  a	  fosmid	  may	  have	  their	   Glc-­‐1-­‐P	   processed	   by	   endogenous	   E.	   coli	   enzymes,	   thereby	   supressing	   the	  identification	  of	  positive	  hits.	  Potential	  solutions	  to	  this	  issue	  are	  discussed	  below.	  	   One	   phosphate-­‐activated	   clone	   from	   the	   FOS62	   library	   (designated	   B5)	  was	  indeed	   seen	   to	   produce	   Glc-­‐1-­‐P	   in	   the	   presence	   of	   phosphate	   (Figure	   12B).	  Interestingly,	  the	  B5	  clone	  also	  appeared	  to	  be	  able	  to	  hydrolyze	  the	  substrate	  in	  the	  absence	  of	  phosphate,	  but	  with	  phosphate	  present,	  phosphorolysis	  dominated.	  This	  raises	  then	  the	  question	  as	  to	  why	  Glc-­‐1-­‐P	  was	  so	  easily	  identified	  in	  the	  B5	  clone	  of	  the	  FOS62	   library,	  but	  not	   from	  the	  TolDC	   library?	  A	   likely	  explanation,	  as	  will	  be	  proven	  below,	   is	   that	   the	  B5	   clone	  produces	   the	  β-­‐anomer	  of	  Glc-­‐1-­‐P,	  while	  other	  	  	  	   32	  	  	  	  	  Figure	  12:	  TLC	  of	  validation	  of	  the	  (A)	  TolDC	  library	  positive	  hits	  from	  and	  (B)	  the	  B5	  fosmid	  clone	  from	  the	  FOS62	  library.	  Whole	  cell	  lysates	  from	  the	  indicated	  clones	  were	  incubated	  with	  1	  mM	  DNPGlc	  for	  1	  h	  at	  30	  °C.	  	   33	  GPases	   may	   be	   producing	   the	   α-­‐anomer,	   the	   latter	   being	   a	   key	   metabolic	  intermediate	  in	  glycolysis	  and	  thus	  readily	  degraded.	  Evidence	  for	  the	  formation	  of	  β-­‐Glc-­‐1-­‐P	   is	   presented	   below,	   but	   first,	   in	   order	   to	   investigate	   the	   dual	  glycosidase/GPase	   activity	   of	   the	   B5	   fosmid,	   and	   in	   the	   absence	   of	   a	   full	   fosmid	  sequence	   (at	   that	   time),	   a	   sub-­‐library	   was	   generated	   to	   identify	   the	   gene	  responsible.	  2.2.7 B5	  sub-­‐library	  Due	  to	  the	  large	  size	  of	  the	  fosmid,	   identifying	  the	  gene	  of	   interest	  on	  the	  B5	  fosmid	  through	  traditional	  sequencing	  methods	  could	  not	  be	  done	  in	  a	  time-­‐efficient	  manner.	  To	  facilitate	  Sanger	  sequencing	  we	  elected	  to	  generate	  a	  sub-­‐library	  where	  each	   clone	   contains	   a	   fragmented	   sequence	   of	   the	   original	   fosmid	   ligated	   into	   a	  pUC19	  vector.	  The	   sub-­‐library	  was	   then	   screened	  with	  DNPGlc	  and	  phosphate	   (as	  described	  above).	  Those	  clones	  that	  possess	  the	  same	  activity	  as	  the	  original	  fosmid	  will	  contain	  the	  gene	  of	   interest	  on	  a	  much	  smaller	  plasmid	  that	  can	  be	  sequenced	  by	  traditional	  methods.	  The	  detailed	  description	  of	  the	  sub-­‐library	  construction	  can	  be	   found	   in	   the	   methodology	   section	   (3.10	   Sub-­‐library)	   and	   is	   outlined	   in	   the	  workflow	  in	  Figure	  13.	  The	  sub-­‐library	  consisted	  of	  336	  clones,	  18	  of	  which	  showed	  GPase	  activity	  (5.4	  %	  positive	  hit	  rate).	  All	  the	  positive	  hits	  sequenced	  contained	  the	  same	  coding	  region,	  which	  included	  a	  CAZy	  GH3	  N-­‐acetylglucosaminidase	  (hereafter	  referred	   to	   as	   BglX).	   GH3	   is	   a	   CAZy	   family	   containing	   β-­‐glucosidases,	  glucosaminidases	   and	   xylosidases,	   with	   a	   small	   number	   having	   dual	   glucosidase-­‐glucosaminidase	  activity.	  BglX	  was	  heterologously	  expressed	  and	  purified	  and	  was	  shown	   to	   be	   responsible	   for	   the	   original	   phosphate-­‐dependent	   activity.	   It	   had	   the	  unusual	   activity	   of	   acting	   as	   a	   GPase	   in	   the	   presence	   of	   phosphate	   and	   as	   a	  glycosidase	   in	   its	   absence	   (Figure	  14).	  This	  was	  of	   some	   surprise,	   as	  neither	  BglX	  (previously	   unknown	   in	   any	   case),	   nor	   any	   other	   GH3	   N-­‐acetylglucosaminidase	  were	  known	  to	  possess	  phosphorylase	  activity.	  	  	   	  	   34	  	  	  	  	  	  	  	  	  	  	  	  Figure	   13:	   Workflow	   outlining	   the	   generation	   of	   a	   sub-­‐library	   from	   a	   fosmid	  containing	   a	   gene	   of	   interest.	   A	   sub-­‐library	   built	   of	   fosmid	   fragments	   ligated	   into	  pUC19	  facilitates	  traditional	  sequencing	  methods,	  such	  as	  Sanger	  sequencing.	  	   	  	   35	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	   14:	   BglX	   activity	   toward	   10	  mM	  DNPGlc	   in	   the	   presence	   of	   0	   and	   100mM	  phosphate.	  Reaction	  was	  incubated	  for	  1	  h	  at	  37	  °C.	  	   36	  2.3 BglX	  characterization	  Through	   homology	   comparison	  BglX	  was	   predicted	   to	   belong	   to	   a	   GH3	   sub-­‐group,	  the	  N-­‐acetylglucosaminidases.	  These	  enzymes	  play,	  probably	  amongst	  others,	  a	   role	   in	   the	   recycling	   of	   peptidoglycan	   by	   removing	   the	   non-­‐reducing	   end,	   N-­‐acetylglucosamine	  from	  the	  disaccharide	  product	  released	  by	  lytic	  transglycosylases	  and	   muramidases	   (GlcNAc-­‐anhMurNAc-­‐peptide)	   in	   gram-­‐negative	   and	   gram-­‐positive	   bacteria,	   respectively	   (45)	   (Scheme	   5).	   The	   resultant	   (anhydro)-­‐MurNAc	  peptide	   is	   an	  activator	  of	  β-­‐lactamase	  production	   in	   some	  gram-­‐negative	  bacteria,	  rendering	  the	  GH3	  N-­‐acetylglucosaminidases	  a	  possible	  therapeutic	  target	  (46-­‐48).	  Accordingly,	   considerable	   attention	  has	   been	  devoted	   to	   the	   generation	   of	   potent,	  selective	   inhibitors	   for	   this	   group	   of	   enzymes.	   The	   discovery	   that	   phosphate	  may	  play	   an	   important	   role	   in	   this	   reaction	   could	   have	   implications	   for	   further	  generations	  of	  inhibitor	  design.	  2.3.1 Phosphate-­‐dependent	  cleavage	  To	   determine	   whether	   phosphate	   was	   truly	   a	   substrate	   of	   BglX	   and	   this	  activity	  was	  not	  an	  artefact	  of	  using	  the	  highly	  activated	  substrate	   I	  shifted	  to	   less	  activated	   ones.	   p-­‐Nitrophenyl	   N-­‐acetyl-­‐β-­‐D-­‐glucosaminide	   (pNPGlcNAc)	   and	   p-­‐nitrophenyl	  β-­‐D-­‐glucopyranoside	   (pNPGlc)	  are	  1000	   times	   less	   reactive	   than	   their	  DNP	  counterparts	  due	  to	  the	  differences	  in	  pKa	  (pNP	  pKa	  =	  7,	  DNP	  pKa	  =	  4).	  Also,	  they	   have	   previously	   been	   used	   in	   a	   study	   to	   show	   that	   one	   GH3	   N-­‐acetylglucosaminidase	   (Nag3	   from	   Cellulomonas	   fimi)	   has	   dual	   β-­‐glucosaminidase/β-­‐glucosidase	  activity	  (49).	  TLC	  analysis	  indeed	  revealed	  that	  BglX	  employs	  phosphate	  to	  cleave	  its	  substrate,	  independent	  of	  the	  pKa	  of	  the	  phenolate	  leaving	  group	  or	  the	  substituent	  at	  the	  C-­‐2	  position	  (Figure	  15A).	  In	  the	  absence	  of	  phosphate	  hydrolysis	  is	  observed	  as	  shown	  by	  the	  presence	  of	  product	  signals	  with	  the	   same	   retention	   factors	   (Rf)	   as	   glucose	   and	   N-­‐acetylglucosamine	   respectively.	  When	  phosphate	   is	  present	   the	  Rfs	  of	   the	  products	  were	   consistent	  with	   sugar-­‐1-­‐phosphate	  versions	  of	  the	  starting	  material.	  Confirmation	  of	  the	  formation	  of	  sugar	  phosphates	   and	   determination	   of	   their	   anomeric	   stereochemistry	   was	   obtained	  through	  1H	  NMR	  analysis.	  	   37	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Scheme	  5:	  Peptidoglycan	  recycling	  reaction	  catalyzed	  by	  N-­‐acetylglucosaminidases.	   	  O OOpeptide ONHAcOOHHOHO NHAc O OOpeptide ONHAcOHHOHO NHAc OH OH+N-acetylglucosaminidase+ H2OGlcNAc-anhMurNAc-peptide (anhydrous)-MurNAc-peptide GlcNAc 	   38	  	  	  Figure	   15:	   (A)	   Phosphate-­‐dependent	   activity	   towards	   pNPGlc,	   pNPGlcNAc	   and	  DNPGlc.	  30	  μg	  of	  purified	  BglX	  and	  10	  mM	  substrate	  was	  incubated	  for	  60	  min	  at	  37	  °C	  the	  absence	  or	  presence	  of	  100	  mM	  phosphate.	  (B)	  Reaction	  scheme	  displays	  the	  phosphate-­‐dependent	  activity	  of	  BglX	  towards	  pNPGlc	  and	  pNPGlcNAc.	  	   39	  2.3.2 Anomeric	  configuration	  of	  sugar-­‐1-­‐phosphates	  	  The	   NMR	   work	   was	   performed	   in	   close	   collaboration	   with	   Dr.	   Markus	  Blaukopf.	   N-­‐Acetylglucosaminidases	   cleave	   their	   substrates	   using	   a	   two-­‐step	  double-­‐displacement	  mechanism,	  retaining	  stereochemistry	  at	  the	  anomeric	  centre	  (Scheme	   6).	   It	   was	   therefore	   possible	   that	   the	   sugar-­‐1-­‐phosphate	   products	   could	  also	   have	   the	   β-­‐configuration,	   if	   the	   same	   mechanism	   is	   followed,	   but	   water	   is	  replaced	  by	  phosphate.	  Alternatively	  the	  phosphate	  might	  be	  able	  to	  bind	  and	  carry	  out	  a	  direct	  displacement.	  1H	  NMR	  analysis	  was	  used	  to	  show	  that	  the	  products	  are	  indeed	   the	   β-­‐anomers:	   β-­‐glucose-­‐1-­‐phosphate	   and	   β-­‐N-­‐acetylglucosamine-­‐1-­‐phosphate,	   with	   anomeric	   stereochemistries	   being	   confirmed	   by	   the	   large	   J1,2	  couplings	  of	  8.4	  Hz	  seen	  for	  both	  Glc-­‐1-­‐P	  and	  GlcNAc-­‐1-­‐P	  (Figure	  16).	  These	  results	  therefore	  confirm	  that	  BglX	   functions	  as	  a	  phosphorylase	   in	  each	  case	  and	   further	  that	   it	   is	  a	  β-­‐retaining	  phosphorylase;	  the	  first	  such	  enzyme	  identified.	   Indeed,	  not	  only	  is	  this	  the	  first	  reported	  example	  of	  a	  retaining	  β-­‐glycoside	  phosphorylase,	  but	  also,	  to	  our	  knowledge,	  this	  is	  the	  first	  reported	  occurrence	  of	  βGlcNAc-­‐1-­‐phosphate	  itself	  in	  a	  biochemical	  system.	  2.3.3 Phosphate	  reactivation	  of	  the	  glycosyl-­‐enzyme	  intermediate	  Further	  insights	  into	  the	  role	  and	  importance	  of	  the	  phosphate	  reaction	  were	  obtained	  by	  using	  an	  approach	  developed	  for	  the	  study	  of	  retaining	  β-­‐glucosidases.	  This	   involved	   trapping	   the	   reaction	   intermediate	   as	   the	   2-­‐fluoroglucosyl-­‐enzyme	  species	   by	   reacting	   the	   enzyme	   with	   2,4-­‐dinitrophenyl	   2-­‐deoxy-­‐2-­‐fluoro-­‐β-­‐D-­‐glucopyranoside	   (DNP2FGlc)	   and	   then	   measuring	   reactivation	   of	   the	   purified	  species	  in	  the	  presence	  of	  phosphate	  (Figure	  17A).	  The	  substitution	  of	  a	  fluorine	  at	  the	  C-­‐2	  position	  acts	  to	  destabilize	  the	  glycosylation	  and	  deglycosylation	  transition	  states,	   thus	   making	   the	   reaction	   very	   slow	   (50).	   By	   including	   a	   highly	   reactive	  leaving	  group,	  such	  as	  DNP,	  the	  high	  activation	  energy	  of	  the	  glycosylation	  step	  can	  be	   overcome,	   allowing	   the	   glycosyl-­‐enzyme	   intermediate	   to	   form	   and	   accumulate	  (50).	   Turnover,	   or	   deglycosylation	   of	   this	   trapped	   species	   through	   hydrolysis	  remains	   very	   slow	   as	   a	   consequence	   of	   the	   fluorine	   at	   C-­‐2.	   However	   turnover	   by	  	   	  	   40	  	  	   	  	  	  	  Scheme	  6:	  Catalytic	  mechanisms	  of	  β-­‐retaining	  glycosidases	  and	  GPases	  utilizing	  a	  His-­‐Asp	  acid/base	  catalytic	  dyad	  	   41	  	   	  	  	  	  	  	  	  Figure	  16:	  1H	  NMR	  spectra	  for	  BglX	  and	  10	  mM	  pNPGlcNAc	  or	  pNPGlc	  with	  0,	  or	  100	  mM	   phosphate.	   Unfortunately	   the	   βGlcNAc	   and	   βGlc	   anomeric	   proton	   falls	   right	  under	  the	  large	  peaks	  from	  residual	  HOD,	  thus	  only	  the	  peaks	  from	  α-­‐anomers	  are	  shown.	  An	  αGlc-­‐1-­‐P	  standard	  was	  included	  to	  contrast	  the	  βGlc-­‐1-­‐P	  signal.	   	  	   42	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  17:	  (A)	  BglX	  inactivation	  by	  incubating	  100	  μM	  BglX	  with	  10	  mM	  DNP2FGlc	  in	  Buffer	  D.	  (B)	  Inactivate	  2FGlc-­‐BglX	  was	  reactivated	  by	  incubation	  with	  increasing	  concentrations	   of	   phosphate.	   Samples	   were	   incubated	   at	   25	   °C	   in	   the	   indicated	  concentration	   of	   phosphate.	   Aliquots	  were	   assayed	   in	  Buffer	  D	   containing	   50	  mM	  pNPGlc	   and	   20	   mM	   phosphate	   at	   the	   indicated	   time	   points.	   (C)	   Rate	   constants	  derived	   from	   traces	   in	   (B)	   were	   plotted	   against	   phosphate	   concentration	   to	  determine	  kreact/Kreact.	  	   43	  transglycosylation	  onto	  an	  appropriate	  acceptor	  should	  occur	  more	  rapidly	  due	   to	  recruitment	  of	  the	  binding	  energy	  associated	  with	  the	  aglycone.	  We	  tested	  whether	  phosphate	  would	  act	  as	  such	  a	  species	  by	  monitoring	  rates	  of	  reactivation	  of	  the	  2-­‐fluoro-­‐glucosyl	   enzyme	   (from	  which	   excess	   inactivator	   had	   been	   removed)	   in	   the	  presence	  of	  phosphate.	  As	  can	  be	  seen	  in	  Figure	  17B,	  phosphate	  indeed	  stimulates	  the	   turnover	   of	   the	   glycosyl-­‐enzyme	   intermediate,	   and	   in	   a	   concentration-­‐dependent	   manner.	   From	   these	   data	   it	   was	   possible	   to	   extract	   the	   value	   for	   the	  reactivation	   efficiency	   constant	   (kreact/Kreact)	   of	   0.005	   s-­‐1	   M-­‐1.	   Phosphate	   would	  therefore	   seem	   to	   be	   the	   natural	   substrate	   for	   this	   enzyme,	   thus	   providing	  additional	   evidence	   supporting	   BglX,	   and	   perhaps	   by	   extension	   other	   N-­‐acetylglucosaminidases,	  to	  be	  classified	  as	  a	  GPase.	  2.3.4 Phosphate	  and	  the	  catalytic	  dyad	  BglX,	   like	   other	   GH3	   β-­‐N-­‐acetylglucosaminidases,	   contains	   a	   characteristic	  signature	  sequence	  present	  only	  within	  the	  N-­‐acetylglucosaminidases	  of	  GH3.	  This	  was	   postulated	   on	   the	   basis	   of	   sequence	   alone	   to	   form	   a	   binding	   site	   for	   the	  NAc	  moiety	   (49),	   Subsequent	   structural	   studies	   confirmed	   that	   this	   sequence	   forms	   a	  flexible	   loop	   in	   the	   active	   site	   (51).	   The	   signature	   sequence	   (K-­‐H-­‐(FI)-­‐P-­‐G-­‐(HL)-­‐G-­‐X(4)-­‐D-­‐(ST)-­‐H)	  indeed	  contains	  the	  residues	  that	  provide	  the	  contact	  points	  for	  the	  N-­‐acetyl	   group	   of	   GlcNAc	   (italicized)	   and	   also	   contains	   the	   catalytic	   acid/base	  residues	   (bold).	   Unlike	   other	   GH3s,	   N-­‐acetylglucosaminidases	   possess	   a	   unique	  catalytic	  dyad	  that	  serves	  as	  the	  acid/base	  residue	  in	  place	  of	  the	  glutamic	  acid	  seen	  in	  all	  other	  cases	  (52).	  The	  dyad	  consists	  of	  a	  histidine,	  stabilized	  by	  an	  aspartic	  acid	  as	  shown	   in	  Scheme	  6.	  The	  reason	   for	  substitution	  of	   the	  acid/base	  Glu	   in	  GH3	  N-­‐acetylglucosaminidases	   by	   histidine	   was	   not	   previously	   obvious.	   One	   suggestion	  was	   that	   a	   neutral	   histidine	   would	   allow	   turnover	   of	   GlcNAc-­‐MurNAc	   substrates	  from	  which	  the	  peptide	  chain	  has	  been	  hydrolysed,	  since	  the	  MurNAc	  residue	  bears	  an	  anionic	  lactyl	  moiety	  that	  might	  be	  repelled	  by	  an	  active	  site	  glutamate	  acid-­‐base	  residue,	   but	   not	   by	   a	   neutral	   His.	   Further	   the	   lactyl	   moiety	   could	   be	   involved	   in	  substrate-­‐assisted	   catalysis	   (51,52).	   However,	   subsequent	   studies	   rendered	   this	  explanation	  unlikely	  (51).	  A	  more	  likely	  possibility	  in	  light	  of	  the	  findings	  reported	  	   44	  in	   this	   thesis	   is	   that	   the	   neutral	   charge	   state	   of	   histidine	   permits	   an	   anionic	  phosphate	   to	   perform	   nucleophilic	   attack	   on	   the	   anomeric	   centre	   of	   the	   glycosyl-­‐enzyme	   intermediate.	   If	   the	   acid/base	   residue	   were	   a	   Glu	   or	   Asp	   (as	   shown	   in	  Scheme	  3),	  its	  negative	  charge	  during	  the	  glycosyl-­‐enzyme	  intermediate	  stage	  would	  provide	   Coulombic	   screening	   against	   the	   binding	   of	   anions.	   Given	   that	   until	   now,	  phosphate	   was	   not	   known	   to	   participate	   in	   the	   reaction	   this	   possibility	   was	   not	  previously	  considered.	  	  2.4 Bioinformatic	  screen	  	  2.4.1 GH94	  sequence	  homology	  screen	  The	   second	   approach	   taken	   to	   identify	   new	   GPases	   was	   to	   perform	   a	  sequence-­‐based	  search	  of	  full	  fosmid	  sequences	  available	  from	  the	  Hallam	  lab.	  The	  open	  source	  software	  MetaPathways	  v1.0	  (53)	  was	  used,	  which	  employs	  BLAST	  (54)	  to	   annotate	   predicted	   CAZy	   genes	   based	   on	   sequence	   homology.	   From	   a	   total	   of	  145200	   fosmid	   contigs	   screened,	   5	   ORFs	   were	   predicted	   to	   be	   GH94s.	   These	  sequences	  were	  designated	  GH94A-­‐E,	  and	  their	  functional	  annotations	  are	  displayed	  in	  Table	  2.	   Inspection	  of	   the	  multiple	   sequence	   alignment	   revealed	   that	  GH94A,	  B	  and	  C	  have	  high	  sequence	  homology	  to	  CBP	  from	  C.	  thermocellum	  as	  indicated	  by	  the	  pairwise	  alignment	  scores	  (PAS)	  of	  58.7,	  65.4	  and	  58.4,	  respectively	  (Figure	  18).	  The	  pairwise	  scores	  are	  a	  percentage	  representation	  of	   the	  amino	  acid	   identity	  shared	  between	  to	  the	  two	  sequences.	  GH94D	  does	  not	  share	  high	  sequence	  similarity	  with	  CtCBP	   or	   CtCDP,	   according	   to	   PAS	   of	   31.8	   and	   16.4,	   respectively.	   However,	  dissimilarity	  may	  be	  explained	  by	  GH94D	  being	  strongly	  predicted	  to	  be	  a	  chitobiose	  phosphorylase.	  GH94E,	  predicted	  to	  be	  a	  cellobiose	  phosphorylase	  from	  the	  family	  Lachnospiraceae,	   also	   did	   not	   appear	   to	   share	   significant	   homology	   with	   either	  CtCBP	  (PAS:	  18.6)	  or	  CtCDP	  (PAS:	  12.0).	  To	  determine	  the	  activity	  of	  the	  5	  predicted	  GH94	   enzymes,	   the	   5	   ORFs	  were	   cloned	   into	   expression	   plasmids,	   expressed	   and	  purified	   as	   described	   in	   the	   methods	   section	   (3.2	   Protein	   expression).	  Unfortunately,	   the	  his-­‐tagged	  version	  of	  GH94B	  expressed	  as	   inclusion	  bodies,	  and	  	  	   	  	   45	  	  	  	  	  	  	  Table	   2:	   Annotations	   of	   candidate	   GH94s	   found	   through	   sequence	   homology	  screening.	  MW	  calculated	  at:	  http://proteome.gs.washington.edu/cgi-­‐bin/aa_calc.pl	  (accessed	  November	  2014)	  	   	  GH94 Top(BLAST(annotation MW((kDa)A cellobiose)phosphorylase 91.2B cellobiose)phosphorylase 92.3C cellobiose)phosphorylase 94.3D chitobiose)phosphorylase 91.4E cellobiose)phosphorylase 103.6	   46	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  18:	  Multiple	  peptide	  sequence	  alignment	  for	  CtCBP,	  GH94A,	  B	  and	  C.	  Aligned	  using	  ClustalW2:	  (http://www.ebi.ac.uk/Tools/msa/clustalw2/).	  Symbol	  legend:	  (*)	  indicates	   a	   position	   which	   has	   a	   single,	   fully	   conserved	   residue.	   (:)	   indicates	  conversion	  between	  groups	  of	   strongly	  similar	  properties.	   (.)	   indicates	  conversion	  between	  groups	  of	  weakly	  similar	  properties.	      CtCBP           MKFGFFDDANKEYVITVPRTPYPWINYLGTENFFSLISNTAGGYCFYRDARLRRITRYRY 60 GH94A           MKFGYFDDKNREYVITTPKTPLPWINYLGNEGFFTLISNTCGGYSFYKDAKLRRITRFRY 60 GH94B           MQYGHFDDQRREYVITNPVTPWPWINYLGNEDFFSLISNSAGGYSFYKDAKFRRLNRYRY 60 GH94C           MRFGYFDDESKEYVITRPDTPLPWINYLGTDAYFGLISNTAGGYSFYKDARLRRLTRYRY 60                 *::*.***  :***** * ** *******.: :* ****:.***.**:**::**:.*:**  CtCBP           NNVPIDMGGRYFYIYDNGD--FWSPGWSPVKRELESYECRHGLGYTKIAGKRNGIKAEVT 118 GH94A           NNVPRDFGGKYLYVKDGET--IWNPGWMPTKTALDSYHCHHGLGYSKWVSSKNGIGVTLT 118 GH94B           NNMPMDNGGRYFFIKDGDC--TWSPGWKPVKTTLDSYECRHGLSYTRISGAKNGLIAETL 118 GH94C           NNSPLDMGGRYIYIRDAENGAYWSPSWMPTRSNLDTYECRHGMGYTNIRSSKNGITAQTR 120                 ** * * **:*::: *      *.*.* *.:  *::*.*:**:.*:.  . :**: .     CtCBP           FFVPLNYNGEVQKLILKNEGQDKKKITLFSFIEFCLWNAYDDMTNFQRNFSTGEVEI--- 175 GH94A           VFVPLHENCELTKAVIKNTGKTEKTIKLYGVLEWCLWNAVDDATNFQRNYSTGEVEV--- 175 GH94B           FFVPLKTWAEVQKVKLTNTSDETKTIQFFSFNEWCLWNAEDDQNNLQRNLSTGEVEI--- 175 GH94C           FFVPLNSDLEVWDFTITNDRTDNALLDLFATVEFALWDAWDDSTNFQRNFNTGEVEVPDL 180                 .****:   *: .  :.*       : ::.  *:.**:* ** .*:*** .*****:     CtCBP           ---------EGSVIYHKTEYRERRNHYAFYSVNAKISGFDSDRDSFIGLYNGFDAPQAVV 226 GH94A           ---------EPSTVYHKTEYRERRNHYAFYHVNAKTTGYETDLDTFLGQNGGWNDPETVA 226 GH94B           ---------EGSTLYHKTEYNERRNHYAFYHLNTEIDGFDTDRESFVGLYNEISGPQAVL 226 GH94C           QEGSALAIEERSTIYHKTEYRERRNHFAYFACSEPLSGFDTQRADFLGPYRGWDQPLAVE 240                          * *.:******.*****:*::  .    *::::   *:*     . * :*   CtCBP           NGKSNNSVADGWAPIASHSIEIELNPGEQKEYVFIIGYVENKDEEKWE--SKGVINKKKA 284 GH94A           KGACSNSLASGWSPIACHEITLVLKPGEEKTLVFNLGYVENEEDKKFE--SPNVINKDKA 284 GH94B           DGKPRNSHAYGWSPIASHYKKITLKAGESQELIFVLGYIENPDEKKWE--SKGIINKEPA 284 GH94C           EGQSSNSVAHGWQPIGSHHLKLSLKPGESKRVIFLLGYHENPMDEKFDPVNPNTINKASV 300                 .*   ** * ** **..*   : *:.**.:  :* :** **  ::*::  . . ***  .  CtCBP           YEMIEQFNTVEKVDKAFEELKSYWNALLSKYFLESHDEKLNRMVNIWNQYQCMVTFNMSR 344 GH94A           HELIRKFDTAASFDDALEELKRYWSTLLGKFSIHSDEEKFDRMINIWNQYQCMITFNFSR 344 GH94B           KELIAKFDTVEKVNAALAELAAYWDKLLSVYNIESADDKLNRMVNIWNQYQCMVTFNMSR 344 GH94C           LPIIRYFLAPARIDQAFQELKAYWQQLLNKLIVNTPDDDTNRTVNIWNAYQCMITFNMSR 360                   :*  * :   .: *: **  **. **.   :.: ::. :* :**** ****:***:**  CtCBP           SASYFESGIGRGMGFRDSNQDLLGFVHQIPERARERLLDLAATQLEDGGAYHQYQPLTKK 404 GH94A           SASYFESGTGRGMGFRDSCQDLLGFVHLIPERARQRILDIAAIQSKDGSTYHQYQPLTKR 404 GH94B           SASFFEVGIGRGMGFRDSNQDLIGFVHQIPERAKERILDIAATQLATGGAYHQYQPLTKR 404 GH94C           SASYFESGIGRGMGFRDSTQDLLGFVHMIPERARERLLDLAATQLSNGGAFHQYQPLTKR 420                 ***:** * ********* ***:**** *****::*:**:** *   *.::********:  CtCBP           GNNEIGSNFNDDPLWLILATAAYIKETGDYSILKEQVPFNNDPSKADTMFEHLTRSFYHV 464 GH94A           GNADVGGGFNDDPLWLVACTSAYIKETGDFSILDEQVQFNNEEGSEEPLMNHLRASIEHT 464 GH94B           GNDAIGGDFNDDPLWLILSATSYIKETGDFSILDEVVPYENDETKAQPLYDHLKRSFYFT 464 GH94C           GNNDVGSDFNDDPHWLVLATAAYVKETGDLSILDEPAPYQNQPGTEMPLFEHLQRAVQYT 480                 **  :*..***** **: .:::*:***** ***.* . ::*:  .  .: :**  :. ..  CtCBP           VNNLGPHGLPLIGRADWNDCLNLNCFSTVPDESFQTTTS-KDGKVAESVMIAGMFVFIGK 523 GH94A           MKNLGPHGLPLIGRADWNDCLNLNCFSKTPGESFQTASN-YESGKAESVFIAGMFVKYGK 523 GH94B           VNNLGPHGLPLIGRADWNDCLNLNCFSSNPNESFQTTQNNTKGSKAESIMIAGLFVLYGR 524 GH94C           LDRLGPHQLPLIGRADWNDCLNLNCFSDTPGQSFQTTTN-QDGSTAESVMIAGLFILSCQ 539                 :..**** *******************  *.:****: .  ..  ***::***:*:   :  CtCBP           DYVKLCEYMGLEE--------EARKAQQHIDAMKEAILKYGYDGEWFLRAYDDFGRKVGS 575 GH94A           EYAEICKLTDRLS--------EAETVLKAVDDIEKATIQDGWDGEWFLRAYDAFEHKVGS 575 GH94B           DFVELSNRIGKTD--------EAIAAQKHVDAMVESVKIHGWDGEWYLRAYDFFGKKVGS 576 GH94C           EMAQLAPLYASASSSNAFDLYNASFYEEKIALMEKAVWEAGWDGEWFRRAYDAFGEPLGS 599                 : .::.      .        :*    : :  : ::    *:****: **** * . :**  CtCBP           KENEEGKIFIESQGFCVMAEIGLEDGKALKALDSVKKYLDTPYGLVLQNPAFTRYYIEYG 635 GH94A           HECDEGKIFIEPQGFCVMAGIGQKQGYGKKALASVNKYLVNDYGVELLAPCYTKYHIELG 635 GH94B           DENDEGKIFIESQGWCTMAEIGKDEGLVAKSLQAVKERLDCQYGIVLNNPAFTKYVIEYG 636 GH94C           HLNDEGKIFIEPQGLCVMAGLGIKDGKARQALDAVRERLATPHGIVLLNPAFKRYQLRLG 659                 .  :*******.** *.** :* .:*   ::* :*.: *   :*: *  *.:.:* :. *  CtCBP           EISTYPPGYKENAGIFCHNNAWIICAETVVGRGDMAFDYYRKIAPAYIEDVSDIHKLEPY 695 GH94A           EISSYPPGYKENGAVFCHNNPWIVIASAMEGLNENTWKLYTKNCPAYIEDQSEIHRTEPY 695 GH94B           EISTYPKGYKENAGIFCHNNPWIMIGETKVKNADRAWEYYTKICPAYLEDISELHRTEPY 696 GH94C           EITSYPPGYKENGSVFCHSNPWIMIAETCIGRGDHAFDYYKRINPSAREAISDVHCCEPY 719                 **::** *****..:***.*.**: ..:     : ::. * :  *:  *  *::*  ***  CtCBP           VYAQMVAGKDAKRHGEAKNSWLTGTAAWNFVAISQWILGVKPDYDGLKIDPCIPKAWDGY 755 GH94A           VYSQMIAGRSAKNYGEAKNSFLTGTASWTFVAASQAILGIQPDFRGLIVKPCLPKKIKHV 755 GH94B           VYAQMIAGKDAAKPGEAKNSWLTGTAAWNFYAISQHILGLKPQYDGLMLEPCIPASVGDF 756 GH94C           VYAQTIAGKEAPTHGEAKNSWLTGNASWNYVAITQYILGIKPTHEGLSVHPVIPNDWEGF 779                 **:* :**:.*   ******:***.*:*.: * :* ***::* . ** :.* :*        CtCBP           KVTRYFRGSTYEITVKNPNHVSKGVAKITVDGNEISGNILPVFND-GKTHKVEVIMG 811 GH94A           EITRVFRGATYRIKVIN---NKTGQHSMTVNGVECDGYLIPFEG--PKEYDVVVNL- 806 GH94B           LINRIFRGKKLEITVKN--NFCTGEVKIFLNGNVIEGNIIPLDKL-SASNQVLVELN 810 GH94C           EAKRIFRGVEYNITVER--QGSGDSVQLIVDGQRIKGNVIPIPPQGTREVNVLVELG 834                   .* ***   .*.* .      .  .: ::*   .* ::*.        .* * :  !	   47	  therefore	   activity	   assays	   could	   not	   be	   performed	   for	   this	   clone.	   The	   other	   4	  candidates	   were	   expressed	   in	   the	   soluble	   fraction	   of	   the	   cell	   lysate	   and	   could	   be	  purified	   using	   immobilized	   metal	   affinity	   chromatography	   allowing	   analysis,	   as	  detailed	  below.	  	  	  2.4.2 Natural	  substrates	  TLC	   analysis	   was	   performed	   to	   determine	   the	   likely	   natural	   substrates	   of	  GH94A,	  C,	  D	  and	  E.	  Activities	  towards	  cellobiose,	  cellotriose,	  chitobiose,	  chitotriose	  were	   assayed	   both	   in	   the	   presence	   and	   absence	   of	   phosphate.	   From	   this	   initial	  screen,	  chitobiose	  and	  chitotriose	  were	  eliminated	  as	  possible	  substrate	  as	  none	  of	  the	   GH94	   candidates	   performed	   cleavage	   whether	   in	   the	   presence	   or	   absence	   of	  phosphate	   (data	   not	   shown).	   This	   was	   surprising,	   as	   GH94D	   was	   annotated	   as	   a	  chitobiose	   phosphorylase:	   we	   therefore	   expected	   to	   see	   phosphate-­‐dependent	  cleavage	   of	   chitobiose	   and/or	   chitotriose.	   Nevertheless,	   phosphate	   dependent	  activity	  was	  detected	   toward	  cellobiose	   for	  GH94A	  and	  GH94C	  (Figure	  19;	   lanes	  1	  and	  2)	   and	   towards	   cellotriose	   for	  GH94D	  and	  GH94E	   (Figure	  19;	   lanes	  3	   and	  4).	  These	   results	   suggest	   that	   GH94A	   and	   GH94C	   are	   cellobiose	   phosphorylases,	  although	   activity	   from	   GH94C	   could	   only	   be	   detected	   after	   extended	   incubation	  times.	   The	   cleavage	   of	   cellotriose	   by	   only	   GH94D	   and	   GH94E	   indicate	   that	   these	  enzymes	  are	  likely	  cellodextrin	  phosphorylases.	  The	  assay	  of	  the	  reverse	  reactions	  supported	  these	  identities,	  as	  GH94A	  and	  GH94C	  produced	  cellobiose	  from	  Glc-­‐1-­‐P	  and	   glucose	   while	   GH94D	   and	   GH94E	   produced	   cellodextrins	   from	   Glc-­‐1-­‐P	   and	  cellobiose	   with	   an	   equilibrium	   constant	   near	   unity	   (Figure	   19;	   lanes	   5	   and	   6).	  GH94E	   also	   had	   interesting	   activity	   towards	   Glc-­‐1-­‐P	   and	   glucose,	   producing	  unknown	  products	   that	   have	   an	  Rf	   in	   between	   cellobiose	   and	   cellotriose	   and	   just	  below	  glucose	  (Figure	  19;	  lane	  5).	  To	  determine	  the	  identity	  of	  these	  products	  NMR	  will	  have	  to	  be	  performed	  in	  future	  studies.	  Likewise,	  these	  enzymes	  will	  be	  further	  characterized	   in	   the	   coming	   months,	   with	   a	   focus	   on	   defining	   their	   substrate	  specificity	  and	  potential	  industrial	  applications.	  	   48	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  19:	  GH94	  activity	  towards	  cellobiose,	  cellotriose	  with	  and	  without	  phosphate	  (lanes	   1-­‐4)	   and	   towards	   Glc-­‐1-­‐P	   and	   glucose	   (lane	   5)	   and	   Glc-­‐1-­‐P	   and	   cellobiose	  (lane	   6).	   25	   μg	   of	   purified	   enzyme	   was	   incubated	   with	   5	   mM	   of	   the	   indicated	  substrate(s)	  and	  0	  or	  100	  mM	  phosphate.	  GH94A,	  D	  and	  E	  reactions	  were	  incubated	  at	  37	  °C	  for	  30	  min,	  GH94C	  incubation	  time	  was	  extended	  to	  16	  h.	  	   49	  2.4.3 GH94	  candidate’s	  activity	  towards	  DNPGlc	  The	  GH94	  candidates	  provide	  us	  with	  the	  opportunity	  to	  test	  whether	  DNPGlc	  serves	  as	  a	  suitable	  substrate	  by	  testing	  against	  other	  GPases.	  Purified	  GH94	  GPases	  along	  with	  CBP	  and	  CDP	  were	  assayed	  with	  DNPGlc	  in	  the	  presence	  and	  absence	  of	  phosphate	  (Figure	  20).	  Surprisingly	  we	  found	  that	  only	  GH94D	  gave	  an	  appreciable	  phosphate-­‐dependent	   signal,	   comparable	   with	   CBP	   from	   C.	   thermocellum.	   While	  GH94A	   and	   GH94E	   did	   also	   show	   a	   slight	   increase	   in	   activity	  when	   phosphate	   is	  present,	   it	   is	   likely	   that	   these	  candidates,	   along	  with	  GH94C,	  would	  not	  have	  been	  identified	   as	  GPases	   if	   they	  were	   present	   in	   our	   functional	   screen.	   This	   highlights	  perhaps	  one	  of	  the	  most	  significant	  drawbacks	  using	  a	  substrate	  whereby	  activity	  is	  measured	  by	  reading	  absorbance	  in	  the	  visible	  wavelength	  range.	  The	  sensitivity	  of	  DNPGlc	  is	  too	  low	  to	  detect	  GPases	  with	  minimal	  activity,	  such	  as	  GH94A	  and	  E.	  In	  our	  functional	  screen,	  our	  positive	  control	  gave	  a	  signal	  that	  was	  ~2	  times	  higher	  in	  the	  presence	  of	  phosphate	  than	  in	   its	  absence	  (according	  to	  Figure	  11A:	  CBP	  box).	  This	  does	  not	  provide	  much	  room	  to	  detect	  GPases	  that	  exhibit	  lower	  activity	  than	  the	  positive	  control,	  and	  likely	  led	  to	  those	  low	  activity	  GPases	  being	  designated	  as	  negative	  hits.	  To	  deal	  with	  this	  issue	  we	  have	  been	  revisiting	  the	  use	  of	  fluorescent-­‐glycoside	   compounds	   as	   an	   alternative	   to	   DNPGlc.	   When	   energy	   (in	   the	   form	   of	  light)	   is	   added	   to	   chromogenic	   substrates,	   it	   is	   absorbed	   and	   dissipated	   as	   heat	  throughout	   the	   surrounding	  medium.	  We	   can	   determine	   the	   concentration	   of	   the	  substrate	  by	  measuring	  how	  much	  energy	  (or	  light)	  is	  absorbed	  as	  it	  passes	  through	  a	   solution	   containing	   the	   compound.	   Fluorescent	   compounds	   absorb	   light	   in	   the	  same	   way	   as	   its	   chromogenic	   counterpart,	   but	   instead	   of	   that	   energy	   being	  dissipated	   as	  heat,	   a	   portion	  of	   it	   is	   emitted	   as	   a	  photon	   (or	   light).	  Measuring	   the	  emitted	  light	  provides	  a	  much	  more	  sensitive	  method	  of	  determining	  activity,	  where	  positive	   hits	   are	   orders	   of	   magnitude	   above	   the	   background.	   Therefore,	   a	  fluorescent	  glucoside	   substrate	  would	  allow	  us	   to	  detect	   low	  activity	  GPases	   from	  the	   functional	   screen	   as	   even	   minimal	   activity	   can	   be	   distinguished	   from	   the	  baseline.	  This	  aspect	  of	  the	  screen	  is	  discussed	  further	  in	  the	  next	  section.	  	  	   	  	   50	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	  20:	  Phosphorolysis	  of	  DNPGlc	  by	  GH94A,	  C-­‐E.	  (A)	  50	  μg	  of	  purified	  enzyme	  was	  incubated	  with	  1	  mM	  DNPGlc	  and	  0	  or	  100	  mM	  phosphate	  for	  20	  h	  at	  30	  °C	  (200	  μL	  reaction	  volume).	  (BG)	  Background	  was	  measured	  by	  running	  the	  reaction	  with	  no	   enzyme.	   DNP	   concentration	  was	   determined	   by	  measuring	   absorbance	   at	   400	  nm.	  	   	  	   51	  2.5 Conclusions	  and	  future	  directions	  This	   study	   has	   outlined	   a	   methodology	   to	   identify	   novel	   GPases	   from	  metagenomic	   libraries.	  By	  employing	  both	  sequence	  homology	  and	   function	  based	  screening	   techniques	   we	   were	   able	   to	   identify	   5	   new	   GH94	   CAZymes	   and	   a	   new	  class	  of	  CAZyme:	   the	  β-­‐retaining	  phosphorylases.	  This	   is	  a	  promising	  start	   toward	  scaling	  up	  the	  screen	  to	  include	  additional	  metagenomic	  libraries.	  However,	  there	  is	  still	  room	  for	  several	  screening	  parameters	  to	  be	  optimized.	  	  	   A	  problematic	  aspect	  of	  the	  functional	  screen	  that	  will	  need	  to	  be	  improved	  on	  in	  the	  future	  is	  the	  final	  validation	  process	  by	  TLC.	  It	  is	  apparent	  that	  endogenous	  E.	  coli	   enzymes	   readily	   degrade	  αGlc-­‐1-­‐P,	   preventing	   its	   detection.	  To	  overcome	   this	  limitation	  we	  are	  considering	  2	  alternatives.	  The	  first	  approach,	  not	  involving	  TLC,	  could	   be	   to	   carry	   out	   spectrophotometric	   assays	   at	   340	   nm	   in	   which	  phosphoglucomutase	   and	   glucose-­‐6-­‐phosphate	   dehydrogenase	   are	   added	   to	   the	  mixtures,	   along	   with	   NAD+.	   Here	   the	   expectation	   is	   that	   any	   αGlc-­‐1-­‐P	   produced	  would	   be	   rapidly	   converted	   to	   6-­‐phosphogluconolactone,	   along	   with	   NAD+	   being	  converted	   to	   NADH,	   as	   shown	   in	   Figure	   21.	   The	   second	   approach	   and	   our	   most	  likely	  course	  of	  action	  will	  be	  to	  turn	  to	  next	  generation,	  whole-­‐fosmid	  sequencing	  of	   clones	   deemed	   to	   contain	   GPases,	   thus	   candidate	   genes	   will	   be	   recognized	  rapidly.	   This	  method	   alone	   is	   not	   ideal	   because	  we	  will	   only	   recognize	   those	   that	  share	   homology	   with	   other	   known	   GPases.	   An	   advantage	   of	   performing	   the	  functional	   screen	   is	   that	   there	   is	   the	   possibility	   to	   discover	   new	   GPase	   (such	   as	  BglX),	   which	   may	   not	   share	   sequence	   homology	   with	   other	   known	   GPases.	  Therefore,	  if	  a	  clone	  is	  shown	  to	  have	  GPase	  activity,	  but	  no	  obvious	  candidate	  can	  be	  found	  through	  sequencing,	  we	  will	  generate	  a	  sub-­‐library	  to	  identify	  the	  gene	  of	  interest.	  Full	  fosmid	  sequencing	  is	  becoming	  more	  and	  more	  feasible	  as	  the	  price	  of	  next	  generation	  cloning	  decreases.	  Also,	  in	  the	  future,	  when	  this	  screen	  is	  scaled	  up,	  many	  more	  candidate	  clones	  will	  be	   identified,	   the	  cost	  will	  decrease	  even	   further	  due	   to	   a	   volume	   discount.	   Currently	   the	   cost	   for	   sequencing	   whole	   fosmids	   is	  	  	   	  	   52	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Figure	   21:	   Schematic	   diagram	   of	   the	   Glucose-­‐6-­‐phosphate	   dehydrogenase	   activity	  assay.	   Glucose-­‐1-­‐phosphate	   produced	   from	   a	   GPase	   is	   converted	   to	   glucose-­‐1-­‐phosphate	  by	  phosphoglucomutase.	  Through	  conversion	  of	  glucose-­‐6-­‐phosphate	  to	  6-­‐phosphogluconolactone,	   glucose-­‐1-­‐phosphate	   dehydrogenase	   reduces	   NAD+	   to	  NADH,	  which	  can	  be	  detected	  by	  measuring	  absorbance	  at	  340	  nm.	  	   53	  ~$5000	  per	  384	  well	  plate	  through	  the	  Faculty	  of	  Pharmaceutical	  Sciences	  at	  UBC.	  Direct	   sequencing	  of	   the	   fosmids	  will	  also	  allow	  us	   to	  skip	   the	   lengthy	  sub-­‐library	  generation	  phase	  of	  the	  screen;	  a	  major	  benefit	  where	  many	  candidate	  clones	  need	  to	  be	  identified.	  	  	   The	  main	  advantage	  to	  using	  the	  DNPGlc	  as	  the	  reporter	  substrate	  is	  that	  it	  has	  a	   highly	   activated	   leaving	   group,	   which	   allows	   us	   to	   identify	   a	   broader	   range	   of	  GPase	  compared	   to	  when	  using	  a	   less	  activated	  one.	  A	  drawback	   to	  using	  such	  an	  activated	   substrate	   is	   that	   other	   enzymes	   (such	   as	   those	   endogenous	   to	   E.	   coli	  present	   in	   the	   cell	   lysate)	   can	   cleave	   it	   as	   well,	   producing	   elevated	   background	  signals.	   Furthermore,	   the	   phenolate	   that	   is	   released	   absorbs	   light	   in	   the	   visible	  range,	  which	   can	  often	  be	  difficult	   to	  distinguish	   from	   the	  background,	   a	  problem	  encountered	   in	   this	   screen.	   In	   future	   studies,	   we	   will	   investigate	   activated	  substrates	  which	  release	  leaving	  groups	  that	  can	  be	  detected	  via	  fluorescence.	  Such	  compounds	  will	  be	  beneficial	  because	  positive	  signals	  will	  be	  orders	  of	  magnitude	  above	   the	   background,	   unlike	   DNP,	   where	   positive	   hit	   signals	   are	   only	   2-­‐3	   times	  above	   the	   background.	   Additionally,	   the	   use	   of	   sensitive	   fluorescent	   reagents	  will	  allow	  us	  to	  use	  lower	  substrate	  concentrations,	  further	  reducing	  background	  signal.	  We	   have	   recently	   been	   testing	   a	   fluorescent	   compound	   known	   as	   6,8-­‐difluorocoumarin	   glucoside,	   which	   is	   processed	   by	   CBP	   only	   in	   the	   presence	   of	  phosphate.	  We	  hope	   to	   test	   this	   compound	  on	  a	   small	  metagenomic	   library	   in	   the	  near	   future.	   Switching	   from	   a	   chromogenic	   substrate	   to	   a	   fluorescent	   one	   will	  hopefully	  allow	  us	  to	  identify	  low	  activity	  GPases	  which	  otherwise	  would	  have	  been	  too	   close	   to	   the	   baseline.	   Amending	   the	   screening	   methodology	   with	   the	  improvements	  noted	  above	  will	  hopefully	  decrease	  the	  false	  positive	  rate,	  minimize	  the	   amount	   of	   time	   and	   resources	  used	   and	  maximize	   the	  number	  of	   new	  GPases	  discovered.	  	  	   The	  screen	  outlined	  here	  has	  not	  only	  provided	  a	  strong	   foundation	  on	  which	  future	   screens	   can	   be	   performed	  with	   a	   reasonable	   expectation	   of	   success,	   it	   has	  already	  resulted	  in	  the	  discovery	  of	  5	  new	  CAZymes	  and	  the	  identification	  of	  a	  new	  	   54	  class	  of	   enzyme,	   the	  β-­‐retaining	  glycoside	  phosphorylases.	  Resulting	  directly	   from	  this	  work	  we	  have	  recently	  submitted	  a	  paper	  in	  JBC	  proposing	  the	  reclassification	  of	  a	  sub-­‐set	  of	  N-­‐acetylglycosaminidases	  as	  phosphorylases	  (Macdonald	  et	  al.,	  2014.	  in	  revision).	  We	  hope	  that	  this	  screen	  will	  continue	  to	  provide	  as	  much	  insight	  into	  GPases	  as	  it	  already	  has,	  as	  I	  plan	  to	  carry	  on	  this	  work	  to	  a	  PhD	  in	  the	  Withers	  lab.	  	   55	  3 Methods	  3.1 Cloning	  pET	   expression	   constructs	  were	  made	   according	   to	   the	   partially	   incomplete	  polymerase	   extension	   (PIPE)	   cloning	   method	   (55).	   PCRs	   were	   all	   done	   using	   a	  Biorad	  MyCycler™	  thermal	  cycler.	  Primer	  sequences	  and	  PCR	  variables	  are	  listed	  in	  Appendix	   table	   1.	   PCR	   products	   were	   confirmed	   by	   running	   on	   1%	   agarose	   gel.	  Confirmed	  PCR	  products	  were	   directly	   transformed	   into	  DH5α	  E.	  coli.	   2	   μL	   of	   the	  vector	   PCR	   product	   was	   combined	  with	   2	   μL	   of	   the	   insert	   PCR	   product	   and	   then	  added	  to	  100	  μL	  DH5α	  chemically	  competent	  cells	  and	  transformed	  by	  heat	  shock.	  Cells	  were	   recovered	  by	   adding	   300	  μL	   LB	  media	   and	   incubated	   for	   1	   h	   at	   37	   °C.	  Following	   recovery,	   cells	   were	   spread	   on	   LB	   agar	   plates	   containing	   50	   μg/mL	  kanamycin	   and	   incubated	   for	   18	   h	   at	   37	   °C.	   Resulting	   colonies	   were	   used	   to	  inoculate	   overnight	   cultures	   which	   were	   used	   for	   plasmid	   minipreps	   and	  sequencing.	   Plasmid	   preparation	   was	   done	   using	   QIAprep	   Spin	   Miniprep	   Kits.	  Sequencing	  was	  done	  at	  Genewiz	   (South	  Plainfield,	  NJ,	  USA)	  using	  T7	  and	  T7term	  sequencing	   primers.	   Constructs	   confirmed	   by	   sequencing	   were	   transformed	   into	  BL21	   (DE3)	   E.	   coli.	   All	   DH5α	   and	   BL21	   strains	   were	   stored	   in	   1	   mL	   LB	   media	  containing	  15	  %	  glycerol	  at	  -­‐70	  °C.	  3.2 Protein	  expression	  LB	   media	   containing	   25	   μg/mL	   kanamycin	   was	   inoculated	   with	   1/100	   of	  overnight	  culture.	  Expression	  cultures	  were	  grown	  at	  37	  °C	  until	  OD600	  =	  0.5	  (~3	  h).	  Cells	  were	  induced	  with	  0.5	  mM	  IPTG	  and	  grown	  for	  18	  h	  at	  20	  °C	  (3	  h	  at	  37	  °C	  for	  CBP	  and	  CDP	  expression).	  Cells	  were	  harvested	  by	  centrifuging	  at	  6	  000	  rpm	  for	  6	  min	  in	  a	  Beckman	  Coulter	  Avanti®	  J-­‐E	  floor	  centrifuge	  (JA-­‐10	  rotor)	  followed	  by	  re-­‐suspension	  (20	  mL	  per	  1	  L	  of	  original	  culture	  volume)	   in	  Buffer	  A	  (50	  mM	  HEPES	  (pH	   7.0),	   100	   mM	   NaCl,	   2	   %	   glycerol,	   5	   mM	   MgSO4	   and	   5	   mM	   imidazole	   (NaCl	  concentration	  was	  increased	  to	  600	  mM	  for	  BglX	  buffers).	  Cells	  were	  lysed	  with	  an	  Avestin	   C3	   homogenizer	  with	   an	   average	   cell	   pressure	   of	   16	   000	  psi.	   The	   soluble	  fraction	  was	   isolated	   by	   centrifuging	   the	   lysate	   at	   15	   000	   rpm	   for	   30	  min	   (JA-­‐20	  	   56	  rotor).	   Soluble	   fraction	   was	   either	   stored	   at	   -­‐20	   °C	   until	   needed	   or	   immediately	  purified.	  3.3 Immobilized	  metal	  affinity	  chromatography	  Proteins	   containing	   a	   hexa-­‐histidine	   tag	  were	  purified	  by	   immobilized	  metal	  affinity	   chromatography	   (IMAC)	   on	   a	   GE	   ÄTKA	   purifier	   equipped	   with	   a	   UV	   and	  conductance	  detector	  and	  an	  automatic	  fraction	  collector.	  The	  soluble	  fraction	  of	  the	  cell	   lysate	  was	   loaded	  on	  to	  a	  pre-­‐equilibrated	  5	  mL	  HisTrap™	  FF	  column	  from	  GE	  using	  a	  P-­‐1	  peristaltic	  pump	  from	  GE.	  Excess	  lysate	  was	  removed	  from	  the	  column	  with	   10	   column	   volumes	   (CV)	   of	   Buffer	   A	   before	   it	   was	   transferred	   to	   the	   pre-­‐equilibrated	  ÄTKA	  purifier.	  The	  column	  was	  washed	  with	  10	  CV	  Buffer	  B	  (Buffer	  A	  containing	  25	  mM	  imidazole)	   followed	  by	  re-­‐equilibration	  with	  5	  CV	  Buffer	  A.	  The	  protein	  was	  eluted	  with	  a	  20	  mL	  gradient	  (0	  –	  100	  %)	  of	  Buffer	  A	  to	  Buffer	  C	  (Buffer	  A	   containing	   600	  mM	   imidazole).	   The	   automatic	   fraction	   collector	   collected	   1	  mL	  elution	   fractions.	   Fractions	   were	   loaded	   on	   SDS	   PAGE	   and	   those	   containing	   the	  highest	   protein	   concentration	   were	   pooled,	   concentrated	   and	   dialysed	   against	  Buffer	  D	  (Buffer	  A	  containing	  0	  mM	  imidazole).	  Protein	  Samples	  were	  stored	  at	  -­‐70	  °C.	  3.4 TLC	  TLCs	   were	   done	   using	   TLC	   silica	   gel	   60	   F254	   TLC	   plates	   (EMD	   Millipore	  Corporation,	  Billerica,	  MA,	  USA).	  The	  plates	  were	  eluted	  with	  a	  mobile	  phase	  of	  1-­‐butanol,	  methanol,	  ammonium	  hydroxide	  and	  water	  in	  a	  5:4:4:1	  ratio,	  respectively.	  TLC	  stain	  composition:	  anisaldehyde	  reagent	  (92.5%	  ethanol,	  4%	  H2SO4,	  1.5%	  acetic	  acid,	  2%	  p-­‐anisaldehyde),	  and	  molybdate	  reagent	  (2.5%	  w/v	  ammonium	  molybdate,	  1%	  w/v	  ceric	  ammonium	  sulfate	  and	  10%	  H2SO4).	  Once	  dried	  the	  pNPGlcNAc	  plate	  was	  stained	  with	  anisaldehyde	  or	  molybdate	  reagent	  and	  heated	  until	   the	  product	  signals	  became	  visible.	  	  3.5 General	  spectroscopy	  methods	  Spectroscopy	   measurements	   were	   performed	   in	   matched	   1	   cm	   path	   length	  quartz	  cuvettes	  using	  a	  Varian	  Cary	  300	  Bio	  UV-­‐visible	  spectrophotometer	  with	  an	  	   57	  automatic	   cell	   changer	   and	   circulating	   waterbath,	   at	   25	   °C	   in	   HEPES	   buffer.	  Reactions	   contained	   potassium	   phosphate	   (pH	   7.0)	   where	   specified.	  Hydrolysis/phosphorolysis	  rates	  were	  calculated	  by	  measuring	  absorbance	  changes	  as	   a	   function	   of	   time	   and	   converting	   these	   to	   concentration	   with	   the	   following	  extinction	  coefficients:	  7280	  M-­‐1	  cm-­‐1	  (pNP)	  and	  12460	  M-­‐1	  cm-­‐1	   (2,4DNP)	  at	  25	   °C,	  pH	  7.0.	  End	  point	  measurements	  were	  converted	   to	   concentration	  using	   the	   same	  extinction	  coefficients.	  Non-­‐linear	  regression	  was	  performed	  using	  GraphPad	  Prism	  v6.0.	  	  3.6 BglX	  reactivation	  BglX	  was	   inactivated	  by	   incubating	  100	  μM	  of	  active	  enzyme	  with	  10	  mM	  of	  the	  inactivator	  DNP2FGlc	  at	  25	  °C	  for	  210	  min	  in	  Buffer	  D	  (50	  μL).	  Excess	  DNP2FGlc	  was	  diluted	  to	  ~2	  μM	  by	  dilution	  with	  Buffer	  D	  and	  re-­‐concentrating	  with	  Amicon	  centrifugal	  filters.	  Reactivation	  was	  monitored	  by	  incubating	  samples	  of	  the	  inactive	  BglX-­‐2FGlc	   complex,	   (50	   μM)	   in	   Buffer	   D	   with	   0,	   1,	   5,	   10,	   25,	   50	   or	   100	   mM	  phosphate	   at	   25	   °C	   (50	  μL	   reaction	   volume)	   and	   transferring	  5	  μL	   aliquots	   at	   the	  indicated	  time	  points	  from	  the	  reactivation	  reaction	  to	  an	  assay	  solution	  (195	  mL)	  containing	   Buffer	   D,	   50	  mM	   pNPGlc	   and	   20	  mM	   phosphate.	   Turnover	   rates	   were	  calculated	   as	   described	   above.	   Reactivation	   data	   was	   fit	   using	   a	   first-­‐order	  expansion	  equation:	  At	  =	  A∞Ÿ(1	  -­‐	  e(-­‐kt)).	  The	  rate	  constants	  from	  each	  trace	  (k)	  were	  plotted	  against	  phosphate	  concentration.	  To	  obtain	  kreact/Kreact	  linear	  regression	  was	  preformed	  by	  fitting	  to	  Y	  =	  m[X]	  +	  Y0,	  where	  the	  slope	  (m)	  =	  kreact/Kreact	  (min-­‐1	  mM-­‐1).	  3.7 1H	  NMR	  analysis	  10	  mM	  substrate	  (pNPGlcNAc	  or	  pNPGlc)	  was	   incubated	  with	  1	  mg/mL	  BglX	  and	  0	  or	  100	  mM	  phosphate	  in	  Buffer	  D	  for	  2	  h	  (500	  μL	  reaction	  volume)	  at	  37	  °C.	  After	  the	  incubation	  time	  BglX	  was	  removed	  with	  Amicon	  centrifugal	  filters	  and	  the	  filtrate	   was	   freeze-­‐dried	   using	   a	   SpeedVac	   (Savant	   SV	   100)	   concentrator	   and	  dissolved	  in	  D2O	  (500	  uL).	  1H	  spectra	  were	  recorded	  on	  a	  Bruker	  400	  MHz	  Avance	  with	   inverse	   probehead.	   Per	   sample	   16	   scans	   employing	   presaturation	   water	  suppression	  (Bruker	  standard	  pulse	  program	  “zgcppr”)	  were	  recorded.	  	   58	  3.8 High	  throughput	  functional	  screen	  The	   screening	   methodology	   was	   modeled	   on	   the	   functional	   metagenomic	  screen	  reported	  by	  Mewis	  and	  colleagues	   (44).	  Master	  plates	  were	   replicated	   into	  384-­‐well	   plates	   containing	   50	   μL	   LB	   media	   with	   100	   μg/mL	   arabinose	   and	   12.5	  μg/mL	  chloramphenicol	  using	  a	  QPix2	  robot.	  The	  replicated	  plates	  were	  grown	  for	  18	   h	   at	   37	   °C.	   The	   assay	   was	   performed	   by	   adding	   50	   μL	   of	   2x	   assay	   buffer	  (composition:	   200	   mM	   potassium	   phosphate	   (pH	   7.0),	   2	   %	   triton	   X-­‐100,	   40	   mM	  NaCl,	  200	  μM	  DNPGlc)	  and	  incubating	  at	  37	  °C	  for	  6	  h.	  The	  assay	  temperature	  was	  set	   to	   37	   °C	   because	   it	   is	   the	   optimal	   temperature	   for	  E.	   coli	   growth	   and	   protein	  expression.	  After	   the	  6	  h	   incubation	  absorbance	  measurements	  were	   taken	  at	  400	  and	   600	   nm.	   The	   optical	   density	   reading	   at	   600	   nm	   was	   taken	   to	   normalize	   the	  activity	   scores	   by	   controlling	   for	   differences	   in	   cell	   densities.	   Fosmid	   clones	  were	  scored	  by	  calculating	   the	  400/600	  nm	  ratio	  and	   those	  greater	   than	  5	  SD	  were	   re-­‐arrayed	  to	  a	  384-­‐well	  consolidation	  plate.	  3.9 Consolidation	  Six	  replicates	  of	  the	  consolidation	  plate	  were	  made	  and	  screened	  (as	  described	  above),	   in	   triplicate,	   in	   the	   presence	   and	   absence	   of	   phosphate.	   Assay	   buffer	   (+	  phosphate):	  50	  mM	  HEPES	   (pH	  7.0),	   200	  mM	  potassium	  phosphate	   (pH	  7.0),	   2	  %	  triton	   X-­‐100,	   40	   mM	   NaCl,	   200	   μM	   DNPGlc.	   Assay	   buffer	   (-­‐	   phosphate):	   50	   mM	  HEPES	  (pH	  7.0),	  2	  %	  triton	  X-­‐100,	  40	  mM	  NaCl,	  200	  μM	  DNPGlc.	  	  3.10 Sub-­‐library	  Fosmids	  were	   isolated	   from	   the	   corresponding	   clones	   using	   a	   QIAprep	   Spin	  Miniprep	  Kit	   and	   sheared	   to	  ~3000	  bp	   fragments	  using	   a	  Covaris	  M220	  Focused-­‐ultrasonicator™.	   To	   prepare	   the	   fragments	   for	   blunt	   end	   ligation	   they	   were	  incubated	  with	  an	  end	  repair	  enzyme	  mix	  (42.5	  μL	  sheered	   fosmid	  DNA,	  5	  μL	  10x	  end	  repair	  buffer,	  2.5	  μL	  end	  repair	  enzymes)	  for	  10	  min	  at	  20	  °C	  and	  then	  directly	  loaded	  on	  a	  1	  %	  agarose	  gel.	  The	  DNA	  band	  (at	  ~3000	  bp)	  was	  excised	  and	  purified	  using	   a	   QIAquick	   Gel	   Extraction	   Kit.	   pUC19	   was	   prepared	   for	   blunt	   end	   ligation	  digestion	   with	   SmaI	   (Fermentas).	   A	   50	   μL	   digestion	   reaction	   containing:	   30	   μL	  pUC19	   (86.5	   ng/μL),	   1	   μL	   SmaI,	   5	   μL	   10x	   FastDigest	   Buffer	   (Fermentas)	   was	  	   59	  incubated	   for	   5	   min	   at	   37	   °C.	   SmaI	   was	   inactivated	   by	   incubating	   the	   digestion	  reaction	   for	   5	  min	   at	   65	   °C.	   Fosmid	   fragments	   and	   linearized	  pUC19	  were	   ligated	  using	   T4	   DNA	   ligase	   (Thermo	   Scientific).	   Three	   20	   μL	   ligation	   reactions	   each	  containing:	   1.75	   μL	   linearized	   pUC19	   (28.2	   ng/μL),	   11	   μL	   fosmid	   fragments	   (13.4	  ng/μL),	  2	  μL	  10x	  T4	  buffer	  and	  1	  μL	  T4	  DNA	  ligase	  were	  incubated	  for	  18	  h	  at	  16	  °C.	  5	  μL	  of	  ligation	  mixture	  was	  transformed	  in	  to	  50	  μL	  chemically	  competent	  DH5α	  by	  heat	  shock	  (x	  12).	  After	  recovery,	  the	  cells	  were	  spread	  on	  LB	  agar	  plates	  containing:	  50	  μg/mL	  ampicillin,	  50	  μg/mL	  X-­‐Gal	  and	  0.1	  mM	  IPTG	  and	  incubated	  for	  18	  h	  at	  37	  °C.	  336	  white	  colonies	  were	  manually	  transferred	  (using	  sterile	  toothpicks)	  to	  4	  96-­‐well	   culture	   plates	   (master	   plates),	   each	   well	   containing	   200	   μL	   LB	   media	  containing:	   10	  %	   glycerol	   and	   100	   μg/mL	   ampicillin.	   The	   inoculated	   plates	   were	  incubated	   for	  18	  h	  37	   °C	   and	   then	   stored	  at	   -­‐70	   °C,	   thawing	  occasionally	   to	  make	  replicates.	  Screening	  of	  the	  sub-­‐libraries	  was	  performed	  as	  described	  above.	  	  	   60	  References	  1.	   Crocker,	   P.	   R.,	   and	   Feizi,	   T.	   (1996)	   Carbohydrate	   recognition	   systems:	  functional	  triads	  in	  cell-­‐cell	  interactions.	  Curr.	  Opin.	  Struct.	  Biol.	  6,	  679-­‐691	  2.	   Frommer,	  W.,	  Junge,	  B.,	  Müller,	  L.,	  Schmidt,	  D.,	  and	  Truscheit,	  E.	  (1979)	  New	  enzyme	  inhibitors	  from	  microorganisms.	  Planta	  Med.	  35,	  195-­‐217	  3.	   Kim,	  J.-­‐H.	  H.,	  Resende,	  R.,	  Wennekes,	  T.,	  Chen,	  H.-­‐M.	  M.,	  Bance,	  N.,	  Buchini,	  S.,	  Watts,	   A.	   G.,	   Pilling,	   P.,	   Streltsov,	   V.	   A.,	   Petric,	   M.,	   Liggins,	   R.,	   Barrett,	   S.,	  McKimm-­‐Breschkin,	   J.	  L.,	  Niikura,	  M.,	  and	  Withers,	  S.	  G.	   (2013)	  Mechanism-­‐based	   covalent	   neuraminidase	   inhibitors	   with	   broad-­‐spectrum	   influenza	  antiviral	  activity.	  Science	  340,	  71-­‐75	  4.	   Tsai,	  C.-­‐S.	  S.,	  Yen,	  H.-­‐Y.	  Y.,	  Lin,	  M.-­‐I.	  I.,	  Tsai,	  T.-­‐I.	  I.,	  Wang,	  S.-­‐Y.	  Y.,	  Huang,	  W.-­‐I.	  I.,	  Hsu,	  T.-­‐L.	  L.,	  Cheng,	  Y.-­‐S.	  E.	  S.,	  Fang,	  J.-­‐M.	  M.,	  and	  Wong,	  C.-­‐H.	  H.	  (2013)	  Cell-­‐permeable	  probe	  for	  identification	  and	  imaging	  of	  sialidases.	  Proc.	  Natl.	  Acad.	  Sci.	  U.	  S.	  A.	  110,	  2466-­‐2471	  5.	   Cantarel,	  B.	  L.,	  Coutinho,	  P.	  M.,	  and	  Rancurel…,	  C.	  (2009)	  The	  Carbohydrate-­‐Active	   EnZymes	   database	   (CAZy):	   an	   expert	   resource	   for	   glycogenomics.	  Nucleic	  Acids	  Res.	  37,	  D233-­‐238	  6.	   Buchholz,	   K.,	   and	   Seibel,	   J.	   (2008)	   Industrial	   carbohydrate	  biotransformations.	  Carbohydr.	  Res.	  343,	  1966-­‐1979	  7.	   Wang,	   L.	   X.,	   and	   Huang,	   W.	   (2009)	   Enzymatic	   transglycosylation	   for	  glycoconjugate	  synthesis.	  Curr.	  Opin.	  Chem.	  Biol.	  13,	  592-­‐600	  8.	   Nakai,	   H.,	   Kitaoka,	   M.,	   Svensson,	   B.,	   and	   Ohtsubo,	   K.	   i.	   (2013)	   Recent	  development	   of	   phosphorylases	   possessing	   large	   potential	   for	  oligosaccharide	  synthesis.	  Curr.	  Opin.	  Chem.	  Biol.	  17,	  301-­‐309	  9.	   Johnson,	  L.	  N.,	  and	  Barford,	  D.	  (1990)	  Glycogen	  phosphorylase.	  The	  structural	  basis	   of	   the	   allosteric	   response	   and	   comparison	   with	   other	   allosteric	  proteins.	  J.	  Biol.	  Chem.	  265,	  2409-­‐2412	  10.	   Dron,	   D.,	   Krzewinski,	   F.,	   Brassart,	   C.,	   and	   Bouquelet,	   S.	   (1999)	   β-­‐1,	   3	  Galactosyl	  N	  acetylhexosamine	  phosphorylase	  from	  Bifidobacterium	  bifidum	  DSM	   20082:	   characterization,	   partial	   purification	   and	   relation	   to	   mucin	  degradation.	  Biotechnol.	  Appl.	  Biochem.	  29,	  3-­‐10	  11.	   Park,	   J.	   K.,	   Keyhani,	   N.	   O.,	   and	  Roseman,	   S.	   (2000)	   Chitin	   catabolism	   in	   the	  marine	   bacterium	   Vibrio	   furnissii.	   Identification,	   molecular	   cloning,	   and	  	   61	  characterization	   of	   A	   N,	   N'-­‐diacetylchitobiose	   phosphorylase.	   J.	   Biol.	   Chem.	  275,	  33077-­‐33083	  12.	   Sawano,	   T.,	   Saburi,	   W.,	   Hamura,	   K.,	   Matsui,	   H.,	   and	   Mori,	   H.	   (2013)	  Characterization	   of	   Ruminococcus	  albus	   cellodextrin	   phosphorylase	   and	  identification	   of	   a	   key	   phenylalanine	   residue	   for	   acceptor	   specificity	   and	  affinity	  to	  the	  phosphate	  group.	  FEBS	  J.	  280,	  4463-­‐4473	  13.	   Senoura,	   T.,	   Ito,	   S.,	   Taguchi,	   H.,	   Higa,	   M.,	   and	   Hamada…,	   S.	   (2011)	   New	  microbial	  mannan	  catabolic	  pathway	  that	  involves	  a	  novel	  mannosylglucose	  phosphorylase.	  Biochem.	  Biophys.	  Res.	  Commun.	  408,	  701-­‐706	  14.	   Suzuki,	   M.,	   Kaneda,	   K.,	   Nakai,	   Y.,	   Kitaoka,	   M.,	   and	   Taniguchi,	   H.	   (2009)	  Synthesis	   of	   cellobiose	   from	   starch	   by	   the	   successive	   actions	   of	   two	  phosphorylases.	  New	  Biotechnol.	  26,	  137-­‐142	  15.	   Elbein,	  A.	  D.,	  Pastuszak,	  I.,	  Tackett,	  A.	  J.,	  Wilson,	  T.,	  and	  Pan,	  Y.	  T.	  (2010)	  Last	  step	  in	  the	  conversion	  of	  trehalose	  to	  glycogen	  a	  mycobacterial	  enzyme	  that	  transfers	  maltose	   from	  maltose	  1-­‐phosphate	   to	  glycogen.	   J.	  Biol.	  Chem.	  285,	  9803-­‐9812	  16.	   Kittl,	   R.,	   and	  Withers,	   S.	   G.	   (2010)	  New	   approaches	   to	   enzymatic	   glycoside	  synthesis	  through	  directed	  evolution.	  Carbohydr.	  Res.	  345,	  1272-­‐1279	  17.	   Luley-­‐Goedl,	   C.,	   and	   Nidetzky,	   B.	   (2010)	   Carbohydrate	   synthesis	   by	  disaccharide	   phosphorylases:	   Reactions,	   catalytic	   mechanisms	   and	  application	  in	  the	  glycosciences.	  Biotechnol.	  J.	  5,	  1324-­‐1338	  18.	   Kwon,	  T.,	  Kim,	  C.	  T.,	   and	  Lee,	   J.-­‐H.	  H.	   (2007)	  Transglucosylation	  of	   ascorbic	  acid	   to	   ascorbic	   acid	   2-­‐glucoside	   by	   a	   recombinant	   sucrose	   phosphorylase	  from	  Bifidobacterium	  longum.	  Biotechnol.	  Lett	  29,	  611-­‐615	  19.	   Goedl,	  C.,	  Sawangwan,	  T.,	  Mueller,	  M.,	  Schwarz,	  A.,	  and	  Nidetzky,	  B.	  (2008)	  A	  High-­‐Yielding	   Biocatalytic	   Process	   for	   the	   Production	   of	   2-­‐	   O	   -­‐(α-­‐D-­‐glucopyranosyl)-­‐	   sn	   -­‐glycerol,	   a	   Natural	   Osmolyte	   and	   Useful	   Moisturizing	  Ingredient.	  Angew.	  Chem.	  Int.	  Ed.	  47,	  10086-­‐10089	  20.	   Nishimoto,	  M.,	  and	  Kitaoka,	  M.	  (2007)	  Practical	  preparation	  of	  lacto-­‐N-­‐biose	  I,	  a	  candidate	  for	  the	  bifidus	  factor	  in	  human	  milk.	  Biosci.	  Biotechnol.	  Biochem.	  71,	  2101-­‐2104	  21.	   De	  Groeve,	  M.	  R.,	  De	  Baere,	  M.,	  Hoflack,	  L.,	  Desmet,	  T.,	  Vandamme,	  E.	   J.,	  and	  Soetaert,	   W.	   (2009)	   Creating	   lactose	   phosphorylase	   enzymes	   by	   directed	  evolution	  of	  cellobiose	  phosphorylase.	  Protein	  Eng.	  Des.	  Sel.	  22,	  393-­‐399	  	   62	  22.	   De	   Groeve,	   M.	   R.,	   Depreitere,	   V.,	   Desmet,	   T.,	   and	   Soetaert,	   W.	   (2009)	  Enzymatic	   production	   of	   alpha-­‐D-­‐galactose	   1-­‐phosphate	   by	   lactose	  phosphorolysis.	  Biotechnol.	  Lett	  31,	  1873-­‐1877	  23.	   De	  Groeve,	  M.	  R.,	  Tran,	  G.	  H.,	  Van	  Hoorebeke,	  A.,	  Stout,	  J.,	  Desmet,	  T.,	  Savvides,	  S.	   N.,	   and	   Soetaert,	  W.	   (2010)	   Development	   and	   application	   of	   a	   screening	  assay	  for	  glycoside	  phosphorylases.	  Anal.	  Biochem.	  401,	  162-­‐167	  24.	   Nakai,	   H.,	   Petersen,	   B.	   O.,	   Westphal,	   Y.,	   Dilokpimol,	   A.,	   Abou	   Hachem,	   M.,	  Duus,	   J.	   Ø.,	   Schols,	   H.	   A.,	   and	   Svensson,	   B.	   (2010)	   Rational	   engineering	   of	  Lactobacillus	  acidophilus	  NCFM	  maltose	  phosphorylase	  into	  either	  trehalose	  or	  kojibiose	  dual	  specificity	  phosphorylase.	  Protein	  Eng.	  Des.	  Sel.	  23,	  781-­‐787	  25.	   Healy,	  F.	  G.,	  Ray,	  R.	  M.,	  Aldrich,	  H.	  C.,	  and	  Wilkie,	  A.	  C.	  (1995)	  Direct	  isolation	  of	   functional	   genes	   encoding	   cellulases	   from	   the	   microbial	   consortia	   in	   a	  thermophilic,	   anaerobic	   digester	   maintained	   on	   lignocellulose.	   Appl.	  Microbiol.	  Biotechnol.	  43,	  667-­‐674	  26.	   Handelsman,	   J.,	   Rondon,	  M.	   R.,	   Brady,	   S.	   F.,	   and	   Clardy,	   J.	   (1998)	  Molecular	  biological	  access	  to	  the	  chemistry	  of	  unknown	  soil	  microbes:	  a	  new	  frontier	  for	  natural	  products.	  Chem.	  Biol.	  5,	  R245-­‐249	  27.	   Amann,	   R.	   I.,	   Ludwig,	   W.,	   and	   Schleifer,	   K.-­‐H.	   (1995)	   Phylogenetic	  identification	   and	   in	   situ	   detection	   of	   individual	   microbial	   cells	   without	  cultivation.	  Microbiol.	  Rev.	  59,	  143-­‐169	  28.	   Uchiyama,	  T.,	  and	  Miyazaki,	  K.	  (2009)	  Functional	  metagenomics	  for	  enzyme	  discovery:	   challenges	   to	   efficient	   screening.	  Curr.	  Opin.	  Biotechnol.	  20,	   616-­‐622	  29.	   Gabor,	  E.	  M.,	  and	  Alkema,	  W.	  B.	  L.	  (2004)	  Quantifying	  the	  accessibility	  of	  the	  metagenome	  by	  random	  expression	  cloning	  techniques.	  Environ.	  Microbiol.	  6,	  879-­‐886	  30.	   Das,	  G.,	  Dineshkumar,	  T.	  K.,	   and	  Thanedar,	  S.	   (2005)	  Acquisition	  of	  a	   stable	  mutation	   in	   metY	   allows	   efficient	   initiation	   from	   an	   amber	   codon	   in	  Escherichia	  coli.	  Microbiology	  151,	  1741-­‐1750	  31.	   Maar,	  D.,	  Liveris,	  D.,	  Sussman,	  J.	  K.,	  and	  Ringquist,	  S.	  (2008)	  A	  single	  mutation	  in	  the	  IF3	  N-­‐terminal	  domain	  perturbs	  the	  fidelity	  of	  translation	  initiation	  at	  three	  levels.	  J.	  Mol.	  Biol.	  28,	  937-­‐944	  32.	   O'Connor,	  M.,	   Gregory,	   S.	   T.,	   and	  Dahlberg,	   A.	   E.	   (2004)	  Multiple	   defects	   in	  translation	   associated	  with	   altered	   ribosomal	   protein	   L4.	  Nucleic	  Acids	  Res.	  32,	  5750-­‐5756	  	   63	  33.	   Nidetzky,	  B.,	  Griessler,	  R.,	  and	  Schwarz,	  A.	   (2004)	  Cellobiose	  phosphorylase	  from	  Cellulomonas	  uda:	  gene	  cloning	  and	  expression	   in	  Escherichia	  coli,	  and	  application	  of	   the	  recombinant	  enzyme	   in	  a	   ‘glycosynthase-­‐type’	   reaction.	   J.	  Mol.	  Catal.	  B:	  Enzym.	  29,	  241-­‐248	  34.	   Nakai,	   H.,	   Hachem,	   M.	   A.,	   Petersen,	   B.	   O.,	   Westphal,	   Y.,	   Mannerstedt,	   K.,	  Baumann,	   M.	   J.,	   Dilokpimol,	   A.,	   Schols,	   H.	   A.,	   Duus,	   J.	   Ø.,	   and	   Svensson,	   B.	  (2010)	   Efficient	   chemoenzymatic	   oligosaccharide	   synthesis	   by	   reverse	  phosphorolysis	   using	   cellobiose	   phosphorylase	   and	   cellodextrin	  phosphorylase	  from	  Clostridium	  thermocellum.	  Biochimie	  92,	  1818-­‐1826	  35.	   Kawaja,	  J.	  D.	  E.,	  Morin,	  K.,	  and	  Gould,	  W.	  D.	  (2005)	  A	  duplicate	  column	  study	  of	  arsenic,	  cadmium	  and	  zinc	  treatment	  in	  an	  anaerobic	  bioreactor	  based	  on	  a	   system	   operated	   by	   Teck	   Cominco	   in	   Trail,	   British	   Columbia.	   In:	   British	  Columbia	  Mine	  Reclamation	  Symposium	  2005,	  Abbotsford,	  British	  Columbia,	  Canada.	  	  36.	   Al,	  M.,	  Evans,	  L.	   J.,	  Gould,	  D.	  W.,	  and	  Duncan,	  W.	  F.	  A.	   (2011)	  The	   long	   term	  operation	  of	  a	  biologically	  based	  treatment	  system	  that	  removes	  As,	  S	  and	  Zn	  from	   industrial	   (smelter	   operation)	   landfill	   seepage.	   Appl.	   Geochem.	   26,	  1886–1896	  37.	   Mewis,	  K.,	  Armstrong,	  Z.,	  Song,	  Y.	  C.,	  Baldwin,	  S.	  A.,	  Withers,	  S.	  G.,	  and	  Hallam,	  S.	  J.	  (2013)	  Biomining	  active	  cellulases	  from	  a	  mining	  bioremediation	  system.	  J.	  Biotechnol.	  167,	  462-­‐471	  38.	   Shizuya,	  H.,	  Birren,	  B.,	  and	  Kim,	  U.	  J.	  (1992)	  Cloning	  and	  stable	  maintenance	  of	  300-­‐kilobase-­‐pair	  fragments	  of	  human	  DNA	  in	  Escherichia	  coli	  using	  an	  F-­‐factor-­‐based	  vector.	  Proc.	  Natl.	  Acad.	  Sci.	  U.	  S.	  A.	  89,	  8794-­‐8797	  39.	   Woo,	   S.-­‐S.,	   Jiang,	   J.,	   Gill,	   B.	   S.,	   Paterson,	   A.	   H.,	   and	   Wing,	   R.	   A.	   (1994)	  Construction	  and	  characterization	  of	  bacterial	  artificial	  chromosome	  library	  of	  Sorghum	  bicolor.	  Nucleic	  Acids	  Res.	  22,	  4922-­‐4931	  40.	   Cai,	   L.,	   Taylor,	   J.	   F.,	  Wing,	   R.	   A.,	   Gallagher,	   D.	   S.,	  Woo,	   S.	   S.,	   and	  Davis,	   S.	   K.	  (1995)	   Construction	   and	   characterization	   of	   a	   bovine	   bacterial	   artificial	  chromosome	  library.	  Genomics	  29,	  413-­‐425	  41.	   Diaz-­‐Perez,	   S.	   V.,	   Crouch,	   V.	  W.,	   and	   Orbach,	  M.	   J.	   (1996)	   Construction	   and	  Characterization	   of	   a	   Magnaporthe	   grisea	   Bacterial	   Artificial	   Chromosome	  Library.	  Fungal	  Genet.	  Biol.	  20,	  280-­‐288	  42.	   Kim,	  U.	  J.,	  Birren,	  B.	  W.,	  Slepak,	  T.,	  Mancino,	  V.,	  Boysen,	  C.,	  Kang,	  H.	  L.,	  Simon,	  M.	   I.,	   and	  Shizuya,	  H.	   (1996)	  Construction	   and	   characterization	  of	   a	  human	  bacterial	  artificial	  chromosome	  library.	  Genomics	  34,	  213-­‐218	  	   64	  43.	   Wild,	   J.,	   Hradecna,	   Z.,	   and	   Szybalski,	   W.	   (2002)	   Conditionally	   amplifiable	  BACs:	   switching	   from	  single-­‐copy	   to	  high-­‐copy	  vectors	  and	  genomic	  clones.	  Genome	  Res.	  12,	  1434-­‐1444	  44.	   Mewis,	  K.,	  Taupp,	  M.,	   and	  Hallam,	  S.	   J.	   (2010)	  A	  high	   throughput	  screen	   for	  biomining	  cellulase	  activity	  from	  metagenomic	  libraries.	  J	  Vis	  Exp	  48	  45.	   Reith,	  J.,	  and	  Mayer,	  C.	  (2011)	  Peptidoglycan	  turnover	  and	  recycling	  in	  Gram-­‐positive	  bacteria.	  Appl.	  Microbiol.	  Biotechnol.	  92,	  1-­‐11	  46.	   Balcewich,	  M.	  D.,	  Stubbs,	  K.	  A.,	  He,	  Y.,	  James,	  T.	  W.,	  Davies,	  G.	  J.,	  Vocadlo,	  D.	  J.,	  and	  Mark,	  B.	  L.	  (2009)	  Insight	  into	  a	  strategy	  for	  attenuating	  AmpC-­‐mediated	  beta-­‐lactam	   resistance:	   structural	   basis	   for	   selective	   inhibition	   of	   the	  glycoside	  hydrolase	  NagZ.	  Protein	  Sci.	  18,	  1541-­‐1551	  47.	   Mark,	   B.	   L.,	   Vocadlo,	   D.	   J.,	   and	   Oliver,	   A.	   (2011)	   Providing	   beta-­‐lactams	   a	  helping	  hand:	  targeting	  the	  AmpC	  beta-­‐lactamase	  induction	  pathway.	  Future	  Microbiol.	  6,	  1415-­‐1427	  48.	   Stubbs,	  K.	  A.,	  Bacik,	  J.	  P.,	  Perley-­‐Robertson,	  G.	  E.,	  Whitworth,	  G.	  E.,	  Gloster,	  T.	  M.,	   Vocadlo,	   D.	   J.,	   and	   Mark,	   B.	   L.	   (2013)	   The	   Development	   of	   Selective	  Inhibitors	   of	   NagZ:	   Increased	   Susceptibility	   of	   Gram-­‐Negative	   Bacteria	   to	  beta-­‐Lactams.	  ChemBioChem	  14,	  1973-­‐1981	  49.	   Mayer,	   C.,	   Vocadlo,	   D.	   J.,	   Mah,	   M.,	   Rupitz,	   K.,	   Stoll,	   D.,	   Warren,	   R.	   A.,	   and	  Withers,	  S.	  G.	  (2006)	  Characterization	  of	  a	  beta-­‐N-­‐acetylhexosaminidase	  and	  a	   beta-­‐N-­‐acetylglucosaminidase/beta-­‐glucosidase	   from	   Cellulomonas	   fimi.	  FEBS	  J.	  273,	  2929-­‐2941	  50.	   Withers,	  S.	  G.,	  Rupitz,	  K.,	  and	  Street,	  I.	  P.	  (1988)	  2-­‐Deoxy-­‐2-­‐fluoro-­‐D-­‐glycosyl	  fluorides.	  A	  new	  class	  of	  specific	  mechanism-­‐based	  glycosidase	   inhibitors.	   J.	  Biol.	  Chem.	  263,	  7929-­‐7932	  51.	   Bacik,	   J.-­‐P.	   P.,	  Whitworth,	   G.	   E.,	   Stubbs,	   K.	   A.,	   Vocadlo,	  D.	   J.,	   and	  Mark,	   B.	   L.	  (2012)	  Active	  site	  plasticity	  within	  the	  glycoside	  hydrolase	  NagZ	  underlies	  a	  dynamic	  mechanism	  of	  substrate	  distortion.	  Chem.	  Biol.	  19,	  1471-­‐1482	  52.	   Litzinger,	   S.,	   Fischer,	   S.,	   Polzer,	   P.,	   Diederichs,	   K.,	  Welte,	  W.,	   and	  Mayer,	   C.	  (2010)	   Structural	   and	   kinetic	   analysis	   of	   Bacillus	   subtilis	   N-­‐acetylglucosaminidase	   reveals	   a	   unique	   Asp-­‐His	   dyad	   mechanism.	   J.	   Biol.	  Chem.	  285,	  35675-­‐35684	  53.	   Konwar,	   K.	   M.,	   Hanson,	   N.	   W.,	   Pagé,	   A.	   P.,	   and	   Hallam,	   S.	   J.	   (2012)	  MetaPathways:	   a	   modular	   pipeline	   for	   constructing	   pathway/genome	  databases	  from	  environmental	  sequence	  information.	  BMC	  Bioinformatics	  14,	  202	  	   65	  54.	   Altschul,	  S.	  F.,	  Gish,	  W.,	  Miller,	  W.,	  Myers,	  E.	  W.,	  and	  Lipman,	  D.	  J.	  (1990)	  Basic	  local	  alignment	  search	  tool.	  J.	  Mol.	  Biol.	  215,	  403-­‐410	  55.	   Klock,	  H.	  E.,	  Koesema,	  E.	  J.,	  Knuth,	  M.	  W.,	  and	  Lesley,	  S.	  A.	  (2008)	  Combining	  the	   polymerase	   incomplete	   primer	   extension	   method	   for	   cloning	   and	  mutagenesis	  with	  microscreening	   to	   accelerate	   structural	   genomics	   efforts.	  Proteins	  71,	  982-­‐994	  	  	   66	  Appendix	  Appendix	  table	  1:	  PIPE	  cloning	  primers	  and	  PCR	  parameters	  	  Thermal cycler parameters:! !95  °C    2 min!95  °C   30 s!An. °C   30 s!72  °C   Ex. time!16  °C   hold! !An: Anneal temperature (see chart)!Ex: Extension time     (see chart)!x25!Construct Template* Annealing3Temp Extension3TimeF: GAAGTAATTATGGGACACCATCACCATCACCATTGAGATCCGGCTGCTR: TCATCAAAAAAACCGAACTTCATGGTATATCTCCTTCTTAAAGF: TAAGAAGGAGATATACCATGAAGTTCGGTTTTTTTGAR: AGCAGCCGGATCTCAATGGTGATGGTGATGGTGTCCCATAATTACTTCAACTF: ACTCTTAAGTTTAAACACCATCACCATCACCATTGAGATCCGGCTGCTR: CTCGCTGTTACTTTAGTAATCATGGTATATCTCCTTCTTAAAGF: TAAGAAGGAGATATACCATGATTACTAAAGTAACAGCGAGR: AGCAGCCGGATCTCAATGGTGATGGTGATGGTGTTTAAACTTAAGAGTCACTATATGF: GTCTGTGGGACGCACGCCTATGAGATCCGGCTGCTAR: GAAAGAATGGATTCGTACATGGTATATCTCCTTCTTAAAGTF: GAAGGAGATATACCATGTACGAATCCATTCTTTCCATCAAACR: GTTAGCAGCCGGATCTCATAGGCGTGCGTCCCACAGACCF: GGTCTGTGGGACGCACGCCTACACCATCACCATCACCR: GAAAGAATGGATTCGTACATGGTATATCTCCTTCTTAAAGTF: GAAGGAGATATACCATGTACGAATCCATTCTTTCCATCAAACR: AATGGTGATGGTGATGGTGTAGGCGTGCGTCCCACAGACCF: GACGTTGTCGTTAATTTGTGAGATCCGGCTGCTAACR: TCAAAATAACCAAATTTCATGGTATATCTCCTTCTTAAAGF: AGAAGGAGATATACCATGAAATTTGGTTATTTTGACGATAR: TTGTTAGCAGCCGGATCTCACAAATTAACGACAACGTCF: CGACGTTGTCGTTAATTTGCACCATCACCATCACCATR: TCAAAATAACCAAATTTCATGGTATATCTCCTTCTTAAAGF: AGAAGGAGATATACCATGAAATTTGGTTATTTTGACGATAR: ATGGTGATGGTGATGGTGCAAATTAACGACAACGTCF: AGGTATTGGTGGAACTAAACTGAGATCCGGCTGCTAACR: GAAATGTCCGTATTGCATGGTATATCTCCTTCTTAAAGTTAF: TAAGAAGGAGATATACCATGCAATACGGACATTTCGATGAR: TTGTTAGCAGCCGGATCTCAGTTTAGTTCCACCAATACCTGF: CAGGTATTGGTGGAACTAAACCACCATCACCATCACCATR: GAAATGTCCGTATTGCATGGTATATCTCCTTCTTAAAGTTAF: TAAGAAGGAGATATACCATGCAATACGGACATTTCGATGAR: GGTGATGGTGATGGTGGTTTAGTTCCACCAATACCTGATF: ATGTGCTGGTTGAGTTGGGATGAGATCCGGCTGCTAACR: CAAAATAACCAAAACGCATGGTATATCTCCTTCTTAAAGTTAF: TAACTTTAAGAAGGAGATATACCATGCGTTTTGGTTATTTR: TTGTTAGCAGCCGGATCTCATCCCAACTCAACCAGCACF: GTGCTGGTTGAGTTGGGACACCATCACCATCACCATR: CAAAATAACCAAAACGCATGGTATATCTCCTTCTTAAAGTTAF: TAACTTTAAGAAGGAGATATACCATGCGTTTTGGTTATTTR: CAATGGTGATGGTGATGGTGTCCCAACTCAACCAGCACF: GAAATTGAATTCTATATGTCATGAGATCCGGCTGCTAACR: CAAAATACCCATATCTCATGGTATATCTCCTTCTTAAAGTTAF: AAGAAGGAGATATACCATGAGATATGGGTATTTTGATGAAGR: GTTAGCAGCCGGATCTCATGACATATAGAATTCAATTTCTGF: GAAATTGAATTCTATATGTCACACCATCACCATCACCATR: CAAAATACCCATATCTCATGGTATATCTCCTTCTTAAAGTTAF: AAGAAGGAGATATACCATGAGATATGGGTATTTTGATGAAGR: GGTGATGGTGATGGTGTGACATATAGAATTCAATTTCTGF: GGATCAAAGTGGAACTGATGTGAGATCCGGCTGCTAACR: TATCCTGAAATTTTATATATTTCATGGTATATCTCCTTCTTAAAGF: GGAGATATACCATGAAATATATAAAATTTCAGGATAAAAATGGR: TTTGTTAGCAGCCGGATCTCACATCAGTTCCACTTTGATCCF: GGATCAAAGTGGAACTGATGCACCATCACCATCACCATR: TATCCTGAAATTTTATATATTTCATGGTATATCTCCTTCTTAAAGF: GGAGATATACCATGAAATATATAAAATTTCAGGATAAAAATGGR: GGTGATGGTGATGGTGCATCAGTTCCACTTTGATCCpET28<GH94E.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94E 603°C 903spET28<GH94E VpET28<Cf.CBP.his 583°C 43minI Beaver<11<F233(cells) 603°C 903spET28<GH94D.his VpET28<Cf.CBP.his 583°C 43minI pET28<GH94D 553°C 903spET28<GH94D V pET28<Cf.CBP.his 583°C 43minI Beaver<11<F233(cells) 553°C 903spET28<GH94C.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94C 603°C 903spET28<GH94C V pET28<Cf.CBP.his 583°C 43minI CO003<01<D223(cells) 603°C 903spET28<GH94B.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94B 603°C 903spET28<GH94B V pET28<Cf.CBP.his 583°C 43minI FOS62<3<J143(cells) 583°C 903spET28<GH94A.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94A 543°C 903spET28<GH94A V pET28<Cf.CBP.his 583°C 43minI FOS<42<D113(cells) 543°C 903s43minI pUC19<3E10 723°C 13minpET28<BglX.his V pET28<Cf.CBP.his 563°C 43minI pUC19<3E10 723°C 13minPrimers*where3the3template3contains3the3same3resistance3gene3as3the3product3construct,3no3more3than353ng3of3template3was3used3in3the3PCR.3553°C 43minI C.#thermocellum3genomic3DNA3(ATCC327405) 513°C 23minVIpET28<Cf.CBP.hispET28<Cf.CDP.his V pET28apET28aC.#thermocellum3genomic3DNA3(ATCC327405) 553°C513°C 43min23minpET28<BglX V pET28<Cf.CBP.his 563°CThermal cycler parameters:! !95  °C    2 min!95  °C   30 s!An. °C   30 s!72  °C   Ex. time!16  °C   hold! !An: Anneal temperature (see chart)!Ex: Extension time     (see chart)!x25!Construct Template* Ann aling3Temp Extension3TimeF: GA GT TT TGGGAC CATCACCATC C ATTGAGATCC GCTGCTR: C TCAAAA A CCGAACTTC T G ATA C CC CTTAAAGF: TAAGAAG AGATAT CCATGA GTTC G TTT T GAR: GCAGCCGGA CTCAATGG G TGG G TGGTGTCCCATAATTACTTCAACTF: ACTCTTAAGTT AA CACCATCACCATC C ATTGAGATCCGGCTGCTR: CTC CTGTT CTTTAGTA CATGGTAT TCTCCTTCTTAAAGF: TAAGAAG AGATAT CCATGATTACTAAAGTAACAGCG GR: AG A CC TCTCAATGGTGAT G GAT G T TAAACTTAAGAGTCACTATATGF: TCTGTGGGACGCACGC TATGAGATC GGCTGCTR: A ATGGATT G AC T GTATATC C TTAA GTF: AAG AGATATACCA GTAC AATCCATTCTTTC TCAAACR: TTAGCA CCGGATCTCATAGGCGTGCGT CA GACCF: GTCTGTGGGACG ACG C ACACC A A ACCR: A ATGGATT G AC T GTATATC C TTAA GTF: G A A ATACCAT TACGAATCCATTC TT CATCAAACR: A T G AT G G TG T TAGG GT GT CC AGACCF: GACGTTGTCGTT GTGA ATCCGGCTGCTAACR: TCAAAAT ACC A TTTCATGG A ATC CC CTTAAAGF: AGAAGGAGATATACCATGAAATTTGGTTATTTTGACGATAR: TTGTTAGCAGCCGG C CACAA T A GACAACGTCF: CG CGT GT GTT A TGCACC CA CATCA CATR: TCAAAAT ACC A TTTCATGG A ATC CC CTTAAAGF: GAAG GATAT CCA A TTTGGTTATTTTGACGATAR: TGGTGAT GT TGGTGCAAATTAA GACAACG CF: AGGTATTGGTGGAACT AAC GAGATCCGG GCT ACR: G ATGTCCGT TGCATGGTAT CTCCTTCT AAAGTTAF: AAGA GAGATAT C ATGCAATACGGACATTTCGATGAR: TTGTTAGCAGCCGGATCTC GTTTAGT CA CAATA C GF: CAGG A TG GGAA TAAACCACCA ACCA CACCATR: G ATGTCCGT TGCATGGTAT CTCCTTCT AAAGTTAF: TAA AA A ATACCATGCAA A GGAC TT GATGAR: GGTGATG TGATG GGTTTA TTCCACCAATAC GATF: ATGTGCTGGTTG GTTGGGATG G CGGCTGCT CR: C AA A CC AAC CATGGTATAT TCCTTCTTAAAG TAF: AAC TTAA AA GA ATA ATG GTTTTGGTT TTTR: TTGTTAGCA CCGGATCTCAT CAACT A CAGCACF: GTGCTGGTTG GTTG G CACC CA CATCACCATR: C AA A CC AAC CATGGTATAT TCCTTCTTAAAG TAF: T CTTTAAGAAGGAGATATACCATGCGTTTTGGTTATTTR: CAATGGTG GG GATGGTG CCCA CTCAA CAGCACF: G T GAATTCTATATG CA GAGATC GG GCT ACR: C AAATACCC CT GTATATCTCC TC AA GTTF: AAG AGGAGATATA CATGAGATATGGGTATTTTGATGAAGR: TTAGCAGCCGGATCTCA G C TATAGAATTCAATTTCTGF: G T GAATTCTATATG CAC CCATCA CATCACC TR: C AAATACCC CT GTATATCTCC TC AA GTTF: AAGA G AGATATACCATGAGATAT GG ATTT GATGAAGR: TGATGGTGATGGTGTGACAT TAGAATTCAATT CTGF: GGAT AA GTGGA CTG GTGA ATCCGG G AACR: TATCCTG A TTT AT TATTTCATGGTATATCTCC TCTTAAAGF: GGAGATATACCAT A ATAT T AAAT AGGA A AAATGGR: TTTG T GCAGCCGGA CTCA T GTT CTTTG CCF: GGAT AA GTGGA CTG GCACCATCACCATCACCAR: TATCCTG A TTT AT TATTTCATGGTATATCTCC TCTTAAAGF: A ATACCATGAAATATATAAAATTTCA GATAAAAATGGR: GGTGATGGTGATGGTGCATCAGTTCCACTTTGATCCpET28<GH94E.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94E 603°C 903spET28<GH94E VpET28<Cf.CBP.his 583°C 43minI Beaver<11<F233(cells) 603°C 903spET28<GH94D.his VpET28<Cf.CBP.his 583°C 43minI pET28<GH94D 553°C 903spET28<GH94D V pET28<Cf.CBP.his 583°C 43minI Beaver<11<F233(cells) 553°C 903spET28<GH94C.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94C 603°C 903spET28<GH94C V pET28<Cf.CBP.his 583°C 43minI CO003<01<D223(cells) 603°C 903spET28<GH94B.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94B 603°C 903spET28<GH94B V pET28<Cf.CBP.his 583°C 43minI FOS62<3<J143(cells) 583°C 903spET28<GH94A.his V pET28<Cf.CBP.his 583°C 43minI pET28<GH94A 543°C 903spET28<GH94A V pET28<Cf.CBP.his 583°C 43minI FOS<42<D113(cells) 543°C 903s43minI pUC19<3E10 723°C 13minpET28<BglX.his V pET28<Cf.CBP.his 563°C 43minI pUC19<3E10 723°C 13minPrimers*where3the3template3contains3the3same3resistance3gene3as3the3product3construct,3no3more3than353ng3of3template3was3used3in3the3PCR.3553°C 43minI C.#thermocellum3genomic3DNA3(ATCC327405) 513°C 23minVIpET28<Cf.CBP.hispET28<Cf.CDP.his V pET28apET28aC.#thermocellum3genomic3DNA3(ATCC327405) 553°C513°C 43min23minpET28<BglX V pET28<Cf.CBP.his 563°C

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0135644/manifest

Comment

Related Items