UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Structural and functional studies of the N-Terminal domain of the GABPα Kang, Hyun-Seo 2005

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2005-0232.pdf [ 10.01MB ]
Metadata
JSON: 831-1.0092078.json
JSON-LD: 831-1.0092078-ld.json
RDF/XML (Pretty): 831-1.0092078-rdf.xml
RDF/JSON: 831-1.0092078-rdf.json
Turtle: 831-1.0092078-turtle.txt
N-Triples: 831-1.0092078-rdf-ntriples.txt
Original Record: 831-1.0092078-source.json
Full Text
831-1.0092078-fulltext.txt
Citation
831-1.0092078.ris

Full Text

STRUCTURAL AND FUNCTIONAL STUDIES OF THE N-TERMINAL DOMAIN OF GABPa  by Hyun-Seo Kang B.Sc, University of Oregon, 2002  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Biochemistry and Molecular Biology)  THE UNIVERSITY OF BRITISH COLUMBIA April 2005 © Hyun-Seo Kang, 2005  M.Sc. Thesis - Hyun-Seo Kong  Abstract GA-binding protein (GABP) is a heterotetrameric ap^a transcription factor composed of two structurally dissimilar subunits, G A B P a and GABP(3. The modular DNA-binding subunit, G A B P a , was known previously to consist of a DNA-binding ETS domain and a PNT domain, which presumably mediates protein-protein interactions with other components of the signalling or transcription machinery. The transactivation subunit, GABPp\ consists of three domains, namely the ankyrin repeats, which bind G A B P a , a leucine zipper to form P2 dimers, and a transactivation domain. G A B P is known to regulate gene expression involved in many different cellular functions, such as cell cycle control, apoptosis, and viral pathogen expression. An investigation of the N-terminal region of G A B P a revealed the existence of third structured domain. Using partial proteolysis and N M R spectroscopy, this new domain was localized to residues 35-121, and was found to be flanked by unstructured residues and independent of the adjacent PNT domain. The gene encoding this domain was cloned and expressed, yielding a new truncation fragment, G A B P a " 35  function, the structure of G A B P a " 35  121  121  . As a step towards discussing its  was determined by N M R spectroscopy. The protein is  composed of a 5-stranded (3-sheet crossed by a distorted helix. Although globally resembling 35 121  ubiquitin, G A B P a " structure elements.  adopts a novel fold as evident by the arrangement of its secondary  A n analysis of G A B P a " 35  121  for features indicative of a macromolecular  binding interface revealed only a region of negative charge. GABPa " 35  121  Therefore, the structure of  provides few clues to its function.  To determine the function of G A B P a " 35  121  , these strategies will be pursued based on the  hypothesis that this domain is a protein-protein interaction module. First, potential interactions with other domains of G A B P a and G A B P p will be examined using approaches such as native ii  M.Sc. Thesis - Hyun-Seo Kang  gel electrophoresis, chemical crosslinking, and NMR spectroscopy. Second, interactions of GABPa " 35  121  with reported partners of GABP, such as ATF, CBP/p300, or Spl, will be tested.  Finally, unbiased affinity methods will be used to identify proteins from cellular extracts that bind specifically to GABPa " . 35  121  iii  M.Sc. Thesis - Hyun-Seo Kang  Table of Contents  Abstract  ii  Table of Contents  iv  List of Figures  vii  List of Tables  viii  Abbreviations  ix  Acknowledgements  xii  Chapter 1 - Introduction  1  1.1  Eukaryotic transcription  1  1.2  Ets transcription factor family  3  1.3.1  Cellular functions of GABP  6  1.3.2  Quaternary and domain structures of GABP  9  1.4  Investigating the N-terminal region of GABPa  14  1.4.1 Evidence implicating the interactions of the N-terminal residues of GABPa with other folded domains in GABP 15 1.4.2 Possible involvement of the N-terminal domain with previously reported GABP protein partners 15 1.5  Thesis overview  16  Chapter 2 - Materials & Methods  18  2.1  Cloning  18  2.2  Protein expression and purification  20  2.3  Limited trypsin digestion  22  2.4  NMR spectroscopy  22  2.5  Residual dipolar couplings (RDCs) measurements  22  2.6  NMR spectral assignments and structure determination  23  2.7  Native gel electrophoresis  23  Chapter 3 - Identification and cloning of the new GABPa domain  24  3.2  Expression of residues 1-169 from GABPa  24  3.3  Identification of the boundaries of the structured region in GABPa " 1  iv  169  26  M.Sc. Thesis - Hyun-Seo Kang  3.3.1  Limited trypsin digestion  3.3.2  Assigning the ' H - N HSQC spectrum of GABPa "  3.4  26  15  1  Sub-cloning and characterization of GABPa " 35  169  29 32  121  Chapter 4 - Strategies for assigning spectra and gathering spectral information for NMRbased structure calculations 34 4.1  Assigning 2D and 3D NMR spectra  34  4.1.1  Assignment of resonances from backbone nuclei  34  4.1.2  Assignment of resonances from aliphatic sidechain nuclei  36  4.1.3  Assigning resonances from aromatic residues  41  4.2  Secondary structure  41  Chapter 5 - The solution structure of GABPa " 35  5.1  48  121  Obtaining dihedral angle information (<J> , P, Xi) from NMR spectra X  48  5.1.1  0> and W angles  5.1.2  %i angle of residues with H ' protons  48  5.1.3  %, angles of Val, Val, He  49  5.2  Residual dipolar coupling constants  49  5.3  Structure calculation  49  5.4  Structure overview  50  48 |3 (3  5.4.1  (3-strand structure  55  5.4.2  Loop structure  55  5.4.3  Distorted helix  55  5.5  Dynamics from amide N relaxation  58  5.6  Surface features  64  5.7  Structural comparisons  67  5.8  Conclusion  72  1 5  Chapter 6 - Concluding remarks and future directions  74  6.1  Domain interactions within GABP  76  6.2  Potential protein partners reported in the literature  79  6.2.1  ATF-1  79  6.2.2  CBP/p300  80  6.2.3  Spl andSp3  :  .'  ;  80  M.Sc. Thesis - Hyun-Seo Kang 6.3  Unbiased screens for interacting protein: "fishing" for protein partners  81  6.4  Future directions  82  References  83  vi  M.Sc. Thesis - Hyun-Seo Kang  List of Figures Figure 1.1. Overview of eukaryotic gene expression  2  Figure 1.2. Schematic diagram of Ets family protein  4  Figure 1.3. Structures of ETS and PNT domains  5  Figure 1.4. Gene expression by G A B P , NRF-2, and E4TF1  7  Figure 1.5. Model of the G A B P a p ^ a heterotetramer complex on D N A  10  Figure 1.6. Structure of the GABPoV(3 heterodimer complex on D N A  12  Figure 2.1. Oligonucleotides and PCR protocol for cloning G A B P a " 1  Figure 3.1. Partially assigned H - N HSQC spectrum of G A B P a " !  1 5  1  Figure 3.2. Overlaid ' H - N HSQC spectra of G A B P a " 1 5  1  169  1  Figure 3.4. Limited trypsin digestion of G A B P a " 1  1 6 9  1 6 8  3 5  19  1 2 1  25 "  2 5 6  , and G A B P a " . . . 27 1  320  by N M R  28 30  1 6 9  Figure 3.5. The fully assigned H - N HSQC spectrum of G A B P a " !  and G A B P a "  1 6 9  , GABPa  Figure 3.3. Identification of the structured region of G A B P a "  1 6 9  1 5  3 5  Figure 4.1. Heteronuclear experiments used to assign G A B P a " 1  1 6 9  33  1 2 1  and G A B P a " 3 5  35  1 2 1  Figure 4.2. Strategies for assigning protein backbone resonances  37  Figure 4.3. Stereospecific assignments of the Gin and Asn NH2 resonances of G A B P a " 3 5  Figure 4.4. Assignment of resonances from aromatic sidechain in G A B P a " 3 5  1 2 1  ..40 42  1 2 1  Figure 4.5. His66 adopts a neutral N H tautomeric form  43  Figure 4.6. Secondary structure prediction of G A B P a "  45  e 2  3 5  1 2 1  Figure 5.1. Iterative assignment of ambiguous restraints by A R I A  51  Figure 5.2 Solution structure of G A B P a "  53  35  121  Figure 5.3. Residual dipolar coupling constants (observed vs. experimental)  56  Figure 5.4. Identification of a bulge between (3-strands S4 and S5 in G A B P a " 3 5  Figure 5.5. Interactions within the proline-containing a-helix in G A B P a " 3 5  Figure 5.6.  15  N-relaxation analysis of G A B P a " 3 5  Figure 5.7. Surface properties of G A B P a " 35  1 2 1  1 2 1  57 59 62  1 2 1  65  121  Figure 5.8. Secondary structure comparisons of G A B P a " 3 5  1 2 1  and ubiquitin (1UBQ)  Figure 5.9. The surface properties of ubiquitin (1UBQ) Figure 5.10. Comparison of surface lysines of G A B P a " 3 5  70 71  1 2 1  and ubiquitin  73  Figure 6.1. The structure of the full length G A B P a  75  Figure 6.2. Preliminary binding studies of G A B P domains by native gel  77  vii  M.Sc. Thesis — Hyun-Seo Kang  List of Tables Table 1.1. Various nomenclature used for GABP subunits  13  Table 2.1. The compositions of M9T minimal media used for C or N isotope-labelling  21  Table 5.1. NMR restraints and statistics for the ensemble of ten structures calculated for GABPa "  52  13  35  121  viii  1 5  M.Sc. Thesis - Hyun-Seo Kang  Abbreviations ID  one-dimensional  2D  two-dimensional  3D  three-dimensional  ATF  activating transcription factor  ATP  adenosine triphosphate  AR  ankyrin repeat  CBP  C R E B binding protein  CREB  cyclic AMP-responsive element  CSI  chemical shift index  CT  constant time  Da  dalton  D 0  deuterium oxide  DNA  deoxyribonucleic acid  DTT  dithiothreitol  ERG  early response genes  ETS (Ets)  E26 transformation specific  ESI-MS  electrospray ionization mass spectrometry  GABP  GA-binding protein  GST  glutathione-S-transferase  HEPES  N-2-hydroxyethylpiperazine-N' -2-ethanesulphonic acid  hGABP  human GA-binding protein  EJMBC  heteronuclear multiple bond correlation  HMQC  heteronuclear multiple quantum coherence  2  ix  M.Sc. Thesis - Hyun-Seo HIV  human immunedeficiency  HSQC  heteronuclear single quantum coherence  IL-16  interleukin 16  IPAP  in-phase anti-phase  IPTG  isopropyl-p-D-thiogalactopyranoside  kDa  kiloDalton  LB  Luria-Bertani  LZ  leucine zipper  MAP  mitogen activated protein  MW  molecular weight  Ni-NTA  nickel-nitrilotriacetic acid  NMR  nuclear magnetic resonance  NOE  nuclear Overhauser effect  NOESY  nuclear Overhauser effect spectroscopy  NRF  nuclear respiratory factor  PAGE  polyacrylamide gel electrophoresis  PCR  polymerization chain reaction  PDB  protein data bank (http://www.rcsb.org/pdb/)  PNT  pointed  ppm  parts per million  RC04  rat cytochrome C oxidase subunit IV  RDC  residual dipolar coupling  rms  root mean square  SDS  sodium dodecyl sulphate  Kang  M.Sc. Thesis - Hyun-Seo Kang TAD  transactivation domain  TEL  translocation ets leukemia  TOCSY  total correlation spectroscopy  Tris  tri(hydroxymethyl)aminomethane  V P 16  viral protein 16  xi  M.Sc. Thesis - Hyun-Seo Kang  Acknowledgements The work presented in this thesis would not have been accomplished without the invaluable supports from numerous people. First of all, I would like to thank to my supervisor, Dr. Lawrence Mcintosh, who gave me the opportunity to work on this project. His excellent insights in science and guidance through the project have inspired me to complete this thesis. The former and current G A B P collaborators, Drs. Cameron Mackereth and Manuela Schaerpf, cannot be missed for their helps through this project. I acknowledge Cameron for introducing me into this exciting project and Manuela, aka Frog, for being such a supportive collaborator and a mentor in climbing. I also wish to thank to Dr. Gary Yalloway for his time and effort on analyzing the mass spectrometric results and really wish his good health. Dr. Greg Lee has also been a great collaborator for my second project on Ets-1 and provided lots of advices on technical problems in calculating the structure of G A B P a . I would like to acknowledge all the former and current members of Dr. Lawrence Mcintosh's lab. Lastly, I would like to thank my parents and my sister especially for their endless moral support.  xii  Chapter 1. Introduction  Chapter 1 - Introduction 1.1  Eukaryotic transcription The key strategy for eukaryotic gene regulation is based on the synchronization of cis-  and trans-regulatory elements (for reviews see (Buratowski, 1994; Lemon and Tjian, 2000; Spiegelman and Heinrich, 2004)). The cis-regulatory elements are commonly D N A sequences represented by the initiator sequence, the TATA-box, and either enhancers in higher eukaryotes or up-stream activation sequences (UAS) in yeast, located in various regions relative to the associated structural genes. The trans-regulatory elements, generally referred to as transcription factors, include basal transcription factors, DNA-binding transactivators, and coactivators. To form the transcription initiation complex, the basal transcription factors must associate either directly or indirectly with R N A polymerase on the initiator sequence and TATA-box. Recruitment of this complex to a gene promoter is dependent upon the activities of a multitude of specific DNA-binding transcription factors (Fig. 1.1).  Due to their modular structure, these  transcription factors are usually also involved in protein-protein or protein-ligand interactions. Such interactions with the initiation complex-associated proteins can be directly or indirectly facilitated through other proteins, known as co-activators or co-repressors.  In parallel, these  transcription factors are also involved in chromatin remodelling. Unlike prokaryotes, chromatin structure in eukaryotes strongly represses gene expression, providing a requirement that transcription be associated with chromatin remodelling. Ligand binding by transcription factors modulates their conformations, thereby regulating transcription.  Similarly, transcriptional  factors are also regulated through post-translational modifications, such as phosphorylation, acetylation, sumoylation, and ubiquitinylation, in response to signal transduction cascades (Tootle and Rebay, 2005). Thus, the structural and functional characterization of specific  1  Chapter 1. Introduction  Chromatin Remodeling  I Figure 1.1. Overview of eukaryotic gene expression In addition to DNA-binding, specific transcription factors are involved in protein-protein interactions with other transcription-related elements, such as the transcription initiation complex, coactivators/corepressors, or chromatin remodelling related-proteins.  Chapterl.  Introduction  transcription factors remains a critical challenge for understanding the normal and aberrant control of eukaryotic gene expression.  1.2  Ets transcription factor family The founding member of the Ets family, ets-l, was discovered as part of the tripartite  oncogene of the avian E26 (E26 transformation specific) retrovirus (Leprince et al, 1983; Nunn et al, 1983). Since then, the Ets transcription factor family has grown to over 30 recognizedmembers from diverse metazoan species (Fig. 1.2) (Graves and Petersen, 1998). These proteins have key roles in regulating cellular differentiation, development, transformation, and proliferation, particularly in blood cell lineages. The Ets family proteins are defined by the conserved -85 residue ETS domain. The ETS domain contains a DNA-binding motif called the winged helix-turn-helix, which specifically recognizes promoters containing a conserved core sequence, 5'- G G A -3' (Fig. 1.3 (a)). Through numerous biochemical and structural studies, the molecular bases for DNA-binding by several ETS domains are well established. In addition to the ETS domain, approximately one third of all Ets family proteins share another conserved domain, called the PNT (pointed) domain (Fig 1.2 and Fig. 1.3 (b)). Although not as well characterized as the ETS domain, the PNT domain is involved in a variety of protein-protein interactions. For example, in Ets-l, it serves as both a docking site for the M A P kinase Erk2, facilitating phosphorylation of an adjacent N-terminal phosphoacceptor, Thr38, and as an interaction platform for the CBP/p300 co-activator (Seidel and Graves, 2002). In the case of Tel, the PNT domain mediates the polymerization that is required for transcriptional repression (Poirel et al, 2000; Potter et al, 2000). In addition, Ets transcription family proteins usually have a poorly characterized transactivation domain(s).  3  Chapter  PNT  1.  Introduction  ETS  Figure 1.2. Schematic diagram of Ets family protein Schematic diagrams of domain structures of selected Ets transcription family proteins are shown. The PNT and ETS domains are highlighted in red and blue, respectively.  4  Chapter 1. Introduction  (b) PNT domain  Tel  GABPa  Ets-1  Figure 1.3. Structures of ETS and PNT domains (a) The ETS domain contains a DNA-binding winged helix-turn-helix motif that recognizes a conserved core 5'-GAA-3' sequence (Petersen et al., 1995). (b) Ribbon diagrams of the tertiary structures of PNT domains from the Ets family proteins, Erg, GABPa, Tel, and Ets-1. The proteins share an architecture of a core four a-helix bundle (H2, red; H3, green; H4, cyan; H5, blue). Note that an N-terminal helix (HI, purple) is specific to GABPa and Ets-1. (Mackereth et al., 2004)  5  Chapter 1.  1.3  Introduction  GA-binding protein (GABP) One member of the Ets family, G A B P , was first discovered as a transactivator for the  immediate early (IE) gene of the herpes simplex virus 1 (HSV-1) in the rat liver (Fig. 1.4 (a)) (Lamarco et al, 1991; Thompson et al, 1991). The IE gene of HSV-1 is induced by the virion associated protein, VP16 (Triezenberg et al, 1988). However, instead of interacting directly with the promoter region, V P 16 forms a complex with a cellular transcription factor, Oct-1, on the first cis-regulatory element of the IE gene. Subsequently, the heterotetrameric G A B P binds the second cis-regulatory element, known as the purine-rich region due to its core G A sequence. Paralleling the discovery of the first GA-binding protein in the rat liver, two other proteins, nuclear respiratory factor 2 (NRF-2) (Virbasius and Scarpulla, 1990; Virbasius and Scarpulla, 1991) and E4 transcriptional factor 1 (E4TF1) (Watanabe et al, 1988), were identified by the Scarpulla and Handa groups, respectively. Although discovered in distinct biological pathways, they were both confirmed later to be the same protein as G A B P . NRF-2 was identified in the course of studying the rat cytochrome C oxidase subunit IV (RC04) gene, which is a respiratory electron carrier gene (Fig. 1.4 (b)). The A T P synthase (3-subunit gene also contains a binding site for NRF-2 in its promoter region. Along with the discovery of G A B P related to the herpes virus expression, the Handa group discovered that the E4TF1 protein activates the promoter of the E4 region of adenovirus type 5 gene in human (Fig. 1.4 (c)).  1.3.1  Cellular functions of GABP The varied nomenclature used for G A B P reflects its diversity in different biological  pathways (Bassuk and Leiden, 1997; Dittmer and Nordheim, 1998; Ghosh and Kolodkin, 1998; Rosmarin et al, 2004; Wasylyk et al, 1998). GA-binding protein (GABP) is a transcriptional factor that controls many different genes related to cell cycle control, apoptosis, and viral  6  Chapter 1. Introduction  Figure 1.4. Gene expression by G A B P , NRF-2, and E4TF1. G A B P , also known as NRF-2, and E4TF1, was identified through three parallel routes, (a) The immediate early (IE) gene of herpes simplex virus 1 (HSV-1) is regulated based on two cisregulatory elements in its promoter region, namely a purine-rich hexanucleotide and a nanonucleotide sequences. The heterotetrameric complex of G A B P and Oct-l/VP16 complex binds these sites, respectively, thereby activating IE gene expression (Lamarco et al., 1991). (b) One of the mammalian respiratory electron carrier genes, R C 0 4 (rat cytochrome C oxidase subunit 4 gene) is activated based on the binding of NRF-2 (nuclear respiratory factor 2) to its promoter region (Virbasius et al., 1993). (c) The E4 promoter of adenovirus type 5 is activated upon the binding of the cellular factor, E4TF1, and an adenovirus E l A gene product (Watanabe et al., 1990).  7  Chapter 1. Introduction  VP16 Oct-1  GABP c c p 2 a  (a)  Purine-rich region  '  I  IE gene of HSV-1  Nonanucleotide sequence  1  '  cis-regulatory element in the promoter region  NRF-2 RC04  (b)  Mammalian repiratory electron carrier gene  E4TF1  (c)  Adenovirus E1A gene product  \ / E4 region Adenovirus type 5  8  Chapter 1.  pathogen expression.  Introduction  In the cell cycle, it has been reported that G A B P controls the G l / S  restriction point either by regulating the gene expression of the cell-cycle associated component, retinoblastoma (Rb), or by physically interacting with another cell-cycle associated factor, E2F1. As a part of apoptosis, G A B P cooperates with both S p l and P U . l to activate transcription of the (32 leukocyte integrin gene, CD 18, which is required for adhesion of white blood cells to endothelium in order to kill foreign cells. Interestingly, it has also been shown that G A B P regulates expression of some of the most important viral pathogens, such as adenoviruses, herpes viruses, and HIV. The adenovirus early 4 (E4) promoter is activated by G A B P with the cooperation of other factors, including A T F / C R E B . G A B P also interacts with p300, a cellular target of the adenoviral protein E1A, to control the induction of the IL-16 promoter (Bannert et al, 1999). The fact that viruses frequently exploit G A B P to achieve gene expression supports the idea that G A B P has powerful transcriptional properties.  1.3.2  Quaternary and domain structures of GABP G A B P is composed of two structurally-dissimilar proteins, G A B P a and GABPP, which  are often referred as the DNA-binding subunit and the transactivation subunit, respectively. Indeed, G A B P is distinct among the Ets family in having these two functions mediated via separate polypeptide chains. These two subunits form a heterotetrameric ap^a complex on D N A with tandem promoter sites (Fig. 1.5) (Graves, 1998) (Rosmarin et al, 2004). However, it is still controversial if G A B P exists as an a(3 a tetramer or an a(3 heterodimer in solution. 2  G A B P a is composed of three structurally distinct regions (Fig. 1.5 (b)). First, the Cterminal ETS domain (residue 320-430) binds purine-rich regions in D N A , preferably with the sequence 5'-GGAA/T-3'(Batchelor et al, 1998). The crystal structure of ETS domain was solved as a DNA-bound complex. Second, a PNT domain (residue 170-250) of unknown 9  Chapter 1. Introduction  (a)  LEU ZIPPER  Activation  PNT  uomuin Ankyrtn repeats  -  Inhibition 7  • ETS domnln  f / W  AVTTCCQGT TOAAOOC,<^A,  ACTTTCCOQT TOAAQOC,qA,  (b) GABPa Unidentified N-terminal 1 35  121  PNT 168  ETS 254  320  430 454  GABPp  1  5  156  258  327 334  382  Figure 1.5. Model of the GABPap a heterotetramer complex on DNA 2  (a) Based on the X-ray crystal structure of GABP ETS/ankyrin repeats complex in fig 1.6, a model of the full heterotetrameric GABP complex on a tandem purine-rich DNA sequence was proposed. Two heterodimers of GABPa/GABPct are connected through the homodimerization of the Leucine Zipper (LZ) domains to form a heterotetramer. The locations of PNT domains and Leucine Zipper domains are shown schematically. The sequence of the purine-rich region is shown in tandom with the conserved 5'-GAA-3' sequence in red (Graves, 1998).  (b)  Schematic diagrams of GABPa and GABPp are shown with their known domains. Although not depicted in (a), the N-terminal domain from GABPa is the subject of the study.  10  Chapter 1.  Introduction  function is located in the middle of G A B P a . The solution structure of the PNT domain was solved using the N M R spectroscopy by the Mcintosh group (Mackereth et al, 2004). Finally, as discussed in this thesis, the N-terminal region of G A B P a , which has been essentially unstudied to date, contains a novel domain of unknown function. GABP(3, known as the transactivation subunit, also contains three domains (Fig. 1.5 (b)), specifically the ankyrin repeats (AR), the transactivation domain (TAD), and the leucine-zipper domain (LZ). The N-terminal A R directly interacts with the G A B P a ETS domain, as shown in the G A B P complex structure. The C-terminal L Z domain forms a coiled-coil leading to a G A B P P homodimer. Unlike these well-characterized domains, the location and characteristics of the T A D in G A B P P are poorly defined. Thg biological specificity of G A B P may arise in large part from the diversity of GABPp. As summarized in Table 1.1 (Rosmarin et ai, 2004), there are four G A B P P l isoforms arising from alternative mRNA gene splicing, as well as a GABP|32 encoded by a separate gene. Significant differences are observed in the domain structures of these G A B P P variants, particularly with respect to the T A D or L Z , prompting the idea that combinatorial association of G A B P a with various G A B P P partners can lead to alternative regulation of gene expression.  1.3.2.1 Crystal structure of the GABPaP complex with DNA The ETS domain of G A B P a is the most extensively studied part of this transcription regulator due to its function in DNA-binding. As shown by Batchelor et al (Batchelor et ai, 1998), the ETS domain of G A B P a is composed of four antiparallel P-strands that pack against five a-helices (Fig. 1.6). This is essentially the identical topology to the ETS domains of other Ets proteins, such as Ets-1 (Lee et ai, 2005) and Fli-1 (Liang et al, 1994). Therefore, as  11  Chapter 1. Introduction  Figure 1.6. Structure of the GABPa/p heterodimer complex on DNA. X-ray crystal structure of the complex of the GABPa ETS domain (yellow) and the GABPp ankyrin repeats (green) bound to DNA (Batchelor et al., 1998).  12  Chapter 1. Introduction  Groups  McKnight  McKnight  Handa  Handa  Scarpulla  Proposed nomenclature  Thompson et al. 1991  De la Brousse et al. 1994  Watanabe et al. 1993  Sawa et al. 1996  Gugneja et al. 1995  GABPa GABPp 1-42 GABPp1-41 GABPpI-38 GABPp1-37 GABPp2  GABPa  E4TF1-60  GABPa GABPp2 GABPpI GABPy2 GABPyl  NRF-2a NRF-2 p1 NRF-2 p2 NRF-2 y1 NRF-2 y2  GABPpI  GABPpI-1  E4TF1-53  GABPp2  GABPpI-2 GABPp2-1  E4TF1-47  GABPpi-42  TAD  AR  LZ  GABPp1-41 GABPpI-38 GABPpI-37 GABPp2  Table 1.1. Various nomenclature used for GABP subunits. GABP(51 is composed of four isomers likely arising from alternate splicing of mRNA, whereas GABPJ32 is expressed from a distinct gene. Rosmarin et al. (Rosmarin et al., 2004) proposed a new nomenclature based on the molecular weights of these species (37, 38, 41, and 42 kDa). The domain structure of the GABPp forms are shown (AR: ankyrin repeats; TAD: transactivation domain; LZ: leucine-zipper).  13  Chapter 1.  Introduction  expected, the ETS domain of GABPa binds DNA through its winged helix-turn-helix motif, where the helix after the turn interacts with the major groove of DNA. Despite its role in enhancing promoter affinity, the four-and-a-half ankyrin repeats (AR) of G A B P P do not directly contact DNA in the ternary complex. Instead, the AR interacts with the first, fourth and fifth helices of GABPa to indirectly stabilize the association of the ETS domain with DNA. Most importantly, Gln321 of GABPa hydrogen bonds to both Lys69 of G A B P p and the DNA backbone. Thus, G A B P P may fine-tune the conformation of GABPa to regulate its affinity for promoter elements.  1.3.2.2 Solution structure of the PNT domain The solution structure of the PNT domain in GABPa was solved by NMR spectroscopic methods (Mackereth et al, 2004). Along with that of Erg, another Ets family protein identified in a colon cancer cell line (Fig. 1.3 (b)), both of these structures share a common conserved fold of four core helices. This fold was shown previously for PNT domains of Ets-1 (Slupsky et al, 1998) and Tel (Kim et al, 2001). However, the GABPa PNT domain also has an additional N terminal helix present in Ets-1 but not Erg (Mackereth et al, 2004) or Tel (Kim et al, 2001). The additional N-terminal helix may provide a docking site for a kinase, as seen with Ets-1, or a binding interface for an unidentified protein partner. However, such a partner has yet to be identified for GABPa.  1.4  Investigating the N-terminal region of GABPa Over the course of studying the PNT domain of GABPa, Cameron Mackereth and Dr.  Manuela Schaerpf in the Mcintosh lab found evidence for a new structured domain in the  14  Chapter 1.  Introduction  previously uncharacterized N-terminal region of G A B P a (residues 1-168) (unpublished data). As described in chapter 2, preliminary analysis of the N M R spectra of a large fragment of G A B P a (residues 1-320) revealed a set of resonances from a folded domain independent of the PNT domain (residues 168-254). The latter conclusion arises from the invariant N M R spectra of the P N T domain, whether in the absence or in the presence of the additional N-terminal residues. Furthermore, a B L A S T (Altschul et al, 1990) search failed to identify any proteins with significant sequence similarity to the 168 N-terminal residues of G A B P a , suggesting that this is a unique feature of this transcription factor.  1.4.1  Evidence implicating the interactions of the N-terminal residues of GABPa with  other folded domains in GABP Although the N-terminal region of G A B P a may not interact intramolecularly with the PNT domain, it is still possible to imagine its association with the ETS, A R , T A D , and/or L Z domains of G A B P . In fact, in a study of the mechanism of G A B P assembly, Chinenov et al. (Chinenov et al, 2000) have shown by analytical ultracentrifugation that deletion of residues 1316 of G A B P a enhances formation of ap^a tetramers over aP dimers in solution. This result suggests a possible participation of the N-terminal or PNT domains in regulating assembly of the heterotetrameric G A B P complex.  1.4.2  Possible involvement of the N-terminal domain with previously reported GABP  protein partners Several reports have shown that G A B P has various protein partners, some of which may bind G A B P a at sites other than its ETS domain. Sawada et al. (Sawada et al, 1999) have documented functionally synergetic interactions between G A B P a and ATF1/CREB for the 15  Chapter 1.  Introduction  activation of the adenovirus early 4 (E4) gene. Furthermore, they have also provided some evidence that ATF1 interacts with the N-terminal region of G A B P a by use of GST-tag pull down assays and surface plasmon resonance experiments.  Galvagni et al. (Galvagni et al.,  2001) have suggested that S p l and Sp3 activate the utrophin promoter in co-operation with GABP.  They have also shown experimentally that S p l and Sp3 physically interact with  G A B P a through their zinc finger domains.  G A B P a also controls the induction of the  interleukin 16 promoter in T lyphocytes in cooperation with the coactivator CBP/p300. Furthermore, Bannert et al. (Bannert et al, 1999) have also shown that G A B P a interacts specifically to the C-terminal portion of p300. These results suggest possible functions of the N-terminal domain in G A B P a in the assembly of multiple protein complexes necessary for transcriptional regulation. However, the precise nature of these interactions and the segments of G A B P a involved remain to be established.  1.5  Thesis overview Since the discovery of G A B P in the late 1980, numerous groups have published data on  this protein. However, due to its primary function as transcription regulator, most of these studies focused on the DNA-binding ETS domain.  In contrast, the N-terminal region of  G A B P a remained largely uncharacterized until the preliminary studies by Mackereth and Schaerpf. Therefore the major goal of this thesis was to dissect the biological function of the N terminal region of G A B P a through structural analysis, followed  by  structure-directed  biochemical studies. To achieve this goal, as outlined in chapter 3, I first delineated the new structured elements in G A B P a using limited proteolysis and N M R spectroscopy. Based on these results, residues 35-121 were discovered to constitute the new folded domain in G A B P a . Preliminary 16  Chapter 1. Introduction 35  121  N M R spectroscopic studies of a truncation fragment encompassing this domain G A B P a demonstrated that it contains a well-folded structure with a predominant (3-strand conformation. In chapter 4, a detailed analysis of the N M R spectra of G A B P a " 3 5  assigned  1 3  35  is presented. Based on the  C , N , and *H resonances, the secondary structure features of G A B P a " 1 5  35  predicted using several different computational algorithms. GABPa "  1 2 1  121  121  were  Together these indicated that  is composed of five (3-strands and a helix. The complete N M R assignments were  combined to interpret the N O E S Y spectra of G A B P a " 35  121  and calculate its tertiary structure, as  summarized in chapter 5. In parallel with these NOE data, additional dihedral angle (O,  %i)  and residual dipolar coupling restraints were measured. The resulting final structural ensemble revealed that G A B P a " 35  121  is composed of a five stranded (3-sheet crossed by a distorted helix.  Although its 3-D structure resembles ubiquitin, G A B P a " 35  121  adopts a novel fold.  While  exciting, this precluded the determination of its function based on structural similarities to any characterized protein. A n analysis of the surface features of G A B P a " 3 5  1 2 1  also yielded only  limited insights into its potential function. Thus chapter 6 describes ongoing experiments for identifying the function of the new G A B P a domain based on the investigation of previously reported protein binding partners. 35 121  The determination of the tertiary structure of G A B P a "  has also completed the full  structural characterization of G A B P a . This is critical for understanding the mechanism of intact G A B P in its biological context.  17  Chapter 2. Materials and Methods  Chapter 2 - Materials & Methods 2.1  Cloning Genes (rat liver) encoding G A B P a " 1  1  169  and G A B P a " 35  121  were sub-cloned from the  320  GABPa "  gene within the pET28 vector by employing two different pairs of oligonucleotides  (SEVIGA) as primers for PCR. Each oligonucleotide was designed with a convenient restriction enzyme cleavage site so that the PCR product could be ligated to the host vector pET28 (Novagen) (Fig. 2.1 (a)). In GABP-121RW, a codon for tryptophan was engineered in front of the stop codon (bold) for the convenience of measuring protein concentration. Pfu Turbo D N A polymerase (Fermentas) was employed to amplify the sequences of G A B P a " 1  121  1 6 9  and G A B P a " 35  , as illustrated in Fig. 2.1 (b). A l l the PCR products were separated on a 1.8 % agarose gel  and purified using QIAquick Gel Extraction kit (Qiagen). Both the purified PCR products and the pET28 host vector were incubated with Ndel and Xhol or Ndel and EcoRl restriction enzymes (Fermentas) overnight for G A B P a " 1  169  and G A B P a " 35  121  , respectively. The digested  PCR products were ligated into the cleaved pET28 host vector (Novagen) at 15 °C overnight and then transformed into E. coli D H 5 a cells (Novagen) by electroporation.  The cells were  incubated overnight on agar plates at 37 °C with selection for kanamycin resistance. The final plasmids were purified with a plasmid preparation kit (Qiagen) and sequenced for the inserted PCR products by NAPS (Nucleic Acid Protein Service unit, UBC). The confirmed plasmids were introduced into E.coli BL21 cells by electroporation. The pET28 vector-fused G A B P constructs are designed to express proteins with Eus6-tag at the N-terminus. The cleavage of FIis6-tag by thrombin results in three additional amino acids (Gly-Ser-FIis) at the N-terminus of the protein.  18  Chapter 2. Materials and Methods  (a) Oligonucleotides for G A B P a GABPa-1 F  forward  1 1 6 9  5' G C G G A C A T A T G A C T A A G A G A G A A G C 3' A/del  GABPa-REV  reverse  5' G C T T C C T C G A G T C A T G C A G C A G C C C A T C T C 3' Xho\  Oligonucleotides for G A B P a 35-121 GABP-35F  forward  5' G C G G A C A T A T G G C T G A A T G T G T A A G C 3' A/del  GABP-121 RW  reverse  5' G C T C A G A A T T C T C A C C / 4 C T C G A C C G T T T C C G C 3' EcoR\  (b) P C R settings  initial  denaturing  (95°C, 5 min)  ^  annealing  extend  ^  ending  (95°C, 1 min) "^(55°C, 30 sec) " (72°C, 30 s e c ) " ^ (72°C, 5 min) r>  20 c y c l e s  t  Figure 2.L Oligonucleotides and PCR protocol for cloning G A B P a " 1  169  and G A B P a " 35  121  (a) The oligonucleotide sequences are shown for the forward and reverse primers of GABPa " 1  169  35 121  and GABPa " , with the restriction enzyme sites underlined. The stop codon is high lighted in bold for the reverse primers, and the engineered Trp codon is italicized in GABP-121RW. (b) The G A B P a " 1  169  and G A B P a " 35  121  constructs were amplified by Pfu Turbo polymerase  (Fermentas) during 20 PCR cycles. The temperatures and the time periods are indicated for each step.  19  Chapter 2. Materials and Methods  2.2  Protein expression and purification GABPa " 1  and G A B P a "  169  35  121  were expressed in E. coli BL21(ADE3) cells grown at 37  °C in Luria broth (LB) or minimal M9T medium (see Table 2.1 for the specific isotope compositions of each sample). After induction with 1 m M JPTG for 4 hours, the cells were harvested by centrifugation at 5K for 10 min. The cell pellet was resuspended in Ni-NTA (Qiagen) column binding buffer (5 m M imidazole, 50 m M HEPES (pH 7.5), 500 m M NaCI, 5 % glycerol) and lysed by passage through a French press three times at 10,000 psi, followed by 15 min of sonication. The lysate was spun at 15K for 1 hour, and the supernatant transferred onto the N i - N T A column pre-equilibrated with binding buffer. The column was washed with 60 m M imidazole, 50 m M FffiPES (pH 7.5), 500 m M NaCI, 5 % glycerol and then G A B P a proteins were eluted with 250 m M imidazole, 50 m M HEPES (pH 7.5), 500 m M NaCI, 5 % glycerol. Following SDS-PAGE analysis, the fractions with the G A B P a protein were pooled and dialyzed overnight in 20 m M sodium phosphate, 20 m M NaCI, pH 7.2 with a few crystals of thrombin (Roche) added for cleavage of the N-terminal His -tag. Thrombin and the cleaved HiS6-tag were 6  removed by incubating the dialyzed protein solution with p-aminobenzamidine beads (SIGMA) and T A L O N metal affinity resin (BD biosciences) at room temperature with mild shaking for 15 min. Then the p-aminobenzamidine beads and the resin were removed by centrifugation at 5K. Once the cleaved proteins were isolated, they were kept reduced by addition of 2 m M DTT (dithiothreitol) (Bioshop). For limited trypsin digestion and N M R measurements, the proteins were concentrated to ~ 0.1 m M and 1.0-1.5 mM, respectively. The concentrations of G A B P a " 1  1 6 9  and G A B P a " 3 5  1 2 1  and 7127 M^cm"  1  were measured using the predicted extinction coefficients of 8654 M^cm"  1  at 280 nm, respectively, calculated with the ProtParam program  (http://www.expasy.org/tools/protparam.html). The masses of the final samples were confirmed  20  Chapter 2. Materials and Methods  M9T minimal media 13  S a m p l e  15  C  1  N  5  N  &  1  3  C  1  12 13  Carbon  6  source  C -D-glucose  1 4  ( NH4) S04 1 g/L  ( NH4)2S04 1 g/L  1 5  2  source  10g/L  C -D-glucose 3 g/L  6  ( NH4) S04 1 g/L  Nitrogen  C o m m o n  C -D-glucose 3 g/L  12  2  salts a n d miscellaneous trace  Na HP04 6 g/L  KH2PO4 3 g/L  NaCI 0.5 g/L  MgSCM  CaCU 0.1 mM  FeCl3 10 uM  2  1mM  13  N  & 10%  13  6  (  1 5  NH4) S04 2  1 g/L  Bi 0.5 mg/ml  Kanamycin 3 5 mg/L  Table 2.1. The compositions of M9T minimal media used for C or N isotope-labelling.  21  1 5  3  C  C -D-glucose 2.7 g/L C -D-glucose 0.3 g/L  metals  1 3  1  6  6  15  5  Chapter 2. Materials and Methods  by ESI-MS. A mass of 10245 Da (10248) was observed (expected) for unlabelled G A B P a '  3 1 z l  (operated by Dr. Gary Yalloway).  2.3  Limited trypsin digestion GABPa " 1  1 6 9  was digested with trypsin at 25 °C in 25 m M Tris (pH 7.9), 50 m M KC1, 0.1  m M E D T A and 1 m M DTT at a ratio of 1:250 (w/w) of enzyme to protein. At defined intervals, aliquots were removed and proteolysis stopped with SDS buffer, followed by heat inactivation' at 95 °C for 5 min. Samples were analyzed both by SDS-PAGE using 15% acrylamide gel and by ESI-MS (operated by Dr. Shouming He).  2.4  NMR spectroscopy A l l isotopically-labeled ( C , N , N / 13  1 5  1 5  1 3  C , N/10% C ) N M R samples contained 1.0 15  13  1.5 m M protein in 20 m M sodium phosphate (pH 7.0), 50 m M NaCI, and 2 m M DTT, with 10% D 0 (by volume) added for the signal lock. Spectra were recorded at 30 °C using 500 M H z 2  Varian Unity, 600 M H z Varian Inova, or 800 M H z Varian Inova N M R spectrometers equipped with a triple resonance gradient probes. A l l the pulse sequences were provided by Dr. Lewis Kay (University of Toronto). Spectra were processed and analyzed using NMRpipe (Delaglio et al, 1995), nmrDraw (Delaglio et al, 1995), and Sparky (Goddard and Kneller).  2.5  Residual dipolar couplings (RDCs) measurements 13  Residual dipolar couplings were acquired on a uniformly 1 2 1  15  C- and  35  N-labeled G A B P a  protein sample diffused in a 5% acrylamide gel prepared with 29:1 acrylamide :  bisacrylamide stock (Ishii et al, 2001). The gel was cast as a cylinder with a 6 mm diameter and 20 mm length. After dialysis in the N M R sample buffer, the gel was soaked in the protein 22  Chapter 2. Materials and Methods  solution (~ 500 ui) overnight. The gel pellet with the diffused protein was loaded into a 5 mm silanized bottomless N M R tube using a gel funnel apparatus, causing the pellet to stretch approximately to twice its original length (Chou et al., 2001). This yielded a splitting of ~ 4 Hz in a H - N M R spectrum of the U0 H 2  l  lock solvent. The backbone H - N and C ' - C RDCs  2  X  1 5  1 3  1 3  a  were analyzed from the ^ - ^ N - J P A P - H S Q C and 2D/3D HNCO-based coupling spectra recorded on the partially aligned protein samples, as well as on an unaligned reference sample (Cloree? al, 1998b).  2.6  NMR spectral assignments and structure determination.  Spectral assignment strategies and structure calculations are described in chapter 4 and 5, along with the details of analysis of resulting structures.  2.7  Native gel electrophoresis Six different G A B P proteins were expressed and purified following previously described  procedures: G A B P a " ' (chapter 2.2), the PNT domain of G A B P a (residues 3 5  1 2  (Mackereth et al, 2004), G A B P a " 1  3 2 0  168-254)  (residues 1-320), full-length G A B P a (residues 1-454) and  G A B P p (residues 1-382) (Chinenov et al, 2000), a complex of ETS domain ( G A B P a (residues 311-430)) and the ankyrin repeat (GABPP2-1 (residues 1-157)) (Batchelor et al, 1998). A l l the purified proteins were dialyzed into 20 m M sodium phosphate, 50 m M NaCI, pH 7.5. The proteins were mixed in different combinations at a 1:1 molar ratio.  Following 4 hours  incubation at room temperature, the protein samples were prepared in loading buffers with or without SDS and analyzed on SDS-PAGE and native (50 m M Trizma, p H 7.0) gels with 15 % acrylamide.  23  Chapter 3. Identification  and cloning of the new GABPa  domain  Chapter 3 - Identification and cloning of the new G A B P a domain Previous studies demonstrated that G A B P a was modular, containing at least the ETS and PNT domains. Although the DNA-binding ETS domain has been studied extensively, the function of the PNT domain remains to be established.  It may serve as a protein-protein  interaction module, possibly for MAP-kinase docking to enhance the phosphorylation at the adjacent Thr280. In the course of studying the structure and function of the PNT domain, Schaerpf and Mackereth noted that the long N-terminal sequence (residues 1-167) of G A B P a , preceding its PNT domain, contains some structured elements (unpublished data). When the ' H - N HSQC spectra of the G A B P a " 1 5  1  320  and G A B P a  1 6 8  "  2 5 6  constructs were compared with each  other, Mackereth found that a subset of the signals corresponding to amides of residues 1-167 exhibited a well-resolved and dispersed pattern, indicative of a significantly structured region. In this current study, limited trypsin digestion and the partial assignment of the ^ - ^ N HSQC spectrum of a new construct, G A B P a " , helped determine that the N-terminal structured 1  169  domain is composed of residues 35-121. This led to the cloning and expression of a new 35 121  truncation species, G A B P a " , that proved to be readily amenable for structural analysis.  3.2  Expression of residues 1-169 from GABPa Based on the proposal of a new structural domain within the N-terminal sequence of  G A B P a , the gene encoding residues 1-169 was sub-cloned into a pET28 vector for expression in E. coli.  The resulting N-labeled G A B P a " 15  1  169  was purified and characterized by N M R  spectroscopy. A preliminary examination of the * H - N HSQC spectrum of G A B P a " 15  1  169  (Fig.  3.1) provided three insights into the nature of the N-terminal domain. First, the spectrum contains at least 65-70 well-dispersed peaks (blue), corresponding to the amide resonances of 24  Chapter 3. Identification and cloning of the new GABPa domain  10  Ci'JO  0  D72  110  A41*  0  A  © _G  h  110  h  115  G103  0  G49  0  S62  £ 2 115  ©"  Si  w  Q  12  N45  R58  o0 K 5 3  ©084  E  5.  L54  a,  O  M104 V92  A65  o  S80  120  0  TX8  o93  6 "  _ L81® @ ©142 N109© ©D76 V  168  LI 08  Q  3  o  ~ EH KI07©L51© E36'  8  N5i Q102© El 05,  ©  LhV  L70  L75 ©  125  E67 V86  1113  ©  Q40®  6  SC(W163) L63  0  Q97  V98  125  0  KI 15 1)43  G85*  o  A l 17  1110^ © 0 7 1 Llll  ^  K87  VI14  l/> V  H20  o  1)78 X  144  Q74  199  10  ' H (ppm)  Figure 3.1. Partially assigned ' H - N HSQC spectrum of G A B P a " 15  1  169  The dispersed peaks (blue) located outside of the random-coil chemical shift region (black) correspond to residues 35-121. Resonance pairs of Gin and Asn N H groups are connected with 2  horizontal lines. The aliased peaks (A41, G85) are marked with an asterisk. Weak peaks not seen at this contour level are identified with " X " .  25  Chapter 3. Identification and cloning of the new GABPa  domain  the structured residues, along with many poorly resolved peaks (black) with random-coil chemical shifts.  Hence the size of the structured region in G A B P a " 1  Second, when compared to the " H - N HSQC spectra of G A B P a 15  1 6 8  "  2 5 4  1 6 9  is likely ~ 8 kDa.  and G A B P a " , (fig. 1  3.2) the chemical shifts of the dispersed peaks from the domain in G A B P a " 1  PNT domain in G A B P a  1 6 8  "  2 5 4  320  and from the  169  superimpose well with those in the larger fragment G A B P a " . 1  320  This indicates that the new domain is structurally independent of the PNT domain. Last, to confirm that the dispersed peaks in the H - N HSQC of G A B P a " !  1 5  1  1 6 9  correspond to the  relatively rigid part of the protein, the backbone dynamics were measured by N-relaxation 15  experiments. In particular, heteronuclear 'H-{ N} N O E ratios, which are ~ 0.8 for rigid amides 15  and less for those showing mobility on the sub-nsec time scale were determined. Indeed as summarized in Fig. 3.3, amides with dispersed peaks in the spectrum of G A B P a " 1  169  exhibited  high H - { N } N O E values, whereas those with random coil shifts showed low N O E ratios. ]  15  These three observations led to further investigations towards identifying the location of the structured elements in G A B P a " , with the goal of removing the flexible residues, and thereby 1  169  obtaining a more resolved spectrum for further structural analysis.  3.3  Identification of the boundaries of the structured region in GABPa ' 1  The boundaries of the structured region in G A B P a " 1  169  169  were investigated by two  methods, specifically limited trypsin proteolysis and the partial assignment of the H - N HSQC !  spectrum of this protein.  3.3.1  Limited trypsin digestion  26  1 5  Chapter 3. Identification and cloning of the new GABPa domain  s CL.  B  'H (ppm)  I  1  N-terminal 35  PNT  121  169  ETS 254  Figure 3.2. Overlaid ' H - ^ N HSQC spectra of G A B P a  1-169  320  , G A B P a " , and GABPa " . 168  The overlay demonstrates that the sum of spectra of GABPa " 1  is part of GABPa "  430 454  169  256  1  (red) and G A B P a " 168  256  320  (green)  320  (blue), suggesting the region of residues 1-169 is independent from the  PNT domain, GABPa  " . The schematic diagram shows the location corresponding to each  1  construct. 27  Chapter 3. Identification and cloning of the new GABPa domain CO.  U  .I•• ,  lllllll ill Ii III .III jl ipl "i 11  CO I  ua <  •i "| r •  5 "  ? 0  S  f  m  "<  1,11, .1L 1, - 1. 1.., f T ini n fl 'I '1  ll.l \\  1  -4 -  5  x  CU  a  rn  u  in _  iil.lilm. i  <  i...1 .li.ll  'I'l'l'  0  L l l • - I 11. ll•-_I_ lill.  |"  •' "|  -4  1 0.8  PJ  o  0 6  Z Z  0.4 0.2  %  0  -0.2  -  -0.4  MTKREAEELIEIEIKTTEKAECTEESIVEQTYTPAECVSQAIDINEPIGNLKKLLEPR^  1  10  20  30  40  50  60  70  80  90  100 110 120 130 140 150 160 169  Figure 3.3. Identification of the structured region of GABPa " 1  The structured region of GABPa " 1  169  169  by NMR  was identified based on chemical shift and heteronuclear  'H-I^N} NOE measurements. The magnitude corresponding to the deviation of the difference of C and C chemical shifts from that of C and C random coil chemical shifts is the a  p  a  p  indication for the secondary structure. The negative and positive values correspond to P-strand and helical conformations, respectively, in A(8C - SC ) and A C , and opposite for A^C . The a  P  13  a  13  ^ - { ^ N } NOE values for the assigned peaks were ~ 0.8, suggesting rigid elements, whereas the unassigned peaks resulted in large negative 'H-{ N} NOE values, indicative of random coil 15  behavior. However, due to absence of their assignment, the actual values for the unassigned peaks are not shown. 28  Chapter 3. Identification and cloning of the new GABPa  domain  Limited proteolysis often provides an avenue for distinguishing unstructured versus structured regions within a protein.  When G A B P a " 1  was incubated with trypsin, two  169  fragments were observed within 2 min. Within 10 - 30 min, the full length protein was completely absent leaving two predominant fragments, with ESI-MS-determined masses of 3741 Da and 5426 Da (Fig 3.4 (a)).  According to the predicted trypsin digestion sites in  GABPa " ,  these two masses correspond to residues 20-53 or 116-150 and 59-107,  respectively.  After 30 min incubation, a new band started appearing around the 14.4 kDa  1  169  protein marker. Unfortunately, the mass of the band was not verified by mass spectrometry, but it may arise due to oxidative crosslinking of cys residues in the cleaved G A B P fragments. To help interpret these proteolysis data, the predicted trypsin cleavage sites are shown in a schematic of G A B P a " 1  169  in Fig 3.4 (b). Although the 3741 Da fragment could arise from  residues 20-53 or 116-150, the former contains an additional potential cleavage site whereas the latter is the longest sequence of G A B P a " 1  169  expected upon complete digestion of the protein.  Any smaller fragments would not be observed using this SDS-PAGE system.  Thus, the  fragment is likely due to cleavages after L y s l l 5 and Lysl50. In contrast, the 5426 Da fragment corresponds to residues 59-107, indicating that R79 and K87 are protected form trypsin digestion. Although not conclusive, these results suggest that the structural domain in G A B P a " 1  1 6 9  falls near the middle of this sequence and not at its termini.  Interestingly, when the  subsequently determined secondary structure of this domain is aligned with the predicted cleavage sites in Fig. 3.4 (b), the protected R79 and K87 lie within a loop between two (3-strands, whereas the cleaved R58 falls at the end of an a-helix. Thus the pattern of trypsin cleavage is not readily explained based on the structure of the new domain in G A B P a .  3.3.2  Assigning the H- N HSQC spectrum of GABPa 1  15  29  1  Chapter 3. Identification and cloning of the new GABPa  domain  (a) 116.0 66.2 45.0 35.0 25.0 18.4  MW = 19037 G A B P a residues (1-169) obs  MW = not confirmed Unidentified band obs  j*  14.4 —I IP  •v  ^^^M  MW =5426 G A B P a residues (59-107)  M m  obs  MW =3741 G A B P a residues (20-53) or (116-150) obs  2'  M  (b)  10" 30'  GABPa Si  35121  H  60' 120"  secondary structure S2S3  Si  S:  II  Trypsin cleavage sites (Arg or Lys) GABPa 35-121  I 20  •  I  I  40  60  80  I  100  t  I 120  140  160169  Residue  Figure 3.4. Limited trypsin digestion of G A B P a " 1  (a) Tryptic digest of G A B P a " 1  1 6 9  1 6 9  .  was analyzed by S D S - P A G E using a 15% acrylamide gel.  Three fragments were observed to be protected for 2 hr digestion. The masses of two of the fragments were confirmed by mass spectrometry as shown on the right side with the matched sequences.  The mass of the band appeared near the 14.4 k D a protein marker ( M ) was not  confirmed.  However, the band may arise from digestion of trypsin or possible oxidative  crosslinking of the G A B P fragments, shown for G A B P a " 1  1 6 9  (b) The expected trypsin proteolysis sites (arrows) are  , along with the measured cleavage sites (dark arrow). Fragments smaller 35 121  than ~ 3 k D a were not resolved on this gel. The secondary structures of G A B P a chapter 5) is shown above the corresponding region. 30  "  (see  Chapter 3. Identification and cloning of the new GABPa domain To determine the boundaries of the structured region of G A B P a " 1  1 5  N , and  1 3  169  , its main-chain H , !  C resonances were assigned. The dispersed peaks in a ' H - ^ N HSQC (Fig. 3.1)  spectrum correspond to the structured parts of a protein. Furthermore  C and C® chemical  1 3  a  13  shifts provide secondary structural information of that protein. Most of the dispersed peaks in GABPa " 1  were assigned using a combination of H N C A C B and C B C A ( C O ) N N H spectra  1 6 9  (Sattler et al, 1999). These experiments provide through-bond scalar correlations between the 1  H  N  and  1 5  N of residue i with the  1 3  C and a  1 3  C shifts of residues i and i-1 or only residue i-1, P  respectively. The peaks in the random coil chemical shift region of the ' H - N HSQC spectrum 1 5  were not fully assigned due to their significant overlap. As shown in Fig. 3.1 and 3.3, all of the assigned resonances in the H - N HSQC spectrum of GABPa !  1 5  1169  correspond to residues 37-118.  As discussed earlier in section 3.2, these residues exhibited high ' H - f ^ N } N O E values (Fig. 3.3), further supporting their location within a structured region. The assigned  1 3  C and C a  1 3  P  resonances were also examined for their deviations from the expected random coil chemical shifts of each residue. In addition, the differences between the observed and reference random coil  1 3  C a n d C chemical shifts (A8) were investigated as a sensitive prediction of secondary. a  13  p  As shown in the top two panels of Fig. 3.3, most of the  1 3  C and C ^ resonances for residues a  13  37-118 deviated from their random chemical shifts, providing further evidence for their participation in a structured region. In contrast, any of the residues outside of this region with assigned ' H and N signals showed essentially random coil N  1 5  1 3  C and C shifts indicative of a a  1 3  P  disordered conformation. Finally, the observation that many residues in the range of 37-118 had negative  (5 C obs-ref) - 5 C ^ ( b - f ) ) 13  a  13  (  0  S  re  shifts relative to the corresponding random coil values  suggested that the structured domain is composed of significant p-strand structure (see chapter 4 for a more detailed discussion of chemical shift data).  31  Chapter 3. Identification and cloning of the new GABPa domain  3.4  Sub-cloning and characterization of G A B P a Although the N M R analysis of G A B P a " 1  169  suggested that residues 37-118 form the  structured core, two and three additional residues were added to the N - and C-terminal boundaries, respectively, as a "margin of error" and to include a tryptophan for quantitative purposes. As a result, the gene encoding G A B P a " 35  121  was sub-cloned into the pET28 vector. 35 121  The protein was bacterially over-expressed and analyzed by N M R spectroscopy. G A B P a yielded an excellent quality H - N HSQC, spectrum indicative of a well-folded protein (Fig. !  1 5  3.5). Most of the overlapping peaks in the spectrum of G A B P a " 1  were absent, while the  1 6 9  remaining dispersed peaks superimposed closely with those observed with the larger construct (Fig. 3.2). In the heteronuclear 'H-{ N} NOE experiment, the N O E values for the dispersed 15  peaks remained as they were for G A B P a " 1  169  (compare Fig 3.3 and Fig. 5.6). Both of these  results indicate that the core structured region (residues 35-121) is independent of the flexible N - and C- terminal ends (residues 1-34 and 122-169) of G A B P a " 1  mainchain N M R assignment of G A B P a " , the full H , C , and 1  121  169  1  13  169  1 5  . Armed with the available  N resonances of G A B P a " 35  were assigned (Fig. 3.5 and Chapter 4.1.1). Further analysis of the H - N N O E spectra of !  1 5  this protein demonstrated that residues 35-37 and residues 119-122 at its N - and C-terminal ends, respectively, are flexible (Fig. 5.6), indicating that its boundaries were indeed well defined.  32  Chapter 3. Identification and cloning of the new GABPa domain  10 i  U90  *0  D72 A4I*  Q71 ( 2)' 9  9  E  N  I  0  9  (  G49  G103  S62  Q74(e2)_®  ; ®  gjCd  2  )  <9 ®-Q93 (e2)  L55  El 12 N45 (52)  N45 TI 19  A65 S80  &  Q93  °M  VJ8 L81©  G E3S  '<>!>  D76 L108 , C69 L70 L75@  [in*  1110 271 Q97  <®L54 «i»R79  -®  Q40 (e2)  <©T88  ©E56  @D64  Y  l  0  ^  1  €  © a©173 E67<* VI20 JV86 <© A  ? EI05  ^JE46  )K87  p  VI14  L94< V96  M 1  C37»* IE118 ®>S100  ,F82 N50 0102©  N109©  S95 ^L59  «™3  9  ® L?l  R58  © Q84  J©  Til 6  3©- Q84(e2)  K52  Q60  D78  - Q97 (e2)  QI02(e2) Q60 (£2)  W122 (el) O L63  S  N50 (52)  A l 17 <^gjS*E12l © A35  ©KII5  D43  G85* Q  <® 148 W122  LI 11 •V98  ^  H4 i  i  r  ' H (ppm)  Figure 3.5. The fully assigned ' H - N HSQC spectrum of GABPa " 15  35  121  The aliased peaks are marked with an asterisk. The unassigned peaks, likely from twice aliased arginine sidechain resonances, are marked with a triangle.  33  Chapter 4. Strategies for assigning NMR spectra  Chapter 4 - Strategies for assigning spectra and gathering spectral information for NMR-based structure calculations  4.1  Assigning 2D and 3D NMR spectra  4.1.1  Assignment of resonances from backbone nuclei The  ' H - ^ N HSQC spectrum is a two-dimensional N M R spectrum showing the  correlated resonances between directly bonded ' H and N nuclei in molecules such as peptides 1 5  and proteins. Since the pattern of the correlated resonances is usually distinct for each protein, it is called the " N M R fingerprint". Typically, the spectrum contains at least the same number of peaks as there are non-proline residues. In addition the ' H - N HSQC spectrum contains the 1 5  correlated ' H - ^ N signals from the side-chains of Trp, Gin, Asn, and Arg. To assign the H - N HSQC spectrum of uniformly C/ N-labeled G A B P a " !  1 5  13  15  35  121  , two  three-dimensional N M R spectra, H N C A C B and C B C A ( C O ) N N H (see Fig. 4.1) (Sattler et al, 1999) (Grzesiek et al, 1993), were employed. resonances are correlated to the amide H and 1  N  1 5  In both of these spectra,  1 3  C and a  1 3  C  P  N resonances observed in the ' H - N HSQC 1 5  spectrum. The two experiments are distinct as the C B C A ( C O ) N N H spectrum correlates the amide of residue i to the H N C A C B , the 1 3  C  a  and  1 3  1 3  1 3  C and a  1 3  C resonances of its previous residue (i-1), whereas in the P  C and C resonances of the both residues i-1 and i can be observed. Further, a  1 3  P  C ^ signals are distinguished in the H N C A C B experiment by their opposite signal  phases (i.e. positive and negative). By combining these two experiments, along with knowledge of the expected C chemical shifts of each amino acid type and the protein sequence, the peaks 1 3  34  Chapter 4. Strategies for assigning NMR spectra H C H  H CH  HBH  pla HCH  111  H C H  I ^H  II S P lH H0 J  *l  0  H  HNCACB  HCH  H  H C H  3\ \ C C  H 0  L Ii<  O  Pi  - N —c —c r r N —c —c •  O  H  I  H:'C H  HCH  A  O  pla , K  o M.W  H  I  HO  I  [H. H  I  II 0  H C H  HtH  H C H  I  ._  I  - N —jC p C p N — C — C • i " i ; "ir i i II H H O H H O  llII 0  CBCACO(CA)HA  NOESY-HSQC  I  II  I  H- H  HCH  I  C —c •  rl>«i, ll  # #  II 0  HNCO  •H'C H "  HCH  gjiC Hj  H CH-  ^ _ C _ C _ N _ C _ C  —N —C —C —N — C —C —  *!,  I H  HCH  I I  I  I H  J  TOCSY-HSQC  HCH  H  H C H  I  -N—C—<  H  KI  I  H C H  Bp  iHiCiHj  —N—C H  1  J  H C H  I  CBCA(CO)NH  I HCH  L  H C H  I  H [GIH  _ M —'rat—r —-Nf—<?•'<••—C • H  H C H  i l l  II 0  H(CCO)-TOCSY-NH  I  I  II  I  I  H  H  O  H  H  C(CO)-TOCSY-NH  HCCH-TOCSY  HCH  I  HCH  HCH  I  C — C —-NC—C — C —  X  i H  HI  HCH  XJ  J  (Hp)Cp(CYC8Ce)He  HCH  I —M—  (Hp)CP(CyC8)H8  II 0  X  ii  i ' ii IHj H O  0  XJ  J HMQC-J (HNHA)  Figure 4.1. Heteronuclear experiments used to assign G A B P a " 1  1 6 9  and G A B P a " 3 5  1 2 1  .  A simplified schematic diagrams are shown for the selected heteronuclear experiments utilized to assign G A B P a " 1  1 6 9  and G A B P a " 3 5  1 2 1  . Shaded atoms are detected in the experiments. A square  denotes through bond scalar connections and a circle denotes through-space N O E connections.  35  Chapter 4. Strategies for assigning NMR spectra  in the ' H - ^ N HSQC spectrum are sequentially assigned based on connecting  1 3  C and a  1 3  C^  resonances (Fig. 4.2). The chemical shift values for the carbonyl carbons in the protein backbone are readily obtained from a H N C O spectrum. In the H N C O experiment (see Fig. 4.1) (Ikura et al, 1990; Muhandiram and Kay, 1994), the carbonyl  1 3  C resonance of residue i-1 in the backbone is  correlated to the adjacent amide H and N resonances of residue i . Once the peaks for the !  N  1 5  backbone amides are known in the H - N HSQC spectrum, the corresponding !  I 5  1 3  C signals in  the H N C O are immediately assigned.  4.1.2  Assignment of resonances from aliphatic sidechain nuclei To obtain the chemical shift values of aliphatic sidechain nuclei, 3 three-dimensional  spectra were assigned in concert: C(CO)-TOCSY-NH (Grzesiek et ai, 1993; Logan et al, 1992), H(CCO)TOCSY-NH (Logan et al, 1992; Montelione et ai, 1992) and H C C H - T O C S Y (Bax et al, 1990; Kay et al, 1993) (see fig. 4.1) (Sattler et al, 1999). The C(CO)-TOCSY-NH and H(CCO)TOCSY-NH spectra correlate all non-carbonyl C (i.e. C , C , C , etc) and H (i.e. 1 3  1 3  a  1 3  P  13  Y  !  ' H " , ' H ^ , H , etc) signals of residue i-1 to the amide H and N of residue i . Although most of ]  the  1 3  Y  !  N  1 5  C signals in the C(CO)-TOCSY-NH spectrum can be assigned confidently based on the  distinct chemical shifts known for each amino acid type, some of the proton peaks on the H(CCO)TOCSY-NH may be ambiguous.  To reduce any ambiguities, a H C C H - T O C S Y  spectrum is also considered. This spectrum provides the correlation of all H signals in an ]  amino acid sidechain to each H - C pair in that sidechain. 1  13  4.1.2.1 Stereospecific assignments  36  Chapter 4. Strategies for assigning NMR spectra ' " " ' " i i  ,  ii  i"  ,.  i  -11.11.11.  30  30 R79-Cp  R79-CP —fi  F82-CP  4(H  L81-CP  L81-CP  Or  .40  L81-Cp  E  D.  a  50  50 R79-Q  R79-Ca|  G S80-Ca  S80-Ca  S80-Caj  60 H  L81-Ca  L81-Ca  L81-Ca 60 —  F82-Ca S80-CP  S80-Cp  S80-CP  W&— 70  IM!i|Pin|i"fTTn  CBCA(CO)NNH HNCACB CBCA(CO)NNH HNCACB CBCA(CO)NNH S80 S80 L81 L81 F82 H (ppm)  9.42  9.28  8.44  N(ppm)  119.9  121.0  121.7  l  15  J-70 HNCACB F82  Figure 4.2. Strategies for assigning protein backbone resonances The  1 3  C and C connectivity in the H N C A C B and CBCA(CO)NNH spectra of G A B P a " a  l 3  p  35  The spectra are sequentially assigned based on the connectivities of the ' H , N  1 3  1 5  121  .  N , and C and l 3  a  C resonances. The C and C resonances of the i and i-1 and the i residues are detected in P  1 3  a  1 3  P  the H N C A C B and CBCA(CO)NNH spectra, respectively. In H N C A C B , the signs of the and  1 3  C peaks are positive (black) and negative (red), respectively. P  37  1 3  C  a  Chapter 4. Strategies for assigning NMR spectra  The side-chain chemical shifts obtained from the previous assignment do not distinguish signals from prochiral nuclei such as the 'Fr and 'H^ on methylenes and the methyls of Val 62  3  and Leu. Therefore the following experiments have been performed to obtain the stereospecific assignments of the nuclei at prochiral centers (Sattler et ai, 1999).  (^-methylenes in R, D, N, C, E , H , L , K, M , Q, F, S, Y, and W side-chains  The resonances from the 'Ff/ and 'FT of these residues were stereospecifically assigned 32  33  using HNHB (Archer et al, 1991) and short mixing time (30ms) N-TOCSY-HSQC spectra 15  (Marion et al, 1989a). This approach relies on the fact that  3  JNH-H0  and  3  JHa-H3  couplings are  largest for N - H or H'VFi' nuclei in a trans dihedral conformation. This also yields the %\ 15  P  3  dihedral angle of the sidechain about the C - C bond according to a staggered rotamer model. a  p  Stereospecific assignments of leucine and valine methyls  The assignment of methyl resonances is one of the most crucial parts for the structure determination of protein due to their importance in forming the protein core.  Therefore  incomplete or incorrect methyl assignments can easily lead to reduced accuracy of an NMR1  13  derived structure. Most methyl resonances are readily observed due to their upfield H and C resonances and can be assigned ambiguously from C(CO)-TOCSY-NH and H(CCO)TOCSYNH spectra. However, additional experiments are necessary for obtaining the stereospecific assignment of the methyls of valine and leucine. These experiments also confirm the methyl group assignments of He, Thr, Ala, and Met. The signals from the diastereotopic methyls of Val and Leu were assigned following the approach of Neri et al. (Neri et al, 1989). A H - C HSQC spectrum is recorded for a sample ]  13  that is 10% nonrandomly and fractionally C-enriched. Due to the metabolic pathways for the 38 13  Chapter 4. Strategies for assigning NMR spectra  biosynthesis of leucine and valine, signals from the pro-R methyls (Leu-5l and Val-yl) appear as doublets in the C dimension, whereas signals from the pro-S methyls (Leu-82 and Val-y2) 1 3  13  1  are singlets. In parallel, a constant time H - C HSQC spectrum is recorded. Again, due to the patterns of C labelling, this discriminates the signals corresponding to Val-yl, Leu-51 and Ile1 3  y2 from those corresponding to Val-y2 and Leu-82. The assignment of Thr-y2 and Ile-8l 1  13  methyls were confirmed based on their disappearance the constant time H - C HSQC spectrum 1  13  compared to the normal H - C HSQC spectrum.  Assignment of Asn and Gin sidechain NH2 signals 15  The sidechain amide resonances of Gin and Asn generally appear in the up-field region of a H - N HSQC spectrum. Each sidechain yields a pair of peaks due to two geminal amide !  1 5  protons coupled to a common  1 5  N nucleus. These signals are assigned to specific amino acids  based on correlations to the sidechain Q $ (Asn) or C ^ n  aJ  13  /y  (Gin) resonances in a H N C A C B  spectrum. Using the E Z - H M Q C - N H spectrum, as described in Mcintosh et al. (Mcintosh et al., 2  1997), the side-chain amides of the Gin and Asn are stereospecifically assigned. 1  experiment, the peaks from the Z ( H  ft99  1  In this  P99  or H ) protons appear with significantly greater  intensities than those from the corresponding E ( ' H  S21  or ' H  621  ) protons (Fig. 4.3). In a random  coil polypeptide, most E protons resonate down-field relative to their corresponding Z protons. However, for Gln84, this pattern is reversed (Fig. 4.3), suggesting that the side-chain amide of Gln84 might play a distinct role in the structure of G A B P a " 35  shown in Chapter 5, the N  e 2  121  . Indeed, based on the structure  of Gln84 forms a hydrogen bond to the amide of Asp76. The  of Asn45 is unusually downfield shifted, which is likely due to a hydrogen bond to Gly90.  39  1 5  N  5 2  Chapter 4. Strategies for assigning NMR spectra 9.0  X.5  8.0  7.5  093 (NE2-HE2I ) 07l(N£2-HE2n^^^ Q97(NE2-HE21)^  i io H  6.5  7.0  N109(N82-H821) Q7I (NE2.HE22) « \ . 1 « AN105,NS2-H622)  1 10  1 9 3 (NE2-HE22) (NE2-HE22) 112  N50(N82-H52I)  112  Q74(NE2-HE2V Q84 (NE2-HE2  O60 (NE2-HE2  74 (NE2-HE22) Q60 (NE2-HE22) 0")2(NE2-HE22) 084 (NE2-HE22)  I 14  a  I 16 •'j  dtk.  0.  0  N45(N52-H82I)  $  I-114  116  N45 (N82-H822)  O40(NE2-HE21: Q40 (NE2-HE22)  JL. X.O  9.0  7.5  7.0  6.5  7.5  7.0  6.5  'H (ppm) X.5  9.0  8.0  093 (NE2-HE21) Q71 (NE2-HE21). 097(NE2-HE21)'  110  _  112  N109j|N82-H89fl  '^j^gL  111109  074(NE2-HE21)  N50(N82-H821)  f 110  (N52-H822;  093 (ME2-HE22) (NE2-HE22)  112  , 084(NE2-HE21) ^Q74 (NE2-HE22)  Z 114  Q60 (NE2-HE22) M  14  Q102(NE|-HE22) |_Q84 (NE2-HE22J^  116  $  N45(NS2-H82I)  040(NE2-HE2I)  9.0  I  N45 (N62-H822)  16  O40 (NE2-HE22)  X.O  8.5  7.5  7.0  6.5  'H (ppm) Gin  Asn  • CP —  I 'p2  a  • CH— O — C  6  \ W(R)W(R)  J*"" ^  '  H&>(EJ  Figure 4.3. Stereospecific assignments of the Gin and Asn N H 2 resonances of GABPa " 35  121  The peaks corresponding to the sidechain NH2 groups are shown in red in the (a) H- N HSQC J  15  spectrum and the (b) EZ-HMQC-NH spectrum. In the EZ-HMQC-NH spectrum (b), the Z 2  protons (H  822  2  or H ) appear with greater intensities than the E protons (H e22  821  or H ). In e21  GABPa " , most of the peaks corresponding to the Z proton in a sidechain are located up-field 35  121  of those for the corresponding E proton. However, for Q84, the reversed pattern is observed.  40  Chapter 4. Strategies for assigning NMR spectra However, there is no obvious structural explanation for the unusual down-field shifts of the N 1 5  and H signals of Gln40 and Asn50. !  4.1.3  Assigning resonances from aromatic residues In many cases, aromatic rings are important in the formation of a protein's structure  through favourable interactions with other core residues. To assign the side chain resonances of Tyr, Phe, His and Trp, the (Hp)Cp(CYC8)HS and (HP)Cp(CYC8Ce)He experiments (Yamazaki et ai, 1993) (Fig. 4.1 and 4.4) were used to correlate the ' H and H of the aromatic ring spin 5  ]  e  system directly to the previously identified C of these amino acids. The remaining ' H and C 1 3  P  1 3  1  13  signals from the aromatic rings were identified in a constant time H - C HSQC spectrum. The distinct chemical shifts of each ring-type facilitated this process. confirmed by using a  13  The assignments were  C-edited N O E S Y spectrum to identify intraresidue ' H - ' H N O E  interactions. The N H of the indole ring was readily identified in the H - N HSQC spectrum 1 5  of G A B P a " 3 5  1 2 1  E l  !  1 5  by its downfield chemical shifts. This was also confirmed using a N-edited 15  N O E S Y spectrum to provide NOE correlations to the adjacent H 1  5 1  and H ^ protons. !  2  A histidine imidazole ring can be protonated or in two possible deprotonated tautomeric states depending on the pH of the environment and its location in the protein. Using a long range H M B C experiment (Bax and Marion, 1988; Pelton et ai, 1993), it is clear that His66 is in the neutral N H tautomeric state based on the large difference between the chemical shifts of its two imidazole nitrogens (160 ppm vs 250 ppm) and the pattern of scalar couplings with the carbon-bonded ring protons (Fig. 4.5). 4.2  Secondary structure  41  Chapter 4. Strategies for assigning NMR  (a)  (Hp)CP(CyC5)H8 7.6  7.4  7.2  7.0  6.8  (c)  spectra  Aromatic ! H - C HSQC I 3  7.8  6.6  7.6  7.4  7.2  7.0  6.8  30  c c.  O  W122 (CP-H51)  W122((;2)  115  /  32  115  H66 (82)  v  H66 (CP-H82)  34  0  Y101 (e»)( 36 120 H 38  0© °  W122 (e3)  Y101 (Cp-H8*)  120  @W122 (^3)  40 F82(CP-H8* 7.6  7.4  7.2  7.0  6.8  W122(T|2)(  6.6  ' H (ppm)  H25  125 D.  5 7.6  7.4  7.2  7.0  6.8  W122(82)(  u  (HP)CP(C7C6Ce)He  (b)  6.6 130  30  130 F82  (E*)  0  32  W122 (Cp-H8l) H66 (CP-H82)  &34  F82 (8*)  34  Y10! (8*) 135  36 Y101 (CP-H8*) F82 (Cp-H8*) Y101 (CP-He*) 38 40 -F82  7.6  7.4  7.2  (CP-HE*)  7.0  6.8  'H (ppm)  140 6.6  135  H  ®H66 (El) *T-n  7.8  | . i . |  7.6  140 i | l  I i i i |  7.4 7.2 7.0 h' H (ppm)  6.i  Figure 4.4. Assignment of resonances from aromatic sidechain in G A B P a ' 35  121  The resonances of ' H and ' H nuclei in the aromatic residues (H66, F82, Y101, W122) are 5  £  assigned from the (Hp)C(3(CyC5)H5 (a) and (Hp)Cp(CyC5Ce)He (b) spectra, respectively, based on correlation to their C resonances. The remaining aromatic side-chain ' H and C 1 3  P  13  signals are assigned using a constant time 'H- C HSQC (c). Due to rapid ring-flipping the H 13  and H  w  5 / s  of F82 and Y101 are degenerate and identified with an asterisk. The negative peaks  (red) arise from the  C of Trp and  C of His in the constant time experiment. The four  strong unassigned peaks in the upfield portion of (a) and (b) and the two negative peaks in (c) are likely from the contaminating His6-tag. 42  Chapter 4. Strategies for assigning NMR spectra  7.8  • —  Io.  i  —  i  H66  160  180  —  —  7.6 i  7.4 •  i  '  7.2  7.0  i  6.8  •  6.6  i  (N<=2-H )  i  EL  H66  -a  (N -H«2) E 2  160  H80  o  200  •200  220  h220 H  £l  c  El  240  81  -240  N  H66  (N  S 1  -H  £ |  )  H«2  \  260  r260 7.8  7.6  7.4  7.2  7.0  6.8  6.6  ' H (ppm)  Figure 4.5. His66 adopts a neutral N H tautomeric form E 2  This conclusion is based on the large difference in the  1 5  N shifts of its two  1 5  N nuclei and the  coupling patterns to the adjacent carbon-bonded protons. Note that a proton directly bonded to I 5  N  e 2  is not observed due to rapid exchange with water. Additional peaks in the spectrum arise  from the contaminating His6-tag.  43  Chapter 4. Strategies for assigning NMR spectra The secondary structural elements of G A B P a " 35  121  were predicted based on measured  JHN-H<X coupling constants combined with the chemical shift values of backbone nuclei. These coupling constants are dependent on the <J> dihedral angles, whereas chemical shifts are and W dihedral angles. First, J N-Ha coupling constants were determined from  reflective of  3  H  the ratio of H vs. H peak intensities in a FINHA experiment (Kuboniwa et al., 1994). As N  a  depicted in Fig. 4.6, most residues in G A B P a " 35  121  show coupling constants > 8 Hz, indicative  of an extended conformation such as that in a (3-strand. However, some regions (residues 48-56, 80-83) have helical or turn conformations as suggested by coupling constants < 6 Hz. Second, the secondary structural components of this protein were predicted based on the chemical shifts of the backbone nuclei ( H , N , ]  N  1 5  1 3  C ' , C , ' H , C ) . In one approach, the observed C 1 3  a  a  13  P  1 3  a  13 B  C  chemical shift differences were compared to those for a random coil polypeptide (i.e.  p  (A( C - C ) 13  a  13  p  - ( C - C ) . ) ) . Negative and positive values correspond to (3-strand and a13  obs  helices, respectively.  a  13  p  ref  In the chemical shift index (CSI) approach (Wishart et al, 1992),  mainchain nuclei are assigned values of +1, 0, -1, depending upon their chemical shifts relative to those of the corresponding amino acid in a random coil polypeptide. A consensus of each comparison is made, along with structure-based smoothing rules, to yield a final CSI of -1 for a residue in a a-helix and +1 for those in a (3-strand. Finally, in the T A L O S approach (Cornilescu et al, 1999), (O, *¥) angles for each residue are determined by a comparison of mainchain chemical shifts with those in a database of known proteins structures. To facilitate comparison to the CSI method, a value of +1 were assigned to residues with (O, ¥ ) angles indicative of a pstrand conformation and -1 to those with a helical conformation. Using these approaches, it is clear that G A B P a " 35  121  is composed of three large and at  least one small P-strand, as well as one helix. Overall, this agrees well with the final tertiary 44  Chapter 4. Strategies for assigning NMR spectra  Figure 4.6. Secondary structure prediction of G A B P a The secondary structure of G A B P a " 35  121  was calculated by four different methods.  For the  T A L O S and CSI approaches, the secondary structure is based on the chemical shifts of the backbone nuclei (*H , N  1 5  N , C , C , C ) versus those in a database of known protein 1 3  a  1 3  P  1 3  structure or for a random coil polypeptide, respectively. For these two methods, values of +1, 0, -1 correspond to (3-strand, random coil, and helix conformations, respectively. Similarly, the differences of  1 3  A(8C -8C )obs-refa  P  C a  1 3  C shifts are compared to those of a random coil chemical, given as P  Negative values correspond to P-strand conformation, whereas positive  values correspond to a helical conformation. Finally, residues with helical conformations have JHN-HOI coupling constants lower than 6 Hz (dashed line). The predicted secondary structure  3  agrees well with that found in the final tertiary structure ensemble of G A B P a " 35  the schematic diagram of the five P-strands and one helix at the top.  45  121  , as shown in  Chapter 4. Strategies for assigning NMR spectra  -fS2V-lS^-  -I  S4  >  1  S5  >  OO  O  00  O  39  43  47  51  55  59  63  67  71  75  79  83  87  91  95  99  103  107  111  115  119  35  39  43  47  51  55  59  63  67  71  75  79  83  87  91  95  99  103  107  111  115  119  o  no  3  8  -3  o  35  n  -HIT'I' ' ' ''"' | '''ll 1  o  1  1  f  I'l'in'—'i'|M'11'''  J  -9 35  39  43  47  51  55  59  63  67  71  75  79  83  87  91  95  99  103  107  111  115  119  39  43  47  51  55  59  63  67  71  75  79  83  87  91  95  99  103  107  111  115  119  15 12 4  35  46  Chapter 4. Strategies for assigning NMR spectra  structure of GABPa " 35  121  determined in Chapter 5 using all available NMR data. Several  sequence-based database searches, such as PROF (Ouali and King, 2000), Jpred (Cuff et al., 1998), and PSIpred (Jones, 1999), also predicted that GABPa " 35  strands.  47  121  is mainly composed of p-  Chapter 5.  The solution structure of GABP d'  Chapter 5 - The solution structure of GABPa Deterrnination of the tertiary structure of "unknown" proteins or protein domains is one method to unveil their function.  This approach, known as reverse genetics, is based on the  comparative studies with the structures of functionally known proteins.  In an attempt to  discover the biological role of the N-terminal region of G A B P a , the solution structure of GABPa " 3 5  was determined using N M R spectroscopy. G A B P a "  1 2 1  3 5  1 2 1  adopts a novel fold with  five (3-strands and a distorted helix. Despite its unique topology, structural database searches suggested that the fold of this domain is related to that of ubiquitin or ubiquitin-like proteins. In addition, G A B P a " 3 5  1 2 1  also exhibits several interesting structural features, such as a putative  sumoylation site and a patch of acidic residues on its surface. Although structural determination 35 121  of G A B P a  "  has provided several important insights into properties of this protein domain,  its function in the native transcription factor is still unclear.  5.1  Obtaining dihedral angle information (<J>,  5.1.1  <& and *F angles The  %0 from NMR spectra  d> and W angles were based on an analysis of mainchain ' H ,  1 3  C,  1 5  N chemical  shifts using the program T A L O S (Cornilescu et al, 1999). The predicted dihedral angles were only used as dihedral restraints  when the  values agreed with those expected  parameterized Karplus equation, JNH-HC( coupling constants. 3  5.1.2  %i angle of residues with H  p p  protons  48  from  Chapter 5.  The solution structure of GABP d '  5 121  The Xi angles for residues with prochiral H protons were determined in concert with the P  stereospecific assignment(s) of the proton resonances. The angles were restrained to ± 60° or d 180° according to a staggered rotamer model.  5.1.3  %i angles of Val, Val, He The %\ restraints for Val, Thr, and He were determined from measured JNCY and Jccy 3  3  coupling constants using N-Cy and C'-Cy spin echo experiments (Grzesiek et ai, 1993; Vuister et al, 1993). The angles were set to ± 60° or 180° based on a staggered rotamer model. Restraints were not utilized for residues showing JNCY and Jccy couplings indicative of rotamer averaging.  5.2  Residual dipolar coupling constants Residual dipolar coupling (RDC)  constants provide a powerful parameter for solving  protein structure, yielding valuable information about the orientational relationships between selected bond vectors in the protein. Unlike NOE restraints, R D C ' s are entirely independent of the distances among different bond vectors and thus provide more long-range or global structural restraints. For G A B P a " 3 5  1 2 1  ,  1 3  c - C ' and * H - N RDC's were measured using protein weakly a  1 3  N  15  aligned in a stretched polyacrylamide gel (Ishii et ai, 2001). The SANI restraints in ARIA were used to convert these couplings into structural restraints. For this, the values of the alignment tensor (R= 0.3 , Da= 5.3 Hz) were determined by a combination of a histogram method and a grid search to find a global energy minimum by ARIA/CNS(Clore et al, 1998a).  5.3  Structure calculation 49  Chapter 5.  The solution structure of GABP  d'  5 121  35 121  Calculation of the tertiary structure of G A B P a "  mainly relied on assigning N O E  peaks to obtain through-space distance restraints between protons. These data were obtained from several N O E S Y spectra, including 3D  15  N-HSQC-NOESY  (Marion et al,  1989b),  simultaneous C - and N-NOESY-HSQC (Pascal et al, 1994), aromatic C - N O E S Y (Slupsky 1 3  15  13  et al, 1998), and constant time methyl-methyl and amide-methyl-NOESY spectra (Zwahlen et al, 1998; Zwahlen et al, 1997). The H , C , and J  1 3  1 5  N spectral assignments of G A B P a " 35  from the previous chapter were required for this process. The structure of G A B P a " 35  121  121  was  calculated by A R I A (vl.2) (Linge et al, 2001), which automatically and re-iteratively interprets and assigns N O E restraints. Dihedral angular restraints (O, F, %\) and residual dipolar coupling X  restraints ( N - H , 15  1  N  13  c ' - C ) were also utilized in the structure calculations. 13  It is also  a  important to note that no hydrogen bond restraints were included. In this protocol, A R I A iteratively generates nine complete sets of G A B P a " 35  121  structures, obtaining the lowest energy  structures in the final iteration (Fig. 5.1). The ten lowest energy structures in this set are then used to generate a final ensemble of ten water-refined structures of G A B P a "  (details of the  structural statistics are reported in Table 5.1).  5.4  Structure overview GABPa " 35  121  is comprised of a five stranded (3-sheet (S,; 36-43, S ; 67-70, S ; 73-74, S ; 2  3  4  91-99, S5; 107-115) connected both in antiparallel and parallel manners, and a distorted helix (48-59) lying across this sheet (Fig. 5.2 (a)). The secondary structure of G A B P a " 35  121  was  defined based on the consensus results for the ten water-refined structures using the programs Promotif (Hutchinson and Thornton, 1996) and Vadar (Willard et al, superimposition of the structural ensemble of G A B P a " 35  50  121  2003).  shows good agreement with each  The  Chapter 5.  The solution structure of GABP c? ~ 5  Figure 5.1. Iterative assignment of ambiguous restraints by A R I A The iterative protocol by A R I A assign N O E s and calculation the lowest-energy structures in combination with several structural restraints, including dihedral angle (3>, W, %i) and residual dipolar couplings ( ^ N - ' H N ,  I 3  C'-  1 3  C ). A  51  Chapter 5.  The solution structure of  GABPa?  5-121  Table 5.1. N M R restraints and statistics for the ensemble of ten structures calculated for GABPa " 3 5  1 2 1  .  A. Summary of restraints NOEs Unambiguous Ambiguous Total  2745 923 3668  Dihedral angles <t>,\|/,xl  51,51,34  Residual dipolar couplings (W-^N)  52  B. Deviation from restraints NOE restraints (A) Dihedral angle restraints (deg.) Residual dipolar coupling restraints (Hz) Residues in allowed region of Ramachandran plot (%) Deviation from idealized geometry Bond lengths (A) Bond angles (deg.) Improper angle (deg.) Mean energies , kcal-mol" E a  0.005±0.000 0.73±0.01 2.29±0.08  1  v d w  Ebonds Eangles Eimpr  E  0.027+0.001 0.85±0.06 1.81+0.02 98.1  N O E  167+10 35±2 488±6 133±7 126±9  E«iih  7±1  E  170±5  s a n i  rmsd from average structure, A Structured residues All residues  b  Backbone 0.15±0.02 0.59+0.09  Heavy atoms 0.41±0.02 0.97±0.06  , final ARIA/CNS energies for van der Waals (vdw), bonds, angles, N O E restraints, dihedral restraints (cdih), and residual dipolar coupling restraints (SANI). , structured residues: 36-43, b  48-59, 67-70, 73-74, 91-99, 107-115.  52  Chapter 5.  The solution structure of GABP o?  5  Figure 5.2 Solution structure of G A B P a (a) Ribbon diagram of the lowest energy structure of G A B P a " 35  121  . The protein is comprised of  a five stranded P-sheet (orange, S i ; 36-43, S ; 67-70, S ; 73-74, S ; 91-99, S ; 107-115) 2  3  4  5  connected in both antiparallel and parallel manners, and a distorted helix (48-59; red). The two prominent S3/S4 (75-90) and S4/S5 (100-106) loops are shown in cyan, (b) Superimposition of the C traces of the ten water-refined structures, showing low rms deviation in the structured a  region.  The N - and C-terminal tails and the S3/S4 loop exhibit more disorder,  (c) The  superimposed methyl-containing residues (Ala, He, Leu, Val, Met, Thr) of the ten water-refined structures are shown in the same view as (b). The methyls are well defined in the middle of the protein, whereas those on the surface of the protein are disordered (colors correspond the secondary structures as in (a)).  53  Chapter 5.  The solution structure of GABP of  5  Chapter 5.  The solution structure of  GABPa?  5121  other (Fig. 5.2 (b)), especially in the regions of regular secondary structure. This is reflected by the low root-mean-square deviations 0.15 A and 0.41 A for all backbone and heavy atoms, respectively, for the structured region. Additionally, excellent agreement was found between 35  121  the measured RDC's and those predicted from the final structure of G A B P a " (Fig. 5.3). Several interesting aspects of the structure of G A B P a " 3 5  5.4.1  1 2 1  are discussed as follows.  P-strand structure The P-sheet of G A B P a " 35  121  contains a distinct bulge in the middle of S5. As shown in  Fig. 5.4, the discontinuous linkage of S4 and S5 at the bulge results in an altered hydrogen bond pattern between Ser95 in S4 and L e u l l l in S . That is, the N of Ser95 in S4 is hydrogenH  5  bonded to the CO of E l 12 in S5 instead of the C O of L e u l l l as would be expected without a bulge (Fig. 5.4, top). The N-terminal half (91-94) of S interacts with the C-terminal half (1124  115) of the "opposite side" of S5 due to the resulting 180° rotation of the C-terminal end of S  5  after the bulge. The change in the register of the hydrogen bond pattern between S4 and S5 at the bulge is confirmed based on a manual inspection of the mainchain N O E patterns in this region, as shown in fig. 5.4 (bottom).  5.4.2  Loop structure  Two distinct loops between S and S (75-90) and S and S (100-106), highlighted in 3  4  4  5  cyan (Fig. 5.2 (a)), protrude from the surface of the protein. Consistent with this exposure, these loop regions showed increased rms deviations in the G A B P a " 3 5  ten-lowest energy structures.  5.4.3  Distorted helix  55  1 2 1  structural ensemble of the  Chapter 5.  The solution structure of GABP o?  5  5 0 5 10 Experimental dipolar coupling constants(Hz)  Figure 5.3. Residual dipolar coupling constants (observed vs. experimental) The 52 experimental ' H - N residual dipolar coupling constants of were measured and 1 5  compared to those calculated from the final structure of G A B P a " 35  121  . The excellent agreements  of the values provide the evidence of high quality structure with low R-factor (0.182) and Qfactor (0.176) values of the correlation calculated based on the "RDCnrs" program (Skrynnikov et al, 2000).  56  Chapter 5. The solution structure of GABP of ' 5  Hydrogen bonding patterns between S and S 4  5  Manually assigned NOE patterns between S and S 4  s  Figure 5.4. Identification of a bulge between P-strands S4 and S5 in GABPa " 33  121  The hydrogen bond pattern (top panel) between S4 and S5 observed from the solution structure of GABPa " 35  121  are confirmed by the manual examination of the mainchain NOE patterns  (bottom panel) in this region. S5 contains a bulge at L l 11, causing an altered hydrogen bond pattern to residues in S4. Note that hydrogen bond restraints were not used in the structure calculations.  57  Chapters. The solution structure of GABPa An examination of the helix in G A B P a " 35  121  revealed two interesting features. First, the  N-terminal protein of the helix adopts a 3io helical conformation, as shown clearly in the ribbon diagram of Fig. 5.2. The pattern of this helical conformation was confirmed based on the examination of the hydrogen bond and N O E patterns.  Specifically hydrogen bonds were  observed between P47/N50 and I48/L51, corresponding to the hydrogen bond pattern of a 3io helix (H j, COj-3). Furthermore, H i - H i N  a  N  + 4  N O E interactions, diagnostic of a regular a-helix,  appear only after N50 (Fig. 5.5). Second, it is interesting that a proline residue, P57, exists in the middle of this helix, causing a kink along its axis (Fig. 5.2 (a) and Fig. 5.5). A proline residue is generally known as a helix breaker as it is not able to form a hydrogen bond to a corresponding i-4 residue (Lys53). As seen in Fig.5.5, due to the presence of this proline, the residues preceding (Glu56) and following (Arg58) the proline also cannot hydrogen bond to Lys52 and Leu54, respectively. However, the regular geometry of the helix is regained after Arg58  as  indicated, with  Arg58/Leu55 and Leu59/Glu56 forming hydrogen  bonds.  Complementing these hydrogen bonding patterns, the conclusion that the helix spans from residues 48-59 is based on several additional criteria: (1) (<J>,  angles predicted by T A L O S  (Cornilescu et al, 1999) define this region as a continuous helix; (2) patterns of ( C - C ^ ) 13  a  I3  shifts relative to a random coil are indicative of a helix; (3) N O E patterns between sequential H  N  and H demonstrate the helical nature of the residues. However, a discontinuity near Pro57 is a  observed with mainchain JNH-HCC couplings (Fig. 5.5).  5.5  Dynamics from amide N relaxation 15  35 121  Complementing the structural analysis of G A B P a " , the global and the internal backbone dynamics of the protein were studied by N-heteronuclear relaxation ( N - T i , N - T , 15  15  15  2  ' H - I ^ N } NOE) experiments. Of the 87 amides in the protein, 11 were not included as the part 58  Chapter  5.  The solution  structure  of  GABPd ' 5  Figure 5.5. Interactions within the proline-containing a-helix in G A B P a The distorted helix, due to the existence of Pro57, in G A B P a " 3 5  1 2 1  is identified with several  secondary structure determination methods: T A L O S , CSI, JHN-HOC, A(5C -5C ) b -ref and NOE3  a  p  0  S  o  derived parameters. The helix is illustrated with hydrogen bonds (dashed line, < 3.2 A) in Rasmol. The 3io helical conformation and a kink were observed as noted. In T A L O S , helical residues (H) are predicted based on the estimated <E>,  angles within the ranges of -57 ± 20° and  -47 ± 20°, respectively, with the nine out of ten database scores. For CSI analysis, H , C and 1  1 3  a  1 3  a  C chemical shifts are used to predict (3-strand, random coil and a-helix conformations, as +1, P  3  3  0 and -1, respectively. Coupling constants are grouped as 6 ( JHN-HO< 6 Hz), 7 (6 Hz < JHN-HCC < 7 Hz), 8 ( JHN-HCX > 8 Hz), with the small coupling constants indicative of helical conformation. 3  A(5C -5C ) s-ref is > 0 for helix. The N O E pattern of H of residue i to H ' s of residues i+2, a  p  a  N  ob  i+3 and i+4 are also shown. Note that d N0,i+3) and d N(i,i+4) correspond to 3io and a helical A  M  conformations, respectively.  59  Chapter 5. The solution structure of GABPa?5-121  HHHHHHHHHH  TALOS •1  CSI  0  -1  8 6  6667676  Ca-Cp  <U/,*3) <U/,*4)  60  8777  Chapter 5.  The solution structure of GABP  of  5121  of the analysis due to the absence of ' H resonances (N-terminus or Prolines) or significant N  overlap (L59, S95, Q97, E105, 1113, E l 18) of peaks in the ' H - ^ N HSQC spectrum. Using Tensor2 (Dosset et al, 2000), the isotropic global tumbling time, x , of the protein was c  calculated as 4.82 ± 0.04 ns at 30 °C from the Tj and T relaxation values of amides in the well2  ordered region of the protein. This global correlation time of G A B P a "  confirms that the  10.2 kDa protein as a monomer in solution (Daragan and Mayo, 1997). The internal dynamics of the protein were then calculated by the anisotropic LipariSzabo model-free approach (Lipari and Szabo, 1982) (Kay et ai, 1989) (Mandel et al, 1995). In this approach, the internal mobility of individual backbone amides on a psec-nsec timescale are cast as a general order parameter, S , along with a characteristic correlation time of the local 2  internal motion, Tj. In addition to these two parameters, the chemical shift exchange contribution term, R , has also been introduced to some sites where the R (1/T ) results cannot ex  2  2  be fit by the simple Lipari-Szabo approach because of the fast (p,sec-msec) interconversion of the chemical environment of a residue. The relaxation properties of G A B P a " 3 5  1 2 1  (Figure 5.6)  show that the structural core of the protein, residues 36-115, is relatively rigid with generally high S values. This provides additional evidence for a well-folded core. The bulge in the middle of S , shown in fig. 5.4, is also rigid as seen by S95 and L l l l having high heteronuclear 5  15  N - N O E and S values. 2  Similarly, although the hydrogen-bond pattern for the helix is  disrupted at Pro57 (Fig. 5.5), the relaxation analysis demonstrated no changes in dynamics through the helix in this fast timescale. However, several local differences in the dynamics are observed. First, besides the N - and C-terminal tails, there are three relatively flexible regions in the protein, namely the N-terminal region of the helix/S loop (residues 61-63), the C-terminal 2  region of S 3 / S 4 loop (residues 87-91), and the  S4/S5  61  loop (residues 101-104). Among these, the  Chapter 5.  Figure 5.6. N-relaxation analysis of G A B P a " 15  35  The solution structure of GABP o?  121  The order parameter, S , and the chemical shift exchange contribution term, R , for the analysis 2  ex  of the internal mobility of the protein were measured based on the steady-state 'H-{ N} N O E values. 15  1 5  N T I and T2 lifetimes and  The regular structured region is well-defined with low  flexibility (high N O E and S values), whereas local motion is observed in the S3/S4 and S4/S5 2  loops. Prolines and residues with poorly resolved amide chemical shifts are shown with a star and a triangle, respectively.  62  Chapter 5.  I I I I I I I I I I I I I I I I I I I III  I I I I I I I II  The solution structure of  I I I I I I I I I I I I II  I  Residue  63  I II  GABPo?  I I I I I I I I I I I I I M I I II  5  I  Chapter 5. S4/S5  The solution structure of  GABPd '  5 121  loop demonstrates the most striking flexibility on the psec-nsec timescale with low  heteronuclear NOE and S values. In particular, Y101 exhibits the least rigidity in this loop and 2  high surface-accessibility measured by Vadar (Willard et ai, 2003). The mobility of the H/S2 loop is not as distinct as other two flexible regions in the protein core. However, the high R  e x  terms of S62 and L63 still suggest that this loop undergoes intermediate time scale (msec-pisec) motions. The flexibility of the C-terminal end of the  S3/S4  First, it is the part of the longest loop (75-90) in G A B P a " 35  loop is interesting for two reasons. 121  . However, only the C-terminal  end of the loop exhibits enhanced flexibility compared to the rest of the protein. Second, and most interestingly, a putative sumoylation site, K87 (Fig. 5.7 (c)), is located in this flexible region, suggesting accessibility for the sumoylation enzymes.  5.6  Surface features The surface properties of a protein dictate its function and the inspection of these  properties provide clues into its possible biological roles. As expected, most of the non-polar residues in G A B P a " 35  121  are located in the middle of its tertiary structure, forming a stable  hydrophobic core. This is seen in Figure 5.2 (c), where the internal methyls of G A B P a " 35  121  superimposed especially well among the ten water-refined structures, indicating of a well-folded protein core. Although most of the hydrophobic residues are located in the core of the protein, some are located on its surface.  The existence of an obvious hydrophobic patch is often  indicative of a site for binding other molecules. However, in the case of G A B P a " 35  121  , these  surface hydrophobic residues are well-scattered among polar residues. Thus the protein lacks any distinct hydrophobic patch (Fig. 5.7 (a)).  64  Chapter 5.  The solution structure of  GABPd  Figure 5.7. Surface properties of G A B P a (a) The hydrophobic residues (cyan) are scattered on the surface of the protein without forming any obvious patch, (b) The secondary structures are high-lighted on the surface of the structure with prominent S4/S5 loop. A shallow groove is indicated. (SI: orange, S2: yellow, S3: light green, S4: blue, S5: dark green, H : brown, N-terminal residue: purple, C-terminal residue: dark grey) (c) A cluster of acidic residues is observed in the loops of (D43, E46),  S4/S5  S3/S4  (D76, D78, D89), S)/H  (E105), the C-terminal sides of S and S (E67, E l 12), and the C-terminal end 2  5  (El 18). These form a strip of a negative charges along the protein surface. It is also noted that the putative sumoylation site (K87) is located on the surface of the protein near this negative patch, (d) A ribbon diagram showing the secondary structure of G A B P a " 35  scheme is used as defined in (b).  65  121  . The same color  Chapter 5.  The solution structure of GABP  In contrast, groups of acidic residues are observed in the loops of Asp89), S i / H (Asp43, Glu46), S 4 / S  5  S3/S4  of  15121  (Asp76, Asp78,  (Glul05), the C-terminal sides of S and S (Glu67,  G l u l l 2 ) , and the C-terminal end (Glull8).  2  5  Together these form a predominant region of  negative electrostatic potential, which can be a possible interface for interaction with positively charged molecules (Fig. 5.7 (c)). Along with the types of the residues, grooves on the surface are also a good indication for binding with other molecules. Although no distinct cleft, such as that typical for an active site of an enzyme, was observed, a shallow groove does exist near the helix of G A B P a " 35  121  (Fig.  5.7 (b)). Alternatively, the exposed, flexible S4/S5 loop could present residues to bind within a complementary cleft in a potential partner macromolecule. Any potential consensus post-translational modification sites on G A B P a " 3 5  examined.  1 2 1  were also  The sequence (-VK TD-) matches a consensus sumoylation site (())KxD/E; <]):  hydrophobic residue, x: any residue, D/E: acidic residue) (Schwartz and Hochstrasser, 2003). Furthermore Lys87 (Fig. 5.7 (c)) is located at the C-terminal end of the long loop between S3 and S4, and is exposed on the surface of the protein.  Thus, it could be accessible to the  sumoylating enzymes. Since Lys87 is located next to the patch of acidic residues, sumoylation of the site could physically interfere with any molecule binding to this negatively charged region of G A B P a " 35  5.7  121  .  Structural comparisons Although the previously described structural information provides general clues for  discovering the biological role of this new domain, this approach did not yield a specific function for G A B P a " 35  121  . Structure similarity searches were performed by a comparison of the  3D protein structures in the Protein Data Bank (PDB). Although there was no exact structure 67  Chapter 5.  that mimics the fold of G A B P a " 35  121  The solution structure of  GABPa  , a structural similarity search by D A L I (DISTANCE  M A T R I X A L I G N M E N T , (Dietmann et ai, 2001)) revealed that this domain resembles ubiquitin or the ubiquitin-like proteins with Z-factor of 4.2. Ubiquitin is a well-studied protein that is crucial for the degradation of proteins by the 26S proteosome (for review see (Pickart, 2004)). As the part of this biological process, the covalent attachment of the C-terminal GlyGly residues of ubiquitin to Lysine residues on the target protein, known as ubiquitinylation, is required. Although G A B P a " 35  121  and ubiquitin have no significant sequence similarity, the tertiary  structure of ubiquitin is also comprised of a P-sheet with five P-strands and an a-helix. However, there are several significant differences between two protein structures (Fig. 5.8). First, the linear orders of the secondary structures are different. Ubiquitin has an additional Pstrand at its beginning, whereas G A B P a has it at the end (Fig. 5.8 (c)). Second, S 5 in ubiquitin is placed in the opposite direction as that in G A B P a . Finally, the large flexible G A B P a is missing in ubiquitin (Fig. 5.8 (a)). Thus, G A B P a " 3 5  1 2 1  S4/S5  loop in  has a novel fold. Although  exciting, this does not provide any obvious clues to its function. 35 121  Given their global structural similarity, additional properties in G A B P a "  that may  resemble ubiquitin were examined. First, the general surface properties of G A B P a " 35  compared to those of ubiquitin. While G A B P a " 35  121  121  were  has a notable strip of negative charges,  ubiquitin did not have any obvious patch of predominant charge on its surface (Fig. 5.9). One face of the molecule is mixed with both negatively and positively charged groups, whereas the opposite face contains few charged residues. Perhaps the most interesting feature of the surface of ubiquitin is the existence of the hydrophobic patch. The non-polar patch, surrounded by  68  Chapter 5.  The solution structure of GABP  o?  5121  several positive charges, is known to be a binding-interface for the ubiquitin interacting motif (UTM) (Fisher et al, 2003; Polo et al, 2002; Raiborg et al, 2002; Shih et al, 2002). UTM is a  69  Chapter 5.  Figure 5.8. Secondary structure comparisons of G A B P a " 35  (a) The global folds of G A B P a " 35  121  The solution structure of  121  GABPa '  25 121  and ubiquitin (1UBQ)  and ubiquitin are clearly similar, (b) Topology diagrams of  these two proteins show that the direction of P-strand S5 in G A B P a " 3 5  the corresponding strand, S5, in ubiquitin. different in these two proteins as G A B P a ' 3 5  1 2 1  1 2 1  is opposite from that of  (c) The order of the secondary structures are contains an additional strand, S4, between S3 and  S5, while the corresponding strand SI is located at the N-terminal end of ubiquitin.  70  Chapter 5.  The solution structure  ofGABPc?  5  Binding interface for Ubiquitin-interacting motif  Figure 5.9. The surface properties of ubiquitin (1UBQ) The hydrophobic, acidic, and basic residues are shown in yellow, red, and blue, respectively. The protein in the left panel is oriented as in the ribbon diagram shown in Fig. 5.8. Note that the front side of the protein contains a notable hydrophobic patch for binding ubiquitin-interacting motifs (ref).  71  Chapter 5.  The solution structure of  GABPa  ~30-residue sequence motif which containing a hydrophobic sequence composed of alternating large and small residues (Leu-Ala-Leu-Ala-Leu) which are used for interacting with the 35-121  hydrophobic patch on ubiquitin. However, unlike ubiquitin, the surface of G A B P a ~  does  not have such an obvious hydrophobic patch. As the part of the general ubiquitinylation process, ubiquitin covalently linked to the target molecule undergoes further ubiquitinylation to form polyubiquitin chains.  Therefore,  ubiquitin itself has target lysine residues on its surface, including a common site (K48) and several alternative sites (K6, K29, K63) (Pickart, 2004). As shown in Fig. 5.10, the comparison of these residues with the lysines on the surface of G A B P a " 35  on G A B P a " 35  121  121  reveals that K52/K53 and K115  are located in corresponding position to K29 and K63 of Ubiquitin, respectively.  It is also interesting to note that the K115 of ubiquitin is located near the predicted sumoylation site (K87) of G A B P a " 35  5.8  121  S3/S4  loop where the  is structured.  Conclusion The primary goal of determining the structure of the new domain in G A B P a was to gain  some insight into its function based on structural comparisons with other proteins of known function. The solution structure of this new domain solved by N M R spectroscopy has been discussed in this chapter. Although the protein contains a novel fold, the structure did not provide immediate functional clues.  In contrast with the closest tertiary structure from the  structure-based similarity search, it did not share any of the unique properties of ubiquitin, such as a hydrophobic patch for the ubiquitin-interacting motif. However, the structural information of this domain will be indispensable to address the overall goal of this project, especially in building a model containing structural details of full-length G A B P in complex with other potential protein partners. 72  Chapter 5.  The solution structure of  GABPa ' 25  (a) G A B P a  K115  (b) Ubiquitin  K63  K63  Figure 5.10. Comparison of surface lysines of GABPa " 35  121  and ubiquitin  Ubiquitin contains a common uniquitination site (K48) and three alternative sites (K6, K29, K63). G A B P a ' 35  121  (a)also contains three lysines (K52, K53, K l 15) located at relatively similar  positions to those (K.29, K63) on ubiquitin (b), respectively. The putative sumoylation site, K87, is labelled in red in GABPa " . 35  121  73  Chapter 6. Concluding remarks and future directions  Chapter 6 - Concluding remarks and future directions  The goals of this thesis were: I. to identify a new domain in the N-terminal region of G A B P a . II. to determine its tertiary structure. III. to dissect its biological function based on its structural information.  The identification of the N-terminal domain of G A B P a and the determination of its 35 121  structure provided two significant advances. First, the analysis of G A B P a "  completed the  structural characterization of the full-length G A B P a as three domains joined by flexible linker sequences (Fig. 6.1).  The structural description of the full-length protein is critical for  understanding its function in a biological context. structure of G A B P a " 35  121  Second, the determination of the tertiary  has revealed a novel protein fold, confirming an initial hypothesis  based on the lack of any sequence similarity to the protein databases. However, ironically, this precludes identification of the function of G A B P a " 35  121  based on comparison to proteins of  known function. Although the first and second goals were accomplished successfully, the third, defining the function of G A B P a " 35  121  , remains to be met. Given that "structure" did not yield "function",  investigations of this question are currently being performed through biochemical approaches. GABPa " 35  121  is too small to be an enzyme and thus probably serves a role in the assembly of  transcription complexes through protein-protein interactions with other components of the transcriptional machinery. This is consistent with the various partnerships described for  74  Chapter 6. Concluding remarks and future  directions  Figure 6.1. The structure of the full length GABPa The model of GABPa is based on the structures of the three domains (N-terminal, PNT, ETS domains) joined by flexible linker sequences. A MAP kinase phosphorylation site (T280) is shown as a red sphere. The linker regions are flexible as evident by their random coil chemical shifts in constructs such as GABPa " . 1  320  75  Chapter 6. Concluding remarks and future directions G A B P a in chapter 1. However, it is also possible that G A B P a " 3 5  1 2 1  binds to peptides or small  molecular ligands.  6.1  Domain interactions within GABP As a first step toward defining the core of the N-terminal domain, it was of interest to  search for interactions of this domain with other portions of G A B P . The GA-binding protein is a heterotetrameric complex (ap a), comprised of the DNA-binding subunit G A B P a ; and the 2  transactivation subunit GABPp\  Each subunit is composed of three domains: the ETS, PNT,  and N-terminal domains for G A B P a and the AR, T A D , and L Z domains for G A B P p . Although the physical association of the ETS domain and the A R has been examined at a molecular level, the possibility of other interactions has only been vaguely addressed. In the course of studying the formation of multimeric complex of the G A B P subunits, Chinenov et al. (Chinenov et al, 2000) have observed no a p a heterotetramer complex formation without binding to tandom 2  target sites positioned on the same side of D N A . Thus isolated G A B P may exist as an aP heterodimer. However, deletion of the N-terminal portion of G A B P a , including the N-terminal domain and the PNT domain, allowed the formation of heterotetrameric complex in solution. This suggests that the N-terminal sequence may interfere the association of the aP dimer to form tetramers. To investigate whether the N-terminal domain or the PNT domain is physically associated with any other G A B P domain, various fragments of G A B P were purified and analyzed by native gel electrophoresis (Fig. 6.2).  This preliminary native gel analysis for  domain interactions did not provide any obvious evidence of binding beyond that expected for full length G A B P a and GABPp. Neither the mixtures of the N-terminal domain nor those of the PNT domain with other parts of G A B P resulted in the formation of a new complex.  76  Chapter 6. Concluding remarks and future directions  Figure 6.2. Preliminary binding studies of G A B P domains by native gel. The purified G A B P domains were run on both native (a) and SDS (b) gels either individually or as mixtures following 4 hrs of incubation at room temperature. labeled as follow: a, G A B P a " 1  320  (residues 1-320); b, G A B P a  G A B P a , residues 168-254); c, G A B P a 254); d, G A B P a " 35  121  1 6 8  "  2 5 4  1 6 8  The individual domains are "  2 5 4  (prep. 1, PNT domain of  (prep. 2, PNT domain of G A B P a , residues 168-  ; e, full-length G A B P a (residues 1-454); f, G A B P p (residues 1-382); g,  complex of ETS domain ( G A B P a (residues 311-430)) and ankyrin repeat (GABPP2-1 (residues 1-157)); M , protein marker with the molecular weights (kDa) labeled. Sample c is only the repetitive preparation of sample b. The band corresponding to the complex of G A B P a / G A B P P is marked with a box in the gel electrophoresis.  77  Chapter 6. Concluding remarks and future  (a) Native gel  a  M b c  d e f  g  e + f  b + d  b d + + f f  d b a + + + g g f  (b) S D S - P A G E 116.0-  66.2 45.0  35.0 25.0 18.4 14.4  a M b  c  d  e  f  g e b + + f d  78  b + f  d + f  d + g  b a + + g f  directions  Chapter 6. Concluding remarks and future directions Despite these initial negative results, there are still several combinations of mixtures for the PNT domain and the N-terminal domain that might form physical interactions, such as with the ETS domain or with individual GABPp domains. Alternatively chemical cross-linking can be used to identify possible partnerships. Due to the advantage of utilizing SDS-PAGE gels, any crosslinked protein can be more confidently analyzed. Lastly, any possible partnerships can be confirmed by chemical shift perturbation mapping using N M R spectroscopy to monitor the titration of a N-labeled protein with an unlabeled partner. Note that this method already 15  revealed that the N-terminal and PNT domains of G A B P a do not interact as the spectrum of GABPa " 1  6.2  320  is essentially the sum of the spectra of G A B P a " 1  169  and G A B P a  1 6 8  "  2 5 6  .  Potential protein partners reported in the literature. Not surprisingly, G A B P has been reported to associate with many transcriptional factors.  Understanding the binding of G A B P with these proteins can provide insights into its roles in various biological contexts. This understanding includes identifying which domains of G A B P are involved in these potential associations. Currently, such binding studies are underway for three major G A B P partners, ATF-1, p300, and Spl/Sp3.  6.2.1  ATF-1 As discussed earlier in Chapter 1, the early 4 (E4) promoter of adenovirus type 5 is  known to be regulated by hGABP (E4TF1) and the adenovirus E 1 A gene product (Watanabe et al, 1988). In 1999, Sawada el al. (Sawada et ai, 1999) showed that the activating transcription factor 1 (ATF-1) and the c A M P response element-binding protein (CREB) are associated with this promoter as well.  In their report, synergistic transactivation was achieved in the  combination of hGABP and selected members of ATF/CREB family. Furthermore, they have 79  Chapter 6. Concluding remarks and future directions  shown that the DNA-binding bZip domain directly interacts with the N-terminal region (the N terminal domain and the PNT domain) of G A B P a .  Several clones encoding A T F have been 35 121  obtained and protein preparations for in vitro binding studies with G A B P a "  with the  G A B P a P N T domain.  6.2.2  CBP7p300 Interleukin 16 (EL-16) is a chemotactic cytokine which binds CD4 receptor and affects  the activation of T cells and replication of HIV. In the course of studying the expression of DL16, Bannert et al. (Bannert et al., 1999) found that the promoter region of human IL-16 contains three purine-rich sequences indicative of Ets binding sites.  Subsequently, they demonstrated  that G A B P binds these sites in concert with the co-activators C R E B binding protein (CBP)/p300. Furthermore, they have identified a direct interaction of the C-terminal region of p300 with G A B P a . Clones encoding C B P or p300 as required to test for binding to G A B P a " 3 5  1 2 1  in vitro  are currently being obtained.  6.2.3  Spl and Sp3 The utrophin gene codes for a large cytoskeletal protein. The promoter region of this  gene contains a functional G A B P binding site and three functional G C elements that are recognized by S p l and Sp3 factors. Galvani et al. (Galvagni et al, 2001) have reported that Spl and Sp3 can cooperate with G A B P for the activation of the utrophin promoter.  More  interestingly, they have observed a synergistic transactivation similar to that of with A T F 1/hGABP. Additionally they have shown that the DNA-binding zinc finger domain of both S p l and Sp3 physically interact with G A B P a . GABPa " 3 5  1 2 1  Clones of these proteins for binding studies with  are also currently being obtained. 80  Chapter 6. Concluding remarks and future directions  Following the preparations of these proteins, the binding studies will be carried out by various methods, including chemical cross-linking, native gel electrophoresis, gel filtration chromatography, and N M R spectroscopy. In the event that binding is detected, further deletion studies of the protein partners will be carried out to identify a minimal G A B P a " 35  121  -binding  region.  6.3  Unbiased screens for interacting protein: "fishing" for protein partners 35 121  As an alternative strategy to testing for binding of G A B P a  "  to transcription partners,  the identification of interaction proteins by affinity methods will also be attempted.  In this  method, in addition to the previously reported protein partners, other protein partners for each domain can be identified from cell extracts. G A B P is known for its widespread expression, being especially abundant in liver, muscle, and hematopoietic cells (Rosmarin et al., 2004). Although there are several protein tags available for this method, GST-tags will be used because they are already attached to some of the G A B P a constructs.  Four different constructs of  G A B P a will be used as bait individually in each test: full-length G A B P a , N-terminal domain, PNT domain, and N-terminal and PNT domains. Once purified, each domain will be loaded onto different GST-columns in parallel. Next, the calf thymus cell extract will be introduced onto the domain bound GST column. After several washing steps, G A B P a will be eluted with their bound protein partners. Four different elutions from different G A B P a constructs will be treated separately, and individual protein partners will be isolated by SDS-PAGE gel. Initally, the patterns of the bands will be compared amongst each other and throughout the four parallel runs. The bands of interest will then be extracted and digested with known enzyme (such as  81  Chapter 6. Concluding remarks and future directions trypsin and chymotrypsin). Finally, the fragments will be identified by mass spectrometry and database searching.  6.4  Future directions The ultimate goal of this project is not only to understand the structural and functional  basis for the N-terminal domain of G A B P a but also gain better insights into the roles of GABP as both a transcriptional regulator and a component of certain signal transduction pathways. Among Ets family members, one of the prominent characteristics of the G A B P complex is that the ETS DNA-binding domain and the transactivation domain are located in different proteins, G A B P a and GABPP, respectively. However, there were insufficient structural bases for GABP to understand how this structural uniqueness contributes to its function. Although the individual structures for all the domains in G A B P a are available now, the structural and functional relationships of these domains are not well characterized. Combining the domain interaction and protein partnership studies from the previous section with the early structural work will provide better insights into the functional and biological roles of the individual components in GABP. As the final stage of this project, I will attempt to solve the structure of the full GABP complex, possibly with some other factors, using X-ray crystallography.  82  References  References Altschul, S. F., Gish, W., Miller, W., Myers, E . W., Lipman, D . J. (1990). Basic Local Alignment Tool. Journal of Molecular Biology. 215:403-10. Archer, S. J., Ikura, M . , Torchia, D. A., Bax, A . (1991). A n alternative 3D-NMR technique for correlating backbone N with side-chain H -resonances in larger proteins. Journal of 1 5  p  Magnetic Resonance 95:636-41.  Bannert, N . , Avots, A., Baier, M . , Serfling, E., Kurth, R. (1999). GA-binding protein factors, in concert with the coactivator CREB binding protein p300, control the induction of the interleukin 16 promoter in T lymphocytes. Proceedings of the National Academy of Sciences of the United States of America. 96:1541-6.  Bassuk, A. G., Leiden, J. M . (1997). The role of Ets transcription factors in the development and function of the mammalian immune system. Advances in Immunology, Vol 64. 64:65104. Batchelor, A . H., Piper, D. E., de la Brousse, F. C., McKnight, S. L., Wolberger, C. (1998). The structure of G A B P alpha/beta: An ETS domain ankyrin repeat heterodimer bound to D N A . Science. 279:1037'-41. Bax, A . , Clore, G., Gronenborn, A . (1990). ' H - ' F I correlation via isotropic mixing of 1  magnetization, a new three-dimensional approach for assigning H and C-enriched proteins. Journal of Magnetic Resonance. 88:425-31.  1 3  C  13  C spectra of  13  Bax, A., Marion, D. (1988). Improved Resolution and Sensitivity in 'H-Detected Multiple-Bond Correlation Spectroscopy. Journal of Magnetic Resonance. 78:186-91. Buratowski, S. (1994). The Basics of Basal Transcription by RNA-Polymerase-II. Cell. 77:1-3. Chinenov, Y., Henzl, M . , Martin, M . E. (2000). The alpha and beta subunits of the GA-binding protein form a stable heterodimer in solution - Revised model of heterotetrameric complex assembly. Journal of Biological Chemistry. 275:7749-56. Chou, J. J., Gaemers, S., Howder, B., Louis, J. M . , Bax, A . (2001). A simple apparatus for generating stretched polyacrylamide gels, yielding uniform alignment of proteins and detergent micelles. lournal of Biomolecular NMR. 21:377-82. Clore, G . M . , Gronenborn, A . M . , Bax, A . (1998a). A robust method for determining the magnitude of the fully asymmetric alignment tensor of oriented macromolecules in the absence of structural information. Journal of Magnetic Resonance. 133:216-21. Clore, G. M . , Gronenborn, A . M . , Tjandra, N . (1998b). Direct structure refinement against residual dipolar couplings in the presence of rhombicity of unknown magnitude. Journal of Magnetic Resonance. 131:159-62.  83  References Cornilescu, G., Delaglio, F., Bax, A . (1999). Protein backbone angle restraints from searching a database for chemical shift and sequence homology. Journal of Biomolecular NMR. 13:289-302. Cuff, J. A . , Clamp, M . E., Siddiqui, A . S., Finlay, M . , Barton, G. J. (1998). JPred: a consensus secondary structure prediction server. Bioinformatics. 14:892-3. Daragan, V . A., Mayo, K . H . (1997). Motional model analyses of protein and peptide dynamics using C-13 and N-15 N M R relaxation. Progress in Nuclear Magnetic Resonance Spectroscopy. 31:63-105. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., Bax, A . (1995). Nmrpipe - a Multidimensional Spectral Processing System Based on Unix Pipes. Journal of Biomolecular NMR. 6:277-93. Dietmann, S., Park, J., Notredame, C., Heger, A . , Lappe, M . , Holm, L . (2001). A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3. Nucleic Acids Research. 29:55-7. Dittmer, J., Nordheim, A . (1998). Ets transcription factors and human disease. Biochimica et Biophysica Acta-Reviews on Cancer. 1377:F1-F11. Dosset, P., Hus, J. C , Blackledge, M . , Marion, D . (2000). Efficient analysis of macromolecular rotational diffusion from heteronuclear relaxation data. Journal of Biomolecular NMR. 16:23-8. Fisher, R. D., Wang, B., Alam, S. L., Higginson, D. S., Robinson, H . , Sundquist, W. I., Hill, C. P. (2003). Structure and ubiquitin binding of the ubiquitin-interacting motif. Journal of Biological Chemistry. 278:28976-84. Galvagni, F., Capo, S., Oliviero, S. (2001). S p l and Sp3 physically interact and co-operate with G A B P for the activation of the utrophin promoter. Journal of Molecular Biology. 306:985-96. Ghosh, A., Kolodkin, A . L . (1998). Specification of neuronal connectivity: ETS marks the spot. Cell. 95:303-6. Graves, B . J. (1998). Transcription - Inner workings of a transcription factor partnership. Science. 279:1000-2. Graves, B . J., Petersen, J. M . (1998). Specificity within the ets family of transcription factors. Advances in Cancer Research, Vol 75. 75:1-55. Grzesiek, S., Anglister, J., Bax, A . (1993). Correlation of backbone amide and aliphatic side13  15  13  chain resonances in Cl N proteins by isotropic mixing of of Magnetic Resonance. 101:114-9.  84  C magnetization. Journal  References  Hutchinson, E. G., Thornton, J. M . (1996). PROMOTJJF - A program to identify and analyze structural motifs in proteins. Protein Science. 5:212-20. 1  13  Ikura, M . , Kay, L . E., Bax, A . (1990). A novel approach for sequential assignment of H , C, and N spectra of proteins: heteronuclear triple-resonance three-dimensional N M R spectroscopy. Application to calmodulin. Biochemistry. 29:4659-67. 1 5  Ishii, Y . , Markus, M . A . , Tycko, R. (2001). Controlling residual dipolar couplings in highresolution N M R of proteins by strain induced alignment in a gel. Journal of Biomolecular NMR. 21:141-51. Jones, D . T. (1999). Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology. 292:195-202.  Kay, L . , X u , G.-Y., Singer, A., Muhnadiram, D., Forman-Kay, J. (1993). A gradient-enhanced i  n  H C C H - T O C S Y experiment for recording side-chains H and C correlations in H2O samples of proteins. Journal of Magnetic Resonance. 101:333-7. Kay, L . E., Torchia, D. A . , Bax, A . (1989). Backbone dynamics of proteins as studied by N inverse detected heteronuclear N M R spectroscopy: application to staphylococcal nuclease. Biochemistry. 28:8972-9. 1 5  Kim, C. A., Phillips, M . L . , Kim, W., Gingery, M . , Tran, H . H., Robinson, M . A., Faham, S., Bowie, J. U . (2001). Polymerization of the S A M domain of T E L in leukemogenesis and transcriptional repression. EM BO Journal. 20:4173-82. Kuboniwa, H., Grzesiek, S., Delaglio, F., Bax, A . (1994). Measurement of H - H J couplings in calcium-free calmodulin using 2D and 3D water-flip-back methods. Journal of Biomolecular NMR. 4:871-8. N  a  Lamarco, K . , Thompson, C. C , Byers, B . P., Walton, E . M . , Mcknight, S. L . (1991). Identification of Ets-Related and Notch-Related Subunits in GA-Binding Protein. Science. 253:789-92. Lee, G. M . , Donaldson, L . W., Pufall, M . A., Kang, H . S., Pot, I., Graves, B . J., Mcintosh, L . P. (2005). The structural and dynamic basis of Ets-1 D N A binding autoinhibition. Journal of Biological  Chemistry. 280:7088-99.  Lemon, B., Tjian, R. (2000). Orchestrated response: a symphony of transcription factors for gene control. Genes & Development. 14:2551-69. Leprince, D., Gegonne, A . , Coll, J., Detaisne, C , Schneeberger, A . , Lagrou, C , Stehelin, D . (1983). A Putative 2nd-Cell-Derived Oncogene of the Avian Leukemia Retrovirus-E26. Nature. 306:395-7. Liang, H . , Mao, X . H . , Olejniczak, E . T., Nettesheim, D . G., Y u , L . P., Meadows, R. P., Thompson, C. B., Fesik, S. W. (1994). Solution Structure of the Ets Domain of Fli-1 When Bound to D N A . Nature Structural Biology. 1:871-6. ,85  References  Linge, J. P., O'Donoghue, S. I., Nilges, M . (2001). Automated assignment of ambiguous nuclear overhauser effects with ARIA. Nuclear Magnetic Resonance of Biological Macromolecules, Pt B. 339:71-90. Lipari, G., Szabo, A . (1982). Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 2. Analysis of experimental results. Journal of the American Chemical Society. 104:4559-70. Logan, T. M . , Olejniczak, E. T., X u , R. X . , Fesik, S. W. (1992). Side-Chain and Backbone Assignments in Isotopically Labeled Proteins from 2 Heteronuclear Triple Resonance Experiments. FEBS Letters. 314:413-8. Mackereth, C. D., Scharpf, M . , Gentile, L. N . , Macintosh, S. E., Slupsky, C. M . , Mcintosh, L. P. (2004). Diversity in structure and function of the Ets family PNT domains. Journal of Molecular Biology. 342:1249-64. Mandel, A . M . , Akke, M . , Palmer, A . G. (1995). Backbone Dynamics of Escherichia-Coli Ribonuclease-Hl - Correlations with Structure and Function in an Active Enzyme. Journal of Molecular Biology. 144-63. Marion, D., Driscoll, P., Kay, L . , Wingfield, P., Bax, A . , Gronenborn, A . , Clore, G. (1989a). HSQC-TOCSY experiment. Biochemistry. 28:6150. Marion, D., Driscoll, P. C , Kay, L . E., Wingfield, P. T., Bax, A., Gronenborn, A . M . , Clore, G. M . (1989b). Overcoming the overlap problem in the assignment of H N M R spectra of larger proteins by use of three-dimensional heteronuclear ' H - N Hartmann -Hahnmultiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: application to interleukin 1 beta. Biochemistry. 28:6150-6. l  1 5  Mcintosh, L . P., Brun, E., Kay, L . E. (1997). Stereospecific assignment of the N H resonances from the primary amides of asparagine and glutamine side chains in isotopically labeled proteins. Journal of Biomolecular NMR. 9:306-12. 2  Montelione, G. T., Lyons, B. A . , Emerson, S. D., Tashiro, M . (1992). A n efficient triple resonance experiment using carbon-13 isotropic mixing for determining sequencespecific resonance assignments of isotopically-enriched proteins. Journal of the American Chemical Society. 114:10974-5. Muhandiram, D., Kay, L . (1994). Gradient-enhanced triple-resonance three-dimensional N M R experiments with improved sensitivity. Journal of Magnetic Resonance. 103:203-16. Neri, D., Szyperski, T., Otting, G., Senn, H . , Wuethrich, K . (1989). Stereospecific nuclear magnetic resonance assignments of the methyl groups of valine and leucine in the D N A binding domain of the 434 repressor by biosynthetically directed fractional C labeling. Biochemistry. 28:7510-6. 1 3  86  References  Nunn, M . F., Seeburg, P. PL, Moscovici, C , Duesberg, P. H . (1983). Tripartite Structure of the Avian Erythroblastosis Virus-E26 Transforming Gene. Nature. 306:391-5. Ouali, M . , King, R. D. (2000). Cascaded multiple classifiers for secondary structure prediction. Protein Science. 9:1162-76.  Pascal, S. M . , Muhandiram, D. R., Yamazaki, T., Kay, J. D. F., Kay, L . E. (1994). Simultaneous Acquistion of N and C-Edited N O E Spectra of Proteins Dissolved in H 0 . Journal of 1 5  13  2  Magnetic Resonance. 103:197-201.  Pelton, J. G., Torchia, D. A . , Meadow, N . D., Roseman, S. (1993). Tautomeric states of the active-site histindines of phosphorylated and unphosphorylated III Glc, a signaltransducing protein from Escherichia coli, using two-dimensional heteronuclear N M R techniques. Protein Science. 2:543-58. Petersen, J. M . , Skalicky, J. J., Donaldson, L . W., Mcintosh, L . P., Alber, T., Graves, B . J. (1995). Modulation of transcription factor Ets-l D N A binding: DNA-induced unfolding of an alpha helix. Science. 269:1866-9. Pickart, C. (2004). Back to the future with ubiquitin. Cell. 23:181-90. Poirel, FL, Lopez, R. G., Lacronique, V., Delia Valle, V., Mauchauffe, M . , Berger, R., Ghysdael, J., Bernard, O. A . (2000). Characterization of a novel ETS gene, T E L B , encoding a protein structurally and functionally related to TEL. Oncogene. 19:4802-6. Polo, S., Sigismund, S., Faretta, M . , Guidi, M . , Capua, M . R., Bossi, G., Chen, EL, De Camilli, P., D i Fiore, P. P. (2002). A single motif responsible for ubiquitin recognition and monoubiquitination in endocytic proteins. Nature. 416:451-5. Potter, M . D., Buijs, A . , Kreider, B., van Rompaey, L . , Grosveld, G . C. (2000). Identification and characterization of a new human ETS-family transcription factor, TEL2, that is expressed in hematopoietic tissues and can associate with TEL1/ETV6. Blood. 95:33418. Raiborg, C , Bache, K . G., Gillooly, D. J., Madshush, I. H., Stang, E., Stenmark, H . (2002). Hrs sorts ubiquitinated proteins into clathrin-coated microdomains of early endosomes. Nature Cell Biology. 4:394-8.  Rosmarin , A . G., Resendes, K . K . , Yang, Z. F., McMillan, J. N . , Fleming, S. L . (2004). G A binding protein transcription factor: a review of G A B P as an integrator of intracellular signaling and protein-protein interactions. Blood Cells Molecules and Diseases. 32:14354. Sattler, M . , Schleucher, J., Griesinger, C. (1999). Heteronuclear multidimensional N M R experiments for the structure determination of proteins in solution employing pulsed field gradients. Progress in NMR Spectroscopy. 34:93-158.  87  References  Sawada, J., Simizu, N . , Suzuki, F., Sawa, C , Goto, M . , Hasegawa, M . , Imai, T., Watanabe, FL, Handa, H . (1999). Synergistic transcriptional activation by hGABP and select members of the activation transcription factor/cAMP response element-binding protein family. Journal of Biological  Chemistry. 274:35475-82.  Schwartz, D. C., Hochstrasser, M . (2003). A superfamily of protein tags: ubiquitin, SUMO and related modifiers. Trends in Biochemical Sciences. 28:321-8. Seidel, J. J., Graves, B . J. (2002). An E R K 2 docking site in the Pointed domain distinguishes a subset of ETS transcription factors. Genes & Development. 16:127-37. Shih, S. C., Katzmann, D . J., Schnell, J. D., Sutanto, M . , Emr, S. D., Hicke, L . (2002). Epsins and Vps27p/Hrs contain ubiquitin-binding domains that function in receptor endocytosis. Nature Cell Biology. 4:389-93.  Skrynnikov, N . R., Goto, N . K., Yang, D. W., Choy, W. Y., Tolman, J. R., Mueller, G. A., Kay, L . E. (2000). Orienting domains in proteins using dipolar couplings measured by liquidstate N M R : Differences in solution and crystal forms of maltodextrin binding protein loaded with beta-cyclodextrin. Journal of Molecular Biology. 295:1265-73. Slupsky, C. M . , Gentile, L . N . , Mcintosh, L . P. (1998). Assigning the N M R spectra of aromatic amino acids in proteins: analysis of two Ets pointed domains. Biochemistry and Cell Biology-Biochimie  et Biologie Cellulaire. 76:379-90.  Spiegelman, B. M . , Heinrich, R. (2004). Biological control through regulated transcriptional coactivators. Cell. 119:157-67. Thompson, C. C., Brown, T. A . , Mcknight, S. L . (1991). Convergence of Ets-Related and Notch-Related Structural Motifs in a Heteromeric DNA-Binding Complex. Science. 253:762-8. Tootle, T. L . , Rebay, I. (2005). Post-translational modifications influence transcription factor activity: a view from the ETS superfamily. Bioessays. 27:285-98. Triezenberg, S. J., Lamarco, K . L . , Mcknight, S. L . (1988). Evidence of D N A - Protein Interactions That Mediate Hsv-1 Immediate Early Gene Activation by V p l 6 . Genes & Development. 2:730-42. Virbasius, J. V . , Scarpulla, R. C. (1990). The Rat Cytochrome-c-Oxidase Subunit-Iv Gene Family - Tissue-Specific and Hormonal Differences in Subunit-IV and Cytochrome-c Messenger-RNA Expression. Nucleic Acids Research. 18:6581-6. Virbasius, J. V . , Scarpulla, R. C. (1991). Transcriptional Activation through Ets Domain Binding-Sites in the Cytochrome-c-Oxidase Subunit-IV Gene. Molecular and Cellular Biology. 11:5631-8.  88  References Virbasius, J. V . , Virbasius, C. M . A . , Scarpulla, R. C. (1993). Identity of Gabp with Nrf-2, a Multisubunit Activator of Cytochrome-Oxidase Expression, Reveals a Cellular Role for an Ets Domain Activator of Viral Promoters. Genes & Development. 7:380-92. Vuister, G. W., Wang, A . C , Bax, A . (1993). Measurement of 3-bond nitrogen carbon J15  couplings in proteins uniformly enriched in Chemical Societyll5:5334-5.  N and  13  C. Journal of the American  Wasylyk, B . , Hagman, J., Gutierrez-Hartmann, A . (1998). Ets transcription factors: nuclear effectors of the Ras-MAP-kinase signaling pathway. Trends in Biochemical Sciences. 23:213-6. Watanabe, H., Imai, T., Sharp, P. A., Handa, H . (1988). Identification of 2 Transcription Factors That Bind to Specific Elements in the Promoter of the Adenovirus Early-Region 4. Molecular and Cellular Biology. 8:1290-300. Watanabe, H . , Wada, T., Handa, H . (1990). Transcription Factor E4TF1 Contains 2 Subunits with Different Functions. EMBO Journal. 9:841-7. Willard, L., Ranjan, A., Zhang, H . Y., Monzavi, H., Boyko, R. F., Sykes, B. D., Wishart, D. S. (2003). V A D A R : a web server for quantitative evaluation of protein structure quality. Nucleic Acids Research. 31:3316-9. Wishart, D., Richards, F., Sykes, B . (1992). The chemical shift index: a fast and simple method for the assignment of protein secondary structure through N M R spectroscopy. Biochemistry. 31:1647-51. Yamazaki, T., Foreman-Kay, J. D., Kay, L . (1993). Two-Dimensional N M R Experiments for Correlating C and ' H Chemical Shifts of Aromatic Residues in C-Labelled Proteins via Scalar Couplings. Journal of the American Chemical Society 115:11054-5. 1 3  P  5 / e  13  Zwahlen, C , Gardner, K . H . , Sarma, S. P., Horita, D. A . , Byrd, R. A . , Kay, L . E . (1998). An •I O  N M R experiment for measuring methyl-methyl NOEs in C-labeled proteins with high resolution. Journal of the American Chemical Society. 120:7617-25. Zwahlen, C , Legault, P., Vincent, S. F. J., Greenblatt, J., Konrat, R., Kay, L . E . (1997). Methods for measurement of intermolecular NOEs by multi-nuclear N M R spectroscopy: application to a bacteriophage L N-peptide/boxB R N A complex. Journal of the American Chemical Society. 119:6711-21.  89  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0092078/manifest

Comment

Related Items