UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Structural and functional studies of the N-Terminal domain of the GABPα Kang, Hyun-Seo 2005

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2005-0232.pdf [ 10.01MB ]
Metadata
JSON: 831-1.0092078.json
JSON-LD: 831-1.0092078-ld.json
RDF/XML (Pretty): 831-1.0092078-rdf.xml
RDF/JSON: 831-1.0092078-rdf.json
Turtle: 831-1.0092078-turtle.txt
N-Triples: 831-1.0092078-rdf-ntriples.txt
Original Record: 831-1.0092078-source.json
Full Text
831-1.0092078-fulltext.txt
Citation
831-1.0092078.ris

Full Text

STRUCTURAL AND FUNCTIONAL STUDIES OF THE N-TERMINAL DOMAIN OF GABPa by Hyun-Seo Kang B.Sc, University of Oregon, 2002 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Biochemistry and Molecular Biology) THE UNIVERSITY OF BRITISH COLUMBIA April 2005 © Hyun-Seo Kang, 2005 M.Sc. Thesis - Hyun-Seo Kong Abstract GA-binding protein (GABP) is a heterotetrameric ap^a transcription factor composed of two structurally dissimilar subunits, G A B P a and GABP(3. The modular DNA-binding subunit, G A B P a , was known previously to consist of a DNA-binding ETS domain and a PNT domain, which presumably mediates protein-protein interactions with other components of the signalling or transcription machinery. The transactivation subunit, GABPp\ consists of three domains, namely the ankyrin repeats, which bind G A B P a , a leucine zipper to form P2 dimers, and a transactivation domain. G A B P is known to regulate gene expression involved in many different cellular functions, such as cell cycle control, apoptosis, and viral pathogen expression. An investigation of the N-terminal region of G A B P a revealed the existence of third structured domain. Using partial proteolysis and N M R spectroscopy, this new domain was localized to residues 35-121, and was found to be flanked by unstructured residues and independent of the adjacent PNT domain. The gene encoding this domain was cloned and expressed, yielding a new truncation fragment, G A B P a 3 5 " 1 2 1 . As a step towards discussing its function, the structure of G A B P a 3 5 " 1 2 1 was determined by N M R spectroscopy. The protein is composed of a 5-stranded (3-sheet crossed by a distorted helix. Although globally resembling 35 121 ubiquitin, G A B P a " adopts a novel fold as evident by the arrangement of its secondary structure elements. An analysis of G A B P a 3 5 " 1 2 1 for features indicative of a macromolecular binding interface revealed only a region of negative charge. Therefore, the structure of G A B P a 3 5 " 1 2 1 provides few clues to its function. To determine the function of G A B P a 3 5 " 1 2 1 , these strategies will be pursued based on the hypothesis that this domain is a protein-protein interaction module. First, potential interactions with other domains of G A B P a and GABPp will be examined using approaches such as native ii M.Sc. Thesis - Hyun-Seo Kang gel electrophoresis, chemical crosslinking, and NMR spectroscopy. Second, interactions of GABPa 3 5 " 1 2 1 with reported partners of GABP, such as ATF, CBP/p300, or Spl, will be tested. Finally, unbiased affinity methods will be used to identify proteins from cellular extracts that bind specifically to GABPa 3 5 " 1 2 1 . iii M.Sc. Thesis - Hyun-Seo Kang T a b l e of Contents Abstract ii Table of Contents iv List of Figures vii List of Tables viii Abbreviations ix Acknowledgements xii Chapter 1 - Introduction 1 1.1 Eukaryotic transcription 1 1.2 Ets transcription factor family 3 1.3.1 Cellular functions of GABP 6 1.3.2 Quaternary and domain structures of GABP 9 1.4 Investigating the N-terminal region of GABPa 14 1.4.1 Evidence implicating the interactions of the N-terminal residues of GABPa with other folded domains in GABP 15 1.4.2 Possible involvement of the N-terminal domain with previously reported GABP protein partners 15 1.5 Thesis overview 16 Chapter 2 - Materials & Methods 18 2.1 Cloning 18 2.2 Protein expression and purification 20 2.3 Limited trypsin digestion 22 2.4 NMR spectroscopy 22 2.5 Residual dipolar couplings (RDCs) measurements 22 2.6 NMR spectral assignments and structure determination 23 2.7 Native gel electrophoresis 23 Chapter 3 - Identification and cloning of the new GABPa domain 24 3.2 Expression of residues 1-169 from GABPa 24 3.3 Identification of the boundaries of the structured region in GABPa 1 " 1 6 9 26 iv M.Sc. Thesis - Hyun-Seo Kang 3.3.1 Limited trypsin digestion 26 3.3.2 Assigning the ' H - 1 5 N HSQC spectrum of GABPa 1 " 1 6 9 29 3.4 Sub-cloning and characterization of GABPa 3 5 " 1 2 1 32 Chapter 4 - Strategies for assigning spectra and gathering spectral information for NMR-based structure calculations 34 4.1 Assigning 2D and 3D NMR spectra 34 4.1.1 Assignment of resonances from backbone nuclei 34 4.1.2 Assignment of resonances from aliphatic sidechain nuclei 36 4.1.3 Assigning resonances from aromatic residues 41 4.2 Secondary structure 41 Chapter 5 - The solution structure of GABPa3 5"1 2 1 48 5.1 Obtaining dihedral angle information (<J> , XP, Xi) from NMR spectra 48 5.1.1 0> and W angles 48 5.1.2 %i angle of residues with H | 3 ' ( 3 protons 48 5.1.3 %, angles of Val, Val, He 49 5.2 Residual dipolar coupling constants 49 5.3 Structure calculation 49 5.4 Structure overview 50 5.4.1 (3-strand structure 55 5.4.2 Loop structure 55 5.4.3 Distorted helix 55 5.5 Dynamics from amide 1 5 N relaxation 58 5.6 Surface features 64 5.7 Structural comparisons 67 5.8 Conclusion 72 Chapter 6 - Concluding remarks and future directions 74 6.1 Domain interactions within GABP 76 6.2 Potential protein partners reported in the literature 79 6.2.1 ATF-1 79 6.2.2 CBP/p300 80 6.2.3 Spl andSp3 : .' ; 80 M.Sc. Thesis - Hyun-Seo Kang 6.3 Unbiased screens for interacting protein: "fishing" for protein partners 81 6.4 Future directions 82 References 83 vi M.Sc. Thesis - Hyun-Seo Kang List of Figures Figure 1.1. Overview of eukaryotic gene expression 2 Figure 1.2. Schematic diagram of Ets family protein 4 Figure 1.3. Structures of ETS and PNT domains 5 Figure 1.4. Gene expression by GABP, NRF-2, and E4TF1 7 Figure 1.5. Model of the GABPap^a heterotetramer complex on D N A 10 Figure 1.6. Structure of the GABPoV(3 heterodimer complex on D N A 12 Figure 2.1. Oligonucleotides and PCR protocol for cloning G A B P a 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 19 Figure 3.1. Partially assigned ! H - 1 5 N HSQC spectrum of G A B P a 1 " 1 6 9 25 Figure 3.2. Overlaid ' H - 1 5 N HSQC spectra of G A B P a 1 " 1 6 9 , G A B P a 1 6 8 " 2 5 6 , and G A B P a 1 " 3 2 0 . . . 27 Figure 3.3. Identification of the structured region of G A B P a 1 " 1 6 9 by N M R 28 Figure 3.4. Limited trypsin digestion of G A B P a 1 " 1 6 9 30 Figure 3.5. The fully assigned ! H - 1 5 N HSQC spectrum of G A B P a 3 5 " 1 2 1 33 Figure 4.1. Heteronuclear experiments used to assign G A B P a 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 35 Figure 4.2. Strategies for assigning protein backbone resonances 37 Figure 4.3. Stereospecific assignments of the Gin and Asn NH2 resonances of G A B P a 3 5 " 1 2 1 ..40 Figure 4.4. Assignment of resonances from aromatic sidechain in G A B P a 3 5 " 1 2 1 42 Figure 4.5. His66 adopts a neutral N e 2 H tautomeric form 43 Figure 4.6. Secondary structure prediction of G A B P a 3 5 " 1 2 1 45 Figure 5.1. Iterative assignment of ambiguous restraints by ARIA 51 Figure 5.2 Solution structure of G A B P a 3 5 " 1 2 1 53 Figure 5.3. Residual dipolar coupling constants (observed vs. experimental) 56 Figure 5.4. Identification of a bulge between (3-strands S4 and S5 in G A B P a 3 5 " 1 2 1 57 Figure 5.5. Interactions within the proline-containing a-helix in G A B P a 3 5 " 1 2 1 59 Figure 5.6. 15N-relaxation analysis of G A B P a 3 5 " 1 2 1 62 Figure 5.7. Surface properties of G A B P a 3 5 " 1 2 1 65 Figure 5.8. Secondary structure comparisons of G A B P a 3 5 " 1 2 1 and ubiquitin (1UBQ) 70 Figure 5.9. The surface properties of ubiquitin (1UBQ) 71 Figure 5.10. Comparison of surface lysines of G A B P a 3 5 " 1 2 1 and ubiquitin 73 Figure 6.1. The structure of the full length G A B P a 75 Figure 6.2. Preliminary binding studies of G A B P domains by native gel 77 vii M.Sc. Thesis — Hyun-Seo Kang List of Tables Table 1.1. Various nomenclature used for GABP subunits 13 Table 2.1. The compositions of M9T minimal media used for 1 3 C or 1 5 N isotope-labelling 21 Table 5.1. NMR restraints and statistics for the ensemble of ten structures calculated for GABPa 3 5 " 1 2 1 52 viii M.Sc. Thesis - Hyun-Seo Kang Abbreviations ID one-dimensional 2D two-dimensional 3D three-dimensional ATF activating transcription factor ATP adenosine triphosphate A R ankyrin repeat CBP CREB binding protein CREB cyclic AMP-responsive element CSI chemical shift index CT constant time Da dalton D 2 0 deuterium oxide D N A deoxyribonucleic acid DTT dithiothreitol E R G early response genes ETS (Ets) E26 transformation specific ESI-MS electrospray ionization mass spectrometry G A B P GA-binding protein GST glutathione-S-transferase HEPES N-2-hydroxyethylpiperazine-N' -2-ethanesulphonic acid hGABP human GA-binding protein EJMBC heteronuclear multiple bond correlation H M Q C heteronuclear multiple quantum coherence ix M.Sc. Thesis - Hyun-Seo Kang HIV HSQC IL-16 IPAP IPTG kDa L B L Z M A P M W Ni-NTA N M R NOE NOESY NRF P A G E PCR PDB PNT ppm RC04 RDC rms SDS human immunedeficiency heteronuclear single quantum coherence interleukin 16 in-phase anti-phase isopropyl-p-D-thiogalactopyranoside kiloDalton Luria-Bertani leucine zipper mitogen activated protein molecular weight nickel-nitrilotriacetic acid nuclear magnetic resonance nuclear Overhauser effect nuclear Overhauser effect spectroscopy nuclear respiratory factor polyacrylamide gel electrophoresis polymerization chain reaction protein data bank (http://www.rcsb.org/pdb/) pointed parts per million rat cytochrome C oxidase subunit IV residual dipolar coupling root mean square sodium dodecyl sulphate M.Sc. Thesis - Hyun-Seo Kang T A D transactivation domain T E L translocation ets leukemia TOCSY total correlation spectroscopy Tris tri(hydroxymethyl)aminomethane VP 16 viral protein 16 xi M.Sc. Thesis - Hyun-Seo Kang Acknowledgements The work presented in this thesis would not have been accomplished without the invaluable supports from numerous people. First of all, I would like to thank to my supervisor, Dr. Lawrence Mcintosh, who gave me the opportunity to work on this project. His excellent insights in science and guidance through the project have inspired me to complete this thesis. The former and current G A B P collaborators, Drs. Cameron Mackereth and Manuela Schaerpf, cannot be missed for their helps through this project. I acknowledge Cameron for introducing me into this exciting project and Manuela, aka Frog, for being such a supportive collaborator and a mentor in climbing. I also wish to thank to Dr. Gary Yalloway for his time and effort on analyzing the mass spectrometric results and really wish his good health. Dr. Greg Lee has also been a great collaborator for my second project on Ets-1 and provided lots of advices on technical problems in calculating the structure of G A B P a . I would like to acknowledge all the former and current members of Dr. Lawrence Mcintosh's lab. Lastly, I would like to thank my parents and my sister especially for their endless moral support. xii Chapter 1. Introduction Chapter 1 - Introduction 1.1 Eukaryotic transcription The key strategy for eukaryotic gene regulation is based on the synchronization of cis-and trans-regulatory elements (for reviews see (Buratowski, 1994; Lemon and Tjian, 2000; Spiegelman and Heinrich, 2004)). The cis-regulatory elements are commonly DNA sequences represented by the initiator sequence, the TATA-box, and either enhancers in higher eukaryotes or up-stream activation sequences (UAS) in yeast, located in various regions relative to the associated structural genes. The trans-regulatory elements, generally referred to as transcription factors, include basal transcription factors, DNA-binding transactivators, and coactivators. To form the transcription initiation complex, the basal transcription factors must associate either directly or indirectly with RNA polymerase on the initiator sequence and TATA-box. Recruitment of this complex to a gene promoter is dependent upon the activities of a multitude of specific DNA-binding transcription factors (Fig. 1.1). Due to their modular structure, these transcription factors are usually also involved in protein-protein or protein-ligand interactions. Such interactions with the initiation complex-associated proteins can be directly or indirectly facilitated through other proteins, known as co-activators or co-repressors. In parallel, these transcription factors are also involved in chromatin remodelling. Unlike prokaryotes, chromatin structure in eukaryotes strongly represses gene expression, providing a requirement that transcription be associated with chromatin remodelling. Ligand binding by transcription factors modulates their conformations, thereby regulating transcription. Similarly, transcriptional factors are also regulated through post-translational modifications, such as phosphorylation, acetylation, sumoylation, and ubiquitinylation, in response to signal transduction cascades (Tootle and Rebay, 2005). Thus, the structural and functional characterization of specific 1 Chapter 1. Introduction Chromatin Remodeling I Figure 1.1. Overview of eukaryotic gene expression In addition to DNA-binding, specific transcription factors are involved in protein-protein interactions with other transcription-related elements, such as the transcription initiation complex, coactivators/corepressors, or chromatin remodelling related-proteins. Chapterl. Introduction transcription factors remains a critical challenge for understanding the normal and aberrant control of eukaryotic gene expression. 1.2 Ets transcription factor family The founding member of the Ets family, ets-l, was discovered as part of the tripartite oncogene of the avian E26 (E26 transformation specific) retrovirus (Leprince et al, 1983; Nunn et al, 1983). Since then, the Ets transcription factor family has grown to over 30 recognized-members from diverse metazoan species (Fig. 1.2) (Graves and Petersen, 1998). These proteins have key roles in regulating cellular differentiation, development, transformation, and proliferation, particularly in blood cell lineages. The Ets family proteins are defined by the conserved -85 residue ETS domain. The ETS domain contains a DNA-binding motif called the winged helix-turn-helix, which specifically recognizes promoters containing a conserved core sequence, 5'- G G A -3' (Fig. 1.3 (a)). Through numerous biochemical and structural studies, the molecular bases for DNA-binding by several ETS domains are well established. In addition to the ETS domain, approximately one third of all Ets family proteins share another conserved domain, called the PNT (pointed) domain (Fig 1.2 and Fig. 1.3 (b)). Although not as well characterized as the ETS domain, the PNT domain is involved in a variety of protein-protein interactions. For example, in Ets-l , it serves as both a docking site for the M A P kinase Erk2, facilitating phosphorylation of an adjacent N-terminal phosphoacceptor, Thr38, and as an interaction platform for the CBP/p300 co-activator (Seidel and Graves, 2002). In the case of Tel, the PNT domain mediates the polymerization that is required for transcriptional repression (Poirel et al, 2000; Potter et al, 2000). In addition, Ets transcription family proteins usually have a poorly characterized transactivation domain(s). 3 Chapter 1. Introduction PNT ETS Figure 1.2. Schematic diagram of Ets family protein Schematic diagrams of domain structures of selected Ets transcription family proteins are shown. The PNT and ETS domains are highlighted in red and blue, respectively. 4 Chapter 1. Introduction (b) PNT domain Tel GABPa Ets-1 Figure 1.3. Structures of ETS and PNT domains (a) The ETS domain contains a DNA-binding winged helix-turn-helix motif that recognizes a conserved core 5'-GAA-3' sequence (Petersen et al., 1995). (b) Ribbon diagrams of the tertiary structures of PNT domains from the Ets family proteins, Erg, GABPa, Tel, and Ets-1. The proteins share an architecture of a core four a-helix bundle (H2, red; H3, green; H4, cyan; H5, blue). Note that an N-terminal helix (HI, purple) is specific to GABPa and Ets-1. (Mackereth et al., 2004) 5 Chapter 1. Introduction 1.3 GA-binding protein (GABP) One member of the Ets family, GABP, was first discovered as a transactivator for the immediate early (IE) gene of the herpes simplex virus 1 (HSV-1) in the rat liver (Fig. 1.4 (a)) (Lamarco et al, 1991; Thompson et al, 1991). The IE gene of HSV-1 is induced by the virion associated protein, VP16 (Triezenberg et al, 1988). However, instead of interacting directly with the promoter region, VP 16 forms a complex with a cellular transcription factor, Oct-1, on the first cis-regulatory element of the IE gene. Subsequently, the heterotetrameric GABP binds the second cis-regulatory element, known as the purine-rich region due to its core G A sequence. Paralleling the discovery of the first GA-binding protein in the rat liver, two other proteins, nuclear respiratory factor 2 (NRF-2) (Virbasius and Scarpulla, 1990; Virbasius and Scarpulla, 1991) and E4 transcriptional factor 1 (E4TF1) (Watanabe et al, 1988), were identified by the Scarpulla and Handa groups, respectively. Although discovered in distinct biological pathways, they were both confirmed later to be the same protein as GABP. NRF-2 was identified in the course of studying the rat cytochrome C oxidase subunit IV (RC04) gene, which is a respiratory electron carrier gene (Fig. 1.4 (b)). The ATP synthase (3-subunit gene also contains a binding site for NRF-2 in its promoter region. Along with the discovery of G A B P related to the herpes virus expression, the Handa group discovered that the E4TF1 protein activates the promoter of the E4 region of adenovirus type 5 gene in human (Fig. 1.4 (c)). 1.3.1 Cellular functions of GABP The varied nomenclature used for G A B P reflects its diversity in different biological pathways (Bassuk and Leiden, 1997; Dittmer and Nordheim, 1998; Ghosh and Kolodkin, 1998; Rosmarin et al, 2004; Wasylyk et al, 1998). GA-binding protein (GABP) is a transcriptional factor that controls many different genes related to cell cycle control, apoptosis, and viral 6 Chapter 1. Introduction Figure 1.4. Gene expression by GABP, NRF-2, and E4TF1. GABP, also known as NRF-2, and E4TF1, was identified through three parallel routes, (a) The immediate early (IE) gene of herpes simplex virus 1 (HSV-1) is regulated based on two cis-regulatory elements in its promoter region, namely a purine-rich hexanucleotide and a nanonucleotide sequences. The heterotetrameric complex of G A B P and Oct-l/VP16 complex binds these sites, respectively, thereby activating IE gene expression (Lamarco et al., 1991). (b) One of the mammalian respiratory electron carrier genes, R C 0 4 (rat cytochrome C oxidase subunit 4 gene) is activated based on the binding of NRF-2 (nuclear respiratory factor 2) to its promoter region (Virbasius et al., 1993). (c) The E4 promoter of adenovirus type 5 is activated upon the binding of the cellular factor, E4TF1, and an adenovirus E l A gene product (Watanabe et al., 1990). 7 Chapter 1. Introduction (a) VP16 GABP c c p 2 a Oct-1 I Purine-rich Nonanucleotide region sequence ' 1 ' cis-regulatory element in the promoter region IE gene of HSV-1 NRF-2 (b) RC04 Mammalian repiratory electron carrier gene E4TF1 Adenovirus E1A gene product (c) \ / E4 region Adenovirus type 5 8 Chapter 1. Introduction pathogen expression. In the cell cycle, it has been reported that G A B P controls the G l /S restriction point either by regulating the gene expression of the cell-cycle associated component, retinoblastoma (Rb), or by physically interacting with another cell-cycle associated factor, E2F1. As a part of apoptosis, G A B P cooperates with both Spl and P U . l to activate transcription of the (32 leukocyte integrin gene, CD 18, which is required for adhesion of white blood cells to endothelium in order to kill foreign cells. Interestingly, it has also been shown that G A B P regulates expression of some of the most important viral pathogens, such as adenoviruses, herpes viruses, and HIV. The adenovirus early 4 (E4) promoter is activated by G A B P with the cooperation of other factors, including ATF/CREB. G A B P also interacts with p300, a cellular target of the adenoviral protein E1A, to control the induction of the IL-16 promoter (Bannert et al, 1999). The fact that viruses frequently exploit G A B P to achieve gene expression supports the idea that G A B P has powerful transcriptional properties. 1.3.2 Quaternary and domain structures of GABP G A B P is composed of two structurally-dissimilar proteins, G A B P a and GABPP, which are often referred as the DNA-binding subunit and the transactivation subunit, respectively. Indeed, G A B P is distinct among the Ets family in having these two functions mediated via separate polypeptide chains. These two subunits form a heterotetrameric ap^a complex on D N A with tandem promoter sites (Fig. 1.5) (Graves, 1998) (Rosmarin et al, 2004). However, it is still controversial if G A B P exists as an a(32a tetramer or an a(3 heterodimer in solution. G A B P a is composed of three structurally distinct regions (Fig. 1.5 (b)). First, the C-terminal ETS domain (residue 320-430) binds purine-rich regions in D N A , preferably with the sequence 5'-GGAA/T-3'(Batchelor et al, 1998). The crystal structure of ETS domain was solved as a DNA-bound complex. Second, a PNT domain (residue 170-250) of unknown 9 (a) (b) G A B P a LEU ZIPPER Chapter 1. Introduction Activation Ankyrtn repeats PNT uomuin A V T T C C Q G T T O A A O O C , < ^ A , - Inhibition 7 • ETS domnln f / W A C T T T C C O Q T T O A A Q O C , q A , Unidentified N-terminal PNT ETS 1 35 GABPp 121 168 1 5 156 254 320 430 454 258 327 334 382 Figure 1.5. Model of the GABPap2a heterotetramer complex on DNA (a) Based on the X-ray crystal structure of GABP ETS/ankyrin repeats complex in fig 1.6, a model of the full heterotetrameric GABP complex on a tandem purine-rich DNA sequence was proposed. Two heterodimers of GABPa/GABPct are connected through the homodimerization of the Leucine Zipper (LZ) domains to form a heterotetramer. The locations of PNT domains and Leucine Zipper domains are shown schematically. The sequence of the purine-rich region is shown in tandom with the conserved 5'-GAA-3' sequence in red (Graves, 1998). (b) Schematic diagrams of GABPa and GABPp are shown with their known domains. Although not depicted in (a), the N-terminal domain from GABPa is the subject of the study. 10 Chapter 1. Introduction function is located in the middle of G A B P a . The solution structure of the PNT domain was solved using the N M R spectroscopy by the Mcintosh group (Mackereth et al, 2004). Finally, as discussed in this thesis, the N-terminal region of G A B P a , which has been essentially unstudied to date, contains a novel domain of unknown function. GABP(3, known as the transactivation subunit, also contains three domains (Fig. 1.5 (b)), specifically the ankyrin repeats (AR), the transactivation domain (TAD), and the leucine-zipper domain (LZ). The N-terminal A R directly interacts with the G A B P a ETS domain, as shown in the G A B P complex structure. The C-terminal L Z domain forms a coiled-coil leading to a GABPP homodimer. Unlike these well-characterized domains, the location and characteristics of the T A D in GABPP are poorly defined. Thg biological specificity of G A B P may arise in large part from the diversity of GABPp. As summarized in Table 1.1 (Rosmarin et ai, 2004), there are four G A B P P l isoforms arising from alternative mRNA gene splicing, as well as a GABP|32 encoded by a separate gene. Significant differences are observed in the domain structures of these GABPP variants, particularly with respect to the T A D or L Z , prompting the idea that combinatorial association of G A B P a with various GABPP partners can lead to alternative regulation of gene expression. 1.3.2.1 Crystal structure of the GABPaP complex with DNA The ETS domain of G A B P a is the most extensively studied part of this transcription regulator due to its function in DNA-binding. As shown by Batchelor et al (Batchelor et ai, 1998), the ETS domain of G A B P a is composed of four antiparallel P-strands that pack against five a-helices (Fig. 1.6). This is essentially the identical topology to the ETS domains of other Ets proteins, such as Ets-1 (Lee et ai, 2005) and Fli-1 (Liang et al, 1994). Therefore, as 11 Chapter 1. Introduction Figure 1.6. Structure of the GABPa/p heterodimer complex on DNA. X-ray crystal structure of the complex of the GABPa ETS domain (yellow) and the GABPp ankyrin repeats (green) bound to DNA (Batchelor et al., 1998). 12 Chapter 1. Introduction Groups McKnight McKnight Handa Handa Scarpulla Proposed nomenclature Thompson et al. 1991 De la Brousse et al. 1994 Watanabe et al. 1993 Sawa et al. 1996 Gugneja et al. 1995 GABPa GABPp 1-42 GABPp1-41 GABPpI-38 GABPp1-37 GABPp2 GABPa GABPpI GABPp2 GABPpI-1 GABPpI-2 GABPp2-1 E4TF1-60 E4TF1-53 E4TF1-47 GABPa GABPp2 GABPpI GABPy2 GABPyl NRF-2a NRF-2 p1 NRF-2 p2 NRF-2 y1 NRF-2 y2 AR TAD LZ GABPpi-42 GABPp1-41 GABPpI-38 GABPpI-37 GABPp2 Table 1.1. Various nomenclature used for GABP subunits. GABP(51 is composed of four isomers likely arising from alternate splicing of mRNA, whereas GABPJ32 is expressed from a distinct gene. Rosmarin et al. (Rosmarin et al., 2004) proposed a new nomenclature based on the molecular weights of these species (37, 38, 41, and 42 kDa). The domain structure of the GABPp forms are shown (AR: ankyrin repeats; TAD: transactivation domain; LZ: leucine-zipper). 13 Chapter 1. Introduction expected, the ETS domain of GABPa binds DNA through its winged helix-turn-helix motif, where the helix after the turn interacts with the major groove of DNA. Despite its role in enhancing promoter affinity, the four-and-a-half ankyrin repeats (AR) of G A B P P do not directly contact DNA in the ternary complex. Instead, the AR interacts with the first, fourth and fifth helices of GABPa to indirectly stabilize the association of the ETS domain with DNA. Most importantly, Gln321 of GABPa hydrogen bonds to both Lys69 of G A B P p and the DNA backbone. Thus, G A B P P may fine-tune the conformation of GABPa to regulate its affinity for promoter elements. 1.3.2.2 Solution structure of the PNT domain The solution structure of the PNT domain in GABPa was solved by NMR spectroscopic methods (Mackereth et al, 2004). Along with that of Erg, another Ets family protein identified in a colon cancer cell line (Fig. 1.3 (b)), both of these structures share a common conserved fold of four core helices. This fold was shown previously for PNT domains of Ets-1 (Slupsky et al, 1998) and Tel (Kim et al, 2001). However, the GABPa PNT domain also has an additional N-terminal helix present in Ets-1 but not Erg (Mackereth et al, 2004) or Tel (Kim et al, 2001). The additional N-terminal helix may provide a docking site for a kinase, as seen with Ets-1, or a binding interface for an unidentified protein partner. However, such a partner has yet to be identified for GABPa. 1.4 Investigating the N-terminal region of GABPa Over the course of studying the PNT domain of GABPa, Cameron Mackereth and Dr. Manuela Schaerpf in the Mcintosh lab found evidence for a new structured domain in the 14 Chapter 1. Introduction previously uncharacterized N-terminal region of G A B P a (residues 1-168) (unpublished data). As described in chapter 2, preliminary analysis of the N M R spectra of a large fragment of G A B P a (residues 1-320) revealed a set of resonances from a folded domain independent of the PNT domain (residues 168-254). The latter conclusion arises from the invariant N M R spectra of the PNT domain, whether in the absence or in the presence of the additional N-terminal residues. Furthermore, a B L A S T (Altschul et al, 1990) search failed to identify any proteins with significant sequence similarity to the 168 N-terminal residues of G A B P a , suggesting that this is a unique feature of this transcription factor. 1.4.1 Evidence implicating the interactions of the N-terminal residues of GABPa with other folded domains in GABP Although the N-terminal region of G A B P a may not interact intramolecularly with the PNT domain, it is still possible to imagine its association with the ETS, AR, T A D , and/or L Z domains of GABP. In fact, in a study of the mechanism of G A B P assembly, Chinenov et al. (Chinenov et al, 2000) have shown by analytical ultracentrifugation that deletion of residues 1-316 of G A B P a enhances formation of ap^a tetramers over aP dimers in solution. This result suggests a possible participation of the N-terminal or PNT domains in regulating assembly of the heterotetrameric G A B P complex. 1.4.2 Possible involvement of the N-terminal domain with previously reported GABP protein partners Several reports have shown that GABP has various protein partners, some of which may bind G A B P a at sites other than its ETS domain. Sawada et al. (Sawada et al, 1999) have documented functionally synergetic interactions between G A B P a and ATF1/CREB for the 15 Chapter 1. Introduction activation of the adenovirus early 4 (E4) gene. Furthermore, they have also provided some evidence that ATF1 interacts with the N-terminal region of G A B P a by use of GST-tag pull down assays and surface plasmon resonance experiments. Galvagni et al. (Galvagni et al., 2001) have suggested that Spl and Sp3 activate the utrophin promoter in co-operation with GABP. They have also shown experimentally that Spl and Sp3 physically interact with G A B P a through their zinc finger domains. G A B P a also controls the induction of the interleukin 16 promoter in T lyphocytes in cooperation with the coactivator CBP/p300. Furthermore, Bannert et al. (Bannert et al, 1999) have also shown that G A B P a interacts specifically to the C-terminal portion of p300. These results suggest possible functions of the N-terminal domain in G A B P a in the assembly of multiple protein complexes necessary for transcriptional regulation. However, the precise nature of these interactions and the segments of G A B P a involved remain to be established. 1.5 Thesis overview Since the discovery of G A B P in the late 1980, numerous groups have published data on this protein. However, due to its primary function as transcription regulator, most of these studies focused on the DNA-binding ETS domain. In contrast, the N-terminal region of G A B P a remained largely uncharacterized until the preliminary studies by Mackereth and Schaerpf. Therefore the major goal of this thesis was to dissect the biological function of the N -terminal region of G A B P a through structural analysis, followed by structure-directed biochemical studies. To achieve this goal, as outlined in chapter 3, I first delineated the new structured elements in G A B P a using limited proteolysis and N M R spectroscopy. Based on these results, residues 35-121 were discovered to constitute the new folded domain in G A B P a . Preliminary 16 Chapter 1. Introduction 35 121 N M R spectroscopic studies of a truncation fragment encompassing this domain G A B P a demonstrated that it contains a well-folded structure with a predominant (3-strand conformation. In chapter 4, a detailed analysis of the N M R spectra of G A B P a 3 5 " 1 2 1 is presented. Based on the assigned 1 3 C , 1 5 N , and *H resonances, the secondary structure features of G A B P a 3 5 " 1 2 1 were predicted using several different computational algorithms. Together these indicated that G A B P a 3 5 " 1 2 1 is composed of five (3-strands and a helix. The complete N M R assignments were combined to interpret the NOESY spectra of G A B P a 3 5 " 1 2 1 and calculate its tertiary structure, as summarized in chapter 5. In parallel with these NOE data, additional dihedral angle (O, %i) and residual dipolar coupling restraints were measured. The resulting final structural ensemble revealed that G A B P a 3 5 " 1 2 1 is composed of a five stranded (3-sheet crossed by a distorted helix. Although its 3-D structure resembles ubiquitin, G A B P a 3 5 " 1 2 1 adopts a novel fold. While exciting, this precluded the determination of its function based on structural similarities to any characterized protein. An analysis of the surface features of G A B P a 3 5 " 1 2 1 also yielded only limited insights into its potential function. Thus chapter 6 describes ongoing experiments for identifying the function of the new G A B P a domain based on the investigation of previously reported protein binding partners. 35 121 The determination of the tertiary structure of G A B P a " has also completed the full structural characterization of G A B P a . This is critical for understanding the mechanism of intact G A B P in its biological context. 17 Chapter 2. Materials and Methods Chapter 2 - Materials & Methods 2.1 Cloning Genes (rat liver) encoding G A B P a 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 were sub-cloned from the 1 320 G A B P a " gene within the pET28 vector by employing two different pairs of oligonucleotides (SEVIGA) as primers for PCR. Each oligonucleotide was designed with a convenient restriction enzyme cleavage site so that the PCR product could be ligated to the host vector pET28 (Novagen) (Fig. 2.1 (a)). In GABP-121RW, a codon for tryptophan was engineered in front of the stop codon (bold) for the convenience of measuring protein concentration. Pfu Turbo D N A polymerase (Fermentas) was employed to amplify the sequences of G A B P a 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 , as illustrated in Fig. 2.1 (b). A l l the PCR products were separated on a 1.8 % agarose gel and purified using QIAquick Gel Extraction kit (Qiagen). Both the purified PCR products and the pET28 host vector were incubated with Ndel and Xhol or Ndel and EcoRl restriction enzymes (Fermentas) overnight for G A B P a 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 , respectively. The digested PCR products were ligated into the cleaved pET28 host vector (Novagen) at 15 °C overnight and then transformed into E. coli DH5a cells (Novagen) by electroporation. The cells were incubated overnight on agar plates at 37 °C with selection for kanamycin resistance. The final plasmids were purified with a plasmid preparation kit (Qiagen) and sequenced for the inserted PCR products by NAPS (Nucleic Acid Protein Service unit, UBC). The confirmed plasmids were introduced into E.coli BL21 cells by electroporation. The pET28 vector-fused G A B P constructs are designed to express proteins with Eus6-tag at the N-terminus. The cleavage of FIis6-tag by thrombin results in three additional amino acids (Gly-Ser-FIis) at the N-terminus of the protein. 18 Chapter 2. Materials and Methods (a) Oligonucleotides for G A B P a 1 1 6 9 GABPa-1 F forward 5' G C G G A C A T A T G A C T A A G A G A G A A G C 3' A/del G A B P a - R E V reverse 5' G C T T C C T C G A G T C A T G C A G C A G C C C A T C T C 3' Xho\ Oligonucleotides for G A B P a 35-121 GABP-35F forward 5' G C G G A C A T A T G G C T G A A T G T G T A A G C 3' A/del GABP-121 RW reverse 5' G C T C A G A A T T C T C A C C / 4 C T C G A C C G T T T C C G C 3' EcoR\ (b) P C R settings initial denatur ing ^ annea l ing ex tend ^ end ing (95°C, 5 min) (95°C, 1 min) "^(55°C, 30 sec) " r > (72°C, 30 s e c ) " ^ (72°C, 5 min) t 20 cyc les Figure 2.L Oligonucleotides and PCR protocol for cloning GABPa 1 " 1 6 9 and GABPa 3 5 " 1 2 1 (a) The oligonucleotide sequences are shown for the forward and reverse primers of GABPa 1 " 1 6 9 35 121 and GABPa " , with the restriction enzyme sites underlined. The stop codon is high lighted in bold for the reverse primers, and the engineered Trp codon is italicized in GABP-121RW. (b) The GABPa 1 " 1 6 9 and GABPa 3 5 " 1 2 1 constructs were amplified by Pfu Turbo polymerase (Fermentas) during 20 PCR cycles. The temperatures and the time periods are indicated for each step. 19 Chapter 2. Materials and Methods 2.2 Protein expression and purification G A B P a 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 were expressed in E. coli BL21(ADE3) cells grown at 37 °C in Luria broth (LB) or minimal M9T medium (see Table 2.1 for the specific isotope compositions of each sample). After induction with 1 m M JPTG for 4 hours, the cells were harvested by centrifugation at 5K for 10 min. The cell pellet was resuspended in Ni-NTA (Qiagen) column binding buffer (5 mM imidazole, 50 m M HEPES (pH 7.5), 500 m M NaCI, 5 % glycerol) and lysed by passage through a French press three times at 10,000 psi, followed by 15 min of sonication. The lysate was spun at 15K for 1 hour, and the supernatant transferred onto the Ni -NTA column pre-equilibrated with binding buffer. The column was washed with 60 m M imidazole, 50 m M FffiPES (pH 7.5), 500 m M NaCI, 5 % glycerol and then G A B P a proteins were eluted with 250 m M imidazole, 50 m M HEPES (pH 7.5), 500 m M NaCI, 5 % glycerol. Following SDS-PAGE analysis, the fractions with the G A B P a protein were pooled and dialyzed overnight in 20 m M sodium phosphate, 20 m M NaCI, pH 7.2 with a few crystals of thrombin (Roche) added for cleavage of the N-terminal His6-tag. Thrombin and the cleaved HiS6-tag were removed by incubating the dialyzed protein solution with p-aminobenzamidine beads (SIGMA) and T A L O N metal affinity resin (BD biosciences) at room temperature with mild shaking for 15 min. Then the p-aminobenzamidine beads and the resin were removed by centrifugation at 5K. Once the cleaved proteins were isolated, they were kept reduced by addition of 2 m M DTT (dithiothreitol) (Bioshop). For limited trypsin digestion and N M R measurements, the proteins were concentrated to ~ 0.1 mM and 1.0-1.5 mM, respectively. The concentrations of GABPa 1 " 1 6 9 and G A B P a 3 5 " 1 2 1 were measured using the predicted extinction coefficients of 8654 M^cm" 1 and 7127 M^cm" 1 at 280 nm, respectively, calculated with the ProtParam program (http://www.expasy.org/tools/protparam.html). The masses of the final samples were confirmed 20 Chapter 2. Materials and Methods M9T minimal media S a m p l e 13 C 15 N 1 5 N & 1 3 C 1 5 N & 10% 1 3 C C a r b o n s o u r c e 1 3C 6-D-glucose 1 2C 6-D-glucose 3 g/L 10g/L 1 3C 6-D-glucose 3 g/L 1 2C 6-D-glucose 2.7 g/L 1 3C 6-D-glucose 0.3 g/L N i t r o g e n s o u r c e ( 1 4 N H 4 ) 2 S 0 4 ( 1 5 N H 4 ) 2 S 0 4 1 g/L 1 g/L ( 1 5 NH4 )2S04 1 g/L ( 1 5 N H 4 ) 2 S 0 4 1 g/L C o m m o n s a l t s a n d m i s c e l l a n e o u s t r a c e m e t a l s Na 2HP04 6 g/L K H 2 P O 4 NaCI 3 g/L 0.5 g/L MgSCM 1 m M CaCU FeCl3 B i 0.1 mM 10 uM 0.5 mg/ml Kanamycin 35 mg/L Table 2.1. The compositions of M9T minimal media used for 1 3 C or 1 5 N isotope-labelling. 21 Chapter 2. Materials and Methods by ESI-MS. A mass of 10245 Da (10248) was observed (expected) for unlabelled G A B P a ' 3 1 z l (operated by Dr. Gary Yalloway). 2.3 Limited trypsin digestion G A B P a 1 " 1 6 9 was digested with trypsin at 25 °C in 25 m M Tris (pH 7.9), 50 m M KC1, 0.1 m M EDTA and 1 mM DTT at a ratio of 1:250 (w/w) of enzyme to protein. At defined intervals, aliquots were removed and proteolysis stopped with SDS buffer, followed by heat inactivation' at 95 °C for 5 min. Samples were analyzed both by SDS-PAGE using 15% acrylamide gel and by ESI-MS (operated by Dr. Shouming He). 2.4 NMR spectroscopy A l l isotopically-labeled ( 1 3 C, 1 5 N , 1 5 N / 1 3 C , 1 5N/10% 1 3 C) N M R samples contained 1.0 -1.5 m M protein in 20 m M sodium phosphate (pH 7.0), 50 m M NaCI, and 2 m M DTT, with 10% D 2 0 (by volume) added for the signal lock. Spectra were recorded at 30 °C using 500 MHz Varian Unity, 600 MHz Varian Inova, or 800 MHz Varian Inova N M R spectrometers equipped with a triple resonance gradient probes. A l l the pulse sequences were provided by Dr. Lewis Kay (University of Toronto). Spectra were processed and analyzed using NMRpipe (Delaglio et al, 1995), nmrDraw (Delaglio et al, 1995), and Sparky (Goddard and Kneller). 2.5 Residual dipolar couplings (RDCs) measurements 13 15 35 Residual dipolar couplings were acquired on a uniformly C- and N-labeled G A B P a 1 2 1 protein sample diffused in a 5% acrylamide gel prepared with 29:1 acrylamide : bisacrylamide stock (Ishii et al, 2001). The gel was cast as a cylinder with a 6 mm diameter and 20 mm length. After dialysis in the N M R sample buffer, the gel was soaked in the protein 22 Chapter 2. Materials and Methods solution (~ 500 ui) overnight. The gel pellet with the diffused protein was loaded into a 5 mm silanized bottomless N M R tube using a gel funnel apparatus, causing the pellet to stretch approximately to twice its original length (Chou et al., 2001). This yielded a splitting of ~ 4 Hz in a 2 H - N M R spectrum of the lU02H lock solvent. The backbone X H - 1 5 N and 1 3 C ' - 1 3 C a RDCs were analyzed from the ^ - ^ N - J P A P - H S Q C and 2D/3D HNCO-based coupling spectra recorded on the partially aligned protein samples, as well as on an unaligned reference sample (Cloree? al, 1998b). 2.6 NMR spectral assignments and structure determination. Spectral assignment strategies and structure calculations are described in chapter 4 and 5, along with the details of analysis of resulting structures. 2.7 Native gel electrophoresis Six different G A B P proteins were expressed and purified following previously described procedures: G A B P a 3 5 " 1 2 ' (chapter 2.2), the PNT domain of G A B P a (residues 168-254) (Mackereth et al, 2004), G A B P a 1 " 3 2 0 (residues 1-320), full-length G A B P a (residues 1-454) and GABPp (residues 1-382) (Chinenov et al, 2000), a complex of ETS domain (GABPa (residues 311-430)) and the ankyrin repeat (GABPP2-1 (residues 1-157)) (Batchelor et al, 1998). A l l the purified proteins were dialyzed into 20 m M sodium phosphate, 50 m M NaCI, pH 7.5. The proteins were mixed in different combinations at a 1:1 molar ratio. Following 4 hours incubation at room temperature, the protein samples were prepared in loading buffers with or without SDS and analyzed on SDS-PAGE and native (50 m M Trizma, pH 7.0) gels with 15 % acrylamide. 23 Chapter 3. Identification and cloning of the new GABPa domain Chapter 3 - Identification and cloning of the new G A B P a domain Previous studies demonstrated that G A B P a was modular, containing at least the ETS and PNT domains. Although the DNA-binding ETS domain has been studied extensively, the function of the PNT domain remains to be established. It may serve as a protein-protein interaction module, possibly for MAP-kinase docking to enhance the phosphorylation at the adjacent Thr280. In the course of studying the structure and function of the PNT domain, Schaerpf and Mackereth noted that the long N-terminal sequence (residues 1-167) of G A B P a , preceding its PNT domain, contains some structured elements (unpublished data). When the ' H - 1 5 N HSQC spectra of the G A B P a 1 " 3 2 0 and G A B P a 1 6 8 " 2 5 6 constructs were compared with each other, Mackereth found that a subset of the signals corresponding to amides of residues 1-167 exhibited a well-resolved and dispersed pattern, indicative of a significantly structured region. In this current study, limited trypsin digestion and the partial assignment of the ^ - ^ N HSQC spectrum of a new construct, G A B P a 1 " 1 6 9 , helped determine that the N-terminal structured domain is composed of residues 35-121. This led to the cloning and expression of a new 35 121 truncation species, G A B P a " , that proved to be readily amenable for structural analysis. 3.2 Expression of residues 1-169 from GABPa Based on the proposal of a new structural domain within the N-terminal sequence of G A B P a , the gene encoding residues 1-169 was sub-cloned into a pET28 vector for expression in E. coli. The resulting 15N-labeled G A B P a 1 " 1 6 9 was purified and characterized by N M R spectroscopy. A preliminary examination of the *H- 1 5 N HSQC spectrum of G A B P a 1 " 1 6 9 (Fig. 3.1) provided three insights into the nature of the N-terminal domain. First, the spectrum contains at least 65-70 well-dispersed peaks (blue), corresponding to the amide resonances of 24 Chapter 3. Identification and cloning of the new GABPa domain 10 1 1 0 115 E 5. a, 120 125 G49 0 Ci'JO 0 D72 A41* A 0 © _G G103 0 S62 N45 V92 o A65 S80 o93 0 6 ~ E H " K I 0 7 © -_ L 5 1 © L81® @ V 3 8 E36' N 1 0 9 © ©142 © D 7 6 LI 08 168 Q ©LhV N5i Q102© El 05, 1113 £ 2 ©" Si12 w Q R58 o © 0 8 4 0 K 5 3 L54 M104 O TX8 o L70 L75 © VI14 l/> SC(W163) L63 V© 6 1110^ © 0 7 1 Q97 Q 4 0 ® KI 15 1)43 E67 V86 ^ K87 0 G85* o A l 17 L l l l 0 V98 o Q74 144 1)78 X 199 10 ' H (ppm) h 110 h 115 H 2 0 125 Figure 3.1. Partially assigned 'H- 1 5 N HSQC spectrum of GABPa 1 " 1 6 9 The dispersed peaks (blue) located outside of the random-coil chemical shift region (black) correspond to residues 35-121. Resonance pairs of Gin and Asn N H 2 groups are connected with horizontal lines. The aliased peaks (A41, G85) are marked with an asterisk. Weak peaks not seen at this contour level are identified with "X" . 25 Chapter 3. Identification and cloning of the new GABPa domain the structured residues, along with many poorly resolved peaks (black) with random-coil chemical shifts. Hence the size of the structured region in G A B P a 1 " 1 6 9 is likely ~ 8 kDa. Second, when compared to the " H - 1 5 N HSQC spectra of G A B P a 1 6 8 " 2 5 4 and G A B P a 1 " 3 2 0 , (fig. 3.2) the chemical shifts of the dispersed peaks from the domain in G A B P a 1 " 1 6 9 and from the PNT domain in G A B P a 1 6 8 " 2 5 4 superimpose well with those in the larger fragment G A B P a 1 " 3 2 0 . This indicates that the new domain is structurally independent of the PNT domain. Last, to confirm that the dispersed peaks in the ! H - 1 5 N HSQC of G A B P a 1 " 1 6 9 correspond to the relatively rigid part of the protein, the backbone dynamics were measured by 15N-relaxation experiments. In particular, heteronuclear 'H-{ 1 5 N} NOE ratios, which are ~ 0.8 for rigid amides and less for those showing mobility on the sub-nsec time scale were determined. Indeed as summarized in Fig. 3.3, amides with dispersed peaks in the spectrum of G A B P a 1 " 1 6 9 exhibited high ] H-{ 1 5 N} NOE values, whereas those with random coil shifts showed low NOE ratios. These three observations led to further investigations towards identifying the location of the structured elements in G A B P a 1 " 1 6 9 , with the goal of removing the flexible residues, and thereby obtaining a more resolved spectrum for further structural analysis. 3.3 Identification of the boundaries of the structured region in GABPa 1' 1 6 9 The boundaries of the structured region in G A B P a 1 " 1 6 9 were investigated by two methods, specifically limited trypsin proteolysis and the partial assignment of the ! H - 1 5 N HSQC spectrum of this protein. 3.3.1 Limited trypsin digestion 26 Chapter 3. Identification and cloning of the new GABPa domain s CL. B 'H (ppm) I N-terminal PNT ETS 1 35 121 169 254 320 430 454 Figure 3.2. Overlaid ' H - ^ N HSQC spectra of GABPa 1 - 1 6 9 , GABPa 1 6 8 " 2 5 6 , and GABPa 1 " 3 2 0 . The overlay demonstrates that the sum of spectra of GABPa 1 " 1 6 9 (red) and GABPa 1 6 8 " 2 5 6 (green) is part of GABPa 1 " 3 2 0 (blue), suggesting the region of residues 1-169 is independent from the PNT domain, GABPa " . The schematic diagram shows the location corresponding to each construct. 27 CO. U CO I a u < 5 x CU a rn 0 u in _ < -4 1 0.8 PJ o 0 6 Z 0.4 Z 0.2 % 0 - -0.2 -0.4 Chapter 3. Identification and cloning of the new GABPa domain lllllll ill Ii III .III . I • • , jl ipl " i 1 1 •i "| r • 5 " ? S 0 m ll.l f \\ "< -4 -1,11, .1 L 1, - 1. 1.., f l ' I 1 '1 f T i n i n iil.lilm. i i...1 .li.ll L l l • - I 1.1 ll•-_I_ l i l l . ' I ' l ' l ' |" •' "| MTKREAEELIEIEIKTTEKAECTEESIVEQTYTPAECVSQAIDINEPIGNLKKLLEPR^ 1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 169 Figure 3.3. Identification of the structured region of GABPa 1 " 1 6 9 by NMR The structured region of GABPa 1 " 1 6 9 was identified based on chemical shift and heteronuclear 'H-I^N} NOE measurements. The magnitude corresponding to the deviation of the difference of C a and C p chemical shifts from that of C a and C p random coil chemical shifts is the indication for the secondary structure. The negative and positive values correspond to P-strand and helical conformations, respectively, in A(8Ca - SCP) and A 1 3 C a , and opposite for A^C 1 3 . The ^-{^N} NOE values for the assigned peaks were ~ 0.8, suggesting rigid elements, whereas the unassigned peaks resulted in large negative 'H-{ 1 5N} NOE values, indicative of random coil behavior. However, due to absence of their assignment, the actual values for the unassigned peaks are not shown. 28 Chapter 3. Identification and cloning of the new GABPa domain Limited proteolysis often provides an avenue for distinguishing unstructured versus structured regions within a protein. When G A B P a 1 " 1 6 9 was incubated with trypsin, two fragments were observed within 2 min. Within 10 - 30 min, the full length protein was completely absent leaving two predominant fragments, with ESI-MS-determined masses of 3741 Da and 5426 Da (Fig 3.4 (a)). According to the predicted trypsin digestion sites in G A B P a 1 " 1 6 9 , these two masses correspond to residues 20-53 or 116-150 and 59-107, respectively. After 30 min incubation, a new band started appearing around the 14.4 kDa protein marker. Unfortunately, the mass of the band was not verified by mass spectrometry, but it may arise due to oxidative crosslinking of cys residues in the cleaved G A B P fragments. To help interpret these proteolysis data, the predicted trypsin cleavage sites are shown in a schematic of G A B P a 1 " 1 6 9 in Fig 3.4 (b). Although the 3741 Da fragment could arise from residues 20-53 or 116-150, the former contains an additional potential cleavage site whereas the latter is the longest sequence of G A B P a 1 " 1 6 9 expected upon complete digestion of the protein. Any smaller fragments would not be observed using this SDS-PAGE system. Thus, the fragment is likely due to cleavages after L y s l l 5 and Lysl50. In contrast, the 5426 Da fragment corresponds to residues 59-107, indicating that R79 and K87 are protected form trypsin digestion. Although not conclusive, these results suggest that the structural domain in GABPa 1 " 1 6 9 falls near the middle of this sequence and not at its termini. Interestingly, when the subsequently determined secondary structure of this domain is aligned with the predicted cleavage sites in Fig. 3.4 (b), the protected R79 and K87 lie within a loop between two (3-strands, whereas the cleaved R58 falls at the end of an a-helix. Thus the pattern of trypsin cleavage is not readily explained based on the structure of the new domain in G A B P a . 3.3.2 Assigning the 1H- 1 5N HSQC spectrum of GABPa1 29 Chapter 3. Identification and cloning of the new GABPa domain (a) 116.0 66.2 45.0 35.0 25.0 18.4 j * 14.4 —I IP • v ^ ^ ^ M M m M 2' 10" 30' 60' 120" MWobs= 19037 GABPa residues (1-169) MWo b s= not confirmed Unidentified band MWo b s=5426 GABPa residues (59-107) MWobs=3741 GABPa residues (20-53) or (116-150) (b) G A B P a 3 5 1 2 1 secondary structure Si H S2S3 Si S: Trypsin cleavage sites (Arg or Lys) II • t GABPa 35-121 I 20 40 I 60 I I 80 100 Residue I 120 140 160169 Figure 3.4. Limited trypsin digestion of G A B P a 1 " 1 6 9 . (a) Tryptic digest of G A B P a 1 " 1 6 9 was analyzed by S D S - P A G E using a 15% acrylamide gel. Three fragments were observed to be protected for 2 hr digestion. The masses of two of the fragments were confirmed by mass spectrometry as shown on the right side with the matched sequences. The mass of the band appeared near the 14.4 kDa protein marker (M) was not confirmed. However, the band may arise from digestion of trypsin or possible oxidative crosslinking of the G A B P fragments, (b) The expected trypsin proteolysis sites (arrows) are shown for G A B P a 1 " 1 6 9 , along with the measured cleavage sites (dark arrow). Fragments smaller 35 121 than ~ 3 kDa were not resolved on this gel. The secondary structures of G A B P a " (see chapter 5) is shown above the corresponding region. 30 Chapter 3. Identification and cloning of the new GABPa domain To determine the boundaries of the structured region of G A B P a 1 " 1 6 9 , its main-chain ! H , 1 5 N , and 1 3 C resonances were assigned. The dispersed peaks in a ' H - ^ N HSQC (Fig. 3.1) spectrum correspond to the structured parts of a protein. Furthermore 1 3 C a and 13C® chemical shifts provide secondary structural information of that protein. Most of the dispersed peaks in G A B P a 1 " 1 6 9 were assigned using a combination of H N C A C B and CBCA(CO)NNH spectra (Sattler et al, 1999). These experiments provide through-bond scalar correlations between the 1 H N and 1 5 N of residue i with the 1 3 C a and 1 3 C P shifts of residues i and i-1 or only residue i-1, respectively. The peaks in the random coil chemical shift region of the ' H - 1 5 N HSQC spectrum were not fully assigned due to their significant overlap. As shown in Fig. 3.1 and 3.3, all of the assigned resonances in the ! H - 1 5 N HSQC spectrum of GABPa 1 1 6 9 correspond to residues 37-118. As discussed earlier in section 3.2, these residues exhibited high 'H-f^N} NOE values (Fig. 3.3), further supporting their location within a structured region. The assigned 1 3 C a and 1 3 C P resonances were also examined for their deviations from the expected random coil chemical shifts of each residue. In addition, the differences between the observed and reference random coil 1 3 C a and 1 3 C p chemical shifts (A8) were investigated as a sensitive prediction of secondary. As shown in the top two panels of Fig. 3.3, most of the 1 3 C a and 1 3 C ^ resonances for residues 37-118 deviated from their random chemical shifts, providing further evidence for their participation in a structured region. In contrast, any of the residues outside of this region with assigned ' H N and 1 5 N signals showed essentially random coil 1 3 C a and 1 3 C P shifts indicative of a disordered conformation. Finally, the observation that many residues in the range of 37-118 had negative (5 1 3 C a (obs-ref) - 5 1 3 C^( 0 b S - r e f ) ) shifts relative to the corresponding random coil values suggested that the structured domain is composed of significant p-strand structure (see chapter 4 for a more detailed discussion of chemical shift data). 31 Chapter 3. Identification and cloning of the new GABPa domain 3.4 Sub-cloning and characterization of GABPa Although the N M R analysis of G A B P a 1 " 1 6 9 suggested that residues 37-118 form the structured core, two and three additional residues were added to the N - and C-terminal boundaries, respectively, as a "margin of error" and to include a tryptophan for quantitative purposes. As a result, the gene encoding G A B P a 3 5 " 1 2 1 was sub-cloned into the pET28 vector. 35 121 The protein was bacterially over-expressed and analyzed by N M R spectroscopy. G A B P a yielded an excellent quality ! H - 1 5 N HSQC, spectrum indicative of a well-folded protein (Fig. 3.5). Most of the overlapping peaks in the spectrum of G A B P a 1 " 1 6 9 were absent, while the remaining dispersed peaks superimposed closely with those observed with the larger construct (Fig. 3.2). In the heteronuclear 'H-{ 1 5 N} NOE experiment, the NOE values for the dispersed peaks remained as they were for G A B P a 1 " 1 6 9 (compare Fig 3.3 and Fig. 5.6). Both of these results indicate that the core structured region (residues 35-121) is independent of the flexible N - and C- terminal ends (residues 1-34 and 122-169) of G A B P a 1 " 1 6 9 . Armed with the available mainchain N M R assignment of G A B P a 1 " 1 6 9 , the full 1 H , 1 3 C , and 1 5 N resonances of G A B P a 3 5 " 1 2 1 were assigned (Fig. 3.5 and Chapter 4.1.1). Further analysis of the ! H - 1 5 N NOE spectra of this protein demonstrated that residues 35-37 and residues 119-122 at its N - and C-terminal ends, respectively, are flexible (Fig. 5.6), indicating that its boundaries were indeed well defined. 32 Chapter 3. Identification and cloning of the new GABPa domain 10 i U90 *0 D72 A4I* Q71 ( E2)' 9 9 N I 0 9 ( S 2 ) <9 ®-Q93 (e2) G49 N50 (52) W122 (el) O L63 G103 S62 Q 7 4 ( e 2 ) _ ® ; ® g j C d QI02(e2) - Q97 (e2) Q60 (£2) 3©- Q84(e2) Q60 L55 K52 El 12 N45 N45 (52) R58 D78 TI 19 © J © Q84 S95 <®L54 T i l 6 ^ L 5 9 M 1 « i » R 7 9 - ® Q40 (e2) A65 & 9 «™3 S80 Q93 °M C37»* <©T88 VJ8 L 8 1 © N 1 0 9 © ® G L?l E3S IE118 ® > S 1 0 0 @D64 ,F82 D76 L108 , '<>!> L70 N50 0102© € [in* V96 1110 271 C69 L75@ L94< D43 ? VI20 EI05 < © Y l 0 1 © E 5 6 ^ A © a©173 ^JE46 E67<* JV86 )K87 VI14 Al 17 <^ gjS*E12l p © A35 © K I I 5 <® 148 G85* Q W122 LI 11 Q97 •V98 ^ H4 i i r ' H (ppm) Figure 3.5. The fully assigned 'H- 1 5 N HSQC spectrum of GABPa 3 5 " 1 2 1 The aliased peaks are marked with an asterisk. The unassigned peaks, likely from twice aliased arginine sidechain resonances, are marked with a triangle. 33 Chapter 4. Strategies for assigning NMR spectra Chapter 4 - Strategies for assigning spectra and gathering spectral information for NMR-based structure calculations 4.1 Assigning 2D and 3D NMR spectra 4.1.1 Assignment of resonances from backbone nuclei The ' H - ^ N HSQC spectrum is a two-dimensional N M R spectrum showing the correlated resonances between directly bonded ' H and 1 5 N nuclei in molecules such as peptides and proteins. Since the pattern of the correlated resonances is usually distinct for each protein, it is called the " N M R fingerprint". Typically, the spectrum contains at least the same number of peaks as there are non-proline residues. In addition the ' H - 1 5 N HSQC spectrum contains the correlated ' H - ^ N signals from the side-chains of Trp, Gin, Asn, and Arg. To assign the ! H - 1 5 N HSQC spectrum of uniformly 1 3C/ 1 5N-labeled G A B P a 3 5 " 1 2 1 , two three-dimensional N M R spectra, H N C A C B and CBCA(CO)NNH (see Fig. 4.1) (Sattler et al, 1999) (Grzesiek et al, 1993), were employed. In both of these spectra, 1 3 C a and 1 3 C P resonances are correlated to the amide 1 H N and 1 5 N resonances observed in the ' H - 1 5 N HSQC spectrum. The two experiments are distinct as the CBCA(CO)NNH spectrum correlates the amide of residue i to the 1 3 C a and 1 3 C P resonances of its previous residue (i-1), whereas in the H N C A C B , the 1 3 C a and 1 3 C P resonances of the both residues i-1 and i can be observed. Further, 1 3 C a and 1 3 C ^ signals are distinguished in the H N C A C B experiment by their opposite signal phases (i.e. positive and negative). By combining these two experiments, along with knowledge of the expected 1 3 C chemical shifts of each amino acid type and the protein sequence, the peaks 34 Chapter 4. Strategies for assigning NMR spectra H C H H B H 111 H C H pla H C H _ M —'rat—r —-Nf—<?•'<••—C • I ^ II SPl J H H H 0 H 0 H C H H C H i l l I H [GIH H C H * l L1J H 3 C\ C\ H H 0 K I H O H C H H C H I I H C H H C H - N —c —c r r N —c —c • I I II I I II H H 0 H - H 0 H N C A C B C B C A ( C O ) N H H N C O H C H I H C H — N — C H H O iHiCiHj L I i < J P i H O T O C S Y - H S Q C Bp - N — C — < A C —c • II rl>«i, ll pla , K ll # # o M.W 0 N O E S Y - H S Q C H C H H C H I I H t H H C H I . _ I - N —jC p C p N — C — C • i " i ; "ir i i II H H O H H O C B C A C O ( C A ) H A H C H H C H I I H:'C H H C H I I — N — C — C — N — C — C — I *!, II I I II H H O [H. H 0 H ( C C O ) - T O C S Y - N H •H'C H " H C H C ( C O ) - T O C S Y - N H gjiC Hj H C H -^ _ C _ C _ N _ C _ C I I II I I II H H O H H 0 H C C H - T O C S Y (Hp)CP(CyC8)H8 H C H (Hp)Cp(CYC8Ce)He H C H I H C H I H C H I H C H —M— C — C —-NC—C — C — X i ii X i ' ii HI H 0 IHj H O XJ XJ J J H M Q C - J ( H N H A ) Figure 4.1. Heteronuclear experiments used to assign G A B P a 1 " 1 6 9 and GABPa 3 5 " 1 2 1 . A simplified schematic diagrams are shown for the selected heteronuclear experiments utilized to assign G A B P a 1 " 1 6 9 and GABPa 3 5 " 1 2 1 . Shaded atoms are detected in the experiments. A square denotes through bond scalar connections and a circle denotes through-space N O E connections. 35 Chapter 4. Strategies for assigning NMR spectra in the ' H - ^ N HSQC spectrum are sequentially assigned based on connecting 1 3 C a and 1 3 C ^ resonances (Fig. 4.2). The chemical shift values for the carbonyl carbons in the protein backbone are readily obtained from a HNCO spectrum. In the HNCO experiment (see Fig. 4.1) (Ikura et al, 1990; Muhandiram and Kay, 1994), the carbonyl 1 3 C resonance of residue i-1 in the backbone is correlated to the adjacent amide ! H N and 1 5 N resonances of residue i . Once the peaks for the backbone amides are known in the ! H - I 5 N HSQC spectrum, the corresponding 1 3 C signals in the HNCO are immediately assigned. 4.1.2 Assignment of resonances from aliphatic sidechain nuclei To obtain the chemical shift values of aliphatic sidechain nuclei, 3 three-dimensional spectra were assigned in concert: C(CO)-TOCSY-NH (Grzesiek et ai, 1993; Logan et al, 1992), H(CCO)TOCSY-NH (Logan et al, 1992; Montelione et ai, 1992) and HCCH-TOCSY (Bax et al, 1990; Kay et al, 1993) (see fig. 4.1) (Sattler et al, 1999). The C(CO)-TOCSY-NH and H(CCO)TOCSY-NH spectra correlate all non-carbonyl 1 3 C (i.e. 1 3 C a , 1 3 C P , 1 3 C Y , etc) and ! H (i.e. ' H " , 'H^ , ] H Y , etc) signals of residue i-1 to the amide ! H N and 1 5 N of residue i . Although most of the 1 3 C signals in the C(CO)-TOCSY-NH spectrum can be assigned confidently based on the distinct chemical shifts known for each amino acid type, some of the proton peaks on the H(CCO)TOCSY-NH may be ambiguous. To reduce any ambiguities, a HCCH-TOCSY spectrum is also considered. This spectrum provides the correlation of all ] H signals in an amino acid sidechain to each 1 H - 1 3 C pair in that sidechain. 4.1.2.1 Stereospecific assignments 36 Chapter 4. Strategies for assigning NMR spectra ' " " ' " i i , i" ii , . i - 1 1 . 1 1 . 1 1 . 30 4(H E D . a 50 60 H R79-Cp R79-Q R79-CP — f i R79-Ca| G S80-Ca S80-CP W&— S80-Ca S80-Cp IM!i|Pin|i"fTTn L81-CP Or S80-Caj L81-Ca S80-CP L81-CP L81-Ca F82-CP L81-Cp L81-Ca — F82-Ca 30 .40 50 60 70 CBCA(CO)NNH HNCACB CBCA(CO)NNH HNCACB CBCA(CO)NNH HNCACB S80 S80 L81 L81 F82 F82 J-70 lH (ppm) 9.42 9.28 1 5N(ppm) 119.9 121.0 Figure 4.2. Strategies for assigning protein backbone resonances 8.44 121.7 The 1 3 C a and l 3 C p connectivity in the HNCACB and CBCA(CO)NNH spectra of GABPa 3 5 " 1 2 1 . The spectra are sequentially assigned based on the connectivities of the ' H N , 1 5 N , and l 3 C a and 1 3 C P resonances. The 1 3 C a and 1 3 C P resonances of the i and i-1 and the i residues are detected in the HNCACB and CBCA(CO)NNH spectra, respectively. In H N C A C B , the signs of the 1 3 C a and 1 3 C P peaks are positive (black) and negative (red), respectively. 37 Chapter 4. Strategies for assigning NMR spectra The side-chain chemical shifts obtained from the previous assignment do not distinguish signals from prochiral nuclei such as the 'Fr6 2 and 'H^ 3 on methylenes and the methyls of Val and Leu. Therefore the following experiments have been performed to obtain the stereospecific assignments of the nuclei at prochiral centers (Sattler et ai, 1999). (^ -methylenes in R, D, N, C, E, H , L, K, M , Q, F, S, Y, and W side-chains The resonances from the 'Ff/32 and 'FT33 of these residues were stereospecifically assigned using HNHB (Archer et al, 1991) and short mixing time (30ms) 1 5N-TOCSY-HSQC spectra (Marion et al, 1989a). This approach relies on the fact that 3 JNH-H0 and 3 J H a - H 3 couplings are largest for 1 5 N - H P or H'VFi'3 nuclei in a trans dihedral conformation. This also yields the %\ dihedral angle of the sidechain about the C a - C p bond according to a staggered rotamer model. Stereospecific assignments of leucine and valine methyls The assignment of methyl resonances is one of the most crucial parts for the structure determination of protein due to their importance in forming the protein core. Therefore incomplete or incorrect methyl assignments can easily lead to reduced accuracy of an NMR-1 13 derived structure. Most methyl resonances are readily observed due to their upfield H and C resonances and can be assigned ambiguously from C(CO)-TOCSY-NH and H(CCO)TOCSY-NH spectra. However, additional experiments are necessary for obtaining the stereospecific assignment of the methyls of valine and leucine. These experiments also confirm the methyl group assignments of He, Thr, Ala, and Met. The signals from the diastereotopic methyls of Val and Leu were assigned following the approach of Neri et al. (Neri et al, 1989). A ] H- 1 3 C HSQC spectrum is recorded for a sample that is 10% nonrandomly and fractionally 13C-enriched. Due to the metabolic pathways for the 38 Chapter 4. Strategies for assigning NMR spectra biosynthesis of leucine and valine, signals from the pro-R methyls (Leu-5l and Val-yl) appear as doublets in the 1 3 C dimension, whereas signals from the pro-S methyls (Leu-82 and Val-y2) 1 13 are singlets. In parallel, a constant time H - C HSQC spectrum is recorded. Again, due to the patterns of 1 3 C labelling, this discriminates the signals corresponding to Val-yl , Leu-51 and Ile-y2 from those corresponding to Val-y2 and Leu-82. The assignment of Thr-y2 and Ile-8l 1 13 methyls were confirmed based on their disappearance the constant time H - C HSQC spectrum 1 13 compared to the normal H - C HSQC spectrum. Assignment of Asn and Gin sidechain 15NH2 signals The sidechain amide resonances of Gin and Asn generally appear in the up-field region of a ! H - 1 5 N HSQC spectrum. Each sidechain yields a pair of peaks due to two geminal amide protons coupled to a common 1 5 N nucleus. These signals are assigned to specific amino acids based on correlations to the sidechain nQaJ$ (Asn) or 1 3 C ^ / y (Gin) resonances in a H N C A C B spectrum. Using the E Z - H M Q C - N H 2 spectrum, as described in Mcintosh et al. (Mcintosh et al., 1997), the side-chain amides of the Gin and Asn are stereospecifically assigned. In this 1 ft99 1 P99 experiment, the peaks from the Z ( H or H ) protons appear with significantly greater intensities than those from the corresponding E ( ' H S 2 1 or 'H 6 2 1 ) protons (Fig. 4.3). In a random coil polypeptide, most E protons resonate down-field relative to their corresponding Z protons. However, for Gln84, this pattern is reversed (Fig. 4.3), suggesting that the side-chain amide of Gln84 might play a distinct role in the structure of G A B P a 3 5 " 1 2 1 . Indeed, based on the structure shown in Chapter 5, the N e 2 of Gln84 forms a hydrogen bond to the amide of Asp76. The 1 5 N 5 2 of Asn45 is unusually downfield shifted, which is likely due to a hydrogen bond to Gly90. 39 Chapter 4. Strategies for assigning NMR spectra 9.0 X.5 8.0 7 . 5 7 . 0 i io H 1 1 2 I 14 I 16 6.5 093 (NE2-HE2I ) N109(N82-H821) Q7I (NE2.HE22) 0 7 l ( N £ 2 - H E 2 n ^ ^ ^ « \ . 1 « AN105,NS2-H622) Q97(NE2-HE21)^ 1 9 3 (NE2-HE22) (NE2-HE22) N50(N82-H52I) Q74(NE2-HE2V O60 (NE2-HE2 Q84 (NE2-HE2 74 (NE2-HE22) Q60 (NE2-HE22) 0")2(NE2-HE22) 084 (NE2-HE22) a •'j N45(N52-H82I) 0. dtk. $ 0 N45 (N82-H822) O40(NE2-HE21: JL. Q40 (NE2-HE22) 9 . 0 9 . 0 1 10 1 1 2 I-114 1 1 6 X.O 7 .5 'H (ppm) X.5 8.0 7.5 1 1 0 _ 112 Z 114 116 7 .0 7 . 0 6 . 5 6 . 5 093 (NE2-HE21) Q71 (NE2-HE21). 097(NE2-HE21)' N50(N82-H821) 074(NE2-HE21) N109j |N82 -H89fl '^j^gL111109 (N52-H822; 093 (ME2-HE22) (NE2-HE22) $ N45(NS2-H82I) f 110 1 1 2 , 084(NE2-HE21) ^Q74 (NE2-HE22) Q60 (NE2-HE22) Q102(NE|-HE22) |_Q84 (NE2-HE22J^ N45 (N62-H822) 040(NE2-HE2I) O40 (NE2-HE22) M 14 I 16 9 . 0 8 .5 X.O 7 .5 7.0 6 . 5 Asn 'H (ppm) Gin • CP — a I \ ' p 2 J*"" ^ H&>(EJ • C H — O — C 6 W(R)W(R) ' Figure 4.3. Stereospecific assignments of the Gin and Asn N H 2 resonances of GABPa 3 5" 1 2 1 The peaks corresponding to the sidechain N H 2 groups are shown in red in the (a) J H- 1 5 N HSQC spectrum and the (b) EZ-HMQC-NH2 spectrum. In the EZ-HMQC-NH2 spectrum (b), the Z protons (H 8 2 2 or H e 2 2 ) appear with greater intensities than the E protons (H 8 2 1 or H e 2 1). In GABPa 3 5" 1 2 1, most of the peaks corresponding to the Z proton in a sidechain are located up-field of those for the corresponding E proton. However, for Q84, the reversed pattern is observed. 40 Chapter 4. Strategies for assigning NMR spectra However, there is no obvious structural explanation for the unusual down-field shifts of the 1 5 N and ! H signals of Gln40 and Asn50. 4.1.3 Assigning resonances from aromatic residues In many cases, aromatic rings are important in the formation of a protein's structure through favourable interactions with other core residues. To assign the side chain resonances of Tyr, Phe, His and Trp, the (Hp)Cp(CYC8)HS and (HP)Cp(CYC8Ce)He experiments (Yamazaki et ai, 1993) (Fig. 4.1 and 4.4) were used to correlate the ' H 5 and ] H e of the aromatic ring spin system directly to the previously identified 1 3 C P of these amino acids. The remaining ' H and 1 3 C 1 13 signals from the aromatic rings were identified in a constant time H - C HSQC spectrum. The distinct chemical shifts of each ring-type facilitated this process. The assignments were confirmed by using a 13C-edited NOESY spectrum to identify intraresidue ' H - ' H NOE interactions. The 1 5 N E l H of the indole ring was readily identified in the ! H - 1 5 N HSQC spectrum of G A B P a 3 5 " 1 2 1 by its downfield chemical shifts. This was also confirmed using a 15N-edited NOESY spectrum to provide NOE correlations to the adjacent 1 H 5 1 and ! H ^ 2 protons. A histidine imidazole ring can be protonated or in two possible deprotonated tautomeric states depending on the pH of the environment and its location in the protein. Using a long range H M B C experiment (Bax and Marion, 1988; Pelton et ai, 1993), it is clear that His66 is in the neutral N H tautomeric state based on the large difference between the chemical shifts of its two imidazole nitrogens (160 ppm vs 250 ppm) and the pattern of scalar couplings with the carbon-bonded ring protons (Fig. 4.5). 4.2 Secondary structure 41 Chapter 4. Strategies for assigning NMR spectra (a) (Hp)CP(CyC5)H8 7.6 7.4 7.2 7.0 6.8 6.6 (c) Aromatic ! H - I 3 C HSQC 7.8 7.6 7.4 7.2 7.0 6.8 c c . O 30 32 34 36 38 40 W122 (CP-H51) / H66 (CP-H82) Y101 (Cp-H8*) F82(CP-H8* 115 120 H (b) 7.6 7.4 7.2 7.0 6.8 6.6 ' H (ppm) (HP)CP(C7C6Ce)He 7.6 7.4 7.2 7.0 6.8 6.6 125 D. 5 u & 3 4 W122 (Cp-H8l) H66 (CP-H82) F82 (Cp-H8*) Y101 (CP-H8*) Y101 (CP-He*) -F82 (CP-HE*) 30 32 34 36 38 40 130 135 H 140 7.6 7.4 7.2 7.0 6.8 6.6 'H (ppm) W122((;2) W122 (e3) H66 (82)v 0 Y101 (e»)( 0© ° @W122 (^ 3) W122(T|2)( W122(82)( F82 (E*) 0 F82 (8*) H25 Y10! (8*) ®H66 (El) *T-n | . i . | i | l I i i i | 7.8 7.6 7.4 7.2 7.0 6.i h 115 120 130 135 140 ' H (ppm) Figure 4.4. Assignment of resonances from aromatic sidechain in G A B P a 3 5 ' 1 2 1 The resonances of ' H 5 and ' H £ nuclei in the aromatic residues (H66, F82, Y101, W122) are assigned from the (Hp)C(3(CyC5)H5 (a) and (Hp)Cp(CyC5Ce)He (b) spectra, respectively, based on correlation to their 1 3 C P resonances. The remaining aromatic side-chain 'H and 1 3 C signals are assigned using a constant time 'H- 1 3 C HSQC (c). Due to rapid ring-flipping the H 5 / s and Hw of F82 and Y101 are degenerate and identified with an asterisk. The negative peaks (red) arise from the C of Trp and C of His in the constant time experiment. The four strong unassigned peaks in the upfield portion of (a) and (b) and the two negative peaks in (c) are likely from the contaminating His6-tag. 42 Chapter 4. Strategies for assigning NMR spectra 7.8 7.6 7.4 7.2 • — — i — i — i i • ' i 160 180 -a I 200 o. 220 240 260 H66 (N<=2-HEL) H66 ( N S 1 - H £ | ) 7.0 6.8 6.6 • i i H66 (N E 2 -H« 2 ) o H £ l c E l N81 H«2 \ 160 H80 •200 h220 -240 r260 7.8 7.6 7.4 7.2 7.0 6.8 6.6 ' H (ppm) Figure 4.5. His66 adopts a neutral N E 2 H tautomeric form This conclusion is based on the large difference in the 1 5 N shifts of its two 1 5 N nuclei and the coupling patterns to the adjacent carbon-bonded protons. Note that a proton directly bonded to I 5 N e 2 is not observed due to rapid exchange with water. Additional peaks in the spectrum arise from the contaminating His6-tag. 43 Chapter 4. Strategies for assigning NMR spectra The secondary structural elements of G A B P a 3 5 " 1 2 1 were predicted based on measured JHN-H<X coupling constants combined with the chemical shift values of backbone nuclei. These coupling constants are dependent on the <J> dihedral angles, whereas chemical shifts are reflective of and W dihedral angles. First , 3 J HN-Ha coupling constants were determined from the ratio of H N vs. H a peak intensities in a FINHA experiment (Kuboniwa et al., 1994). As depicted in Fig. 4.6, most residues in G A B P a 3 5 " 1 2 1 show coupling constants > 8 Hz, indicative of an extended conformation such as that in a (3-strand. However, some regions (residues 48-56, 80-83) have helical or turn conformations as suggested by coupling constants < 6 Hz. Second, the secondary structural components of this protein were predicted based on the chemical shifts of the backbone nuclei ( ] H N , 1 5 N , 1 3 C ' , 1 3 C a , ' H a , 1 3 C P ) . In one approach, the observed 1 3 C a -13 B C p chemical shift differences were compared to those for a random coil polypeptide (i.e. ( A ( 1 3 C a - 1 3 C p ) o b s - ( 1 3 C a - 1 3 C p ) r e f . ) ) . Negative and positive values correspond to (3-strand and a-helices, respectively. In the chemical shift index (CSI) approach (Wishart et al, 1992), mainchain nuclei are assigned values of +1, 0, -1, depending upon their chemical shifts relative to those of the corresponding amino acid in a random coil polypeptide. A consensus of each comparison is made, along with structure-based smoothing rules, to yield a final CSI of -1 for a residue in a a-helix and +1 for those in a (3-strand. Finally, in the TALOS approach (Cornilescu et al, 1999), (O, *¥) angles for each residue are determined by a comparison of mainchain chemical shifts with those in a database of known proteins structures. To facilitate comparison to the CSI method, a value of +1 were assigned to residues with (O, ¥ ) angles indicative of a p-strand conformation and -1 to those with a helical conformation. Using these approaches, it is clear that G A B P a 3 5 " 1 2 1 is composed of three large and at least one small P-strand, as well as one helix. Overall, this agrees well with the final tertiary 44 Chapter 4. Strategies for assigning NMR spectra Figure 4.6. Secondary structure prediction of G A B P a The secondary structure of G A B P a 3 5 " 1 2 1 was calculated by four different methods. For the TALOS and CSI approaches, the secondary structure is based on the chemical shifts of the backbone nuclei (*HN, 1 5 N , 1 3 C a , 1 3 C P , 1 3 C ) versus those in a database of known protein structure or for a random coil polypeptide, respectively. For these two methods, values of +1, 0, -1 correspond to (3-strand, random coil, and helix conformations, respectively. Similarly, the differences of 1 3 C a - 1 3 C P shifts are compared to those of a random coil chemical, given as A(8Ca-8CP)obs-ref- Negative values correspond to P-strand conformation, whereas positive values correspond to a helical conformation. Finally, residues with helical conformations have 3JHN-HOI coupling constants lower than 6 Hz (dashed line). The predicted secondary structure agrees well with that found in the final tertiary structure ensemble of G A B P a 3 5 " 1 2 1 , as shown in the schematic diagram of the five P-strands and one helix at the top. 45 Chapter 4. Strategies for assigning NMR spectra -fS2V-lS^- -I S4 > 1 S5 > OO O 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 119 00 o O 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 119 n o 3 o n 8 -3 o -9 -HIT'I ' 1 ' 1 ' 1 ' '" ' f | J ' ' ' l l I'l'in'—'i'|M'11''' 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 119 15 12 4 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 119 46 Chapter 4. Strategies for assigning NMR spectra structure of GABPa 3 5 " 1 2 1 determined in Chapter 5 using all available NMR data. Several sequence-based database searches, such as PROF (Ouali and King, 2000), Jpred (Cuff et al., 1998), and PSIpred (Jones, 1999), also predicted that GABPa 3 5 " 1 2 1 is mainly composed of p-strands. 47 Chapter 5. The solution structure of GABP d' Chapter 5 - The solution structure of GABPa Deterrnination of the tertiary structure of "unknown" proteins or protein domains is one method to unveil their function. This approach, known as reverse genetics, is based on the comparative studies with the structures of functionally known proteins. In an attempt to discover the biological role of the N-terminal region of G A B P a , the solution structure of G A B P a 3 5 " 1 2 1 was determined using N M R spectroscopy. G A B P a 3 5 " 1 2 1 adopts a novel fold with five (3-strands and a distorted helix. Despite its unique topology, structural database searches suggested that the fold of this domain is related to that of ubiquitin or ubiquitin-like proteins. In addition, G A B P a 3 5 " 1 2 1 also exhibits several interesting structural features, such as a putative sumoylation site and a patch of acidic residues on its surface. Although structural determination 35 121 of G A B P a " has provided several important insights into properties of this protein domain, its function in the native transcription factor is still unclear. 5.1 Obtaining dihedral angle information (<J>, %0 from NMR spectra 5.1.1 <& and *F angles The d> and W angles were based on an analysis of mainchain ' H , 1 3 C , 1 5 N chemical shifts using the program T A L O S (Cornilescu et al, 1999). The predicted dihedral angles were only used as dihedral restraints when the values agreed with those expected from parameterized Karplus equation, 3JNH-HC( coupling constants. 5.1.2 %i angle of residues with H p p protons 48 Chapter 5. The solution structure of GABP d5'121 The Xi angles for residues with prochiral H P protons were determined in concert with the stereospecific assignment(s) of the proton resonances. The angles were restrained to ± 60° or d 180° according to a staggered rotamer model. 5.1.3 % i angles of Val, Val, He The %\ restraints for Val, Thr, and He were determined from measured 3JNCY and 3Jccy coupling constants using N-Cy and C'-Cy spin echo experiments (Grzesiek et ai, 1993; Vuister et al, 1993). The angles were set to ± 60° or 180° based on a staggered rotamer model. Restraints were not utilized for residues showing JNCY and Jccy couplings indicative of rotamer averaging. 5.2 Residual dipolar coupling constants Residual dipolar coupling (RDC) constants provide a powerful parameter for solving protein structure, yielding valuable information about the orientational relationships between selected bond vectors in the protein. Unlike NOE restraints, RDC's are entirely independent of the distances among different bond vectors and thus provide more long-range or global structural restraints. For G A B P a 3 5 " 1 2 1 , 1 3 c a - 1 3 C ' and * H N - 1 5 N RDC's were measured using protein weakly aligned in a stretched polyacrylamide gel (Ishii et ai, 2001). The SANI restraints in ARIA were used to convert these couplings into structural restraints. For this, the values of the alignment tensor (R= 0.3 , Da= 5.3 Hz) were determined by a combination of a histogram method and a grid search to find a global energy minimum by ARIA/CNS(Clore et al, 1998a). 5.3 Structure calculation 49 Chapter 5. The solution structure of GABP d5'121 35 121 Calculation of the tertiary structure of G A B P a " mainly relied on assigning NOE peaks to obtain through-space distance restraints between protons. These data were obtained from several NOESY spectra, including 3D 1 5 N-HSQC-NOESY (Marion et al, 1989b), simultaneous 1 3 C - and 1 5 N-NOESY-HSQC (Pascal et al, 1994), aromatic 1 3 C - N O E S Y (Slupsky et al, 1998), and constant time methyl-methyl and amide-methyl-NOESY spectra (Zwahlen et al, 1998; Zwahlen et al, 1997). The J H , 1 3 C , and 1 5 N spectral assignments of G A B P a 3 5 " 1 2 1 from the previous chapter were required for this process. The structure of G A B P a 3 5 " 1 2 1 was calculated by ARIA (vl.2) (Linge et al, 2001), which automatically and re-iteratively interprets and assigns NOE restraints. Dihedral angular restraints (O, XF, %\) and residual dipolar coupling restraints ( 1 5 N - 1 H N , 1 3c ' - 1 3 C a ) were also utilized in the structure calculations. It is also important to note that no hydrogen bond restraints were included. In this protocol, ARIA iteratively generates nine complete sets of G A B P a 3 5 " 1 2 1 structures, obtaining the lowest energy structures in the final iteration (Fig. 5.1). The ten lowest energy structures in this set are then used to generate a final ensemble of ten water-refined structures of G A B P a " (details of the structural statistics are reported in Table 5.1). 5.4 Structure overview G A B P a 3 5 " 1 2 1 is comprised of a five stranded (3-sheet (S,; 36-43, S 2; 67-70, S 3; 73-74, S 4; 91-99, S5; 107-115) connected both in antiparallel and parallel manners, and a distorted helix (48-59) lying across this sheet (Fig. 5.2 (a)). The secondary structure of G A B P a 3 5 " 1 2 1 was defined based on the consensus results for the ten water-refined structures using the programs Promotif (Hutchinson and Thornton, 1996) and Vadar (Willard et al, 2003). The superimposition of the structural ensemble of G A B P a 3 5 " 1 2 1 shows good agreement with each 50 Chapter 5. The solution structure of GABP c?5~ Figure 5.1. Iterative assignment of ambiguous restraints by A R I A The iterative protocol by A R I A assign N O E s and calculation the lowest-energy structures in combination with several structural restraints, including dihedral angle (3>, W, %i) and residual dipolar couplings ( ^ N - ' H N , I 3 C ' - 1 3 C A ) . 51 Chapter 5. The solution structure of GABPa? 5 - 1 2 1 Table 5.1. N M R restraints and statistics for the ensemble of ten structures calculated for G A B P a 3 5 " 1 2 1 . A. Summary of restraints NOEs Unambiguous 2745 Ambiguous 923 Total 3668 Dihedral angles <t>,\|/,xl 51,51,34 Residual dipolar couplings (W-^N) 52 B. Deviation from restraints NOE restraints (A) 0.027+0.001 Dihedral angle restraints (deg.) 0.85±0.06 Residual dipolar coupling restraints (Hz) 1.81+0.02 Residues in allowed region of Ramachandran plot (%) 98.1 Deviation from idealized geometry Bond lengths (A) 0.005±0.000 Bond angles (deg.) 0.73±0.01 Improper angle (deg.) 2.29±0.08 Mean energies a , kcal-mol"1 E v d w 167+10 Ebonds 35±2 Eangles 488±6 Eimpr 133±7 E N O E 126±9 E«iih 7±1 E s a n i 170±5 rmsd from average structure, A Backbone Heavy atoms Structured residuesb 0.15±0.02 0.41±0.02 All residues 0.59+0.09 0.97±0.06 , final ARIA/CNS energies for van der Waals (vdw), bonds, angles, NOE restraints, dihedral restraints (cdih), and residual dipolar coupling restraints (SANI). b , structured residues: 36-43, 48-59, 67-70, 73-74, 91-99, 107-115. 52 Chapter 5. The solution structure of GABP o?5 Figure 5.2 Solution structure of G A B P a (a) Ribbon diagram of the lowest energy structure of G A B P a 3 5 " 1 2 1 . The protein is comprised of a five stranded P-sheet (orange, Si ; 36-43, S 2; 67-70, S 3; 73-74, S 4; 91-99, S 5; 107-115) connected in both antiparallel and parallel manners, and a distorted helix (48-59; red). The two prominent S3/S4 (75-90) and S4/S5 (100-106) loops are shown in cyan, (b) Superimposition of the C a traces of the ten water-refined structures, showing low rms deviation in the structured region. The N - and C-terminal tails and the S3/S4 loop exhibit more disorder, (c) The superimposed methyl-containing residues (Ala, He, Leu, Val, Met, Thr) of the ten water-refined structures are shown in the same view as (b). The methyls are well defined in the middle of the protein, whereas those on the surface of the protein are disordered (colors correspond the secondary structures as in (a)). 53 Chapter 5. The solution structure of GABP of5 Chapter 5. The solution structure of GABPa?5121 other (Fig. 5.2 (b)), especially in the regions of regular secondary structure. This is reflected by the low root-mean-square deviations 0.15 A and 0.41 A for all backbone and heavy atoms, respectively, for the structured region. Additionally, excellent agreement was found between 35 121 the measured RDC's and those predicted from the final structure of G A B P a " (Fig. 5.3). Several interesting aspects of the structure of G A B P a 3 5 " 1 2 1 are discussed as follows. 5.4.1 P-strand structure The P-sheet of G A B P a 3 5 " 1 2 1 contains a distinct bulge in the middle of S5. As shown in Fig. 5.4, the discontinuous linkage of S4 and S5 at the bulge results in an altered hydrogen bond pattern between Ser95 in S4 and L e u l l l in S 5. That is, the N H of Ser95 in S4 is hydrogen-bonded to the CO of E l 12 in S5 instead of the CO of L e u l l l as would be expected without a bulge (Fig. 5.4, top). The N-terminal half (91-94) of S 4 interacts with the C-terminal half (112-115) of the "opposite side" of S5 due to the resulting 180° rotation of the C-terminal end of S 5 after the bulge. The change in the register of the hydrogen bond pattern between S4 and S5 at the bulge is confirmed based on a manual inspection of the mainchain NOE patterns in this region, as shown in fig. 5.4 (bottom). 5.4.2 Loop structure Two distinct loops between S 3 and S 4 (75-90) and S 4 and S 5 (100-106), highlighted in cyan (Fig. 5.2 (a)), protrude from the surface of the protein. Consistent with this exposure, these loop regions showed increased rms deviations in the G A B P a 3 5 " 1 2 1 structural ensemble of the ten-lowest energy structures. 5.4.3 Distorted helix 55 Chapter 5. The solution structure of GABP o?5 5 0 5 10 Exper imenta l d ipolar coup l ing cons tan ts (Hz) Figure 5.3. Residual dipolar coupling constants (observed vs. experimental) The 52 experimental ' H - 1 5 N residual dipolar coupling constants of were measured and compared to those calculated from the final structure of G A B P a 3 5 " 1 2 1 . The excellent agreements of the values provide the evidence of high quality structure with low R-factor (0.182) and Q-factor (0.176) values of the correlation calculated based on the "RDCnrs" program (Skrynnikov et al, 2000). 56 Chapter 5. The solution structure of GABP of5' Hydrogen bonding patterns between S4 and S5 Manually assigned NOE patterns between S4 and Ss Figure 5.4. Identification of a bulge between P-strands S4 and S5 in GABPa 3 3 " 1 2 1 The hydrogen bond pattern (top panel) between S4 and S5 observed from the solution structure of GABPa 3 5 " 1 2 1 are confirmed by the manual examination of the mainchain NOE patterns (bottom panel) in this region. S5 contains a bulge at L l 11, causing an altered hydrogen bond pattern to residues in S4. Note that hydrogen bond restraints were not used in the structure calculations. 57 Chapters. The solution structure of GABPa An examination of the helix in G A B P a 3 5 " 1 2 1 revealed two interesting features. First, the N-terminal protein of the helix adopts a 3io helical conformation, as shown clearly in the ribbon diagram of Fig. 5.2. The pattern of this helical conformation was confirmed based on the examination of the hydrogen bond and NOE patterns. Specifically hydrogen bonds were observed between P47/N50 and I48/L51, corresponding to the hydrogen bond pattern of a 3io helix (HNj, COj-3). Furthermore, H a i - H N i + 4 NOE interactions, diagnostic of a regular a-helix, appear only after N50 (Fig. 5.5). Second, it is interesting that a proline residue, P57, exists in the middle of this helix, causing a kink along its axis (Fig. 5.2 (a) and Fig. 5.5). A proline residue is generally known as a helix breaker as it is not able to form a hydrogen bond to a corresponding i-4 residue (Lys53). As seen in Fig.5.5, due to the presence of this proline, the residues preceding (Glu56) and following (Arg58) the proline also cannot hydrogen bond to Lys52 and Leu54, respectively. However, the regular geometry of the helix is regained after Arg58 as indicated, with Arg58/Leu55 and Leu59/Glu56 forming hydrogen bonds. Complementing these hydrogen bonding patterns, the conclusion that the helix spans from residues 48-59 is based on several additional criteria: (1) (<J>, angles predicted by TALOS (Cornilescu et al, 1999) define this region as a continuous helix; (2) patterns of ( 1 3 C a - I 3 C^) shifts relative to a random coil are indicative of a helix; (3) NOE patterns between sequential H N and H a demonstrate the helical nature of the residues. However, a discontinuity near Pro57 is observed with mainchain JNH-HCC couplings (Fig. 5.5). 5.5 Dynamics from amide 1 5N relaxation 35 121 Complementing the structural analysis of G A B P a " , the global and the internal backbone dynamics of the protein were studied by 15N-heteronuclear relaxation ( 1 5 N-Ti , 1 5 N-T 2 , 'H - I^N} NOE) experiments. Of the 87 amides in the protein, 11 were not included as the part 58 Chapter 5. The solution structure of GABPd5' Figure 5.5. Interactions within the proline-containing a-helix in G A B P a The distorted helix, due to the existence of Pro57, in G A B P a 3 5 " 1 2 1 is identified with several secondary structure determination methods: TALOS, CSI, 3JHN-HOC, A(5C a -5C p ) 0 b S -ref and NOE-o derived parameters. The helix is illustrated with hydrogen bonds (dashed line, < 3.2 A) in Rasmol. The 3io helical conformation and a kink were observed as noted. In TALOS, helical residues (H) are predicted based on the estimated <E>, angles within the ranges of -57 ± 20° and -47 ± 20°, respectively, with the nine out of ten database scores. For CSI analysis, 1 H a , 1 3 C a and 1 3 C P chemical shifts are used to predict (3-strand, random coil and a-helix conformations, as +1, 3 3 0 and -1, respectively. Coupling constants are grouped as 6 ( JHN-HO< 6 Hz), 7 (6 Hz < JHN-HCC < 7 Hz), 8 (3JHN-HCX > 8 Hz), with the small coupling constants indicative of helical conformation. A(5C a -5C p ) o b s-ref is > 0 for helix. The NOE pattern of H a of residue i to H N ' s of residues i+2, i+3 and i+4 are also shown. Note that dAN0,i+3) and dMN(i,i+4) correspond to 3io and a helical conformations, respectively. 59 Chapter 5. The solution structure of GABPa? 5-121 TALOS CSI Ca-Cp <U/,*3) H H H H H H H H H H •1 0 -1 8 6 6 6 6 7 6 7 6 8 7 7 7 <U/,*4) 60 Chapter 5. The solution structure of GABP of5121 of the analysis due to the absence of ' H N resonances (N-terminus or Prolines) or significant overlap (L59, S95, Q97, E105, 1113, E l 18) of peaks in the ' H - ^ N HSQC spectrum. Using Tensor2 (Dosset et al, 2000), the isotropic global tumbling time, xc, of the protein was calculated as 4.82 ± 0.04 ns at 30 °C from the Tj and T 2 relaxation values of amides in the well-ordered region of the protein. This global correlation time of G A B P a " confirms that the 10.2 kDa protein as a monomer in solution (Daragan and Mayo, 1997). The internal dynamics of the protein were then calculated by the anisotropic Lipari-Szabo model-free approach (Lipari and Szabo, 1982) (Kay et ai, 1989) (Mandel et al, 1995). In this approach, the internal mobility of individual backbone amides on a psec-nsec timescale are cast as a general order parameter, S 2, along with a characteristic correlation time of the local internal motion, Tj. In addition to these two parameters, the chemical shift exchange contribution term, R e x , has also been introduced to some sites where the R 2 (1/T2) results cannot be fit by the simple Lipari-Szabo approach because of the fast (p,sec-msec) interconversion of the chemical environment of a residue. The relaxation properties of G A B P a 3 5 " 1 2 1 (Figure 5.6) show that the structural core of the protein, residues 36-115, is relatively rigid with generally high S values. This provides additional evidence for a well-folded core. The bulge in the middle of S 5, shown in fig. 5.4, is also rigid as seen by S95 and L l l l having high heteronuclear 1 5 N - N O E and S 2 values. Similarly, although the hydrogen-bond pattern for the helix is disrupted at Pro57 (Fig. 5.5), the relaxation analysis demonstrated no changes in dynamics through the helix in this fast timescale. However, several local differences in the dynamics are observed. First, besides the N - and C-terminal tails, there are three relatively flexible regions in the protein, namely the N-terminal region of the helix/S2 loop (residues 61-63), the C-terminal region of S 3 / S 4 loop (residues 87-91), and the S 4 / S 5 loop (residues 101-104). Among these, the 61 Chapter 5. The solution structure of GABP o? Figure 5.6. 15N-relaxation analysis of G A B P a 3 5 " 1 2 1 The order parameter, S 2, and the chemical shift exchange contribution term, R e x , for the analysis of the internal mobility of the protein were measured based on the 1 5 N TI and T2 lifetimes and steady-state 'H-{ 1 5 N} NOE values. The regular structured region is well-defined with low flexibility (high NOE and S 2 values), whereas local motion is observed in the S3/S4 and S4/S5 loops. Prolines and residues with poorly resolved amide chemical shifts are shown with a star and a triangle, respectively. 62 Chapter 5. The solution structure of GABPo?5 I I I I I I I I I I I I II I II I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I Residue 63 Chapter 5. The solution structure of GABPd5'121 S4/S5 loop demonstrates the most striking flexibility on the psec-nsec timescale with low heteronuclear NOE and S 2 values. In particular, Y101 exhibits the least rigidity in this loop and high surface-accessibility measured by Vadar (Willard et ai, 2003). The mobility of the H/S2 loop is not as distinct as other two flexible regions in the protein core. However, the high R e x terms of S62 and L63 still suggest that this loop undergoes intermediate time scale (msec-pisec) motions. The flexibility of the C-terminal end of the S3/S4 loop is interesting for two reasons. First, it is the part of the longest loop (75-90) in G A B P a 3 5 " 1 2 1 . However, only the C-terminal end of the loop exhibits enhanced flexibility compared to the rest of the protein. Second, and most interestingly, a putative sumoylation site, K87 (Fig. 5.7 (c)), is located in this flexible region, suggesting accessibility for the sumoylation enzymes. 5.6 Surface features The surface properties of a protein dictate its function and the inspection of these properties provide clues into its possible biological roles. As expected, most of the non-polar residues in G A B P a 3 5 " 1 2 1 are located in the middle of its tertiary structure, forming a stable hydrophobic core. This is seen in Figure 5.2 (c), where the internal methyls of G A B P a 3 5 " 1 2 1 superimposed especially well among the ten water-refined structures, indicating of a well-folded protein core. Although most of the hydrophobic residues are located in the core of the protein, some are located on its surface. The existence of an obvious hydrophobic patch is often indicative of a site for binding other molecules. However, in the case of G A B P a 3 5 " 1 2 1 , these surface hydrophobic residues are well-scattered among polar residues. Thus the protein lacks any distinct hydrophobic patch (Fig. 5.7 (a)). 64 Chapter 5. The solution structure of GABPd Figure 5.7. Surface properties of G A B P a (a) The hydrophobic residues (cyan) are scattered on the surface of the protein without forming any obvious patch, (b) The secondary structures are high-lighted on the surface of the structure with prominent S4/S5 loop. A shallow groove is indicated. (SI: orange, S2: yellow, S3: light green, S4: blue, S5: dark green, H: brown, N-terminal residue: purple, C-terminal residue: dark grey) (c) A cluster of acidic residues is observed in the loops of S 3 / S 4 (D76, D78, D89), S)/H (D43, E46), S 4 / S 5 (E105), the C-terminal sides of S 2 and S 5 (E67, E l 12), and the C-terminal end (El 18). These form a strip of a negative charges along the protein surface. It is also noted that the putative sumoylation site (K87) is located on the surface of the protein near this negative patch, (d) A ribbon diagram showing the secondary structure of G A B P a 3 5 " 1 2 1 . The same color scheme is used as defined in (b). 65 Chapter 5. The solution structure of GABP of15121 In contrast, groups of acidic residues are observed in the loops of S 3 / S 4 (Asp76, Asp78, Asp89), S i /H (Asp43, Glu46), S 4 / S 5 (Glul05), the C-terminal sides of S 2 and S 5 (Glu67, Glul l2) , and the C-terminal end (Glull8). Together these form a predominant region of negative electrostatic potential, which can be a possible interface for interaction with positively charged molecules (Fig. 5.7 (c)). Along with the types of the residues, grooves on the surface are also a good indication for binding with other molecules. Although no distinct cleft, such as that typical for an active site of an enzyme, was observed, a shallow groove does exist near the helix of G A B P a 3 5 " 1 2 1 (Fig. 5.7 (b)). Alternatively, the exposed, flexible S4/S5 loop could present residues to bind within a complementary cleft in a potential partner macromolecule. Any potential consensus post-translational modification sites on G A B P a 3 5 " 1 2 1 were also examined. The sequence (-VK TD-) matches a consensus sumoylation site (())KxD/E; <]): hydrophobic residue, x: any residue, D/E: acidic residue) (Schwartz and Hochstrasser, 2003). Furthermore Lys87 (Fig. 5.7 (c)) is located at the C-terminal end of the long loop between S3 and S4, and is exposed on the surface of the protein. Thus, it could be accessible to the sumoylating enzymes. Since Lys87 is located next to the patch of acidic residues, sumoylation of the site could physically interfere with any molecule binding to this negatively charged region of G A B P a 3 5 " 1 2 1 . 5.7 Structural comparisons Although the previously described structural information provides general clues for discovering the biological role of this new domain, this approach did not yield a specific function for G A B P a 3 5 " 1 2 1 . Structure similarity searches were performed by a comparison of the 3D protein structures in the Protein Data Bank (PDB). Although there was no exact structure 67 Chapter 5. The solution structure of GABPa that mimics the fold of G A B P a 3 5 " 1 2 1 , a structural similarity search by D A L I (DISTANCE M A T R I X A L I G N M E N T , (Dietmann et ai, 2001)) revealed that this domain resembles ubiquitin or the ubiquitin-like proteins with Z-factor of 4.2. Ubiquitin is a well-studied protein that is crucial for the degradation of proteins by the 26S proteosome (for review see (Pickart, 2004)). As the part of this biological process, the covalent attachment of the C-terminal Gly-Gly residues of ubiquitin to Lysine residues on the target protein, known as ubiquitinylation, is required. Although G A B P a 3 5 " 1 2 1 and ubiquitin have no significant sequence similarity, the tertiary structure of ubiquitin is also comprised of a P-sheet with five P-strands and an a-helix. However, there are several significant differences between two protein structures (Fig. 5.8). First, the linear orders of the secondary structures are different. Ubiquitin has an additional P-strand at its beginning, whereas G A B P a has it at the end (Fig. 5.8 (c)). Second, S 5 in ubiquitin is placed in the opposite direction as that in G A B P a . Finally, the large flexible S 4 / S 5 loop in G A B P a is missing in ubiquitin (Fig. 5.8 (a)). Thus, G A B P a 3 5 " 1 2 1 has a novel fold. Although exciting, this does not provide any obvious clues to its function. 35 121 Given their global structural similarity, additional properties in G A B P a " that may resemble ubiquitin were examined. First, the general surface properties of G A B P a 3 5 " 1 2 1 were compared to those of ubiquitin. While G A B P a 3 5 " 1 2 1 has a notable strip of negative charges, ubiquitin did not have any obvious patch of predominant charge on its surface (Fig. 5.9). One face of the molecule is mixed with both negatively and positively charged groups, whereas the opposite face contains few charged residues. Perhaps the most interesting feature of the surface of ubiquitin is the existence of the hydrophobic patch. The non-polar patch, surrounded by 68 Chapter 5. The solution structure of GABP o?5121 several positive charges, is known to be a binding-interface for the ubiquitin interacting motif (UTM) (Fisher et al, 2003; Polo et al, 2002; Raiborg et al, 2002; Shih et al, 2002). UTM is a 69 Chapter 5. The solution structure of GABPa25'121 Figure 5.8. Secondary structure comparisons of G A B P a 3 5 " 1 2 1 and ubiquitin (1UBQ) (a) The global folds of G A B P a 3 5 " 1 2 1 and ubiquitin are clearly similar, (b) Topology diagrams of these two proteins show that the direction of P-strand S5 in G A B P a 3 5 " 1 2 1 is opposite from that of the corresponding strand, S5, in ubiquitin. (c) The order of the secondary structures are different in these two proteins as G A B P a 3 5 ' 1 2 1 contains an additional strand, S4, between S3 and S5, while the corresponding strand SI is located at the N-terminal end of ubiquitin. 70 Chapter 5. The solution structure ofGABPc?5 Binding interface for Ubiquitin-interacting motif Figure 5.9. The surface properties of ubiquitin (1UBQ) The hydrophobic, acidic, and basic residues are shown in yellow, red, and blue, respectively. The protein in the left panel is oriented as in the ribbon diagram shown in Fig. 5.8. Note that the front side of the protein contains a notable hydrophobic patch for binding ubiquitin-interacting motifs (ref). 71 Chapter 5. The solution structure of GABPa ~30-residue sequence motif which containing a hydrophobic sequence composed of alternating large and small residues (Leu-Ala-Leu-Ala-Leu) which are used for interacting with the 35-121 hydrophobic patch on ubiquitin. However, unlike ubiquitin, the surface of G A B P a ~ does not have such an obvious hydrophobic patch. As the part of the general ubiquitinylation process, ubiquitin covalently linked to the target molecule undergoes further ubiquitinylation to form polyubiquitin chains. Therefore, ubiquitin itself has target lysine residues on its surface, including a common site (K48) and several alternative sites (K6, K29, K63) (Pickart, 2004). As shown in Fig. 5.10, the comparison of these residues with the lysines on the surface of G A B P a 3 5 " 1 2 1 reveals that K52/K53 and K115 on G A B P a 3 5 " 1 2 1 are located in corresponding position to K29 and K63 of Ubiquitin, respectively. It is also interesting to note that the K115 of ubiquitin is located near the S3/S4 loop where the predicted sumoylation site (K87) of G A B P a 3 5 " 1 2 1 is structured. 5.8 Conclusion The primary goal of determining the structure of the new domain in G A B P a was to gain some insight into its function based on structural comparisons with other proteins of known function. The solution structure of this new domain solved by N M R spectroscopy has been discussed in this chapter. Although the protein contains a novel fold, the structure did not provide immediate functional clues. In contrast with the closest tertiary structure from the structure-based similarity search, it did not share any of the unique properties of ubiquitin, such as a hydrophobic patch for the ubiquitin-interacting motif. However, the structural information of this domain will be indispensable to address the overall goal of this project, especially in building a model containing structural details of full-length G A B P in complex with other potential protein partners. 72 Chapter 5. The solution structure of GABPa25' (a) GABPa K115 (b) Ubiquitin K63 K63 Figure 5.10. Comparison of surface lysines of GABPa 3 5 " 1 2 1 and ubiquitin Ubiquitin contains a common uniquitination site (K48) and three alternative sites (K6, K29, K63). G A B P a 3 5 ' 1 2 1 (a)also contains three lysines (K52, K53, K l 15) located at relatively similar positions to those (K.29, K63) on ubiquitin (b), respectively. The putative sumoylation site, K87, is labelled in red in GABPa 3 5 " 1 2 1 . 73 Chapter 6. Concluding remarks and future directions Chapter 6 - Concluding remarks and future directions The goals of this thesis were: I. to identify a new domain in the N-terminal region of G A B P a . II. to determine its tertiary structure. III. to dissect its biological function based on its structural information. The identification of the N-terminal domain of G A B P a and the determination of its 35 121 structure provided two significant advances. First, the analysis of G A B P a " completed the structural characterization of the full-length G A B P a as three domains joined by flexible linker sequences (Fig. 6.1). The structural description of the full-length protein is critical for understanding its function in a biological context. Second, the determination of the tertiary structure of G A B P a 3 5 " 1 2 1 has revealed a novel protein fold, confirming an initial hypothesis based on the lack of any sequence similarity to the protein databases. However, ironically, this precludes identification of the function of G A B P a 3 5 " 1 2 1 based on comparison to proteins of known function. Although the first and second goals were accomplished successfully, the third, defining the function of G A B P a 3 5 " 1 2 1 , remains to be met. Given that "structure" did not yield "function", investigations of this question are currently being performed through biochemical approaches. G A B P a 3 5 " 1 2 1 is too small to be an enzyme and thus probably serves a role in the assembly of transcription complexes through protein-protein interactions with other components of the transcriptional machinery. This is consistent with the various partnerships described for 74 Chapter 6. Concluding remarks and future directions Figure 6.1. The structure of the full length GABPa The model of GABPa is based on the structures of the three domains (N-terminal, PNT, ETS domains) joined by flexible linker sequences. A MAP kinase phosphorylation site (T280) is shown as a red sphere. The linker regions are flexible as evident by their random coil chemical shifts in constructs such as GABPa 1 " 3 2 0 . 75 Chapter 6. Concluding remarks and future directions G A B P a in chapter 1. However, it is also possible that G A B P a 3 5 " 1 2 1 binds to peptides or small molecular ligands. 6.1 Domain interactions within GABP As a first step toward defining the core of the N-terminal domain, it was of interest to search for interactions of this domain with other portions of GABP. The GA-binding protein is a heterotetrameric complex (ap 2a), comprised of the DNA-binding subunit G A B P a ; and the transactivation subunit GABPp\ Each subunit is composed of three domains: the ETS, PNT, and N-terminal domains for G A B P a and the AR, T A D , and L Z domains for GABPp. Although the physical association of the ETS domain and the A R has been examined at a molecular level, the possibility of other interactions has only been vaguely addressed. In the course of studying the formation of multimeric complex of the G A B P subunits, Chinenov et al. (Chinenov et al, 2000) have observed no ap 2 a heterotetramer complex formation without binding to tandom target sites positioned on the same side of D N A . Thus isolated G A B P may exist as an aP heterodimer. However, deletion of the N-terminal portion of G A B P a , including the N-terminal domain and the PNT domain, allowed the formation of heterotetrameric complex in solution. This suggests that the N-terminal sequence may interfere the association of the aP dimer to form tetramers. To investigate whether the N-terminal domain or the PNT domain is physically associated with any other GABP domain, various fragments of G A B P were purified and analyzed by native gel electrophoresis (Fig. 6.2). This preliminary native gel analysis for domain interactions did not provide any obvious evidence of binding beyond that expected for full length G A B P a and GABPp. Neither the mixtures of the N-terminal domain nor those of the PNT domain with other parts of G A B P resulted in the formation of a new complex. 76 Chapter 6. Concluding remarks and future directions Figure 6.2. Preliminary binding studies of G A B P domains by native gel. The purified G A B P domains were run on both native (a) and SDS (b) gels either individually or as mixtures following 4 hrs of incubation at room temperature. The individual domains are labeled as follow: a, G A B P a 1 " 3 2 0 (residues 1-320); b, G A B P a 1 6 8 " 2 5 4 (prep. 1, PNT domain of G A B P a , residues 168-254); c, G A B P a 1 6 8 " 2 5 4 (prep. 2, PNT domain of G A B P a , residues 168-254); d, G A B P a 3 5 " 1 2 1 ; e, full-length G A B P a (residues 1-454); f, GABPp (residues 1-382); g, complex of ETS domain (GABPa (residues 311-430)) and ankyrin repeat (GABPP2-1 (residues 1-157)); M , protein marker with the molecular weights (kDa) labeled. Sample c is only the repetitive preparation of sample b. The band corresponding to the complex of GABPa/GABPP is marked with a box in the gel electrophoresis. 77 Chapter 6. Concluding remarks and future directions (a) Native gel a M b c d e f g e b b d d b a + + + + + + + f d f f g g f (b) S D S - P A G E 116.0-66.2 -45.0 35.0 -25.0 -18.4 14.4 a M b c d e f g e b + + f d b d d b a + + + + + f f g g f 78 Chapter 6. Concluding remarks and future directions Despite these initial negative results, there are still several combinations of mixtures for the PNT domain and the N-terminal domain that might form physical interactions, such as with the ETS domain or with individual GABPp domains. Alternatively chemical cross-linking can be used to identify possible partnerships. Due to the advantage of utilizing SDS-PAGE gels, any crosslinked protein can be more confidently analyzed. Lastly, any possible partnerships can be confirmed by chemical shift perturbation mapping using N M R spectroscopy to monitor the titration of a 1 5N-labeled protein with an unlabeled partner. Note that this method already revealed that the N-terminal and PNT domains of G A B P a do not interact as the spectrum of G A B P a 1 " 3 2 0 is essentially the sum of the spectra of G A B P a 1 " 1 6 9 and G A B P a 1 6 8 " 2 5 6 . 6.2 Potential protein partners reported in the literature. Not surprisingly, G A B P has been reported to associate with many transcriptional factors. Understanding the binding of G A B P with these proteins can provide insights into its roles in various biological contexts. This understanding includes identifying which domains of GABP are involved in these potential associations. Currently, such binding studies are underway for three major G A B P partners, ATF-1, p300, and Spl/Sp3. 6.2.1 ATF-1 As discussed earlier in Chapter 1, the early 4 (E4) promoter of adenovirus type 5 is known to be regulated by hGABP (E4TF1) and the adenovirus E1A gene product (Watanabe et al, 1988). In 1999, Sawada el al. (Sawada et ai, 1999) showed that the activating transcription factor 1 (ATF-1) and the cAMP response element-binding protein (CREB) are associated with this promoter as well. In their report, synergistic transactivation was achieved in the combination of hGABP and selected members of ATF/CREB family. Furthermore, they have 79 Chapter 6. Concluding remarks and future directions shown that the DNA-binding bZip domain directly interacts with the N-terminal region (the N -terminal domain and the PNT domain) of G A B P a . Several clones encoding ATF have been 35 121 obtained and protein preparations for in vitro binding studies with G A B P a " with the G A B P a PNT domain. 6.2.2 CBP7p300 Interleukin 16 (EL-16) is a chemotactic cytokine which binds CD4 receptor and affects the activation of T cells and replication of HIV. In the course of studying the expression of DL-16, Bannert et al. (Bannert et al., 1999) found that the promoter region of human IL-16 contains three purine-rich sequences indicative of Ets binding sites. Subsequently, they demonstrated that G A B P binds these sites in concert with the co-activators CREB binding protein (CBP)/p300. Furthermore, they have identified a direct interaction of the C-terminal region of p300 with G A B P a . Clones encoding CBP or p300 as required to test for binding to G A B P a 3 5 " 1 2 1 in vitro are currently being obtained. 6.2.3 Spl and Sp3 The utrophin gene codes for a large cytoskeletal protein. The promoter region of this gene contains a functional GABP binding site and three functional GC elements that are recognized by Spl and Sp3 factors. Galvani et al. (Galvagni et al, 2001) have reported that Spl and Sp3 can cooperate with GABP for the activation of the utrophin promoter. More interestingly, they have observed a synergistic transactivation similar to that of with ATF-1/hGABP. Additionally they have shown that the DNA-binding zinc finger domain of both Spl and Sp3 physically interact with G A B P a . Clones of these proteins for binding studies with G A B P a 3 5 " 1 2 1 are also currently being obtained. 80 Chapter 6. Concluding remarks and future directions Following the preparations of these proteins, the binding studies will be carried out by various methods, including chemical cross-linking, native gel electrophoresis, gel filtration chromatography, and N M R spectroscopy. In the event that binding is detected, further deletion studies of the protein partners will be carried out to identify a minimal G A B P a 3 5 " 1 2 1 -binding region. 6.3 Unbiased screens for interacting protein: "fishing" for protein partners 35 121 As an alternative strategy to testing for binding of G A B P a " to transcription partners, the identification of interaction proteins by affinity methods will also be attempted. In this method, in addition to the previously reported protein partners, other protein partners for each domain can be identified from cell extracts. G A B P is known for its widespread expression, being especially abundant in liver, muscle, and hematopoietic cells (Rosmarin et al., 2004). Although there are several protein tags available for this method, GST-tags will be used because they are already attached to some of the G A B P a constructs. Four different constructs of G A B P a will be used as bait individually in each test: full-length G A B P a , N-terminal domain, PNT domain, and N-terminal and PNT domains. Once purified, each domain will be loaded onto different GST-columns in parallel. Next, the calf thymus cell extract will be introduced onto the domain bound GST column. After several washing steps, G A B P a will be eluted with their bound protein partners. Four different elutions from different G A B P a constructs will be treated separately, and individual protein partners will be isolated by SDS-PAGE gel. Initally, the patterns of the bands will be compared amongst each other and throughout the four parallel runs. The bands of interest will then be extracted and digested with known enzyme (such as 81 Chapter 6. Concluding remarks and future directions trypsin and chymotrypsin). Finally, the fragments will be identified by mass spectrometry and database searching. 6.4 Future directions The ultimate goal of this project is not only to understand the structural and functional basis for the N-terminal domain of GABPa but also gain better insights into the roles of GABP as both a transcriptional regulator and a component of certain signal transduction pathways. Among Ets family members, one of the prominent characteristics of the GABP complex is that the ETS DNA-binding domain and the transactivation domain are located in different proteins, GABPa and GABPP, respectively. However, there were insufficient structural bases for GABP to understand how this structural uniqueness contributes to its function. Although the individual structures for all the domains in GABPa are available now, the structural and functional relationships of these domains are not well characterized. Combining the domain interaction and protein partnership studies from the previous section with the early structural work will provide better insights into the functional and biological roles of the individual components in GABP. As the final stage of this project, I will attempt to solve the structure of the full GABP complex, possibly with some other factors, using X-ray crystallography. 82 References References Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J. (1990). Basic Local Alignment Tool. Journal of Molecular Biology. 215:403-10. Archer, S. J., Ikura, M . , Torchia, D. A. , Bax, A. (1991). An alternative 3D-NMR technique for correlating backbone 1 5 N with side-chain Hp-resonances in larger proteins. Journal of Magnetic Resonance 95:636-41. Bannert, N . , Avots, A. , Baier, M . , Serfling, E., Kurth, R. (1999). GA-binding protein factors, in concert with the coactivator CREB binding protein p300, control the induction of the interleukin 16 promoter in T lymphocytes. Proceedings of the National Academy of Sciences of the United States of America. 96:1541-6. Bassuk, A. G., Leiden, J. M . (1997). The role of Ets transcription factors in the development and function of the mammalian immune system. Advances in Immunology, Vol 64. 64:65-104. Batchelor, A . H. , Piper, D. E., de la Brousse, F. C., McKnight, S. L. , Wolberger, C. (1998). The structure of G A B P alpha/beta: An ETS domain ankyrin repeat heterodimer bound to D N A . Science. 279:1037'-41. Bax, A. , Clore, G., Gronenborn, A . (1990). ' H - ' F I correlation via isotropic mixing of 1 3 C 1 13 magnetization, a new three-dimensional approach for assigning H and C spectra of 13C-enriched proteins. Journal of Magnetic Resonance. 88:425-31. Bax, A. , Marion, D. (1988). Improved Resolution and Sensitivity in 'H-Detected Multiple-Bond Correlation Spectroscopy. Journal of Magnetic Resonance. 78:186-91. Buratowski, S. (1994). The Basics of Basal Transcription by RNA-Polymerase-II. Cell. 77:1-3. Chinenov, Y. , Henzl, M . , Martin, M . E. (2000). The alpha and beta subunits of the GA-binding protein form a stable heterodimer in solution - Revised model of heterotetrameric complex assembly. Journal of Biological Chemistry. 275:7749-56. Chou, J. J., Gaemers, S., Howder, B., Louis, J. M . , Bax, A. (2001). A simple apparatus for generating stretched polyacrylamide gels, yielding uniform alignment of proteins and detergent micelles. lournal of Biomolecular NMR. 21:377-82. Clore, G. M . , Gronenborn, A. M . , Bax, A . (1998a). A robust method for determining the magnitude of the fully asymmetric alignment tensor of oriented macromolecules in the absence of structural information. Journal of Magnetic Resonance. 133:216-21. Clore, G. M . , Gronenborn, A . M . , Tjandra, N . (1998b). Direct structure refinement against residual dipolar couplings in the presence of rhombicity of unknown magnitude. Journal of Magnetic Resonance. 131:159-62. 83 References Cornilescu, G., Delaglio, F., Bax, A. (1999). Protein backbone angle restraints from searching a database for chemical shift and sequence homology. Journal of Biomolecular NMR. 13:289-302. Cuff, J. A. , Clamp, M . E., Siddiqui, A . S., Finlay, M . , Barton, G. J. (1998). JPred: a consensus secondary structure prediction server. Bioinformatics. 14:892-3. Daragan, V . A. , Mayo, K. H. (1997). Motional model analyses of protein and peptide dynamics using C-13 and N-15 N M R relaxation. Progress in Nuclear Magnetic Resonance Spectroscopy. 31:63-105. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., Bax, A . (1995). Nmrpipe - a Multidimensional Spectral Processing System Based on Unix Pipes. Journal of Biomolecular NMR. 6:277-93. Dietmann, S., Park, J., Notredame, C., Heger, A. , Lappe, M . , Holm, L . (2001). A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3. Nucleic Acids Research. 29:55-7. Dittmer, J., Nordheim, A. (1998). Ets transcription factors and human disease. Biochimica et Biophysica Acta-Reviews on Cancer. 1377:F1-F11. Dosset, P., Hus, J. C , Blackledge, M . , Marion, D. (2000). Efficient analysis of macromolecular rotational diffusion from heteronuclear relaxation data. Journal of Biomolecular NMR. 16:23-8. Fisher, R. D., Wang, B., Alam, S. L., Higginson, D. S., Robinson, H. , Sundquist, W. I., Hi l l , C. P. (2003). Structure and ubiquitin binding of the ubiquitin-interacting motif. Journal of Biological Chemistry. 278:28976-84. Galvagni, F., Capo, S., Oliviero, S. (2001). Spl and Sp3 physically interact and co-operate with G A B P for the activation of the utrophin promoter. Journal of Molecular Biology. 306:985-96. Ghosh, A. , Kolodkin, A . L . (1998). Specification of neuronal connectivity: ETS marks the spot. Cell. 95:303-6. Graves, B. J. (1998). Transcription - Inner workings of a transcription factor partnership. Science. 279:1000-2. Graves, B. J., Petersen, J. M . (1998). Specificity within the ets family of transcription factors. Advances in Cancer Research, Vol 75. 75:1-55. Grzesiek, S., Anglister, J., Bax, A. (1993). Correlation of backbone amide and aliphatic side-13 15 13 chain resonances in Cl N proteins by isotropic mixing of C magnetization. Journal of Magnetic Resonance. 101:114-9. 84 References Hutchinson, E. G., Thornton, J. M . (1996). PROMOTJJF - A program to identify and analyze structural motifs in proteins. Protein Science. 5:212-20. 1 13 Ikura, M . , Kay, L . E., Bax, A . (1990). A novel approach for sequential assignment of H , C, and 1 5 N spectra of proteins: heteronuclear triple-resonance three-dimensional N M R spectroscopy. Application to calmodulin. Biochemistry. 29:4659-67. Ishii, Y . , Markus, M . A. , Tycko, R. (2001). Controlling residual dipolar couplings in high-resolution N M R of proteins by strain induced alignment in a gel. Journal of Biomolecular NMR. 21:141-51. Jones, D. T. (1999). Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology. 292:195-202. Kay, L . , Xu , G.-Y., Singer, A. , Muhnadiram, D., Forman-Kay, J. (1993). A gradient-enhanced i n HCCH-TOCSY experiment for recording side-chains H and C correlations in H2O samples of proteins. Journal of Magnetic Resonance. 101:333-7. Kay, L. E., Torchia, D. A. , Bax, A. (1989). Backbone dynamics of proteins as studied by 1 5 N inverse detected heteronuclear N M R spectroscopy: application to staphylococcal nuclease. Biochemistry. 28:8972-9. Kim, C. A. , Phillips, M . L. , Kim, W., Gingery, M . , Tran, H . H. , Robinson, M . A. , Faham, S., Bowie, J. U . (2001). Polymerization of the S A M domain of T E L in leukemogenesis and transcriptional repression. EM BO Journal. 20:4173-82. Kuboniwa, H. , Grzesiek, S., Delaglio, F., Bax, A. (1994). Measurement of H N - H a J couplings in calcium-free calmodulin using 2D and 3D water-flip-back methods. Journal of Biomolecular NMR. 4:871-8. Lamarco, K. , Thompson, C. C , Byers, B. P., Walton, E. M . , Mcknight, S. L . (1991). Identification of Ets-Related and Notch-Related Subunits in GA-Binding Protein. Science. 253:789-92. Lee, G. M . , Donaldson, L . W., Pufall, M . A. , Kang, H . S., Pot, I., Graves, B. J., Mcintosh, L . P. (2005). The structural and dynamic basis of Ets-1 D N A binding autoinhibition. Journal of Biological Chemistry. 280:7088-99. Lemon, B., Tjian, R. (2000). Orchestrated response: a symphony of transcription factors for gene control. Genes & Development. 14:2551-69. Leprince, D., Gegonne, A. , Coll, J., Detaisne, C , Schneeberger, A. , Lagrou, C , Stehelin, D. (1983). A Putative 2nd-Cell-Derived Oncogene of the Avian Leukemia Retrovirus-E26. Nature. 306:395-7. Liang, H. , Mao, X . H. , Olejniczak, E. T., Nettesheim, D. G., Yu, L . P., Meadows, R. P., Thompson, C. B., Fesik, S. W. (1994). Solution Structure of the Ets Domain of Fli-1 When Bound to DNA. Nature Structural Biology. 1:871-6. ,85 References Linge, J. P., O'Donoghue, S. I., Nilges, M . (2001). Automated assignment of ambiguous nuclear overhauser effects with ARIA. Nuclear Magnetic Resonance of Biological Macromolecules, Pt B. 339:71-90. Lipari, G., Szabo, A . (1982). Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 2. Analysis of experimental results. Journal of the American Chemical Society. 104:4559-70. Logan, T. M . , Olejniczak, E. T., Xu, R. X . , Fesik, S. W. (1992). Side-Chain and Backbone Assignments in Isotopically Labeled Proteins from 2 Heteronuclear Triple Resonance Experiments. FEBS Letters. 314:413-8. Mackereth, C. D., Scharpf, M . , Gentile, L . N . , Macintosh, S. E., Slupsky, C. M . , Mcintosh, L . P. (2004). Diversity in structure and function of the Ets family PNT domains. Journal of Molecular Biology. 342:1249-64. Mandel, A . M . , Akke, M . , Palmer, A . G. (1995). Backbone Dynamics of Escherichia-Coli Ribonuclease-Hl - Correlations with Structure and Function in an Active Enzyme. Journal of Molecular Biology. 144-63. Marion, D., Driscoll, P., Kay, L. , Wingfield, P., Bax, A. , Gronenborn, A. , Clore, G. (1989a). HSQC-TOCSY experiment. Biochemistry. 28:6150. Marion, D., Driscoll, P. C , Kay, L . E., Wingfield, P. T., Bax, A. , Gronenborn, A . M . , Clore, G. M . (1989b). Overcoming the overlap problem in the assignment of lH N M R spectra of larger proteins by use of three-dimensional heteronuclear ' H - 1 5 N Hartmann -Hahn-multiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: application to interleukin 1 beta. Biochemistry. 28:6150-6. Mcintosh, L . P., Brun, E., Kay, L. E. (1997). Stereospecific assignment of the N H 2 resonances from the primary amides of asparagine and glutamine side chains in isotopically labeled proteins. Journal of Biomolecular NMR. 9:306-12. Montelione, G. T., Lyons, B. A. , Emerson, S. D., Tashiro, M . (1992). An efficient triple resonance experiment using carbon-13 isotropic mixing for determining sequence-specific resonance assignments of isotopically-enriched proteins. Journal of the American Chemical Society. 114:10974-5. Muhandiram, D., Kay, L . (1994). Gradient-enhanced triple-resonance three-dimensional N M R experiments with improved sensitivity. Journal of Magnetic Resonance. 103:203-16. Neri, D., Szyperski, T., Otting, G., Senn, H. , Wuethrich, K. (1989). Stereospecific nuclear magnetic resonance assignments of the methyl groups of valine and leucine in the DNA-binding domain of the 434 repressor by biosynthetically directed fractional 1 3 C labeling. Biochemistry. 28:7510-6. 86 References Nunn, M . F., Seeburg, P. PL, Moscovici, C , Duesberg, P. H. (1983). Tripartite Structure of the Avian Erythroblastosis Virus-E26 Transforming Gene. Nature. 306:391-5. Ouali, M . , King, R. D. (2000). Cascaded multiple classifiers for secondary structure prediction. Protein Science. 9:1162-76. Pascal, S. M . , Muhandiram, D. R., Yamazaki, T., Kay, J. D. F., Kay, L . E. (1994). Simultaneous Acquistion of 1 5 N and 1 3C-Edited NOE Spectra of Proteins Dissolved in H 2 0 . Journal of Magnetic Resonance. 103:197-201. Pelton, J. G., Torchia, D. A. , Meadow, N . D., Roseman, S. (1993). Tautomeric states of the active-site histindines of phosphorylated and unphosphorylated III Glc, a signal-transducing protein from Escherichia coli, using two-dimensional heteronuclear N M R techniques. Protein Science. 2:543-58. Petersen, J. M . , Skalicky, J. J., Donaldson, L . W., Mcintosh, L . P., Alber, T., Graves, B. J. (1995). Modulation of transcription factor Ets-l D N A binding: DNA-induced unfolding of an alpha helix. Science. 269:1866-9. Pickart, C. (2004). Back to the future with ubiquitin. Cell. 23:181-90. Poirel, FL, Lopez, R. G., Lacronique, V. , Delia Valle, V. , Mauchauffe, M . , Berger, R., Ghysdael, J., Bernard, O. A. (2000). Characterization of a novel ETS gene, T E L B , encoding a protein structurally and functionally related to TEL. Oncogene. 19:4802-6. Polo, S., Sigismund, S., Faretta, M . , Guidi, M . , Capua, M . R., Bossi, G., Chen, EL, De Camilli, P., Di Fiore, P. P. (2002). A single motif responsible for ubiquitin recognition and monoubiquitination in endocytic proteins. Nature. 416:451-5. Potter, M . D., Buijs, A. , Kreider, B., van Rompaey, L. , Grosveld, G. C. (2000). Identification and characterization of a new human ETS-family transcription factor, TEL2, that is expressed in hematopoietic tissues and can associate with TEL1/ETV6. Blood. 95:3341-8. Raiborg, C , Bache, K. G., Gillooly, D. J., Madshush, I. H. , Stang, E., Stenmark, H . (2002). Hrs sorts ubiquitinated proteins into clathrin-coated microdomains of early endosomes. Nature Cell Biology. 4:394-8. Rosmarin , A . G., Resendes, K. K. , Yang, Z. F., McMillan, J. N . , Fleming, S. L . (2004). GA-binding protein transcription factor: a review of G A B P as an integrator of intracellular signaling and protein-protein interactions. Blood Cells Molecules and Diseases. 32:143-54. Sattler, M . , Schleucher, J., Griesinger, C. (1999). Heteronuclear multidimensional N M R experiments for the structure determination of proteins in solution employing pulsed field gradients. Progress in NMR Spectroscopy. 34:93-158. 87 References Sawada, J., Simizu, N . , Suzuki, F., Sawa, C , Goto, M . , Hasegawa, M . , Imai, T., Watanabe, FL, Handa, H . (1999). Synergistic transcriptional activation by hGABP and select members of the activation transcription factor/cAMP response element-binding protein family. Journal of Biological Chemistry. 274:35475-82. Schwartz, D. C., Hochstrasser, M . (2003). A superfamily of protein tags: ubiquitin, SUMO and related modifiers. Trends in Biochemical Sciences. 28:321-8. Seidel, J. J., Graves, B. J. (2002). An ERK2 docking site in the Pointed domain distinguishes a subset of ETS transcription factors. Genes & Development. 16:127-37. Shih, S. C., Katzmann, D. J., Schnell, J. D., Sutanto, M . , Emr, S. D., Hicke, L . (2002). Epsins and Vps27p/Hrs contain ubiquitin-binding domains that function in receptor endocytosis. Nature Cell Biology. 4:389-93. Skrynnikov, N . R., Goto, N . K. , Yang, D. W., Choy, W. Y. , Tolman, J. R., Mueller, G. A. , Kay, L . E. (2000). Orienting domains in proteins using dipolar couplings measured by liquid-state NMR: Differences in solution and crystal forms of maltodextrin binding protein loaded with beta-cyclodextrin. Journal of Molecular Biology. 295:1265-73. Slupsky, C. M . , Gentile, L . N . , Mcintosh, L . P. (1998). Assigning the N M R spectra of aromatic amino acids in proteins: analysis of two Ets pointed domains. Biochemistry and Cell Biology-Biochimie et Biologie Cellulaire. 76:379-90. Spiegelman, B. M . , Heinrich, R. (2004). Biological control through regulated transcriptional coactivators. Cell. 119:157-67. Thompson, C. C., Brown, T. A. , Mcknight, S. L. (1991). Convergence of Ets-Related and Notch-Related Structural Motifs in a Heteromeric DNA-Binding Complex. Science. 253:762-8. Tootle, T. L. , Rebay, I. (2005). Post-translational modifications influence transcription factor activity: a view from the ETS superfamily. Bioessays. 27:285-98. Triezenberg, S. J., Lamarco, K. L. , Mcknight, S. L . (1988). Evidence of D N A - Protein Interactions That Mediate Hsv-1 Immediate Early Gene Activation by Vpl6 . Genes & Development. 2:730-42. Virbasius, J. V. , Scarpulla, R. C. (1990). The Rat Cytochrome-c-Oxidase Subunit-Iv Gene Family - Tissue-Specific and Hormonal Differences in Subunit-IV and Cytochrome-c Messenger-RNA Expression. Nucleic Acids Research. 18:6581-6. Virbasius, J. V. , Scarpulla, R. C. (1991). Transcriptional Activation through Ets Domain Binding-Sites in the Cytochrome-c-Oxidase Subunit-IV Gene. Molecular and Cellular Biology. 11:5631-8. 88 References Virbasius, J. V. , Virbasius, C. M . A. , Scarpulla, R. C. (1993). Identity of Gabp with Nrf-2, a Multisubunit Activator of Cytochrome-Oxidase Expression, Reveals a Cellular Role for an Ets Domain Activator of Viral Promoters. Genes & Development. 7:380-92. Vuister, G. W., Wang, A. C , Bax, A . (1993). Measurement of 3-bond nitrogen carbon J-15 13 couplings in proteins uniformly enriched in N and C. Journal of the American Chemical Societyll5:5334-5. Wasylyk, B., Hagman, J., Gutierrez-Hartmann, A. (1998). Ets transcription factors: nuclear effectors of the Ras-MAP-kinase signaling pathway. Trends in Biochemical Sciences. 23:213-6. Watanabe, H. , Imai, T., Sharp, P. A. , Handa, H . (1988). Identification of 2 Transcription Factors That Bind to Specific Elements in the Promoter of the Adenovirus Early-Region 4. Molecular and Cellular Biology. 8:1290-300. Watanabe, H. , Wada, T., Handa, H. (1990). Transcription Factor E4TF1 Contains 2 Subunits with Different Functions. EMBO Journal. 9:841-7. Willard, L. , Ranjan, A. , Zhang, H . Y. , Monzavi, H. , Boyko, R. F., Sykes, B. D., Wishart, D. S. (2003). V A D A R : a web server for quantitative evaluation of protein structure quality. Nucleic Acids Research. 31:3316-9. Wishart, D., Richards, F., Sykes, B. (1992). The chemical shift index: a fast and simple method for the assignment of protein secondary structure through N M R spectroscopy. Biochemistry. 31:1647-51. Yamazaki, T., Foreman-Kay, J. D., Kay, L . (1993). Two-Dimensional N M R Experiments for Correlating 1 3 C P and ' H 5 / e Chemical Shifts of Aromatic Residues in 1 3C-Labelled Proteins via Scalar Couplings. Journal of the American Chemical Society 115:11054-5. Zwahlen, C , Gardner, K . H. , Sarma, S. P., Horita, D. A. , Byrd, R. A. , Kay, L . E. (1998). An •I O N M R experiment for measuring methyl-methyl NOEs in C-labeled proteins with high resolution. Journal of the American Chemical Society. 120:7617-25. Zwahlen, C , Legault, P., Vincent, S. F. J., Greenblatt, J., Konrat, R., Kay, L. E. (1997). Methods for measurement of intermolecular NOEs by multi-nuclear N M R spectroscopy: application to a bacteriophage L N-peptide/boxB R N A complex. Journal of the American Chemical Society. 119:6711-21. 89 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0092078/manifest

Comment

Related Items