Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Structural characterization of DNA binding and autoinhibition by the Ets1 transcription factor Desjardins, Geneviève 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2015_may_desjardins_genevieve.pdf [ 15.85MB ]
Metadata
JSON: 24-1.0166107.json
JSON-LD: 24-1.0166107-ld.json
RDF/XML (Pretty): 24-1.0166107-rdf.xml
RDF/JSON: 24-1.0166107-rdf.json
Turtle: 24-1.0166107-turtle.txt
N-Triples: 24-1.0166107-rdf-ntriples.txt
Original Record: 24-1.0166107-source.json
Full Text
24-1.0166107-fulltext.txt
Citation
24-1.0166107.ris

Full Text

STRUCTURAL CHARACTERIZATION OF DNA BINDING AND AUTOINHIBITION BY THE ETS1 TRANSCRIPTION FACTOR   by   Geneviève Desjardins     BA, Université de Montréal, 2005 Ms. Sc., Université de Montréal, 2009   A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF   DOCTOR OF PHILOSOPHY  in  The Faculty of Graduate and Postdoctoral Studies   (Biochemistry and Molecular Biology)   THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)   April 2015   Geneviève Desjardins, 2015  ii Abstract   Ets1 belongs to the ETS transcription factor family and plays key roles in regulating eukaryotic gene expression. The affinity of the Ets1 for its cognate DNA sites is autoinhibited by an intrinsically disordered serine-rich region (SRR) and an appended helical inhibitory module (IM). Through transient interactions, the SRR both sterically blocks the ETS domain and allosterically stabilizes the IM to modulate DNA-binding affinity. Calmodulin-dependent kinase II phosphorylation of five serines within the SRR progressively reinforces autoinhibition in response to calcium signaling.   Using mutagenesis and quantitative DNA-binding measurements, we demonstrate that phosphorylation-enhanced autoinhibition requires the presence of phenylalanine/tyrosine (ϕ) residues adjacent to the SRR phosphoacceptor serines. The introduction of additional phosphorylated Ser-ϕ-Asp, but not Ser-Ala-Asp, repeats within the SRR dramatically reinforces autoinhibition. NMR spectroscopic studies of phosphorylated and mutated SRR variants, both within their native context and as separate trans-acting peptides, confirmed that the aromatic residues and phosphoserines contribute to the formation of a dynamic complex with the ETS domain. Complementary NMR studies also identified the SRR-interacting surface of the ETS domain, which encompasses its positively-charged DNA recognition interface and an adjacent region of neutral polar and nonpolar residues. Collectively, these studies highlight the role of aromatic residues and their synergy with phosphoserines in an intrinsically disordered regulatory sequence that integrates cellular signaling and gene expression.   We also investigated by NMR spectroscopy the interaction of Ets1 with specific and non-specific oligonucleotides. Upon binding DNA, helices HI-1 and HI-2 of the IM unfold. Thus, autoinibition does not impart DNA-binding specificity. Using amide chemical shift perturbation mapping, we also show that Ets1 binds both specific and non-specific oligonucleotides through its canonical ETS domain interface. However, the non-specific complex is formed by weak and dynamic electrostatic interactions, whereas the specific complex involves well-ordered hydrogen bonds and salt bridges. In support of this conclusion, five lysine sidechains are protected from rapid hydrogen exchange upon binding of specific DNA, whereas only one is stabilized in the non-specific complex. Overall, these data are consistent with Ets1 rapidly finding specific DNA sites within the genome via facilitated diffusion (sliding and hopping) within a vast background of non-specific sequences.   iii  Preface    Chapter 2 is based on an extended reformatted version of the following paper: Desjardins G, Meeker CA, Bhachech N, Currie SL, Okon M, Graves BJ, McIntosh LP. (2014) Synergy of aromatic residues and phosphoserines within the intrinsically disordered DNA-binding inhibitory elements of the Ets1 transcription factor. Proc Natl Acad Sci U S A. (PNAS) 111(30), 11019-24.   All experiments involving EMSA DNA binding assays shown in chapter 3 were carried out in the laboratory of Barbara Graves at the University of Utah.  All mutants shown in tables 2.1 and 2.2 were made, expressed, purified and DNA-binding tested by Charles Meeker, except for 1 mutant.  The ΔN279 4φV mutant was cloned at UBC.  Expression, purification and EMSA binding assays for this mutant were carried out by Simon Currie. Competition DNA binding in Chapter 3 for the trans system were performed by Niraja Bhachech.    All NMR experiments pertaining to the disordered SRR were performed by myself at UBC with the following exceptions. Desmond Lau helped to express and purify the proteins involved in the trans system NMR characterization.  Mark Okon provided his assistance to record all NMR data. Dr. Graves, Dr. McIntosh and myself wrote the paper.    Chapter 3 on DNA binding is based on experiments carried out at UBC.  I expressed and purified all protein/DNA complexes shown in the chapter. Mark Okon designed new NMR pulse sequences and provided extensive help to collect the NMR data. I analyzed all the data presented in that chapter. Manuscript is under preparation.  iv Table of contents Abstract......................................................................................................................................... ii	  Preface .........................................................................................................................................iii	  Table of contents ......................................................................................................................... iv	  List of tables................................................................................................................................viii	  List of figures ............................................................................................................................... ix	  Glossary......................................................................................................................................xiii	  Acknowledgements..................................................................................................................... xv	  Chapter 1: Introduction ................................................................................................................. 1	  1.1 Regulation of gene transcription ......................................................................................... 1	  1.1.1 Transcription in eukaryotes.......................................................................................... 1	  1.1.2 Epigenetics and chromatin organisation...................................................................... 3	  1.2 ETS transcription factors .................................................................................................... 5	  1.2.1 ETS factors and DNA recognition................................................................................ 8	  1.2.2 Ets1.............................................................................................................................. 9	  1.2.3 Modular organization of Ets1 ....................................................................................... 9	  1.2.4 Ets1 PNT domain and Ras-dependent signaling....................................................... 11	  1.2.5 Autoinhibition of Ets1 DNA-binding............................................................................ 11	  1.2.6 Regulation of Ets1 autoinhibition through protein-partnership................................... 15	  1.3. Transcription factors and DNA binding in vivo................................................................. 17	  1.3.1 The specificity paradox .............................................................................................. 17	  1.3.2 Finding the right binding site in vivo........................................................................... 17	  1.4 Intrinsically disordered proteins ........................................................................................ 20	  1.4.1 Structural characterization of IDPRs.......................................................................... 20	  1.4.2 IDPRs interact with other proteins ............................................................................. 22	  1.4.3 From static to dynamic complexes: different levels of fuzziness ............................... 23	   v 1.4.4 Regulation of IDPRs .................................................................................................. 25	  1.5 Hypotheses and goals ...................................................................................................... 26	  1.5.1 Investigating how the disordered SRR drives Ets1 autoinhibition ............................. 26	  1.5.2 Investigating the role of Ets1 dynamics in binding specific and non-specific DNA .... 28	  Chapter 2: Synergy of aromatic residues and phosphoserines within the intrinsically disordered DNA-binding inhibitory elements of Ets1 .................................................................................... 30	  2.1 Introduction ....................................................................................................................... 30	  2.2 Role of charged and aromatic residues in autoinhibition .................................................. 32	  2.3 Aromatic residues and phosphorylation mediate intramolecular interactions................... 35	  2.3.1 Chemical shift perturbation studies............................................................................ 35	  2.3.2 Paramagnetic relaxation enhancement (PRE) studies .............................................. 39	  2.3.3 The SRR undergoes sub-nsec timescale motions that dampen slightly upon phosphorylation .................................................................................................................. 42	  2.4 Structural characterization of the trans-system ................................................................ 44	  2.4.1 The trans-SRR system recapitulates autoinhibition................................................... 44	  2.4.2 The trans-SRR peptides interact with the ETS domain through phosphoserines and aromatic residues ............................................................................................................... 45	  2.4.3 The SRR interacts in trans and in cis with the same interfaces of the ETS domain.. 48	  2.4.4 Phosphoserines and aromatic residues form the ΔN301-binding interface of the trans-SRR peptides...................................................................................................................... 50	  2.5 Trans-SRR peptides are intrinsically disordered and form dynamic “fuzzy” complexes with ΔN301 ..................................................................................................................................... 54	  2.6 Coupled roles of the IM and HI-1 in autoinhibition............................................................ 57	  2.7 Discussion ........................................................................................................................ 60	  2.7.1 Aromatic residues in the SRR are critical for phosphorylation-dependent autoinhibition....................................................................................................................... 60	  2.7.2 Fuzzy interactions between the SRR and ETS domain............................................. 60	   vi 2.7.3 Mechanisms of phosphoserine-aromatic synergy: π-cation interactions? ................. 62	  2.7.4 Mechanisms of phosphoserine-aromatic synergy: intramolecular salting-out? ......... 64	  2.7.5 Aromatic residues in intrinsically disordered regulatory sequences .......................... 64	  2.7.6 ETS factor autoinhibition............................................................................................ 64	  2.8 Material and methods ....................................................................................................... 65	  2.8.1 Unlabeled synthetic SRR peptides ............................................................................ 65	  2.8.2 Protein expression and purification............................................................................ 66	  2.8.3 13C/15N-labeled SRR peptides.................................................................................... 68	  2.8.4 Phosphorylation of the ΔN279 constructs and SRR peptides.................................... 69	  2.8.5 Electrophoretic mobility shift assays (EMSAs) .......................................................... 69	  2.8.6 NMR spectral assignments of ΔN279 4φA, ΔN279 H278/C350A/C416A and SRR peptides .............................................................................................................................. 70	  2.8.7 Backbone amide and sidechain 15N relaxation .......................................................... 71	  2.8.8 NMR-monitored peptide titrations .............................................................................. 71	  2.8.9 Paramagnetic relaxation experiments (PRE)............................................................. 71	  Chapter 3: DNA binding by Ets1................................................................................................. 73	  3.1 Introduction ....................................................................................................................... 73	  3.2 Preliminary characterization of Ets1/DNA complexes by NMR spectroscopy .................. 76	  3.2.1 Optimizing specific Ets1/DNA complexes.................................................................. 77	  3.2.2 Optimizing non-specific Ets1/DNA complexes........................................................... 80	  3.3 NMR spectral assignments............................................................................................... 83	  3.3.1 HI-1 and HI-2 unfold upon specific DNA binding ....................................................... 86	  3.3.2 Helices HI-1 and HI-2 also unfold upon non-specific DNA binding............................ 90	  3.4 Non-specific and specific DNA bind Ets1 at a similar canonical interface ........................ 91	  3.5 The dynamic properties of ΔN301 change upon DNA binding ......................................... 99	  3.6 Sidechains contribute to DNA recognition ...................................................................... 103	  3.6.1 Arginine and lysine sidechains at the DNA-binding interface .................................. 103	   vii 3.6.2 Arginine guanidinium groups have different relaxation properties that correlate with their DNA interactions....................................................................................................... 107	  3.7 Discussion ...................................................................................................................... 109	  3.7.1 HI-1 and HI-2 unfold upon DNA-binding .................................................................. 110	  3.7.2 Autoinhibition does not impart DNA-binding specificity ........................................... 111	  3.7.3 Ets1 binds specific and non-specific DNA through its canonical interface .............. 111	  3.7.4 Arginine and lysine sidechains reflect differential interactions of Ets1 with specific and non-specific DNA .............................................................................................................. 113	  3.7.5 Transitioning from non-specific to specific binding .................................................. 114	  3.8 Material and methods ..................................................................................................... 115	  3.8.1 Fully and partially deuterated 13C/15N-labelled ΔN301............................................. 115	  3.8.2 Duplex oligonucleotide preparation ......................................................................... 115	  3.8.3 NMR-monitored DNA titrations ................................................................................ 116	  3.8.4 NMR experiments for spectral assignments of the DNA/protein complexes ........... 117	  3.8.5 Relaxation measurements ....................................................................................... 117	  Chapter 4: Conclusions and future studies............................................................................... 118	  4.1 Autoinhibition of Ets1 through “fuzzy interactions” and aromatic/phosphoserine synergy.............................................................................................................................................. 118	  4.2 Future studies on the disordered SRR ........................................................................... 119	  4.3 Autoinhibition of Ets1 and the search for target site in the cell ....................................... 122	  4.4 Future studies on the DNA binding activity of Ets1......................................................... 123	  4.5 Ets1 as a model system for understanding intrinsically disordered regions and fuzzy regulatory interactions .......................................................................................................... 124	  Bibliography .............................................................................................................................. 126	    viii List of tables  Table 2.1 Mutations demonstrate a critical role for SRR aromatics in ΔN279 phosphorylation-enhanced autoinhibition of DNA binding ................................. 33 Table 2.2 Additional Ser-φ-Asp repeats mediate higher levels of phosphorylation-dependent autoinhibition of DNA binding ............................................................................ 35 Table 2.3 ΔN279 constructs made for PRE experiments with MTSL spin label ................ 39 Table 2.4 Dissociation constants (KD) for the SRR peptides with ΔN301 .......................... 46 Table 2.5 Phosphorylation and ΔN301-binding does not markedly change the X-Pro cis/trans conformational equilibria of the SRR peptides .................................... 54 Table 2.6 The various Ets1 constructs produced at UBC .................................................. 68 Table 3.1 Ets1 constructs and DNA sequences used in DNA-binding studies .................. 77 Table 3.2 Sequences and predicted molar absorptivities of the duplex DNAs used in Chapter 3 ......................................................................................................... 116  ix List of figures  Figure 1.1 Cartoon representation of transcription in eukaryotes ......................................... 2     Figure 1.2 Epigenetic modifications determine which genes are activated or silenced ........ 4 Figure 1.3 Nomenclature and domain organization of the 28 paralogous human ETS factors .............................................................................................................................. 7 Figure 1.4 Structural overview of Ets1 ................................................................................ 10 Figure 1.5 Autoinhibition is a regulatory mechanism that is modulated through post- translational modifications, proteolysis and protein partnerships ...................... 12  Figure 1.6 Dynamic model of phosphorylation-enhanced autoinhibition of Ets1 DNA binding ............................................................................................................................ 13 Figure 1.7 Autoinhibition of Ets1 is relieved by protein-partnerships .................................. 16 Figure 1.8 Cartoon representation of the three mechanisms to facilitate target sites recognition by TFs ............................................................................................. 18  Figure 1.9 Examples of different types of fuzzy complexes ................................................ 24  Figure 1.10 Sequence conservation of the disordered SRR ................................................ 27 Figure 2.1 Schematic representation of Ets1 ...................................................................... 31  Figure 2.2 Overlaid 15N-HSQC spectra of ΔN2790P, ΔN2792P, ΔN2790P 4ϕA and ΔN2792P 4ϕA .................................................................................................................... 36 Figure 2.3 Chemical shift perturbations (CSPs) define the phosphorylation-dependent intramolecular interaction surface for the wild-type and 4φA-mutant SRR within ΔN279 ................................................................................................................ 37 Figure 2.4 Electrostatic and hydrophobic/van der Waals forces contribute to the SRR-ETS domain interactions ........................................................................................... 38 Figure 2.5 Paramagnetic relaxation enhancements (PRE) studies define the intramolecular interaction surface within ΔN279 ....................................................................... 41  x Figure 2.6 Phosphorylation partially dampens fast backbone motions of the dynamic SRR in ΔN279-4ϕA ........................................................................................................ 43 Figure 2.7 Trans-inhibition of ΔN301 DNA binding by the SRR peptide requires both phosphorylation and aromatic residues ............................................................. 45 Figure 2.8 Interactions of the SRR peptides for the ETS domain measured by NMR spectroscopy ..................................................................................................... 46 Figure 2.9 Intermolecular interactions of the trans-SRR peptides with the ETS domain are dependent upon adjacent aromatic residues and phosphoserines ................... 47 Figure 2.10 Interactions of the trans-SRR peptides with ΔN301 measured by NMR spectroscopy ..................................................................................................... 49 Figure 2.11 Assigned 15N-HSQC spectra of the (A) SRR0P and (B)  SRR2P trans-peptides . 51 Figure 2.12 Titration of the trans-SRR peptides with ΔN301 monitored by NMR spectroscopy ............................................................................................................................ 52  Figure 2.13 Charged and aromatic residues in the trans-SRR peptides interact with the ETS domain ............................................................................................................... 53 Figure 2.14 The SRR is predominantly disordered, even when phosphorylated and bound as trans-peptides to ΔN301 or in the intermolecular context of ΔN279 ................. 55 Figure 2.15 Aromatic residues dampen fast motions in the trans SRR peptides ................. 56 Figure 2.16 The inhibitory module stabilizes the folding of the ETS domain ........................ 59 Figure 2.17 Possible mechanisms for synergistic autoinhibition of Ets1 .............................. 62 Figure 2.18 Positively charged arginine and lysines side chains could interact with the SRR through π-cation interactions ............................................................................. 63 Figure 3.1 Schematic view of a transcription factor interacting with a large DNA molecule in dilute solution ..................................................................................................... 74 Figure 3.2 Formation of a specific Ets1/DNA complex ....................................................... 78 Figure 3.3 Optimization of the protein fragment and oligonucleotide size for NMR studies of the specific Ets1/DNA complex ......................................................................... 80  xi Figure 3.4 Ets1 binds non-specific DNAs of differing length with varying affinity ............... 82 Figure 3.5 Assigned amide 15N-HSQC spectrum of the specific complex (ΔN301 with SC1-12) ..................................................................................................................... 85 Figure 3.6 Assigned amide 15N-HSQC spectrum of the non-specific complex (ΔN301 with NS2-12) ............................................................................................................. 86 Figure 3.7 Chemical shift-based prediction of the secondary structure and backbone  dynamics of free ΔN301 .................................................................................... 87 Figure 3.8 Helices HI-1 and HI-2 unfold when ΔN301 binds SC1-12 ................................. 89 Figure 3.9 Although predominantly unfolded, residues corresponding to helices HI-1 and HI-2 undergo conformational exchange in the ΔN301/SC1-12 complex ................ 90 Figure 3.10 Helices HI-1 and HI-2 unfold when ΔN301 binds NS2-12 ................................. 91 Figure 3.11 Ets1 binding to non-specific and specific DNAs ................................................ 93 Figure 3.12 Amide chemical shift perturbations resulting from binding of specific and non-specific DNA to ΔN301 ...................................................................................... 94 Figure 3.13 Mapping of ΔN301 amide CSPs due to binding of specific and non-specific DNA ............................................................................................................................ 95 Figure 3.14 Comparison of the magnitudes and signs of the ΔN301 amide CSPs resulting from binding specific versus non-specific DNA ................................................. 98 Figure 3.15 Changes in the RSI-S2 squared order parameter of ΔN301 upon binding specific and non-specific DNA ...................................................................................... 100 Figure 3.16 Amide heteronuclear 15N-NOE measurements indicate that helices HI-1 and HI-2of the inhibitory module are unfolded, but not conformationally unrestricted, in the Ets1 DNA complexes ....................................................................................... 102 Figure 3.17 Characterizing the interactions of ΔN301 lysine sidechains with DNA ............ 104 Figure 3.18 Characterizing the interactions of ΔN301 arginine sidechains with DNA ........ 106  xii Figure 3.19 Dynamic properties of arginine sidechains upon binding of ΔN301 to specific and non-specific DNA ............................................................................................. 108 Figure 3.20 The heteronuclear {1H}15N-NOE spectra of ΔN301 bound to specific DNA and non-specific DNA ............................................................................................. 109  Figure 4.1 Investigation of the physical mechanism driving the synergistic interaction between the aromatic residues and phosphoserines with the trans-system ... 121  xiii Glossary  CamKII: calmodulin kinase II CBP: CREB binding protein CD: circular dichroism  D2O: deuterium oxide Da: Dalton DNA: deoxyribonucleic acids DTT: dithiothreitol EDTA: ethylenediaminetetraacetic acid EMSA: electrophoretic mobility shift assay ESI-MS: electrospray ionization mass spectrometry ETS: E26 transforming sequence ERK2: mitogen activated protein kinase 1 FRET: Förster resonance energy transfer Gd(DTPA-BMA): gadolinium(III) 5,8-bis(carboxylatomethyl)-2-[2-(methylamino)-2-oxoethyl]-10-oxo-2,5,8,11-tetraazadodecane-1-carboxylate hydrate (i.e. gadodiamide or omniscan) HAT: histone acetylase or histone acetyl transferase HDAC: histone deacetylase HSQC: heteronuclear single quantum coherence  IDPR: intrinsically disordered protein region IM: inhibitory module IPTG isopropyl-βD-thiogalactopyranoside ITC: isothermal titration calorimetry KD: dissociation constant MALDI-TOF: matrix-assisted laser desorption/ionization-time of flight MES: 2-(N-morpholino)ethanesulfonic acid MICS: motif identification from chemical shift MTSL: S-(1-oxyl-2,2,5,5-tetramethyl-2,5-dihydro-1H-pyrrol-3-yl)methyl methanesulfonothioate NMR: nuclear magnetic resonance NOE: nuclear Overhauser effect NOESY: nuclear Overhausser effect spectroscopy PCR: polymerase chain reaction PDB: protein database PNT domain: pointed domain  xiv PTM: post-translational modifications PRE: paramagnetic relaxation experiments RCI-S2: chemical shift based prediction of the model-free order parameter SAXS: small angle X-ray scattering SDS PAGE: sodium dodecyl sulfate polyacrylamide gel electrophoresis SRR: serine-rich region SUMO: small ubiquitin related modifier TF: transcription factors TOCSY: total correlation spectroscopy TROSY: transverse relaxation-optimized spectroscopy UBC9: SUMO-conjugating enzyme 9 wHTH: winged helix-turn-helix motif   xv Acknowledgements   I would first like to thank my research supervisor, Lawrence McIntosh, for having me in his lab.  Under his direction, I have learned to tackle new challenges and become a more accomplished scientist, in general, and as an NMR spectroscopist.  I would also like to thank Mark Okon, for always being willing to explain (and repeat) some NMR subtleties or pulse sequences.  Your help was invaluable. Furthermore, I wish to thank all the members of this lab, past and current (Hanso, Helen, Jerome, Patrick, Adrienne, Desmond, Cecilia, Soumya, Jacob, Florian, Laura and Myriam) for always being there to discuss science and other aspects of life, that it be in the lab or in front of a beer.  You all made these years in Vancouver interesting and full of laughter.  So thank you again for making this a great time of my life!  I would like to especially thank Eric Escobar for being such a great friend and co-worker. You really made a difference by listening to me rant about science and then showed me that I still like science after all these years.     Finally, I wish to thank Simon, my partner in life, and Stephanie, my daughter, for sharing me with science. You have both been very patient and always full of encouragements, especially when I was exasperated by science. I must mention at this point that Simon has always been supportive of my endeavours and is always there when I need a pep-talk. He also reminds me regularly why I am doing a Ph.D and tells me how awesome I am.  Thank you to all the members of our family for being there, traveling regularly to see me and being there when it matters.  These studies and, especially this thesis, were possible because of the significant financial contributions made by various organisms.  I sincerely wish to thank the Canadian Institute of Health Research (CIHR), the Canadian Cancer Society (CCS), the Faculty of Graduate and Postdoctoral Studies (FOGS+PS), as well as, the department of biochemistry and molecular biology of UBC for their financial support.  Thank you everyone for all your support and encouragements, they made all the difference.  1 Chapter 1: Introduction   1.1 Regulation of gene transcription    Transcription is a process that is inherent to all cells.  At any given time, genes will be read and transcribed into mRNA to maintain cell homeostasis and respond to external and internal stimuli. Transcription is a complex process that is tightly regulated by the interplay of a myriad of proteins known as transcription factors (TF). The term transcription factor encompasses all proteins involved in regulation of gene expression. Due to their critical roles in regulating specific gene expression, about 10% of all proteins expressed in eukaryotes are TFs. Many TFs bind specific genomic sites, such as promoters or enhancers, through their DNA-binding domains and interact with other components of the transcriptional machinery through protein-protein interaction domains (PPI). TFs also respond to signalling cascades and thereby help integrate cellular cues with the activation or repression of gene expression. Transcription regulation involves many complex mechanisms that are required for normal gene expression, and the disruption of these mechanisms frequently leads to cancer.  1.1.1 Transcription in eukaryotes    Eukaryotes genes are transcribed by three different polymerases, but RNA polymerase II (RNA pol II) is responsible for transcription of most mRNAs, small nucleolar RNAs (snoRNAs), some small nuclear RNAs (snRNAs), and microRNAs.  Transcription levels are modulated by protein-DNA associations, protein-protein interactions and post-translation modifications in response to signaling events in the cell.   Eukaryotes gene transcription requires the interaction of several proteins with DNA as well as other proteins known as general transcription factors for polymerase II (TFII), gene specific transcription factors, co-activator and co-repressors.  Transcription is initiated by transcription factor’s recognition of specific DNA sequence termed core promoters elements such as the TATA box, Initiator element (Inr) and TFIIB recognition element (BRE) (Figure 1A). Theses core promoters are localized near the start site (+1) of the gene being expressed. Promoter proximal elements, which are located 70-200 bp 5′ of the core promoter, generally bind different transcription factors that will increase the frequency of transcription initiation events.  This increase in transcription can be mediated directly by theses transcription factors or  2 by recruiting enhancer elements through long-range interactions.    Figure 1.1: Cartoon representation of transcription in eukaryotes.  (A) Typical gene organization in eukaryotes include enhancers and silencers located upstream of the promoter. The proximal promoter recruits transcription factors that modulate the initiation frequency, whereas the core promoter, which is nearby the start site for transcription, is used to recruit the transcription preinitiation complex (PIC). All theses elements are located between insulator regions to prevent cross-activation or repression between nearby genes. (B) TBP, a subunit of TFIID, recognizes the TATA box element in the core promoter region and in turn favors sequential binding of TFIIA, TFIIB, TFIIF and RNA pol II. TFIIE and TFIIH are then recruited along with the mediator complex to form the PIC and initiate transcription. Specific transcription factors such as AP-1 and Ets1 will be recruited to proximal promoters and distal enhancers.  In turn, they will recruit co-activators or co-repressors such as p300/CBP and remodeling complexes to enable line specific programs. Upon hyperphosphorylation of the C-terminal domain of RNA pol II (CTD) (red circles with a P), RNA pol II clears the promoter and initiates the elongation phase of transcription.  Dark green ovals with black line roll around are histones.   Enhancer regulatory elements can be located up to 100 kb from the transcription initiation site and are instrumental in mediating gene expression for different cells types (Figure 1A). Long-range regulatory elements in eukaryotes include enhancers, silencers and insulators. Insulator InsulatorGeneEnhancer or Silencer Proximal promoter Core Promoter+1 site5’- -3’ATATA Box / DPE / InrCCAAT Box / GC / BLE-200 bp to -70 bp -31bp to -26 bp-1000 bp to -700 bpInsulatorInsulatorEnhancer Proximal promoter TATA Gene+1 sitePPP CTDAP-1 TFIID(TBP)RNA pol IIMediatorp300(CBP)TFIIA TFIIBTFIIFTFIIHTFIIEEts1B 3 Enhancers are around 500 bp in length and can be located upstream, downstream or within an intron.  They contain about 10 binding sites for different transcription factors and increase gene promoter activity either in all tissues or in a tissue-specific manner. Silencers refer to similar elements that repress gene activity.  Insulators, which are 300 bp to 2 kb in length, are regions that mark the boundaries between euchromatin and heterochomatin and insulate genes from one another to prevent cross-interaction between enhancers and silencers of different genes.   During initiation, TFIID will recognize the core promoter elements such as the TATA box, which serves as a platform to sequentially recruit TFIIA, TFIIB, TFIIF, the RNA polymerase II and several other transcription associated factors (TAFs), including the mediator complex (Figure 1.1B).  The recruitment of theses general transcription factors stabilize the TFIID-TATA box interaction and facilitate the start site selection. This in turn, allows the recruitment of TFIIE and TFIIH, which phosphorylates the C-terminal domain of RNA pol II (CTD), unwinds the DNA to separate the two strands and creates the transcription bubble. All together, these transcription factors form the pre-initiation complex (PIC). Upon hyperphosphorylation of several serines in the heptapeptide repeats of the CTD, RNA pol II clears the promoter and initiates the elongation phase of transcription. TFIID dissociates from the RNA pol II 10 bases downstream of the start site, where it remains bound to DNA with most of the transcription factors in the PIC.  This dissociation allows the PIC to be reassembled quickly and initiate new rounds of transcription. The amount of genes transcribed is further modulated by post-translational modifications and the presence of co-activators or co-repressors at enhancer and distal sites. The expressed mRNA are then exported to the cytosol and translated by the ribosomes into actively folded proteins.   1.1.2 Epigenetics and chromatin organisation    Due to the extensive size of the genome (e.g. 3.3 billion bp for humans) that needs to be compacted in a very small nucleus, DNA is wrapped around histone octamers to form nucleosomes. These in turn are supercoiled to form chromatin filaments and maximize DNA condensation. Nucleosomes are not equally distributed in chromatin and are dynamically repositioned to increase or decrease expression of genes specific for each cell line program. Moreover, upon elongation, nucleosomes are removed or reposition to avoid stalling of transcription by RNA pol II by remodelling complexes such as SWI/SNF. As a result, most of the genome spend its time being tightly packed as heterochromatin and will generally be silenced and unavailable for transcription. (Figure 1.2A).   4  Figure 1.2: Epigenetic modifications determine which genes are activated or silenced. (A) Chromatin is inaccessible to TFs when supercoiled into the heterochromatin state (right) and thus genes are silenced. In contrast, they are accessible and can be transcribed when less tightly coiled in the euchromatin state (left). (B) Histone acetyltransferases (HATs) modify lysines in the disordered tail of histones which in turn recruit remodelling complexes to modify chromatin structure. (C) Upon opening of the DNA, promoters segments, including the TATA box, become available and can be bound by specific and general TFs to initiate or repress gene transcription. (D) Pioneer transcription factors (PTFs) can also bind to condensed DNA and initiate gene transcription. Adapted by permission from Macmillan Publishers Ltd: Nature Reviews Genetics, Bell et al., 2011, 12(8), 554-564 © 2011   In contrast, a relatively small fraction exists as transcriptionally-active euchromatin. Post-translational modifications, such as DNA methylation and histone acetylation, methylation and phosphorylation, play an important role in determining which regions will become accessible in different cell lines and in response to different signalling events. Typically, histone acetylase (HAT) will acetylate histones at key positions in their unstructured tails to loosen the heterochromatin (Figure 1.2B). This change in chromatin structure opens up regions that contain binding sites for various transcription factors (Figure 1.2C).  In turn, theses will bind and recruit the transcription machinery. Inversely, histone deacetylase (HDAC) will remove acetyl groups on lysines to compact the DNA around histones and repress transcription.  HeterochromatinEuchromatinAB CHATTF PTFDTF 5 Methyltransferase (HMT) will similarly modify lysines in the histone tails, but they can either repress or stimulate transcription depending on the sites being modified. This regulation of gene expression by such heritable modifications is termed epigenetic.   Recent studies have demonstrated that some transcription factors can bind at specific enhancer sites, even when the chromatin is condensed (Figure 1.2D). Such TF are referred as pioneer transcription factors and can act by passive and/or active effects. Their passive mode of action involves the cooperative recruitment of other transcription factor to enhancers and decreasing the response time to signalling events by reducing the amount of TF necessary to create an active enhancer. In the active mode, pioneer TFs can modify histones post-translationally, destabilize chromatin higher-order structure or recruit remodelling complex to “relax” the DNA at enhancer site. They can also recruit specific co-activator that activate lineage-specific transcriptional programs [1] and differentiation. PU-1, a transcription factor of the ETS family has been identified as pioneer TF and is necessary to activate the differentiation of haematopoietic stem cells in B-cell or macrophage lineage [2].   Transcription factors are often members of families sharing conserved DNA binding domains such as basic leucine zippers, zinc fingers, homeodomains, and winged helix-turn-helix motifs. The ETS family, which is studied in this thesis, belongs to the latter.  1.2 ETS transcription factors  The ETS (E26 transformation specific or simply E-twenty six) family is one of the largest families of transcription factors found in higher metazoans. Human have 28 paralogues defined by the presence of a conserved DNA-binding ETS domain [3] (Figure 1.3). ETS transcription factors are involved in regulating diverse processes including cellular proliferation, differentiation, apoptosis, angiogenesis, and haematopoiesis, and their aberrant activities are associated with metastatic invasion and the migration of tumors. Indeed, over 700 target genes contain ETS binding sites (EBS) in their regulatory regions [3]. Genome wide analyses also demonstrated that different ETS factors have both redundant functions to control the expression of housekeeping genes and diverse functions associated with more specific cellular processes [4, 5]. This diversity is consistent with knockout mice having different phenotypes depending on which ETS factor is modified. Indeed, 23 of the 27 murine ETS genes have been genetically altered and all, except for Elf1 and Elk1, show different phenotypes. These studies also  6 confirmed that ETS factors are at the core of the gene regulatory networks controlling in particular hematopoietic specification, maintenance and differentiation [6, 7].   7    Figure 1.3: Nomenclature and domain organization of the 28 paralogous human ETS factors. The proteins are grouped into sub-families based on phylogenetic comparisons [8]. The ETS DNA binding domains are depicted in red and the serine rich regions (SRRs) in yellow. The PNT domains, which are present in about 1/3 of the family members, are in green, the OST domain in blue, and the B boxes in pink. Phosphorylated regions are identified schematically by the circled letter P and can contain more than one site of modification. Not shown are sites of additional PTMs such as acetylation and sumoylation. The sub-family containing Ets1, the subject of this thesis is boxed. Along with Ets2, a closely related member, these are the only two ETS factors possessing an appended SRR.  FEVFLI1ERGPETS domainPNT domainOST domainB boxPhosphoacceptors ERFETV3 (PE1, PEP1, METS)ETV3LETV2 (ER71)ELF1ELF2 (NERF)ELF4 (MEF)ELF3 (ESE1, ESX)ELF5 (ESE2)EHF (ESE3)SPDEF (PDEF)GABPA ELK1 PELK3 (NET, SAP2, ERP) PELK4 (SAP1) PSPIBSPICSPI1 (PU.1, Sfpi1) PETS1 P PETS2 P PETV6 (TEL)ETV7 (TEL2)PETV1 (ER81)ETV4 (PEA3, E1AF)ETV5 (ERM)PPPSRR region 8 Considering the myriad of critical functions affected by ETS factor, it is not surprising that any modification in their regulation generally leads to diseases including oncogenesis and cancer progression. Aside from the pivotal role played by ETS factors in hematopoiesis, several chromosomal rearrangements of the promoter region of TMPRSS2 (an androgen-responsive, prostate-specific gene) to the genes encoding the DNA binding domain of ERG and PEA3 subfamily members (ETV1, ETV4 and ETV5) are found in over 60% of all prostate tumors. The resulting high-level expression of these ETS factors drives oncogenesis [9-11]. Alternatively, fusion of either the PNT or ETS domain to other regulatory domains through chromosomal translocations have been identified and yield chimeric oncoproteins, as seen with ETV6-associated leukemia and Ewing's sarcoma [12-14]. The loss of proper DNA-binding regulation via mutations in auto-inhibitory domains, such as seen in viral v-ets, also leads to oncogenic activity [15]. Ultimately, all theses alterations of ETS transcription factors deregulate the expression level of ETS dependent genes. Since their aberrant expression is often involved in cell immortalization, cell cycle disruption, metastasis and migration of tumour, the overexpression of ETS factors is generally linked with a poor prognosis of cancer survival [16-21].   1.2.1 ETS factors and DNA recognition   All proteins in the ETS family contain a highly conserved DNA-binding ETS domain composed of ~85 amino acids that fold into three α-helices and four antiparallel β-strands. The resulting fold is often called a winged helix-turn-helix (wHTH) as helices H2 and H3 form a classic HTH DNA-binding motif and the intervening turn and a β-hairpin provide flanking "wings." DNA footprinting indicates that ETS domains bind a region spanning 12 to 15 bp, but displays sequence preference for only ∼9 bp with a central, invariant 5'GGA(A/T)3' core. Numerous crystal and NMR structures of ETS factors have been solved and illustrate how ETS domains bind DNA [22-28]. The DNA binding interface is recognized both by direct readout, which involves base-pair contact by ETS domain sidechains, and indirect readout, which utilizes the sequence-dependent positioning of the phosphodiester backbone. Most importantly direct interactions in the major groove involve hydrogen bonding between two invariant arginines in the DNA recognition helix (H3) and the two guanines of the 5'-GGA(A/T)-’3 core. Depending on the ETS factor, a conserved tyrosine or histidine in H3 also contacts the A/T base and confers specificity [29]. The loop (also known as “turn”) between helices H2 and H3, the “wing” between strands S3 and S4, and the N-terminus of helix H1 also provide additional direct and water-mediated hydrogen bonding, hydrophobic, and electrostatic interactions with the phosphate  9 backbone. Strikingly, there are no direct contacts outside of the central 5'GGA(A/T)3, thus implicating indirect sequence-dependent backbone configurations for conferring a limited specificity of different ETS domains for different DNA sequences flanking this central core. DNA specificity is also provided through many additional routes, including appended autoinhibitory modules, cell line specific expression, post-translational modifications and combinatorial partnerships with other transcription factors.  1.2.2 Ets1  Ets1 is the founding member of the ETS transcription factors. It contains 440 residues and is highly conserved with ~80-90% sequence identity between amphibians/fish and mice/humans. This similarity is conserved not only for the ordered ETS and PNT domains, but throughout the remainder of protein which is predominantly disordered. Such a high level of sequence similarity indicates that Ets1 is under an evolutionary pressure to conserve its overall sequence and that it possesses necessary non-redundant functions relative to other ETS factors [7]. The function of Ets1 is illustrated in part by knockout mouse models which have reduced number of B, T and NK cells, with partial lethality during embryogenesis [3, 6, 7, 30]. Ets1 therefore plays an important role in embryonic and post-natal development, as well as in control of haematopoiesis, especially for differentiation of B and T cells. Ets1 expression is found at high levels during embryonic and post-natal development and in immune tissues such as spleen, thymus and lymph node in adults.   1.2.3 Modular organization of Ets1    The modular organization of Ets1 is shown schematically in Figure 1.4. In addition to the ETS domain, the protein contains a helical bundle referred as PNT domain that mediates protein interactions. The Ets domain is flanked on both sides by auto-inhibitory sequences that form an appended helical inhibitory module (IM) on the opposite surface of the DNA-binding helix. The IM and an intrinsically disordered serine-rich region (SRR) at its N-terminus both contribute to the autoinhibition of DNA binding, as discussed in detail below. Structures have been solved for the PNT domain (Figure 1.4B) and ETS domain with and without the IM (Figure 1.4C) [31-35]. Secondary structure prediction softwares indicate that the remainder of the protein is intrinsically disordered; but this hasn’t been confirmed yet. These flexible regions contain several sites of post-translational modifications, including phosphorylation, sumoylation,  10 acetylation, ubiquitinylation and glycosylation [30, 36-42]. So-called transactivation domains have also been mapped to various regions of Ets1, although the only one studied in detail involves the N-terminal region. Along with the PNT domain, these N-terminal residues bind the co-activator CBP [31, 38, 41].    Figure 1.4: Structural overview of Ets1. (A) Ets1 contains a PNT domain (orange, green), serine rich region (SRR; yellow), inhibitory module (IM; cyan) and ETS domain (red) that are structurally characterized. Ets1 is post-translationally modified at several sites to regulate its functions. (B) The PNT domain is composed of two α-helices (orange) appended to a core helical bundle often called a SAM domain (green; PDB ID 2JV3). In response to Ras signaling, the MAP kinase ERK2 phosphorylates T38 and S41. This causes H0 (orange) to be displaced from the core of the PNT/SAM domain, increasing its affinity for the TAZ1 domain of the co-activator histone acetyl transferase CBP and leading to enhanced transcription of Ets1-regulated genes. (C) The IM, which contains four α-helices (HI-1/HI-2/H4/H5; cyan), is appended to the ETS domain (red). The marginally stable HI-1 and HI-2 pack on the surface distal to the DNA-binding interface and allosterically regulate DNA binding. A disordered SRR region (not shown) also stabilizes the IM and sterically blocks the DNA-binding interface of the ETS domain. In response to Ca+2 signaling, CaMKII progressively phosphorylates S251, S270, PNT domain SRR IM ETS domain IM 1 42 66 134 244 279 301 331 415 440 K15 T38 S41 K227 S251 S282 S285 Fold inhibited P  P P        P  P  331-440 1 301-440 2 279-440 20244-440 3P 500 279-440 2P 250 CamKIIERK2P 244-440 5P 1000P PP PS273 S270 K8Ubc9H1 S1 S2 H2 H3 S3 S4 H4 H5HI-2HI-1 SRR IMIMP q  H1H3H5HI-2HI-1 DNAH4H2H0 H1 H2 H3 H4 H5H2'P  P H2H0H1H3H4H5PPAB C 11 S273, S282 and S285 to reinforce autoinhibition and repress gene expression at the level of DNA-binding. The inhibition levels are indicated in (A) for each construct compared to the uninhibited construct (ΔN331), as measured by EMSA [40, 43]. In addition, K15 and K227 are sumoylated by Ubc9, resulting in transcriptional repression. Boxes represent α-helices; arrows, β-strands; P, phosphoacceptors; φ, Tyr/Phe.   1.2.4 Ets1 PNT domain and Ras-dependent signaling   Ets1 and closely-related Ets2 are both activated by a Ras-MAPK signaling cascade that culminates in phosphorylation-enhanced binding of the acetyltransferase co-activator CBP with the PNT domain [44]. In turn, CBP recruits the general transcriptional machinery and contributes to chromatin remodeling, which changes the expression level of specific sets of genes[45]. The monomeric PNT domains of Ets1 and Ets2 regulate CBP binding through two appended helices that contain two phosphoacceptor sites (Thr38 and Ser41) in a disordered region (Figure 1.4B). In Ets1, helix H0, which is marginally stable and conformationally dynamic, is displaced from the core bundle upon phosphorylation of the neighboring Thr38/Ser41 by the MAPK ERK2 (extracellular regulated kinase 2) [31]. This reduces the affinity of Ets1 for the TAZ1 domain of CBP (Kd increases from ~60 to 2 µM). Both proteins interact through complementary charged surfaces encompassing H0 and the adjacent phosphoacceptors of Ets1. Thus, phosphorylation enhances CBP binding by shifting a conformational equilibrium of the PNT domain from a "closed" to an "open" state, augmented by increased electrostatic interactions with the positively-charged TAZ1 domain [46]. Preliminary studies suggest a similar "phospho-switch" mechanisms for Ets2 and the Drosophila ortholog Pnt-P2 [47].   1.2.5 Autoinhibition of Ets1 DNA-binding   Autoinhibition is a common mechanism to control biological processes that arise through intramolecular interactions between inhibitory regions and functional domains within a protein [48]. As such, it provides effective “on-site” repression. Several mechanisms can counteract autoinhibition, including proteolysis, post-translational modifications, as well as protein partnership (Figure 1.5). Autoinhibition is frequently observed in TFs, and often links the regulation of transcription with cellular signaling events. Not surprisingly, the alteration of autoinhibition generally leads to disease.  12  Figure 1.5: Autoinhibition is a regulatory mechanism that is modulated through post-translational modifications, proteolysis and protein partnerships. (A) An auto-inhibitory domain modulates the activity of a second, separate functional domain (center). Autoinhibition can be counteracted or reinforced by post-translational modifications such as phosphorylation (left), by association with a second intramolecular domain or a distinct protein partner (right), or through proteolysis (below). Reprinted from Pufall et al, Annu. Rev. Cell Dev. Biol. 2002, 18, 421-462 with permission from Annual Reviews. Copyright 2014 Annual Reviews. All rights reserved.    The DNA-binding activities of several ETS factors are repressed intramolecularly by appended modules flanking their conserved ETS domains [28, 29, 49, 50]. However, the underlying molecular mechanisms are generally poorly understood [25, 28, 29, 38, 40, 43, 48-50]. Ets1 represents one of the few protein examples for which autoinhibition has been extensively studied both in vivo and in vitro. The minimal ETS domain of Ets1 recognizes the 5'-GGA(A/T)-3' core sequence with ~ 10 pM affinity. This affinity is attenuated by the presence of a flanking inhibitory module (IM) appended on the opposite surface of the DNA-binding interface. The IM is formed by four α-helices (HI-1/HI-2/H4/H5), of which HI-1 and HI-2 are marginally stable and unfold upon DNA binding [51] (Figure 1.6). Thus, the energetic penalty of an allosteric conformational transition reduces the net affinity of Ets1 for DNA.   13  Figure 1.6: Dynamic model of phosphorylation-enhanced autoinhibition of Ets1 DNA binding. Ets1 is postulated to exist in a conformational equilibrium between an inactive rigid state (A,B) and a flexible active state (C). The latter is characterized by a disrupted inhibitory module and has a higher affinity for DNA.  When HI-1 and HI-2 are folded to form an intact IM, the SRR interacts transiently with the DNA-binding interface, reducing its flexibility and stabilizing the inhibitory module. This serves to both sterically and allosterically inhibit DNA binding. (A) Upon phosphorylation (pink circles) by CaMKII, autoinhibition is reinforced due to increased interactions of the SRR with the ETS domain. (D) Upon binding to DNA, HI-1 and HI-2 both unfold. High resolution X-ray crystallographic and NMR spectroscopy studies have demonstrated that the structure of the core ETS domain does not change in any substantial way when complexed with oligonucleotides containing the 5'GGA(A/T)3' motif. The ETS domain appears in red, the IM in cyan, and dotted gold lines represents the disordered SRR in several possible (and highly schematic) conformations. Only two of the five phosphoserine acceptors in the SRR are shown. H4GAGAUnfoldedHI-1, HI-2and SRRH5H5 HI-1H3H5H3HI-2SRRHIH3H2H4HI-1 unfoldsCamKII+PPhospha-taseA B C DH5 HI-1H3H5HI-2H3HI-2HIH2H4HIH2H4Phosphoserines 14  Somewhat surprisingly, the IM itself only imparts ~ 2-fold autoinhibition. The full ~ 20-fold effect observed for wild-type Ets1 requires the presence of an intrinsically disordered region (SRR) that contains ~60 residues N-terminal to the IM. NMR spectroscopic measurements indicate that the SRR is flexible and interacts transiently with the ETS domain to both block the DNA-binding interface and stabilize the IM [43]. The interaction surface on the IM and ETS domain was identified by paramagnetic relaxation enhancement (PRE) measurements with a Cu+2 atom bound to a Gly-Ser-His (or ATCUN motif) at the start of the SRR [43]. In addition, disruption of the IM by mutation abrogates the effect of the SRR, indicating that they act together to modulate DNA binding [38]. Thus, autoinhibition of Ets1 has both steric and allosteric origins and exemplifies a functionally important "fuzzy complex" (see section 1.4.3).   Autoinhibition provides a mechanism to regulate Ets1 at the level of DNA-binding. In response to Ca+2-signaling, CaM kinase II (CaMKII) can phosphorylate up to five serines within the SRR. Progressively higher levels of multi-site phosphorylation increase autoinhibition to a total of ~500-fold [40]. This leads to the control of Ets1 DNA-binding activity in a "dimmer switch", rather than an all-or-none, mechanism [52, 53].    Another key feature of Ets1 autoinhibition is the importance of dynamics in shifting the equilibrium from the inactive to the DNA-binding competent state. As evidenced by facile amide hydrogen exchange (HX), the inhibitory helices HI-1 and HI-2 are only marginally stable and thus poised to unfold [32]. The DNA recognition helix H3 also shows limited HX protection, indicative of significant local flexibility. As hypothesized below, this flexibility may be required in order to "scan" non-specific DNA, and adopt a high affinity state upon recognition of a specific DNA sequence [54]. Using additional NMR relaxation methods, a network of residues bridging the IM to the DNA-binding interface was discovered. These residues undergo msec-µsec timescale motions, which are gradually dampened with increasing autoinhibition due to SRR phosphorylation [40]. These experiments led to the formulation of a model in which Ets1 exists in a conformational equilibrium between flexible active and rigid inactive states (Figure 1.6). This equilibrium is shifted towards the inactive state via transient phosphorylation-dependent interactions of the flexible SRR with a surface of Ets1 encompassing the dynamic network.    In addition to being controlled by Ca+2-dependent CaMKII phosphorylation, autoinhibition of Ets1 is regulated through alternative splicing, which eliminates the SRR-modulated conformational equilibrium. For example, the oncogenic version of Ets1 found in the p135 oncoprotein of the E26 retrovirus has the C-terminal inhibitory sequences deleted, thus relieving autoinhibition [15]. Its higher DNA-binding activity likely contributes to oncogenesis [15].  15 Likewise, an alternatively-spliced Ets1 isoform, which lacks the N-terminal inhibitory sequences encoded by exon VII, binds DNA with higher affinity and slightly broader specificity [55]. Thus, an extensive body of work demonstrates that Ets1’s DNA-binding activity is regulated by post-translational modification and alternative splicing.     1.2.6 Regulation of Ets1 autoinhibition through protein-partnership   The activity of Ets1 is further modulated through protein partnerships. For example, Ets1 has been shown to interact with several TFs, including RUNX (AML-1), PAX5, Pit-1, HIF-α, and GATA-1, as well as with the co-activator CBP [30]. The interaction of Ets1 with these proteins allows its recruitment to non-consensus low affinity site, as well as composite binding sites. Several partnerships, including those with PAX5, USF, TF3E RUNX1, and Ets1 itself, also relieve autoinhibition to activate different sets of genes [34, 56, 57]. This leads to either interactions of higher affinity, synergistic activation, or repression of specific target genes. The mechanisms by which partners counter autoinhibition have only been characterized for Ets1, PAX5, and RUNX1.   PAX5 is a well-known example of co-operative partner transcription factor that alters the specificity of Ets1. In presence of PAX5, Ets1 binds a low affinity 5'GGAG3' site within composite DNA sites [23]. Ets1 is unable to bind this non-consensus sequence to any measurable extent without PAX5. The crystal structure of the PAX5/Ets1 complex on an oligonucleotide corresponding to the mb-1 promoter indicated that HI-1 is unfolded (Figure 1.7 A-B). Furthermore, cooperative binding with PAX5 causes a conformational change in the sidechain of Tyr395 within the recognition helix H3 of ETS domain, enabling binding to the non-consensus site. This conformational change is caused by Gln22 in PAX5, which hydrogen bonds with the side chains of Tyr395 and Gln336. Taken altogether, this partnership changes the specificity of Ets1, allowing its recruitment to a composite site that Ets1 cannot otherwise bind on its own.  16  Figure 1.7: Autoinhibition of Ets1 is relieved by protein-partnerships. Crystal structures of (A) the ETS domain bound to a consensus 3’-GGAA-5’ DNA oligonucleotide (PDB 1K79), (B) the Ets1/PAX5 complex bound to the mb-1 promoter (PDB 1MDM), (C) the Ets1/Runx1 complex bound to the TCRα enhancer (PDB 4L0Z) and (D) the Ets1/Runx1/CBFβ complex bound to the TCRα enhancer (PDB 3WTS). The ETS domain is colored in red, the inhibitory module in cyan when present in the structure, PAX5 is in purple, Runx1 in magenta, and CBFβ in orange.   The best studied example of an Ets1 partnership involves RUNX. The cooperative binding of these proteins to sites such as TCRα enhancer relieves the autoinhibition of both proteins. Recent crystal structures of Ets1/Runx1 and Ets1/Runx1/CBFβ complexes with oligonucleotides corresponding to the TCRα composite enhancer revealed that Ets1 autoinhibition is relieved because  the ETS interaction domain (EID) of Runx1 physically HI-2HI-2H1 H1H1H1H2H2 H2H2H3H3H3H3H4H4H4 H4EIDH5H5H5H5Runx1 Runx1CBF `PAX5ACBD 17 interacts with H1 of the ETS domain and partly occupies the space where HI-1 and HI-2 normally pack on the ETS domain (Figure 1.7 C-D) [33, 58].  1.3. Transcription factors and DNA binding in vivo   Two long standing questions about TFs in general can be cast in terms of the ETS factors. First, how do 28 closely-related ETS factors with highly similar ETS domains that bind conserved 5'GGA(A/T)3' sites execute specific functions in a cell? And second, how do these factors find their target sequences within a background of almost 3.3 billion basepairs of non-specific DNA?   1.3.1 The specificity paradox   Many research groups have addressed the question of how TFs of the same family execute diverse biological functions. As exemplified by the ETS factors, such TFs typically possess highly conserved DNA binding domains that bind in vitro to similar consensus DNA sequences with similar affinities. Indeed, TF families are defined by their binding domains and their target sites are identified based on consensus sequences. However, in vivo, these TFs can bind at very specific (and often non-consensus) sites as required to control unique sets of genes. This disparity between the requirement for distinct activities in vivo and the similarity for DNA recognition in vitro presents a specificity paradox. Part of the solution to this paradox lies in the indirect readout of different family members for variant DNA sequences, combined with phenomena such as autoinhibition. Expression in different cell types and combinatorial recruitment to low affinity sites by various protein partnerships, often in response to post-translational modification via cellular signalling cascades, also partially explains this paradox. However, additional factors including differences in protein dynamics across family members could also play a role in the specificity of gene expression.  1.3.2 Finding the right binding site in vivo   Another long standing question in gene regulation is how can TFs “scan” a vast sea of non-specific DNA sequences (for which they usually have µM affinity) to rapidly find a small number of specific target sites with sub-nM affinity? It is generally accepted that non-specific  18 DNA both buffers the concentrations of TFs available for specific DNA binding (thereby enhancing the specificity of transcription) and aids in the search process via "facilitated diffusion". That is, the TF undergoes 1D sliding along non-specific DNA, combined with 3D hopping and intersegment transfer to jump from one region of DNA to another proximal or distal region (Figure 1.8) [59]. The first process allows the TF to rapidly "scan" DNA in a relatively localized manner, whereas the latter allows a more coarse-scale search. Key to this model is the observation that TFs bind to non-specific DNA primarily through electrostatic interactions with its phosphodiester backbone (as required for sliding along the equipotential electrostatic surface of DNA), whereas specific DNA is recognized primarily through complementary protein-base hydrogen bonding (i.e., direct read-out).    Figure 1.8: Cartoon representation of the three mechanisms to facilitate target sites recognition by TFs. (A) Sliding, or a random walk by 1D diffusion, of a TF  allows a rapid intramolecular search over a relatively short region of DNA . (B) Direct transfer of a TF from one segment of DNA to another occurs through formation of a bridged complex without dissociation into free solution (middle, top). TFs with more than one DNA binding domain can transfer to a second DNA strand while the first DNA binding domain remains attached (not shown). (C) Hopping involves dissociation of the TF from one DNA segment, followed by 3D diffusion and then re-association on a different DNA segment. Jumping and direct transfer involve intermolecular translocation events.  DNA strands are colored in yellow and red to show the initial and final location on a DNA strand.    A  1D slidingB  Direct intersegment transferC  Jumping 19  Recent experimental advancements have improved our understanding on the relative contributions of direct intersegment transfer, jumping, and sliding in the DNA-search problem [60]. Well characterized systems, such as HoxD9, Egr1, Oct1, Sox2 and the lac repressor, have been used to measure the speed and efficiency of these various mechanisms. Indeed, elegant fluorescence and PRE-based experiments showed that the HoxD9 homeodomain diffuses in a spiral motion along the phosphate-sugar backbone of non-specific DNA and direct intersegment transfer (e.g., without going through a free intermediate) is the predominant mechanism for jumping to different DNA sites [61-64]. In a similar fashion, Egr-1, Sox2 and Oct1 use mainly intersegment transfer, followed by sliding on the DNA to find their target sites [65-67]. Both Egr-1 and Oct1 have bipartite DNA-binding modules, which facilitate intersegment transfer as one module can remain bound to one DNA strand, while the other contacts a second strand.    As for 1D sliding, NMR studies comparing the broadening and lineshape of NMR signals from the lac repressor in complex with different sequences, have also been used to estimate its diffusion constant on non-specific DNA. Contrary to previous publications, the authors of this study argue that 1D diffusion likely plays a minor role in the DNA search process [68]. Indeed, recent findings indicate that efficient sliding may only last for ~50 bp before the TF dissociates from the DNA strands. NMR and single-molecule fluorescence have been used to calculate the speed at which TFs can diffuse on DNA (D1). However, the results seem to vary significantly based on the methods used. Most estimates suggest that TFs diffuse on DNA at ~3.0x10-4 to 4.6 x10-6 bp2 s-1 [65, 69-73], which is slower than the diffusion limit. Thus, 1D sliding does contribute to the search for specific sites on DNA, but plays a lesser role compared to intersegment transfer.   Interestingly, recent computational studies further suggest that disordered positively charged tails in TFs can interact with DNA and contribute to hopping from one strand to the other [74]. These new findings help to shed light on the physical mechanisms involved in accelerating the search in vivo for specific sites by TFs, while providing new ways to quantify the interplay between the three different mechanisms. It will be interesting to see in the future how auto-inhibitory modules and disordered regions affect the various mechanisms to find target sites in vivo.     20 1.4 Intrinsically disordered proteins  Over the last two decades, an exponentially growing number of papers have reported critical functions being carried out by regions of proteins, or even more surprisingly, by full-sized proteins, that do not possess any stable secondary or tertiary structures under physiological conditions [75, 76]. These intrinsically disordered proteins (IDPs) or intrinsically disordered protein regions (IDPRs) are more accurately described by an ensemble of rapidly interconverting conformations, rather than any one predominant fold. Such IDPRs can be found in all kingdoms of life, including viruses. However, they are generally longer and more prevalent in eukaryotes [77]. Indeed, it is predicted that about 50% of all eukaryotic proteins contain disordered regions of at least 30 residues and 10% of these proteins are fully disordered [78, 79] [80].   The current view of the protein structure-function paradigm now recognizes a continuum spanning from highly dynamic disordered states to highly ordered tertiary folds. Numerous experimental and computational studies have demonstrated that both ordered and disordered proteins play functional roles in cells [81]. Rigid globular proteins are mostly involved in cellular housekeeping roles including enzymatic catalysis, whereas IDPRs, with their flexibility, are more frequently involved in processes such as signalling pathways and the regulation of transcription and translation [81-83]. The presence of disordered segments is beneficial for the latter since it adds adaptability, versatility and reversibility [84]. Indeed, the observation that many IDPR sequences are conserved amongst different species indicates that they have evolved to perform such important functional roles.   1.4.1 Structural characterization of IDPRs  Detailed analyses of IDPRs sequences have demonstrated that the compositions of disordered proteins differ significantly from those adopting globular folds [78]. In general, IDPRs have an overall higher content of prolines and polar/charged residues, as well as fewer hydrophobic and order-promoting residues (Cys, Trp, Tyr, Ile, Phe Val and Leu) than do globular proteins [85]. Because of this distinct and low complexity amino acid composition, IDPRs can often be identified by their average net charge and hydropathy [86]. Numerous bioinformatic tools, including PONDR, DisEMBL Disopred2, and IUpred have thus been optimized to robustly detect disordered regions in protein sequences [79, 87-89]. However, since functionally important residues in IDPRs are often grouped in short linear amino acid  21 motifs, these proteins typically show overall higher rates of mutations than do structured proteins [90]. This sequence divergence often complicates the identification of the few residues most critical to processes such as interactions with partner macromolecules. Therefore, a major experimental and computational challenge lies in determining accurate models of these dynamic polypeptide chains and explaining the roles of disorder in their various functions.  Since the behaviour of IDPRs can only be explained by an ensemble of conformations, the standard approaches of NMR spectroscopy, X-ray crystallography, and electron microscopy do not give accurate models of their structures and functions. Therefore, new methods have been developed and refined in the past decade to better describe the ensemble properties of IDPRs. These methods include advanced NMR spectroscopic approaches, small angle X-ray and neutron scattering (SAXS/SANS), single-molecule Förster resonance energy transfer (FRET), hydrodynamic techniques, and molecular dynamics (MD) simulations [82, 91].  Importantly, no one approach is sufficient and IDPRs are best characterized by integrating information from multiple methods to obtain an ensemble view consistent with all of the available experimental data. Paramount to this view is both a description of the conformational space sampled by the IDPRs (thermodynamics), as well as the dynamics of their interconversion (kinetics).  Of all of these experimental techniques, NMR spectroscopy is best suited to characterize IDPRs at the atomic level. Amongst others, the measurements of chemical shifts, scalar coupling constants, residual dipolar couplings (RDCs), paramagnetic resonance enhancements (PREs), nuclear Overhauser effects (NOEs), translational/rotational diffusion, and numerous additional relaxation parameters can be used to define the conformational distributions and dynamic properties of IDPRs [80]. In addition, transient structures and interactions, as well as sparsely populated states can be detected. However, it is still very challenging to extract conformational ensembles from NMR data due to many factors including non-linear averaging (e.g., the NOE and PRE are dependent upon the inverse sixth power (1/r6) of the separation between interacting species) and potentially complex time-scale (kinetic) dependences. Therefore, complementary approaches are critical. Of these, FRET experiments yield long-range distance information between two fluorophores that have been added at specific sites on the protein. Although providing one data point per sample, this approach complements the short-range (< 5 Å) nature of NOE measurements. SAXS and SANS are also useful for charactering IDPRs because they provide information on the size (radius of gyration) and shape distribution of a protein ensemble. Like FRET, such data help provide longer range restraints necessary to define the various conformers within the ensemble adopted by an IDPR.  22 Ultimately, the information obtained through these various methods can then be integrated in molecular dynamics simulations to yield accurate representation of IDPRs. The next step in characterization of IDPRs is then to understand how they recruit and interact with other folded or even unfolded proteins.  1.4.2 IDPRs interact with other proteins  IDPRs exhibit numerous functions, spanning from serving as flexible linkers between structured domains in modular proteins, to intimate components of macromolecular complexes. IDPs frequently associate with protein partners through interactions involving their disordered regions. These disordered sequences are often the sites of post-translational modifications (PTMs), as they are readily accessible to the "writer" and "eraser" enzymes (e.g. kinases and phosphatases), as well as "reader" proteins. Furthermore, flexible polypeptide regions can sample and occupy a larger region of space than a well folded compact protein, which enhances their ability to rapidly recruit other proteins (e.g. using fly-casting mechanism) [92]. With their dynamic, disordered conformations, the resulting interactions are often of low affinity. Their intrinsic conformational flexibility also allows IDPRs to bind at low energetic cost to multiple partners. This leads to promiscuous and readily reversible formation of diverse protein complexes. Collectively, the intrinsic versatility and plasticity of IDPRs makes them excellent candidates to work as "hubs" in protein interaction networks, including those necessary for signalling cascades and transcriptional regulation.  IDPRs often contain regions of 5 to 25 functionally-important amino acids, termed molecular recognition motifs (MORFs), which recognize hydrophobic patches on other proteins and frequently undergo a disorder-to-order folding transition upon binding to form stable α-helical, β-strand, polyproline II helical (PPII) or irregular structures [93-95]. This folding may be induced after binding as an initial, transient complex. However, biophysical studies have demonstrated that predominantly disordered polypeptide sequences often have conformational propensities for the structure that they adopt upon binding to their specific partners. Such propensities can be identified by computational analyses and verified experimentally by approaches such as circular dichroism (CD) and NMR spectroscopy. The presence of these preformed elements facilitate binding kinetically and energetically, since fewer structural changes are necessary and the entropic penalty for ordering is reduced [94]. In contrast to an induced folding mechanism, this is often described as a population-shift conformational  23 selection pathway. However, these are two extreme models and features of both mechanisms are often exhibited by IDPRs as they bind their partner proteins [80, 96].  1.4.3 From static to dynamic complexes: different levels of fuzziness  The possible complexes formed by IDPRs form a structural continuum that ranges from being relatively static to fully dynamic depending on the proteins involved (Figure 1.9). Proteins like p21 [97] and p53 [98] contain disordered regions that will fold into different "static" conformations upon binding to different partners. In contrast, proteins like Sic1 [99], Tcf4 and Ste5p [100] interact with their partners, but retain a significant amount of disorder [84]. These so-called "fuzzy complexes" can exhibit multiple stable conformations in the bound state or some dynamic disorder such that the protein fluctuates between a large number of conformational states.   24   Figure 1.9: Examples of different types of fuzzy complexes. The IDPRs are shown by orange and magenta, and fuzzy regions are represented by dotted lines. The folded binding partners are displayed as gray surfaces. Examples presented from left to right show an increasing amount of disorder, which results in increasingly dynamic complexes. (A) The WH2 domain of Wiskott–Aldrich syndrome protein (WASP) interacts with actin in alternative locations, either via an 18 residue segment (orange; PDB ID 2A3Z) or via only three residues (magenta; PDB ID 2FF3). (B) The nonsense mediated decay factor UPF2 binds to UPF1 via two structured regions and the connecting linker remains ambiguously defined in the complex (dotted line; PDB code 2WJV). (C) DNA-binding by the transcription factor Ultrabithorax (Ubx) is strongly influenced by various disordered regions that flank the structured DNA-binding homeodomain. (PDB ID 1B8I). (D) The cyclin-dependent kinase inhibitor Sic1 has nine phosphorylation sites that interchangeably contact Cdc4. The positions of two of them, Thr45 and Ser76, are shown by orange and magenta balls respectively. The phosphorylation sites are represented by spheres (Figure adapted from Fuxreiter, M., Mol. BioSyst., 2012, 8, 168–177	  with permission © from the Royal Society of Chemistry).   The result of the inherent flexibility of IDPRs is their ability to structurally adapt to different partners with different binding events that can potentially have opposing or alternate functions. This flexibility is well exemplified by a disordered fragment of p53 that can bind four different proteins (Cyclin A, sirtuin, CBP and S100ββ) [101] and adopt different local structures for each complex. This capacity for alternate complexes to have different effects is termed moonlighting [101]. The higher frequency of IDPs in eukaryotes than prokaryotes could Disorder(A) Polymorphic (B) Clamp (C) Flanking (D) RandomStatic Dynamic 25 therefore easily explain how eukaryotes gained an increased phenotypic complexity without increasing their genotype complexity [102].  1.4.4 Regulation of IDPRs   Since over-expression and ectopic non-cognate interactions of IDPRs are generally detrimental for the cell, multiple control mechanisms are required during transcription and translation to tightly regulate these proteins. Such mechanisms involve regulation of mRNA levels, alternative splicing, reduced half-life of IDPRs, alteration of post-translational modification (PTM) patterns and stabilization of IDPRs through complex formation [103-107].    Alternative splicing constitutes an efficient way to regulate of IDPRs. Roughly 30% to 60% of alternately spliced mRNAs in humans are associated with exons encoding long disordered polypeptide regions [108, 109]. Thus several different isoforms can be produced with alternate disordered regions [110]. The cell's ability to produce different isoforms partly explains how IDPRs can bind multiple partners in different cell types and how cell differentiation is achieved. Recent bioinformatic studies have shown that protein interaction networks and signalling pathways can be modified through the presence of MORFs in alternately spliced disordered regions [111, 112], thus underlining the importance of alternate splicing. Ets1, which is the focus of this thesis, can be spliced in several different ways to produce various isoforms that may or may not contain the transactivation domain and the auto-inhibitory region [113, 114].  However, the function of these isoforms and their roles in cancer are not entirely clear.  Due to their inherently disordered nature, IDPRs are subjected to post-translational modifications (PTMs) more often than structured proteins [103, 115]. Among many others, these PTMs include ubiquitination, sumoylation, phosphorylation, acetylation, glycosylation, methylation, and palmitylation. All these modifications alter the protein’s half-life, binding partners, functions, localization, and ultimately gene regulation of disordered proteins. Not surprisingly, changes in these PTMs will often be at the heart of diseases related to disordered proteins.    Ultimately, the activity and life span of most IDPRs is very tightly regulated because of their dynamic nature and their propensity to interact weakly with various proteins. Indeed, many studies in the past 10 years have outlined that the overabundance, misidentification, misregulation, or the presence of altered states of disordered segments in proteins or signalling  26 pathways are strongly correlated with cancer, human immunodeficiency virus (HIV), malaria, neurodegenerative diseases, cardiovascular diseases and diabetes [108, 110, 116-123]. Unsurprisingly, many proteins involved in these diseases have significant disordered regions that play pivotal roles. Examples include p53, τ protein, α-synuclein, Sic1 and prion proteins, to name only a few. Their failure to adopt the required conformational state at the right location leads to misfolding, loss of function, gain of toxicity, and/or aggregation [124, 125].  1.5 Hypotheses and goals   The overarching theme of my thesis is to understand how Ets1 translates genetic information by binding at specific promoter/enhancer sequences to activate transcription. The first goal is to elucidate the physicochemical mechanisms underlying DNA-binding autoinhibition by the disordered SRR and the impact of its phosphorylation in response to calcium signaling. The second goal lies in understanding the role of protein dynamics on binding to specific and non-specific DNA sequences. Due to its central involvement in key cellular processes, the aberrant activities of Ets1 frequently lead to dysregulated gene expression and oncogenesis. Indeed, autoinhibition has emerged as the nexus for two mechanisms that drive biological distinction among ETS factors, namely regulation by protein partnerships and post-translational modifications [8]. Accordingly, we propose that it will also be the key to understanding how the different family members become dysregulated and lead to oncogenesis.  1.5.1 Investigating how the disordered SRR drives Ets1 autoinhibition   How does the intrinsically disordered serine-rich region (SRR) interact with the ordered inhibitory module (IM) and ETS domain to regulate DNA binding? The auto-inhibitory SRR is unique to Ets1 and Ets2 and conserved in sequence amongst orthologs from different species. Sequence alignment of the disordered regions shows a repetitive pattern of both aromatic phenylalanine/tyrosine (φ) and charged aspartate/glutamate residues (S-φ-D/E) adjacent to three of the five CaMKII phosphoacceptor serines (Ser251, Ser282, and Ser285) (Figure 1.10). Phosphorylation of these three acceptors enhances Ets1 autoinhibition by an additive mechanism [40]. Additional aromatic residues are also present near these apparent repeats. This is particularly striking as IDPRs are generally depleted in such amino acids. Accordingly, we hypothesized that the aromatic residues play a central role in the transient, phosphorylation-enhanced interaction of the SRR with the IM/ETS domain.  Through extensive mutational  27 analyses, combined with quantitative DNA-binding measurements and detailed NMR spectroscopic studies, we found that Phe/Tyr (φ) residues in the SRR indeed act synergistically with nearby phosphoserines to reinforce autoinhibition.    Figure 1.10: Sequence conservation of the disordered SRR. (Upper panel) Consensus sequence of the SRR region in Ets1 and Ets2. (Lower panel) Sequence alignment of residues 244-300 for various Ets1 and Ets2 orthologs. Serines and threonines are highlighted in green, aspartic acids and glutamic acids in red, lysines and arginines in cyan, and histidines, phenylalanines, tyrosines and tryptophans in orange. The CaMKII phosphoacceptor serines are identified with a (*). Aromatic (φ) and negatively-charged sidechains (D/E) are adjacent to Ser251, Ser282, and Ser285, which contribute to autoinhibition    Using mutagenesis and quantitative DNA-binding measurements, we have found that phosphorylation-enhanced autoinhibition requires the presence of phenylalanine or tyrosine (φ) residues adjacent to the SRR phosphoacceptor serines. Mutation of these aromatic residues to glycines, alanines or valines strongly reduces phosphorylation-enhanced autoinhibition. Furthermore, the introduction of additional phosphorylated Ser-φ-Asp, but not Ser-Ala-Asp, repeats within the SRR dramatically reinforces autoinhibition. NMR spectroscopic studies of phosphorylated and mutated SRR variants, both within their native context and as separate trans-acting peptides, confirmed that the aromatic residues and phosphoserines contribute to transient interactions with the ETS domain. Chemical shift perturbation and paramagnetic relaxation enhancement measurements were also used to identify the SRR-interacting surface of the ETS domain, which encompasses its positively-charged DNA-recognition interface and an adjacent region of neutral polar and non-polar residues. This suggests that phosphorylation-* * ** *     GKLGGQDSFE-SIESYDSCDRLTQSWSSQSSFNSLQRVPSYDSFDSEDYPAALPNHKP       human Ets-1GKLGGQDSFE-SVESYDSCDRLTQSWSSQSSFNSLQRVPSYDSFDYEDYPAALPNHKP       mouse Ets-1GKLGGQDSFE-SIESYDSCDRLTQSWSSQSSFNSLQRVPSYDSFDSEDYPAALPNHKP       horse Ets-1GKLGGQDSFE-SIESYDSCDRLTQSWSSQSSFQSLQRVPSYDSFDSEDYPAALPNHKP       chicken Ets-1GKLGGQDSFE-SIESYDSCDRLTQSWSSQSSFNSLQRVPSYDSFDSEDYPAALPNHKP       marmoset Ets-1GKLGGQDSFE-SIESYDSCDRLTQSWSSQSSFNSLQRVPSYDSFDSEDYPAALPNHKP       rat Ets-1GKLGGQESFE-SIESHDSCDRLTQSWSSQSSYNSLQRVPSYDSFDSEDYPPAMPSHKS       xenopus Ets-1GTPKDHDSPENGADSFESSDSLLQSWNSQSSLLDVQRVPSFESFEDDCSQS-LCLNKP       human Ets-2GKPKDHDSPENGGDSFESSDSLLRSWNSQSSLLDVQRVPSFESFEEDCSQS-LCLSKL       mouse Ets-2GKPGDRDPPENGADSFESSDSLLQSWNSQSSLLDVQRVPSFESFEDDCSQS-LCLSKP       horse Ets-2GKLREHESSESGAESYESSDSMLQSWNSQSSLVDLQRVPSYESFEDDCSQS-LCMSKP       chicken Ets-2GTPKDHVSTENGADSFESSDSLLQSWNSQSSLLDVQRVPSFESFEDDCSQS-LCLNKP       marmoset Ets-2GKPKEHDSPENGGDSFESSDSLLRSWNSQSSLLDVQRVPSFESFEEDCSQS-LCLSKP       rat Ets-2GKLRDYDSGDSGTESFESTESLLQSWTSQSSLVDMQRVPSYDGFEEDGSQA-LCLNKP       xenopus Ets-2S-q-D/E S-q-D/E S-q-D/E-S-q-D/E 28 enhanced π/cation or hydrophobic interactions contribute to SRR-mediated autoinhibition. Collectively, these studies highlight the role of aromatic residues and their synergy with phosphoserines in an intrinsically disordered regulatory sequence and demonstrate the critical role played by dynamic regions of the Ets1 transcription factor in the integration of cellular signaling and gene expression.   1.5.2 Investigating the role of Ets1 dynamics in binding specific and non-specific DNA   How does Ets1 bind its target DNA sequences against a background of non-specific sites? The second part of my doctoral thesis attempts to further our knowledge on how TFs recognize their consensus binding sites in the cell. Indeed, it is hard to conceptualize how these molecules can find the right binding site while surrounded by a vast sea of non-specific DNA. Even more puzzling, how can the various ETS transcription factors find the correct promoter region to bind, when they all possess a similar highly conserved DNA binding domain? The research presented in this thesis therefore strives to answer these questions by characterizing the structural and dynamic properties of the Ets1 protein bound to specific and non-specific DNA oligonucleotides.     To address this question, I undertook detailed NMR spectroscopic studies of a partially auto-inhibited Ets1 fragment in presence of specific and non-specific DNA oligonucleotide sequences. Upon binding either DNA, helices HI-1 and HI-2 of the inhibitory module unfold and are in conformational exchange between a major unfolded state and possibly minor folded states. Thus, autoinibition does not impart DNA-binding specificity. Using amide chemical shift perturbation mapping, I showed that Ets1 binds both specific and non-specific oligonucleotides through its canonical ETS domain interface. However, the non-specific complex is formed via relatively weak and dynamic electrostatic interactions, whereas the specific complex involves well-ordered hydrogen bonds and salt bridges. In support of this conclusion, five lysine sidechain aminium groups are protected from rapid hydrogen exchange upon binding of specific DNA, whereas only one weak signal is stabilized in the non-specific complex. Broadening of amide and sidechains signals in the non-specific complex is indicative of conformational averaging (possibly sliding) on the DNA. Additionally, backbone amide and sidechain tryptophan indole and arginine guandinium groups show downfield shifted 1H signals in the presence of specific, but not non-specific, DNA. Such spectral perturbations are indicative of the formation of hydrogen bonds to the DNA basepairs and phosphodiester backbone. Overall, these data are consistent with the model that transcription factors such as Ets1 rapidly find  29 specific DNA sites within the genome via facilitated diffusion (sliding and hopping) within a vast background of weakly interacting non-specific sequences.    The studies presented in my thesis build upon the previous models established for Ets1 autoinhibition and highlight the diversity of mechanisms used for producing a graded response to cellular signalling. Moreover, this research underlines the importance of post-translational modifications and the crucial role played by disorder in transcription regulation of Ets1. The studies on Ets1 bound to different DNA further support the role of disorder in recruiting protein partners by expanding Ets1’s radii, and help expand the role of dynamics in target site recognition. The knowledge acquired through my research helps us to gain a better understanding of autoinhibition in the ETS family and of transcription regulation in general.  30   Chapter 2: Synergy of aromatic residues and phosphoserines within the intrinsically disordered DNA-binding inhibitory elements of Ets1   The Ets1 transcription factor is autoinhibited by a conformationally-disordered serine rich region (SRR) that transiently interacts with its DNA-binding ETS domain. In response to calcium signaling, autoinhibition is reinforced by phosphorylation of five serines within the SRR by calmodulin-dependent kinase II. Using mutagenesis and quantitative DNA-binding measurements, we demonstrate that phosphorylation-enhanced autoinhibition requires the presence of phenylalanine or tyrosine (φ) residues adjacent to the SRR phosphoacceptor serines. The introduction of additional phosphorylated Ser-φ-Asp, but not Ser-Ala-Asp, repeats within the SRR dramatically reinforces autoinhibition. NMR spectroscopic studies of phosphorylated and mutated SRR variants, both within their native context and as separate trans-acting peptides, confirmed that the aromatic residues and phosphoserines contribute to the formation of a dynamic complex with the ETS domain. Complementary NMR studies also identified the SRR-interacting surface of the ETS domain, which encompasses its positively-charged DNA-recognition interface and an adjacent region composed of neutral polar and non-polar residues. Collectively, these studies highlight the role of aromatic residues and their synergy with phosphoserines in an intrinsically disordered regulatory sequence that integrates cellular signaling and gene expression.   2.1 Introduction   Intrinsically disordered protein regions (IDRs) are increasingly recognized for their prevalence in the eukaryotic proteome and their roles in normal biological processes, as well as in disease [81, 126, 127]. IDRs serve as flexible linker sequences between modular domains and as key components of complex protein-interaction networks. These sequences are often sites of post-translational modifications and their plasticity enables accessible and reversible interactions necessary for the integration of cellular signals.  The autoinhibition of the Ets1 transcription factor provides an illuminating example of how a flexible, disordered region can modulate the regulatable DNA-binding ETS domain, and thereby tune it for biological control by calcium-dependent phosphorylation and cooperative protein partnerships [8]. The inhibitory module (IM) of Ets1 is composed of four α-helices (HI-1,  31 HI-2, H4 and H5) that pack onto the ETS domain distal from the DNA interface (Figure 2.1A) [34]. Helix HI-1 and HI-2 are marginally stable and unfold upon DNA binding, thus implicating an allosteric mechanism of inhibition [32, 51]. The modest 2-fold repression afforded by the IM is increased to ~ 20-fold by an intrinsically disordered serine rich region (SRR) [40, 128]. The dynamic SRR interacts transiently with both the ETS domain DNA-binding interface and the IM, and thus plays steric and allosteric inhibitory roles [43]. The transient nature of the interaction follows from the observation that the SRR is disordered, adopting no predominant conformation and exhibiting substantial sub-nsec timescale motions. Phosphorylation of five SRR serines by calmodulin-dependent kinase II (CaMKII) leads to a dramatic ~ 500-fold autoinhibition [38]. In parallel, the conformational flexibility of the IM and the ETS domain decreases [40, 43]. By analogy to the well-characterized lac repressor [54], this flexibility is hypothesized to play a central role in DNA binding (see Chapter 3).     Figure 2.1: Schematic representation of Ets1. (A) Boundaries of Ets1 including the PNT domain (green), serine-rich region (SRR; yellow), inhibitory module (IM; cyan) and ETS domain (red). (B) Representation of the trans system where ΔN279 is effectively divided into the truncated SRR peptide (residues 279 - 300) and ΔN301. The mutants with four aromatic residues (bold) in the SRR substituted by alanines are denoted as 4ϕA, and the asterisks (*) indicate the CaMKII phosphoacceptors Ser282 and Ser285.   This study interrogates the physicochemical basis of the transient interactions underpinning Ets1 autoinhibition. Through mutational analyses, we found that Phe/Tyr (φ) RVPSADSADAEDAPAALPNHKPSRR 4qA **RVPSYDSFDYEDYPAALPNHKPSRR **6N301Trans systemPeptidesABIM IMETS2796N279 4qA**(GSH)RVPSADSADAEDAPAALPNHKP -279(GSH)RVPSYDSFDYEDYPAALPNHKP -6N279 * *+IM IMPNT SRR ETSIM IM1 42 134 244 301 331 415 440ETSIM IMETS 32 residues in the SRR act synergistically with nearby phosphoserines to reinforce autoinhibition. Complementing these biochemical assays, we used NMR spectroscopy to investigate the intermolecular interactions of peptides, corresponding to the SRR, with the IM and ETS domain. The trans-SRR peptides retained inhibitory activity and formed dynamic complexes, lacking any persistent induced conformation, via the same interface detected "in cis" by chemical shift perturbation (CSP) and paramagnetic relaxation enhancement (PRE) experiments. Intermolecular binding was also enhanced by phosphorylation and weakened upon glycine, alanine or valine substitution of the Phe/Tyr residues. These studies revealed the synergy between aromatic residues and phosphoserines in the function of an IDR.   2.2 Role of charged and aromatic residues in autoinhibition    Sequence alignment of the SRR, which is unique to Ets1 and Ets2, revealed a repetitive pattern of Phe/Tyr and Asp/Glu residues adjacent to the CaMKII phosphoacceptors (Figures 1.10 and 2.1). To understand the roles of these conserved charged and aromatic residues, the experiments presented here focus on ΔN279, the smallest fragment of Ets1 that recapitulates phosphorylation-dependent autoinhibition. This fragment contains two phosphorylation sites and four conserved aromatic residues (Figure 2.1) [32].    Initial mutagenesis studies sought to interrogate the role of negative charges in the SRR on DNA binding (Table 2.1). As reported previously [40, 43], the presence of the truncated wild-type SRR led to 18-fold inhibition of DNA binding for ΔN279 versus ΔN301. (Note that ΔN301 contains the IM, which imparts ~two-fold inhibition relative to the minimal ETS domain [128]) Phosphorylation of Ser282 and Ser285 dramatically increased the effect to 1900-fold. Substitution of the phosphorylated serines by glutamates or aspartates increased inhibition to only ~75-fold. The incomplete recapitulation of phosphorylation may arise from differences in charge (-2 or -1 at pH 8) and/or structural between phosphate and carboxylate moieties. In an attempt to distinguish between these interpretations, we reduced the net negative charge of the SRR by replacing Asp284, Asp287, Glu289 and Asp290 with alanines or by introducing two positively-charged arginines at positions 282 and 285. Both mutants retained the basal auto-inhibitory properties of the SRR. CaMKII failed to phosphorylate the Asp/Glu to Ala mutants, and thus effects on the reinforcement of autoinhibition could not be measured. (Although little is known about CamKII specificity, early publications indicate that it generally phosphorylates a Arg-x-x-Ser/Thr consensus motif [129, 130]), where x is any hydrophobic residue). Collectively, these experiments indicated that autoinhibition of ΔN279 DNA binding by the disordered SRR is  33 not a simple global electrostatic effect, but rather through a role specific to the presence of phosphoserines.   Table 2.1: Mutations demonstrate a critical role for SRR aromatics in ΔN279 phosphorylation-enhanced autoinhibition of DNA binding. Protein SRR sequence a KD (x10-11 M) b Fold Inhibition c ΔN301  2.5 ± 0.5 - ΔN2790P RVPS YDS FDYEDYP 44 ± 4 18 ± 3 ΔN2792P RVPSPYDSPFDYEDYP 4,700 ± 580 1,900 ± 320 ΔN2790P S to E RVPE YDE FDYEDYP 180 ± 50 73 ± 27 ΔN2790P S to D RVPD YDD FDYEDYP 190 ± 8 77 ± 10 ΔN2790P DE to A RVPS YAS FAYAAYP 71 ± 16 28 ± 9 ΔN2790P S to R RVPR YDR FDYEDYP 45 ± 5 18 ± 3 ΔN279 4φG0P RVPS GDS GDGEDGP 44 ± 9 18 ± 4 ΔN279 4φG2P RVPSPGDSPGDGEDGP 130 ± 11 50 ± 7 ΔN279 4φA0P RVPS ADS ADAEDAP 44 ± 2 18 ± 2 ΔN279 4φA2P RVPSPADSPADAEDAP 140 ± 10 54 ± 8 ΔN279 4φV0P  RVPS VDS VDVEDVP 23 ± 2 9 ± 2 ΔN279 4φV2P  RVPSPVDSPVDVEDVP 160 ± 33 63 ± 18 FL Ets1 4φA0P RVPS ADS ADAEDAP 44 ± 12 18 ± 6 FL Ets1 4φA5P RVPSPADSPADAEDAP 140 ± 10 160 ± 17  a Partial sequence. Phosphoserines (SP) and mutated/modified sites are in bold.  b Measured by EMSA for a consensus Ets1 site.  c Relative to ΔN301.  Subsequent mutagenesis studies were carried out to interrogate the role of the aromatic residues in the SRR on DNA binding. Replacement of the four conserved aromatic residues in the truncated SRR (φ = Tyr283, Phe286, Tyr288, and Tyr290) with glycine or alanine did not measurably change basal inhibition, but dramatically impaired phosphorylation-dependent autoinhibition. As summarized in Table 2.1, DNA binding is only weakened by ~ 3-fold for ΔN279-4φG2P versus ΔN279-4φG0P, and for ΔN279-4φA2P versus ΔN279-4φA0P. These values pale in comparison to the ~ 100-fold phosphorylation-enhancement observed for the wild-type species. Similar effects were found in the context of both ΔN244 with a full length SRR (not  34 shown) and native Ets1 (FL Ets1), both of which have five phosphoacceptor serines. Substitution of the four aromatics in ΔN279 to valines had comparable effects to the gylcine and alanine replacements, suggesting that the aromatic character of tyrosines and phenylalanines is essential. The four aromatics were also mutated to leucines in order to test the impact of having larger hydrophobic groups within the SRR. Although ΔN279-4φL0P also retained basal inhibition, the effects of phosphorylation could not be determined as CaMKII treatment yielded a mixture of 1P and 2P proteins that could not be separated (not shown). Regardless, the reduced effect of phosphorylation observed within the 4φG, 4φA and 4φV backgrounds indicates that both the phosphoserines and aromatic residues in the SRR are required for phosphorylation-enhanced autoinhibition.   The importance of having phosphorylated serines and aromatic residues was tested further by the creating ΔN279 variants containing increasing numbers of Ser-φ-Asp motifs within the SRR  (Table 2.2). Although having little effect on basal autoinhibition, phosphorylation at an increasing number of sites progressively weakened DNA binding by up to 100,000-fold relative to ΔN301. In contrast, Ser-Ala-Asp (or Ser-Gly-Asp; not shown) repeats lacking aromatics were ineffective, even with six phosphoserines present. These mutagenesis experiments clearly demonstrated the synergistic roles of phosphoserines and aromatic groups the autoinhibition of Ets1.   35  Table 2.2: Additional Ser-φ-Asp repeats mediate higher levels of phosphorylation-dependent autoinhibition of DNA binding. Protein SRR sequence a KD (x10-11 M) b Fold Inhibition c ΔN2790P RVP-SPYD-SPFD-YPED-YPPA-APLP-NPHK-P 44 ± 4.4 18 ± 2.7 ΔN2792P RVP-SPYD-SPFD-YPED-YPPA-APLP-NPHK-P 4,700 ± 580 1,900 ± 320 3-unit SRR0P RVP-SPYD-SPFD-SPYD-YPPA-APLP-NPHK-P 44 ± 3.8 18 ± 2.2 3-unit SRR2P RVP-SPYD-SPFD-SPYD-YPPA-APLP-NPHK-P 5,900 ± 2,700 2,400 ± 1,200 4-unit SRR0P RVP-SPYD-SPFD-SPYD-SPFD-APLP-NPHK-P 51 ± 1.7 20 ± 2.5 4-unit SRR4P RVP-SPYD-SPFD-SPYD-SPFD-APLP-NPHK-P 33,000 ± 17,000 13,000 ± 5,700 6-unit SRR0P RVP-SPYD-SPFD-SPYD-SPFD-SPYD-SPFD-P 58 ± 17 23 ± 7.0 6-unit SRR6P RVP-SPYD-SPFD-SPYD-SPFD-SPYD-SPFD-P > 200,000 > 100,000 6-unit 4φA0P RVP-SPAD-SPAD-SPAD-SPAD-SPAD-SPAD-P 40 ± 4.4 16 ± 2.6 6-unit 4φA6P RVP-SPAD-SPAD-SPAD-SPAD-SPAD-SPAD-P 210 ± 46 84 ± 21   a Constructs for tandem repeats, delineated by hyphenation, retained wild-type SRR length by replacing 3, 6, or 12 amino acids. Phosphoserines (SP) and mutated/modified sites are in bold. b Measured by EMSA for a consensus Ets1 site.  c Relative to ΔN301 (Table 2.1).  2.3 Aromatic residues and phosphorylation mediate intramolecular interactions    Having identified the critical residues necessary for phosphorylation dependent autoinhibition, we next sought to structurally investigate the roles of aromatic residues and phosphoserines through NMR studies.   2.3.1 Chemical shift perturbation studies   Amide chemical shift perturbations (CSPs) and 15N relaxation measurements were used to probe the conformation and intramolecular interactions of the 4φA version of the ΔN279 SRR in its unmodified and phosphorylated states (Figure 2.2). A comparison of the 15N-HSQC spectra of the wild-type and the 4ϕA mutant showed that the largest CSPs were localized within the SRR (Figure 2.3A). This is expected as aromatic groups influence the chemical shifts of neighboring nuclei through ring current effects, which are no longer present in the 4ϕA mutant. More importantly, amides within the ETS domain and IM also exhibited small CSPs, indicating  36 perturbed interactions with the SRR.  Phosphorylation of either wild-type or the 4ϕA mutant also led to similar patterns of CSPs (Figure 2.3B). However, these spectral perturbations were generally larger upon phosphorylation of wild-type than the 4ϕA mutant, and upon mutation of the phosphorylated rather than the non-phosphorylated protein.      Figure 2.2: Overlaid 15N-HSQC spectra of ΔN2790P, ΔN2792P, ΔN2790P 4ϕA and ΔN2792P 4ϕA. Each spectrum (ΔN2790P (blue), ΔN2792P (red), ΔN2790P 4ϕA (gold) and ΔN2792P 4ϕA (purple)) contains one signal for each mainchain or sidechain amide 1H-15N pair in the protein. Expanded regions at the bottom shows that the chemical shifts of some residues are unperturbed (overlap perfectly), whereas those of others are perturbed by phosphorylation and mutations (boxed in cyan). Close inspection reveals that the chemical shift changes for these latter residues differ in magnitude and direction. The spectra were fully assigned using standard 1H-13C-15N correlation experiments.    11 10 9 8 7 61301201108.4 8.2 8.0 7.81151141131127.9 7.7 7.5111110109W375 H¡W338 H¡W338G392 11.0 10.6 10.213112912715N (ppm)1H (ppm)15N (ppm)1H (ppm) 37   Figure 2.3: Chemical shift perturbations (CSPs) define the phosphorylation-dependent intramolecular interaction surface for the wild-type and 4φA-mutant SRR within ΔN279. (A) CSPs for main-chain amides and side-chain amides/indoles (inset) in ΔN279 (orange bars) and ΔN2792P (black lines) due to the 4ϕA substitutions. (B) CSPs for mainchain amides (as well as sidechain amides/indoles; inset) upon phosphorylation of ΔN279 (red lines) and ΔN279-4φA (blue histogram). Blank values correspond to prolines or residues with missing data due to HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4SRRAB6N279 2P - 6N279 0P6N279 4qA 2P - 6N279 4qA 0P6b(ppm)Residue1.20.80.40.0N297N315N336Q351N380N385N400Q419W338W356W361W3750.10.00.26b(ppm)6N279 4qA 2P - 6N279 2P6N279 4qA 0P - 6N279 0P1.20.80.40.0280 300 320 340 360 380 400 420 440290 310 330 350 370 390 410 430280 300 320 340 360 380 400 420 440290 310 330 350 370 390 410 430N297N315N336Q351N380N385N400Q419W338W356W361W3750.10.20.0 38 spectral overlap. Regions shaded in gray designate α-helices (rectangles) or β-strands (arrows) in ΔN301.   When mapped onto the ΔN301 structure, residues with the largest CSPs clustered near helix H3 and the following β-strand S4 of the ETS domain, as well as in the HI-2/H1 and S4/H4 loops (Figure 2.4A). Given that chemical shift changes are an exquisitely sensitive indicator of structural perturbations, these data coarsely define the interaction surface of ETS domain and IM for the SRR. Furthermore CSP measurements demonstrate that these interactions are altered upon phosphorylation and aromatic mutations. As also shown in Figure 2.4B, the aromatic- and phosphoserine-dependent interaction surface includes positively-charged residues near the DNA-binding interface, as well as an adjacent patch of hydrophobic and neutral polar residues. Thus both electrostatic and hydrophobic/van der Waals forces likely contribute to the SRR-ETS domain interactions. In addition to stabilizing the IM to reinforce an allosteric pathway for autoinhibition [32, 40], these studies also revealed that the SRR interacts directly with the DNA-recognition helix H3 and thus sterically impedes DNA-binding.   Figure 2.4: Electrostatic and hydrophobic/van der Waals forces contribute to the SRR-ETS domain interactions. (A) CSPs for residues 301–440 of ΔN2792P due to the 4ϕA mutation mapped on the structure of ΔN301 (PDB ID 1R36) with perturbed side chains in stick format (red: Δδ > 0.125 ppm, yellow: 0.125 ppm > Δδ > 0.08 ppm, gray: Δδ < 0.08 ppm or prolines). (B) Surface representation of ΔN301 showing the positively-charged (blue; Arg, Lys), negatively-charged (red; Asp, Glu), hydrophobic (green), and neutral polar (white) residues. HI-1HI-2H1H2H3H4H5A BHI-1HI-2H1H2H3H4H5W375W338Q336 39 2.3.2 Paramagnetic relaxation enhancement (PRE) studies  To better understand the transient interactions between the SRR and the ETS domain two distinct PRE strategies were utilized. In the first approach, Gd(DTPA-BMA), a highly inert water-soluble paramagnetic compound typically employed as an MRI contrast agent, was used to perturb the relaxation of amides nearest to the surface of the ETS domain [131, 132]. Corresponding residues in neither ΔN2792P nor ΔN2790P showed any significant solvent PRE changes relative to ΔN301 (data not shown). The absence of any measurable effect indicates that intramolecular interactions of the SRR do not markedly reduce the solvent accessible surface area of the ETS domain in ΔN279. These results are consistent with the dynamic nature of the SRR and its fuzzy interactions with the ETS domain.   The second approach involved covalently linking a MTSL nitroxide spin-label to single cysteines in ΔN279 to probe the interactions of the SRR with the ETS domain and IM [133]. Unfortunately, the necessary removal of the two native cysteines (Cys350 and Cys416) in ΔN279 significantly decreased the stability and solubility of the protein. Several combinations of serine and alanine replacements were tested to alleviate these detrimental effects, and C350A/C416A variant proved best behaved (Table 2.3). Additional constructs were then made with a single non-native cysteine introduced at the N-terminus of the SRR (H278C), as well as at positions 383 and 400 within the wHTH motif. However, the K383C and N400C mutants were very unstable and could not be used for PRE experiments since they precipitated within hours of purification. Ultimately, only one mutant, ΔN279 H278C/C350A/C416A, with a N-terminal cysteine, could be used to probe the interactions between the SRR and ETS domain. However, mutation of the WT cysteines also caused significant NMR spectral changes and necessitated reassignment of the 15N-HSQC spectrum for this construct.   Table 2.3: ΔN279 constructs made for PRE experiments with MTSL spin label. Mutant Expression level Precipitates readily ΔN279 H278C/C350A/C416S normal yes ΔN279 H278C/C350S/C416A normal yes ΔN279 H278C/C350S/C416S normal yes ΔN279 H278C/C350A/C416A normal no ΔN279 K383C/C350A/C416A low yes ΔN279 N400C/C350A/C416A low yes   40  Having identified one good candidate, we carried out PRE on ΔN279 H278C/C350A/C416A with an N-terminal MTSL spin label. To avoid any intermolecular interactions, these studies were performed at a relatively low protein concentration (25-50 µM), which limited the overall signal-to-noise of the spectra. Nevertheless, significant PRE effects were observed for amides clustered in the SRR (279-300), in the loop linking the N-terminal inhibitory sequence to the ETS domain (HI-2 to H1), the DNA binding helix (H3) and the following β-strand S4 (Figure 2.5A). Thus, the terminal spin label must at least transiently localize to this region of ΔN279. Upon phosphorylation, the PRE effects are also localized to residues along the ETS domain DNA-binding interface, and slightly less so to the N-terminal inhibitory sequence (Figure 2.5B). Similar results were seen previously with a paramagnetic Cu+2 bound to an ATCUN motif (Gly-Ser-His) at the N-terminus of ΔN279 [43]. These PRE studies are consistent with the CSP measurements of the effects of phosphorylation and 4ϕA mutation. Collectively, they demonstrate that the SRR interacts with the positively charged, polar and neutral residues located at the DNA-binding interface of the ETS domain.    41   Figure 2.5: Paramagnetic relaxation enhancements (PRE) studies define the intramolecular interaction surface within ΔN279. Amide 15N-HSQC intensity ratios (Ipara/Idia) for (A) ΔN2790P and (B) ΔN2792P with a paramagnetic MTSL nitroxide spin label disulfide-linked to the N-terminal Cys278 versus with a reduced diamagnetic (or cleaved) hydroxylamine label. Lower intensity ratios reflect increased PREs due to closer average proximity of the spin label and the amide proton. The interaction surface is highlighted by mapping the intensity ratios for residues 301-440 onto the structure of ΔN301 (PDB 1R36) using a color gradient from red to blue for values of 0 (maximal PRE) to 1 (negligible PRE). Blank values correspond to prolines or residues with missing data due to spectral overlap. Regions shaded in gray designate α-helices (rectangles) or β-strands (arrows) in ΔN301. HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4SRRHI-1HI-2 H1H2H3H4H56N2790P6N2792P280 300 320 340 360 380 400 420 4400.250.500.751.001.250.00ResidueI para/I diaB280 300 320 340 360 380 400 420 4400.250.500.751.001.250.00I para/I diaAHI-1HI-2 H1H2H3H4H5 42 2.3.3 The SRR undergoes sub-nsec timescale motions that dampen slightly upon phosphorylation    To further characterize the interactions of the SRR and IM/ETS domain, 1H{15N}-NOE measurements studies were carried out with the 4ϕA mutant of ΔN279 (Figure 2.6). The heteronuclear NOE is particularly sensitive to motions of the 1H-15N bond vector, and decreases in value from approximately 0.8 to -4 with increasing mobility on the sub-nsec timescale [134, 135]. As expected, residues in the well-folded ETS domain gave rise to heteronuclear NOE values around 0.75. In contrast, SRR-4ϕA residues showed significantly lower NOE values, indicative of substantial, but not completely unrestricted, motions. Upon phosphorylation, the NOE values of the aromatic-free 4ϕA-SRR increased modestly, demonstrating a slight dampening of their fast time-scale motions. This is consistent with a previous 15N relaxation analysis of the wild-type ΔN279 [43]. We thus conclude that the SRR functions through transient intramolecular interactions with the ETS domain and IM, and that its dynamic character scales inversely with its phosphorylation-dependent inhibitory activity.   43  Figure 2.6: Phosphorylation partially dampens fast backbone motions of the dynamic SRR in ΔN279-4ϕA. Heteronuclear 1H{15N}-NOE values of ΔN279-4ϕA in its unmodified (upper panel; orange) and phosphorylated (middle; purple) states, collected on a 600 MHz spectrometer at 28°C. The lower panel shows the changes (ΔNOE) upon phosphorylation. Heteronuclear NOE values above ~ 0.6 are characteristic of ordered rigid regions, whereas decreasing values indicate increasing flexibility on the sub-nsec timescale. With heteronuclear HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4SRR1.00.750.500.250-0.25-0.50-0.75-1.01.00.750.500.250-0.25-0.50-0.75-1.00.750.500.250-0.25280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440Residue6N279 4qA 0P6N279 4qA 2P6N279 4qA 2P - 6N279 4qA 0P61 H{15 N} NOE1 H{15 N} NOE1 H{15 N} NOE 44 NOE values ~ 0, the SRR is dynamic, but not completely unrestrained, in the context of ΔN279-4ϕA0P. Upon phosphorylation, the NOE values of the aromatic-free SRR increased to ~ 0.3, indicating partially dampened motions due to increased intramolecular interactions. However, in both cases, the SRR had lower NOE values than the well-ordered ETS domain, confirming the fuzzy dynamic nature of these transient interactions. Similar behavior has been reported for the SRR in wild-type Ets1 constructs [40, 43]. Blank values correspond to prolines or residues with missing data due to spectral overlap. Regions shaded in gray designate α-helices (rectangles) or β-strands (arrows) in ΔN301.  2.4 Structural characterization of the trans-system  2.4.1 The trans-SRR system recapitulates autoinhibition  To facilitate further characterization of the disordered SRR, we developed a trans system wherein ΔN279 is divided in a SRR peptide and ΔN301, the latter containing the ETS domain and IM (Figure 2.1). This intermolecular system opens the door to new concentration-dependent studies of the SRR.  By dividing the system into the trans-SRR and ΔN301, it becomes possible to monitor how the presence of one component alters the structural and dynamic properties of the other. To verify the suitability of this system, the effects of the peptides on DNA-binding by ΔN301 were measured by EMSA (Figure 2.7). The phosphorylated peptide SRR2P weakened binding of ΔN301 to specific DNA, whereas peptides that either lacked phosphoserines, or had the 4φA mutations, failed to measurably inhibit in trans.  This is consistent with the relative affinities of the peptides for ΔN301, as shown below. Thus, the trans system recapitulated the cis effect of the SRR, and could be used for further studies of the mechanism of autoinhibition.  45  Figure 2.7: Trans-inhibition of ΔN301 DNA binding by the SRR peptide requires both phosphorylation and aromatic residues. The effects of the trans-SRR peptides on DNA binding by ΔN301 were measured by EMSA. Shown are the resulting binding isotherms for ΔN301 in the absence (KD=0.12 nM; blue) and presence of 0.58 mM SRR2P (KD=0.73 nM; red). The six-fold increase in fit KD values indicates trans-inhibition of DNA binding (t test P = 0.002). Consistent with their weaker affinity for ΔN301, the SRR0P, SRR-4ϕA0P, and SRR-4ϕA2P peptides did not significantly affect ΔN301 DNA binding under experimental conditions (less than 2-fold difference in fit KD values; corresponding P > 0.1). The data points and associated error bars were obtained from three replicas and fit to a simple binding isotherm.  2.4.2 The trans-SRR peptides interact with the ETS domain through phosphoserines and aromatic residues   Titration of the unlabeled, and therefore “invisible”, synthetic peptides into 15N-labeled ΔN301 allowed us to identify the binding interface on the ETS domain and dissect the underlying interaction forces. The observation of CSPs for amides in ΔN301 upon addition of the trans-peptides confirmed that the two species indeed interact in an intermolecular fashion (Figure 2.8). Furthermore, the progressive change in amide shifts with increasing amounts of peptide indicates that binding is relatively weak and occurs in the fast exchange limit. This 0.20.40.60.81.010-12 10-11 10-10 10-9 10-8 10-7 10-8[6N301]  (M)0fraction boundno SRR2PKD 0.12 nM0.58 mM SRR2PKD 0.73 nM 46 enabled us to determine the equilibrium dissociation constants (KD) for binding by fitting the resulting titration curves for a simple 1:1 isotherm (Figure 2.9 and Table 2.4).   Figure 2.8: Interactions of the SRR peptides for the ETS domain measured by NMR spectroscopy. Regions of the superimposed 15N-HSQC spectra of 15N-labeled ΔN301 with the unlabeled SRR2P added in peptide to protein molar ratios of 0 (red), 0.25 (orange), 0.50 (yellow), 0.75 (green), 1.0 (cyan), 1.5 (blue), 2.0 (purple) and 4.0 (magenta). Similar spectra were recorded for SRR0P, SRR-4φA0P and SRR-4φA2P (not shown). These data were used to generate the titration curves in Figure 2.9.  Table 2.4: Dissociation constants (KD) for the SRR peptides with ΔN301. a Peptide 50 mM NaCl 90 mM NaCl 200 mM NaCl SRR0P 1650 µM ± 250 µM 3300 µM ± 1000 µM  SRR2P 100 µM ± 10 µM 150 µM ± 20 µM 350 µM ± 40 µM SRR-4ϕA0P -b   SRR-4ϕA2P 260 µM ± 30 µM   a KD values ± standard errors of the mean were obtained by averaging the individual fit values of seven perturbed residues (L393, R394, Y395, Y396, D398, I401, W338ε1 and W375ε1). b Although binding weakly, a reliable KD value could not be measured   10 9 8 7 613012512011511010515N (ppm)1H (ppm)6N301 : SRR 2P1 : 0  1 : 0.251 : 0.501 : 0.751 : 1.001 : 1.501 : 2.001 : 4.00W375 H¡1D398Y395W338 H¡1W338I401Y396Q336 H¡1K399G392L389 K404Y397Q336R394L393 47  Figure 2.9: Intermolecular interactions of the trans-SRR peptides with the ETS domain are dependent upon adjacent aromatic residues and phosphoserines. The affinities of the SRR peptides for 15N-labeled ΔN301 were obtained by fitting the combined amide chemical shift changes of perturbed nuclei (Δδ) as a function of peptide:protein ratio to a 1:1 binding isotherm. The fit KD values are reported in Table 2.4.  The SRR2P peptide showed the strongest affinity for ΔN301 with a KD value of 100 µM in 50 mM sodium chloride (Table 2.4). The absence of the two phosphate groups caused a 16-fold affinity reduction (KD ~ 1650 µM) for the ETS domain. Increasing the ionic strength from 50 to 200 mM sodium chloride reduced the affinity of both SRR0P and SRR2P for ΔN301 by 2- to 3-fold, indicating an electrostatic component to binding. Replacing the four aromatic residues in the SRR peptide with alanines (SRR-4ϕA0P) severely weakened binding to the ETS domain, precluding reliable measurement of a KD value. Phosphorylation of the mutant SRR-4ϕA2P peptide partially restored binding to ΔN301 (KD ~ 260 µM). Collectively, these NMR titrations further support that binding of the SRR to the ETS domain is dependent upon the presence of both phosphate and aromatic moieties.    D398 HNY395 HNW375 H¡11 2 3 4 5 6 7 8 900.10.20.30.46b (ppm)SRR0P + 6N301Ratio peptide : 6N3010.10.20.30.46b (ppm)SRR2P + 6N3011 2 3 4 5 6 7 8 90Ratio peptide : 6N3011 2 3 4 5 6 7 8 900.10.20.30.4SRR 4qA2P + 6N3011 2 3 4 5 6 7 8 900.10.20.30.4 SRR 4qA0P + 6N301 48 2.4.3 The SRR interacts in trans and in cis with the same interfaces of the ETS domain   We also used the NMR-monitored titrations to determine and compare the binding interface of the four SRR peptides on ΔN301. Addition of all four trans-peptides to the 15N-labeled protein led to similar patterns of CSPs, albeit with varying magnitudes due, at least in part, to differing affinities that precluded the formation of the fully saturated complexes (<37% for SRR-4ϕA0P; 37% for SRR0P; 78% for SRR-4ϕA2P; 84% for SRR2P) (Figures 2.10A and B). Residues of ΔN301 showing the largest CSPs clustered within the loop between helices HI-2 and H1 linking the inhibitory module to the ETS domain, as well as within H3 and the following β-strand (S3) of the ETS domain. Upon more detailed inspection, amides near the C-terminus of H3 and in the loop to S3 (including W338, L341 and K399) showed co-linear CSPs when titrated with the four peptides (Figure 2.10C). These spectral changes might arise from common local conformational perturbations linked to autoinhibition. A similar phenomenon of co-linear CSPs paralleling DNA-binding affinity was previously observed with cis-acting SRR variants [40]. In contrast, for other residues (including L337, W375 and Y395) clustering near the N-terminus of H3 and the H2-H3 turn, the CSP patterns differed for SRR-4φA0P and SRR-4φA2P relative to SRR0P and SRR2P. These differences might reflect CSPs due to aromatic ring currents, thus suggesting that the SRR tyrosine and phenylalanine residues localize to this region of ΔN301. Overall, the same surface regions were identified for the intramolecular interactions within ΔN279 by PRE and CSP studies (Figure 2.5, 2.6 and 2.7). Thus, the interactions of the trans-peptides recapitulate the intramolecular allosteric and steric mechanisms of SRR-mediated autoinhibition.   49  HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4AB390380370360340330320310 430420410400350Residue6b (ppm)0.10.30.20.40.30.20.1W338 W356 W361 W375SRR2PSRR0P0.40.10.20.3390380370360340330320310 4304204104003506b (ppm)Residue0.30.20.1W338 W356 W361 W375SRR 4qA2PSRR 4qA0PC7.3 7.2K3997.1114.5115.5L3418.4 8.28.3116.0117.015N (ppm)1H (ppm)1H (ppm)General patternSpecific patternHI-1HI-2H3 - DNA binding helixH4H2H1H5W338Q336 W375W375 H¡1113.5114.58.0 7.8I401W338125.5126.5L3379.49.59.6128.0129.010.510.77.37.47.5120.0121.0Y3957.8 7.7109.5108.5G392 50 Figure 2.10: Interactions of the trans-SRR peptides with ΔN301 measured by NMR spectroscopy. Amide and tryptophan indole (inset) chemical shift perturbations observed in the 15N-HSQC spectra of 15N-labeled ΔN301 (~ 250 µM) upon addition of SRR0P (A; 1.0 mM peptide yielding 37% saturation based on the KD values of Table 3; blue histogram), SRR2P (A; 640 µM peptide, 84% saturation; red line), SRR-4φA0P (B; 1.1 mM peptide, < 37% saturation; orange histogram), or SRR-4φA2P (B; 1.1 mM peptide, 78% saturation; purple line). Regions shaded in grey designate α-helices (rectangles) or β-strands (arrows) in ΔN301 (PDB: 1R36). (C) Small regions of the superimposed 15N-HSQC spectra of 15N-labeled ΔN301 in the absence (cyan contours) or presence of SRR-4φA0P (orange), SRR-4φA2P (purple), SRR0P (blue), or SRR2P (red). Residues with large (Δδ > 0.1 ppm; red) and medium (0.07 – 0.1 ppm; yellow) amide or indole CSPs upon titration with SRR2P are mapped on ΔN301 (PDB 1R36). Perturbed sidechains are labeled and shown in stick format. Other peptides were similar. Overall, these CSP patterns demonstrate that the trans-SRR peptides bind to an interface encompassing helices HI-2/H1 and H3, as well as H4 of the ΔN301 IM and ETS domain, respectively. However, some residues (left) exhibited a pattern of increasing shift perturbations in the order SRR-4φA0P < SRR-4φA2P < SRR0P < SRR2P, suggestive of similar structural perturbations. Other residues (right) showed peptide-specific changes.   2.4.4 Phosphoserines and aromatic residues form the ΔN301-binding interface of the trans-SRR peptides  The conformational and dynamic properties of the SRR peptides were also characterized by NMR methods to determine how aromatic residues and phosphorylation contribute to autoinhibition.  The SRR peptides were uniformly 13C/15N-labeled and the 15N- and 13C-HSQC spectra of their unphosphorylated and phosphorylated forms assigned (Figures 2.11 and 2.12). As often seen with peptides, the 15N-HSQC spectra of both the SRR0P and SRR2P peptides contain a sub-set of extra weak signals arising from minor populations of conformers with cis X-Pro peptide bonds. The cis and trans X-Pro assignments were based on the distinctive 13Cγ vs. 13Cδ  chemical shifts of the proline residues [136].    51   Figure 2.11: Assigned 15N-HSQC spectra of the (A) SRR0P and (B)  SRR2P trans-peptides. 15N-HSQC spectra of the enzymatically synthesized peptides at 28°C on a 600MHz. Samples were in 20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT and 5% D2O. The * indicate signals from minor conformers with cis X-Pro bonds.   9.0 8.5 8.0 7.5126124122120118116A293*A293R279A294*V280A294F286E289* D287Y283 Y291V280*Y288D284H298N297S285*S285S282S282*L295*L295**E289D290*D290K299K299*L295SRR2P9.0 8.5 8.0 7.5126124122120118116A293*K299A293R279V280A294*A294K299*E289* F286E289D287 Y291D284D290Y288H298N297S282S285*S285S282*L295*V280*D290*L295Y283SRR0PL295**15N (ppm)A B1H (ppm)1H (ppm) 52  Figure 2.12: Titration of the trans-SRR peptides with ΔN301 monitored by NMR spectroscopy. Superimposed 15N-HSQC spectra of 15N-labeled ΔN301 with the unlabeled synthetic SRR2P peptide added in protein:peptide molar ratios of 0 (red), 0.25 (orange), 0.50 (yellow), 0.75 (green), 1.0 (cyan), 1.5 (blue), 2.0 (purple) and 4.0 (magenta). Similar spectra were recorded for SRR0P, SRR-4ϕA0P and SRR-4ϕA2P. Selected amide and indole groups showing CSPs due to protein binding are labeled.  Residues with a * indicate cis X-Pro populations.  Titrations of the 13C/15N-labeled SRR peptides with unlabeled invisible ΔN301 were then performed to identify the interfacial residues involved in the interaction with the ETS domain (Figures 2.12 and 2.13). The amides of Asp284, Ser285, Phe286, Tyr288 and Glu289, as well as, the aromatic groups of Phe286, were perturbed upon addition of ΔN301 to SRR0P. In the case of SRR2P, the effect was more pronounced with backbone amides and side chain aromatic groups of residues 283 to 291 inclusive all showing larger CSPs. For the most part, this 9.0 8.5 8.0 7.51261241221201181166N301 : SRR 2P0 : 10.25 : 10.5: 10.75: 11.0 : 11.5 : 12.0 : 1Y288S285S282F286E289S285*S282*7.4 7.2 7.0 6.81341331321311301296N301 : SRR 2P0 : 10.25 : 10.5: 10.75: 11.0 : 11.5 : 12.0 : 1F286 H¡F286 Hb9.0 8.5 8.0 7.5126124122120118116Y288S285S282E289S282*S285*7.4 7.2 7.0 6.81341331321311301296N301 : SRR 0P0 : 10.5 : 11.0 : 11.6 : 12.1 : 13.2 : 14.2 : 16N301 : SRR 0P0 : 10.5 : 11.0 : 11.6 : 12.1 : 13.2 : 14.2 : 1F286 H¡F286 HbSRR2PSRR0PA BC D1H (ppm)1H (ppm)15N (ppm)13C (ppm) 53 difference reflects the increased saturation state of the latter peptide since the lower affinity of the SRR0P precluded formation of a 1:1 complex. Together, these CSP patterns confirmed that trans binding to ΔN301 involves both phosphoserine and aromatic moieties within the SRR peptides. Figure 2.13: Charged and aromatic residues in the trans-SRR peptides interact with the ETS domain. Amide CSPs observed upon addition of ΔN301 to the labeled SRR0P (~25% saturation; blue bars) and SRR2P (~85% saturation; red line). The protein-binding interface is highlighted in grey, along with aromatic residues (φ) and phosphoacceptors (*). Missing data are prolines or residues with overlapped signals.  The 15N-HSQC spectra of the SRR peptides also contain a subset of weaker signals arising from the four prolines adopting significant populations with cis X-Pro amide bonds (12% to 30%; Figure 2.11 and Table 2.5). Interestingly, the weaker signals from conformers with cis X-Pro amide bonds also shifted upon titration with ΔN301 (Figure 2.12). This demonstrates that the protein binds peptides with both trans and cis X-Pro conformations. This lack of any conformational preference provides further evidence for the formation of a “fuzzy complex,” as discussed below.  0.20.1280 285 290 2956b (ppm)SRR2P+/- 6N301SRR0P +/- 6N301** q q qqResidue 54  Table 2.5: Phosphorylation and ΔN301-binding does not markedly change the X-Pro cis/trans conformational equilibria of the SRR peptidesa,b Residue SRR2P SRR0P ΔN301-bound SRR2P V280 20 28 21 S282 12 18 16 S285 15 18 19 A293 18 17 13 A294 -c 15 -c K299 32 36 33  a The listed % values = cis / (cis+trans) are derived from the relative intensities of the major and minor amide peaks for each residue in a 15N-HSQC spectrum, and assumed to result from conformers with cis X-Pro bonds of Pro281, Pro292, Pro296, and/or Pro300. Extracted volumes gave similar results. Residues E289, D290 and L295 also showed multiple signals, but their population could not be quantified reliably due to spectral overlap. b Cis/trans conformational equilibria is also present in the ΔN279 fragment (not shown). Residues R279, V280, P281, S282, Y283 and D284 showed detectable isomerization in the phosphorylated state, whereas residues R279, V280, P281, S282, E289, D290, P292, A293, A294 and L295 presented two different conformations in the unphosphorylated state.  However, their relative populations could not be quantified reliably due to spectral overlap of the poorly dispersed 15N-HSQC signals of the SRR. c Spectral overlap prevented measurement of cis X-Pro bond populations.  2.5 Trans-SRR peptides are intrinsically disordered and form dynamic “fuzzy” complexes with ΔN301   We probed the conformational and dynamic properties of the SRR peptides to investigate further how aromatic residues and phosphorylation contribute to autoinhibition. The secondary structural propensities of the SRR peptides were predicted by analysis of mainchain chemical shifts with the algorithm δ2D [137] (Figure 2.14A). Consistent with their limited spectral dispersion (Figures 2.11 and 2.12), both the unmodified SRR0P and phosphorylated SRR2P peptides showed almost identical high random coil propensities. This confirms that they are predominantly disordered and that phosphorylation does not induce any persistent secondary  55 structure. When bound to ΔN301, the SRR2P peptide also exhibited random coil chemical shifts, with only a slightly higher propensity for residues 283 to 287 to sample extended conformations (Figure 2.14B). Therefore, the trans-peptides form dynamic complexes with the ETS domain. This is consistent with the previously described behavior of the ΔN279 construct in which the SRR is disordered and remains disordered upon phosphorylation (Figure 2.14B) [43]. The behaviour of the SRR, both in trans or in cis, stands in sharp contrast to examples of disordered regions that fold into well defined structures upon binding to their partner [96].   Figure 2.14: The SRR is predominantly disordered, even when phosphorylated and bound as trans-peptides to ΔN301 or in the intermolecular context of ΔN279. (A) Propensities of the SRR0P (green), SRR2P (black) and ΔN301-bound SRR2P (~ 85% saturation; red) to adopt extended, polyproline helix 2 (PPII) and random coil conformations as calculated from backbone 13C and 15N chemical shifts (trans X-Pro only) with δ2D [137]. Similar propensities were predicted for peptides with cis X-Pro conformations (not shown). (B) The SRR residues in the cis context of ΔN279 are also disordered without or with phosphorylation. The slight differences in propensities relative to the peptides arise from the lack of 1Hα and 13CO chemical shift assignment for ΔN279 [43]. Shaded grey area corresponds to the residues interacting with the ETS domain and IM. 0.750.500.251.000** q q qq6N2790P                6N2792P 0.750.500.251.0000.750.500.251.0000.750.500.251.000279280281282283284289288287286285290291292293294295296297298299278ResidueRandom Coil PropensityPPII Helix PropensityExtended Propensity** q q qqSRR0P         SRR2P         SRR2P + 6N301A B0.750.500.251.000279280281282283284289288287286285290291292293294295296297298299278Residue0.750.500.251.000 56 The fast sub-nsec timescale dynamics of the unbound SRR peptides were investigated by 1H{15N}-NOE measurements (Figure 2.15). In the case of SRR0P peptide, the terminal residues are very dynamic with NOE values less than -0.75. However the central residues, which mediate binding to ΔN301, exhibited values around zero. Thus, although lacking any predominant secondary structure, the motions of the SRR0P peptide backbone are partially dampened. The phosphorylated SRR2P peptide showed slightly higher NOE values for these central residues, suggesting a small reduction in mobility. More strikingly, the NOE values of residues throughout SRR-4φA0P decreased substantially to those more characteristic of a random coil polypeptide [138, 139]. This indicates that the presence of the four aromatic sidechains dampens the backbone flexibility of the SRR peptides, possibly through the formation of dynamic hydrophobic clusters that lack any persistent secondary structure. This also provides a possible mechanism for the synergy between phosphoserines and aromatic residues.   Figure 2.15: Aromatic residues dampen fast motions in the trans SRR peptides. 1H{15N}-NOE ratios for SRR0P (green), SRR2P (black), and SRR-4φA0P (orange). Decreasing values indicate increasing flexibility on the sub-nsec timescale. Missing data are prolines or residues with overlapped signals.     Residue1 H{15N} NOE0.25-0.25-0.75-1.25-1.75SRR 4qA0PSRR0P SRR2P280 285 290 295** q q qq 57 2.6 Coupled roles of the IM and HI-1 in autoinhibition  The IM only imparts ~two-fold inhibition and the SRR is required for the full ~20-fold effect seen with ΔN279 or wild-type Ets1. However, disruption of the IM with mutations in helices HI-1 (Y307P) or H5 (L429A) completely abrogates autoinhibition [38]. Furthermore, the presence of the SRR stabilizes helix HI-1 against fluctuations detectable through amide HX studies, and this stabilization increases with increasing phosphorylation [40, 43]. Thus, it is somewhat surprising that amides within the IM did not show any CSPs upon introduction of the 4ϕA mutations into ΔN279 (Figures 2.3 and 2.5) or upon addition of the trans-SRR peptides to ΔN301 (Figure 2.10).  Also, in the context of ΔN279, PRE experiments did not indicate the formation of any transient interactions between the MTSL spin label at the N-terminus of the SRR with the IM (Figure 2.5).  These results raise the question as to how the SRR and IM act co-operatively to drive autoinhibition.  One possible mechanism is that HI-1 of the IM serves to position the SRR close to the DNA-recognition helix H3. This would in effect increase the local concentration of the disordered SRR within this region of he ETS domain to facilitate transient steric blockage of DNA-binding. By thermodynamic linkage, the resulting interactions of the SRR with the ETS domain would favor a folded IM, thus facilitating the allosteric component of autoinhibition.  To help address this question, we compared the 15N-HSQC spectra of two uninhibited mutants (ΔN279-Y307P and ΔN280-L429A) with those of ΔN279 and ΔN301 (Figure 2.16). These mutations have been shown to disrupt the IM, thus relieving autoinhibition [38]. The intention was to determine whether interactions of the SRR with the recognition helix H3 require an intact IM. Unfortunately, both mutants proved unstable and readily aggregated, thus precluding detailed NMR analyses. Qualitatively, the presence of many intense sharp signals in the 15N-HSQC spectra of both mutants indicates that their IMs are indeed unfolded. However, the broad signals in the middle of the spectrum of ΔN279-Y307P and the numerous weak or absent signals exhibited by ΔN279-L429A are also consistent with the visible aggregation of these proteins. A sub-set of resolved peaks from the mutants showed chemicals shifts intermediate between those of ΔN279 and ΔN301, whereas a second sub-set exhibited chemical shifts typical of ΔN301. More detailed comparisons of ΔN301, ΔN279, ΔN279-Y307P and ΔN280-L429A could not be made since assignment of the mutant spectra proved impossible. In the end, we must conclude that mutations in HI-1 and H4 severely disrupt the stability of Ets1 and that their behaviour no longer reflects the equilibrium observed in previous  58 publication. Regardless, these experiments clearly show that the folded IM does stabilize the ETS domain against aggregation.   59   11 10 9 8 713012512011511010511 10 9 8 71301251201151101056N279 Y307P6N280 L429A15N (ppm)15N (ppm)1H (ppm)AB 60 Figure 2.16: The inhibitory module stabilizes the folding of the ETS domain. Overlaid 1H-15N HSQC of ΔN301 (orange) and ΔN279 (red) along with (A) ΔN279-Y307P (cyan) or (B) ΔN280-L429A (blue). Both mutants are unstable and readily aggregated. In panel A, black boxes highlight two examples in which signals for amides in the ΔN279-Y307P mutant exhibit intermediate chemical shifts between those of ΔN301 and ΔN279, whereas pink boxes show two examples of signals that exhibit chemical shifts typical of ΔN301. The latter lacks the SRR.  2.7 Discussion  2.7.1 Aromatic residues in the SRR are critical for phosphorylation-dependent autoinhibition   Previous studies demonstrated that the intrinsically disordered SRR is required for both basal and phosphorylation-enhanced autoinhibition of Ets1 DNA binding, yet the underlying molecular interactions were largely unknown. IDRs are generally depleted in hydrophobic amino acids [140], and thus it is striking that the CaMKII phosphoacceptor serines contributing to autoinhibition are adjacent to Phe/Tyr (φ) residues within a Ser-φ-Asp/Glu repeat. There are two repeats in the truncated SRR (282-SYD-SFD-287), and a third in the full SRR (251-SFE-253) [38, 40]. Mutation of these two aromatic residues and two additional ones within the truncated SRR to glycines, alanines or valines strongly reduced phosphorylation-enhanced autoinhibition. Conversely, the presence of additional Ser-φ-Asp repeats dramatically increased the effect, whereas even six phosphorylated Ser-Ala-Asp repeats did not reinforce autoinhibition. The contribution of the aromatic residues was confirmed using the trans-SRR system. Upon addition of ΔN301, NMR signals from the aromatic residues in the SRR peptides shifted, indicating that they are involved directly in the intermolecular interface. Furthermore, the SRR-4φA2P peptide bound ΔN301 with lower affinity than did the SRR2P peptide, and the very weak interaction of the SRR-4φA0P peptide could not be quantified. Thus, both phosphoserines and adjacent Phe/Tyr residues within the SRR act synergistically to inhibit Ets1 DNA binding.  2.7.2 Fuzzy interactions between the SRR and ETS domain   The interactions between the SRR and ETS domain can be described as “fuzzy” based on their transient nature and the lack of any induced, persistent structure [96]. This conclusion,  61 which stands in sharp contrast to examples of disordered regions that fold into well-defined structures upon partner binding, derives from several lines of evidence. Whether linked to the IM and ETS domain as in ΔN279 or as an isolated trans-peptide, the SRR residues have random coil chemical shifts. Furthermore, neither phosphorylation nor ΔN301 binding causes any chemical shift changes indicative of a predominant induced structure. Amide 1H{15N}-NOE experiments also demonstrated that the residues in the SRR are significantly more flexible on a sub-nsec timescale than those in the well-ordered ETS domain. However, the backbone motions of the SRR are not unrestricted. Based on the substantially lower NOE values of the SRR-4φA peptide relative to the wild-type peptide, this might result from hydrophobic clustering of the four aromatic residues. The motions of the SRR also dampen with increased binding to the ETS domain upon phosphorylation. Thus, the SRR appears to rapidly interconvert between an ensemble of conformations that are partially restrained by transient interactions, both within the SRR itself and with the ETS domain. This ensemble may involve binding of the flexible SRR at one predominant site or at multiple sites on the ETS domain. These two possibilities were not resolved due to coarse nature of CSP and PRE data and the absence of detectable interproton NOEs between the SRR and ETS domain [43].   The interaction surface of Ets1 contacted by the SRR helps us understand how the dynamic complex functions in autoinhibition. This surface, which spans from the ETS domain to the IM, was broadly defined through NMR-monitored titrations of ΔN301 with the trans-SRR peptides, PREs from an N-terminal MTSL spin label on ΔN279, and CSPs of ΔN279 resulting from 4φA-mutation and phosphorylation of the SRR. Overall, this region has a net positive charge due to several Arg and Lys residues, whereas the SRR of ΔN279 contains four Asp and Glu residues, as well as two phosphoacceptor serines. Thus, electrostatic forces likely contribute to their interactions. This conclusion is further supported by the modestly weakened KD values of the SRR peptides for ΔN301 with increasing ionic strength. However, glutamate and aspartate are only partial mimics of phosphoserine, and removing the four carboxylates or introducing two arginines does not impair basal autoinhibition. Therefore, more than Coulombic interactions are at play. Indeed, the SRR-interacting surface of the Ets1 ETS domain overlaps with a relatively large patch of neutral polar and non-polar residues between the IM and DNA-recognition helix (Figure 2.4). With partially exposed sidechains, Tyr307 (HI-1), Tyr329 (HI-2), Trp338 (H1), Tyr395 (H3), and Tyr396 (H3) are included in this patch. Thus, hydrophobic and van der Waals interactions involving aromatic residues could also help localize the SRR to this region of Ets1.    62 2.7.3 Mechanisms of phosphoserine-aromatic synergy: π-cation interactions?    The observation that phosphorylation-enhanced autoinhibition requires the presence of aromatic residues suggests that the SRR functions through more than just a simple, additive collection of weak electrostatic and hydrophobic/van der Waals contacts. What then are the underlying forces by which aromatic residues integrate with phosphoserines to mediate SRR-ETS domain binding?    One possible mechanism could involve the simultaneous association of adjacent phosphoserines and aromatic residues with lysine and arginine sidechains in the DNA-recognition interface of Ets1 via ion-pair and π/cation interactions, respectively (Figure 2.17) [141, 142]. The latter, which involves the association of the negatively-charged electron "cloud" of an aromatic ring with a positively-charged group, are often found in biological systems [143, 144]. Along with salt-bridges involving the phosphoserines, this could result in cooperative, multi-valent binding of the SRR with the ETS domain.    Figure 2.17: Possible mechanisms for synergistic autoinhibition of Ets1. Highly schematic cartoon models of phosphoserine/aromatic synergy due to (left) cooperative phosphate salt bridge and aromatic π/cation interactions with Lys/Arg sidechains along the DNA-binding interface(+), or (right) phosphate-driven hydrophobic clustering. The IM/ETS domain is shown as an oval with a hydrophobic surface (green) extended to the positively-charged DNA-binding interface.   The binding interface on the ETS domain contains nine lysines and arginines that could interact with the four aromatic residues in the SRR (Figure 2.18A,B).  In principle, if lysine aminium and arginine guanidinium side chains are involved in π-cation interactions with the SRR, their chemical shifts should differ in the NMR spectra of ΔN301, ΔN2790P and ΔN2792P.  However, unless protected by their structural environment, the labile protons in these side chains typically undergo rapid exchange with water and thus are difficult to detect by 1H-NMR + ++PO4=PO4=OHOHPO4= PO4=+ + +PP 63 approaches. Fortunately, with reduced sample temperature, arginine 15NεH signals can be observed in the 15N-HSQC spectra of these proteins (Figure 2.18C), Although not assigned, the arginine signals overlap closely for ΔN301, ΔN2790P and ΔN2792P, with only small perturbations seen with due to phosphorylation. This could reflect slightly different sample conditions, degradation, or increased protection of an otherwise rapidly exchanging side chain. Overall, these data argue against involvement of ariginines sidechains in π-cation interactions.  However, we cannot rule out the possibility of π-cation interactions with lysines side chains that could not be detected due to fast hydrogen exchange.    Figure 2.18: Positively charged arginine and lysines side chains could interact with the SRR through π-cation interactions. (A) Surface and (B) cartoon representation of ΔN301 showing all lysines (blue) and arginines (cyan). Of these, six lysines and three arginines are localized close to the DNA-binding interface and could potentially interact with the SRR. (PDB 1R36) (C) 15N-HSQC spectra of the arginine side chain 15NεH signals in ΔN301 (orange), ΔN2790P (blue), ΔN2792P (red) and all there overlaid (lower panel). Signals that exhibit small changes in chemicals shift or intensity are indicated with an *. 88848015N (ppm)6N30188848015N (ppm)6N2790P88848015N (ppm)6N2792P*****9.0 8.0 7.088848015N (ppm)1H (ppm)OverlayABC 64 2.7.4 Mechanisms of phosphoserine-aromatic synergy: intramolecular salting-out?   Alternatively, the negatively-charged phosphates could interact with solvent and help drive a hydrophobic clustering of the neighboring aromatic residues either within the SRR or with the hydrophobic residues on the surface of the ETS domain and the IM. In support of this hypothesis, as a "salting-out" anion in the Hofmeister series, phosphate decreases the solubility of nonpolar solutes and stabilizes proteins, possibly through the ordering of water [145-147].  Recent advances suggest that protein stabilization as a function of the Hofmeister series might involve direct ion–macromolecule interactions as well as interactions with water molecules in the first hydration shell of the macromolecule [148]. Alternatively, protein stabilization appears to be correlated with changes in surface tension, preferential weak interactions of ions with the protein backbone and repulsion between solutes and co-solutes [149, 150]. These different forces may all be involved in the stabilization of the SRR upon phosphorylation, but will necessitate further studies.  2.7.5 Aromatic residues in intrinsically disordered regulatory sequences  Functionally critical aromatic groups in IDRs are also found in the disordered activation domain (EAD) of the transcription factor EWS. This ~280-residue region is a low complexity sequence comprised mainly of a repeating SYGQQS motif. The activity of the EAD in transcriptional regulation and oncogenesis is strongly dependent upon the presence of multiple tyrosines within these repeats [151, 152]. A structural contribution of the multiple tyrosines in the (G/S)Y(G/S) repeats of all related FET proteins (FUS, EWS, and TAF15) is also suggested by the reversible polymerization of these IDRs [153]. This may mediate biological regulation since the tyrosines are essential for phosphorylation-regulated binding of the EAD to the C-terminal domain (CTD) of RNA polymerase II. Although the EWS-EAD tyrosines are hypothesized to form fuzzy "polycation/π" interactions with yet unidentified positively-charged partner proteins [154], the molecular forces driving these processes, which show both similarities and differences to the autoinhibitory SRR of Ets1, remain to be established.  2.7.6 ETS factor autoinhibition    The Ets1 autoinhibitory mechanism has both common and distinct features relative to those exploited by other ETS transcription factors, and thus provides a route to specificity within  65 this group of proteins that otherwise share highly related DNA-binding properties [8]. ERG is regulated by a dynamic N-terminal sequence that perturbs its DNA-binding helix in a manner akin to that of the Ets1 SRR [28]. However, this sequence is almost devoid of aromatic residues. Furthermore, ERG lacks an equivalent IM and is not known to be modulated at the level of DNA binding by phosphorylation. In contrast, ETV6 does not contain an equivalent SRR, and instead is autoinhibited by a C-terminal helix that sterically blocks its ETS domain interface and unfolds upon binding both specific and non-specific DNA sequences [29]. In all of these cases, additional DNA binding proteins can counteract or bypass the negative effects of autoinhibition and, thereby, provide added specificity to each ETS factor.     In closing, this chapter builds upon previous studies of the Ets1 transcription factor to better define the physicochemical basis of DNA binding autoinhibition. Mutational and structural analyses were used to investigate the transient interactions between the disordered SRR and the well-structured ETS domain in a cis and trans context. In both cases, aromatic and negatively-charged residues in the disordered region form dynamic fuzzy complexes with the DNA binding interface, thus sterically and allosterically preventing DNA binding. Theses findings shed light on the synergy between aromatic residues and phosphoserines in a regulatory IDPR and extend the repertoire of known auto-inhibitory mechanisms.  2.8 Material and methods  2.8.1 Unlabeled synthetic SRR peptides   The following chemically-synthesized peptides, corresponding to the truncated SRR of Ets1 (residues 279-300), were purchased from Biomatik (Canada): SRR0P, Ac-RVPSYDSFDYEDYPAALPNHKP-NH2; SRR2P, Ac-RVP(pS)YD(pS)FDYEDYPAALPNHKP-NH2 (phosphorylated on both serines); SRR-4φA0P, Ac-RVPSADSADAEDAPAALPNHKP-NH2, SRR-4φA2P, Ac-RVP(pS)AD(pS)ADAEDAPAALPNHKP-NH2. The N-terminal acetylated and C-terminal amidated peptides were obtained in a lyophilized form after HPLC purification. The dry peptides were resuspended in NMR sample buffer (20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT), re-adjusted to pH 6.5, and dialyzed in Float-A-Lyzer G2 (SpectrumLabs, USA) for 48 h against the same buffer. Concentrations of SRR0P and SRR2P were determined by absorbance using a predicted molar absorptivity ε280 = 4470 M-1 cm-1.  66 Amino acid analysis (Sick Children's Hospital, Toronto) was used to quantify samples of SRR-4φA0P and SRR-4φA2P.   2.8.2 Protein expression and purification   The plasmid encoding residues 279-440 of murine Ets1 (denoted as ΔN279; Figure 2.1) with a thrombin-cleavable N-terminal His6-tag was obtained by ligating a synthetic, codon-optimized gene (Blue Heron, US) into the pET-28a vector via BamH1/Nco1 restriction sites. The non-tagged Ets1 ΔN301 construct (residues 301 to 440) was then generated by PCR from this optimized sequence and inserted into pET-28a with the same restriction sites. Codon optimization led to only modest improvements in protein yields. Mutations summarized in Table 2.5 were introduced into the truncated SRR of ΔN279 via the QuickChange site-directed mutagenesis protocol (Stratagene). Genes encoding ΔN279 with tandem Ser-φ-Asp repeats (Table 2.2) were generated by engineering unique restriction sites at the 5’ (Bsu36I) and 3’ (AvrII) end of the SRR coding sequence. Double restriction digests were used to excise the coding sequence for the SRR. The resulting parental vector was the recipient of mutant versions of the SRR generated from synthetic oligonucleotides. After successful cloning of synthetic inserts, the artificial restriction sites were reverted to the native sequence through QuickChange.   The ΔN279 and ΔN301 constructs were expressed in Escherichia coli HMS174 (λDE3). Cultures (1 L) were grown at 37°C to OD600 = 0.8, induced with 0.4 mM isopropyl-β-D-thiogalactopyranoside (IPTG), and then grown at 30°C for an additional 2 h in the case of unlabeled LB media and 4 h with M9 minimal media. Kanamycin (40 µg/mL) used as the selection antibiotic, and the M9 minimal media was supplemented with 1 g/L (15N, 99%)-NH4Cl for uniform 15N labeling or 1 g/L (15N, 99%)-NH4Cl and 3 g/L (13C6, 99%)-glucose for uniform 13C/15N labeling (Sigma-Aldrich or Cambridge Isotope Laboratories). Harvested cells containing ΔN301 or ΔN279 were resuspended, respectively, in lysis buffer A (50 mM sodium citrate pH 5.3, 50 mM NaCl, 0.5 mM EDTA, and 1 mM TCEP) or B (20 mM sodium phosphate pH 7.8, 500 mM NaCl, 10 mM imidazole, 0.5 mM EDTA, and 2 mM DTT) with protease inhibitor cocktail tablets (200 µL of 1 tablet resuspended in 1mL of water) and lyzed by high pressure homogenization and sonication.   The ΔN279 was then purified by Ni+2-affinity chromatography (5 mL HisTrap HP, GE Healthcare) on an AKTA prime FPLC and eluted with an imidazole gradient from 0 to 500 mM in  67 buffer B. Fractions containing ΔN279 were identified by 15% SDS-PAGE, pooled and dialyzed overnight in cleavage buffer (25 mM Tris pH 7.9, 1 mM EDTA, 10% glycerol, 300 mM NaCl and 1 mM TCEP) with thrombin (Novagen; 1 U/mg of protein). The sample was then repurified by Ni+2-affinity chromatography to remove the cleaved affinity tag and any uncleaved protein. The protein was concentrated with a 3-kDa molecular weight cut-off (MWCO) centricon device and loaded on a Superdex 75 gel filtration column (AKTA FPLC) equilibrated with NMR sample buffer (20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT). Fractions containing purified ΔN279 were identified by 15% SDS-PAGE pooled and concentrated 500 µL of 0.1-0.45 mM for NMR analysis. The WT ΔN279 protein, with a non-native N-terminal Gly-Ser-His remaining from the affinity tag, has a theoretical mass of 19,088 Da, and its concentration was determined by absorbance at 280 nm using a predicted ε280 = 39,880 M-1 cm-1.   The ΔN301-containing lysate was loaded on a DEAE column attached in tandem to a strong cation exchange column (SP Sepharose, GE Healthcare) and eluted with an NaCl gradient from 0 to 1 M in buffer A. The fractions containing ΔN301 were pooled and purified further by gel filtration, as described for ΔN279. The final concentration of WT ΔN301 in NMR sample buffer was determined by absorbance using a predicted ε280 = 35,410 M-1 cm-1.  For other mutants see molecular weights and extinction coefficients contained in table 2.6.  68  Table 2.6: The various Ets1 constructs produced at UBC. Construct name Residue range Mutations/extra residues Molecular weight (Da) Molar absorptivity ε280  (M-1 cm-1) ΔN301 301-440 - 16 300 35 410 ΔN279 279-440 GSH at N-term 19 088 39 880 ΔN279 4φA 279-440 Y283A/F286A/Y288A/Y291A 18 637 35 410 ΔN279 H278C/C350A/C416A 279-440 GSC at N-term,  C350A/C416A 18 989 39 880 ΔN279 C350A/C416A/K383C 279-440 GSH at N-term C350A/C416A/K383C 18 998 39 880 ΔN279 C350A/C416A/N400C 279-440 GSH at N-term C350A/C416A/N400C 19 012 39 880 ΔN279 H278C/C350A/C416S 279-440 GSH at N-term,  C350A/C416A 19 055 39 880 ΔN279 H278C/C350S/C416S 279-440 GSC at N-term H278C/C350S/C416A 19 039 39 880 ΔN279 H278C/C350S/C416A 279-440 GSC at N-term H278C/C350S/C416A 19 039 39 880 ΔN279 Y307P 279-440 GSH at N-term and Y307P 19 021 38 390 ΔN280 L429A 280-440 L429A 18 739 39 880  2.8.3 13C/15N-labeled SRR peptides  The genes encoding the WT and 4φA variants of the His6-tagged SRR (residues 279-300) of murine Ets1 were ligated into a pGEXW2T vector. The resulting polypeptides had tandem GST (glutathione-S-transferase) and His6 tags. The GST component increased solubility of the SRR peptides, while the His6 component was used for purification. The gene products were expressed in E. coli BL21 (λDE3) grown in M9 minimal media containing 1 g/L (15N, 99%)-NH4Cl and 3 g/L (13C6, 99%)-glucose and purified as described above for ΔN279 with the following modifications. After the first round of Ni+2-affinity chromatography, the samples were cleaved with thrombin, and the products re-chromatographed on a Ni+2-affinity  69 chromatography to remove the cleaved GST-His6-tag and any uncleaved product. To obtain phosphorylated peptides, phosphorylation was performed after the first round of Ni+2-affinity chromatography, followed by purification on Mono-Q resin (5/50 GL, GE Healthcare), as described below. The flow-through containing each peptide was purified further by reversed-phase HPLC (DIonex) with a a linear gradient (buffer A: water 0.1% TFA, buffer B: acetonitrile 0.1% TFA) on a C-18 column (PROTO 300Å C18 10 µm, Higgins Analytical Inc). The peak fractions in the elutant were pooled, lyophilized to dryness, resuspended in ~ 1mL of NMR sample buffer, and adjusted to pH 6.5 using a small amount of 0.1 and 1 M NaOH. 5 % of D20 was added to an aliquot of 500 uL for NMR studies.   2.8.4 Phosphorylation of the ΔN279 constructs and SRR peptides   Protein (without tag) and GST-His6-tagged peptides were diluted to a final concentration of 25 µM in 50 mM Tris, 0.5 mM magnesium acetate, 2 mM DTT, and 10 % v/v Phos-stop (Roche) at pH 7.7. Calmodulin-dependent kinase II (CaMKII, expressed via a baculovirus in Sf9 cells and activated by pre-incubation at 30°C for 10 min with calmodulin, CaCl2, and ATP) was added to the protein or peptide stock solutions. Final concentrations were 1 µM of CaMKII, 5.0 µM of calmodulin, 2.0 mM DTT, 1 mM ATP and 0.5 mM CaCl2. The resulting samples were incubated at 30°C for up to 16 h, yielding products with one or two phosphorylated sites as determined by ESI-MS or MALDI-ToF mass spectrometry. After dialysis to 20 mM Tris pH 8.3, 0.5 mM EDTA, 10% glycerol, and 1 mM TCEP, the products were chromatographed on Mono Q anion exchange (Mono Q 5/50 GL or 10/10, GE Healthcare) using a 0 to 500 mM gradient of NaCl in this buffer. Fractions of protein with two phosphorylated sites were pooled and exchanged into NMR sample buffer with an Amicon filtration device (3 kDa MWCO). Fractions containing phosphorylated peptide were lyophilized then dialysed in NMR buffer with Float-A-Lyzer G2 1000 Da cut-off. These sites were identified previously as Ser282 and Ser285 via MS sequencing and, in the case of isotopically-labeled proteins, confirmed by NMR spectroscopy. Phosphorylated GST-His6-tagged peptides were subjected to thrombin cleavage and HPLC purification, as described above.  2.8.5 Electrophoretic mobility shift assays (EMSAs)    DNA-binding assays of the Ets1 proteins were performed using a 32P radio-labeled, duplexed 27-bp oligonucleotides containing a high affinity ETS binding site:  70   5’TCGACGGCCAAGCCGGAAGTGAGTGCC3’ ("top" strand)   5’TCGAGGCACTCACTTCCGGCTTGGCCG3’ ("bottom" strand). Boldface GGAA marks the consensus sequence motif for ETS family DNA binding. The oligonucleotides were labeled with [γ-32P] ATP using T4 polynucleotide kinase (Invitrogen), and annealed by boiling for 5 min and slowly cooling over 6 to 8 h. The DNA concentration was kept constant at 2.5 x 10-12 M, whereas the Ets1 protein concentrations ranged over six orders of magnitude. The final binding reactions were incubated at 4°C for 3 - 4 h in a buffer containing 25 mM Tris, 0.1 mM EDTA, 60 mM KCl, 6 mM MgCl2, 200 µg/mL BSA, 10 mM DTT, 2.5 ng/µL poly(dIdC), and 10% (v/v) glycerol at pH 8.0, and then resolved on an 8% native polyacrylamide gel. Binding reactions for the trans inhibition assay were set up similarly, and each dilution of the protein was incubated with the specified SRR peptide (23 to 580 µM) at 4°C for 10 min. Radiolabeled DNA was added, the reaction incubated for another 45 min at 4°C, and then resolved on an 8% native polyacrylamide gel. Radiolabeled DNA was quantitated on dried gels by phosphorimaging. Equilibrium dissociation constants (KD values) were determined by non-linear least squares fitting of the total protein concentration [P] versus fraction of DNA bound ([PD]/[D]t) to the equation [PD]/[D]t = 1/(1 +(KD/[P])) using KaleidaGraph (v. 3.51, Synergy Software). Due to the low concentration of total DNA, the free and total protein concentrations are effectively equal. Reported KD values represent the average of at least two independent experiments ± standard errors of the means.   2.8.6 NMR spectral assignments of ΔN279 4φA, ΔN279 H278/C350A/C416A and SRR peptides   NMR data were recorded at 25 or 28°C on 500 MHz Varian Unity, cryoprobe equipped 600 MHz Varian Inova, and cryoprobe equipped 600 MHz Bruker Avance III spectrometers. Proteins and peptides were in NMR sample buffer (20 mM MES, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT at pH 6.5) with 5-10 % lock D2O. Data were processed and analyzed using NMRpipe [155] and Sparky [156]. Signals from backbone and sidechain 1H, 13C, and 15N nuclei were assigned using HNCACB, CBCACONNH, CCC-TOCSY-NH (mixing time=0.14s), HCCC-TOCSY-NH, 15N-TOCSY and HNCO.     71 2.8.7 Backbone amide and sidechain 15N relaxation   Proteins, protein/DNA complexes or peptides were 100 to 450 µM in 20 mM MES, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT, and 10% D2O at pH 6.5.  Heteronuclear NOE relaxation data were recorded for backbone amides on a 600 MHz spectrometer at 28°C [134]. NOE value were calculated as the ratio of peak intensities or integrated volumes in spectra recorded with a 2 sec recycle delay and 3 sec of 1H saturation versus with a 5 sec recycle delay. Data points for the T1 (20, 40, 80, 140, 220, 320, 520, 820, 1220 and 2000 ms) and T2 (17, 32, 49, 65, 81, 97, 114, 130, 146 and 162 ms) experiments were collected in random order.    2.8.8 NMR-monitored peptide titrations   Peptide-protein interactions were monitored via sensitivity-enhanced 15N-HSQC spectra [157]. Experiments involved titrating up to 487 µL of unlabeled peptide (initially 1.8 to 2.2 mM) into 15N-labeled ΔN301 (initially 450 µL, 0.25 mM), or up to 589 µL of unlabeled ΔN301 (initially 1 to 2 mM) into 13C/15N-labeled peptide (initially 500 µL, 0.125-0.25 mM). Chemical shift perturbations (CSPs) were calculated from the combined amide 1HN and 15N shift changes as Δδ = . Equilibrium dissociation constants (KD values) were determined by fitting, with GraphPad Prism, Δδi to the following equation for a 1:1 binding isotherm in the fast exchange limit on the NMR chemical shift timescale,  € Δδi= A[ ]T ,i+ B[ ]T ,i+KD( )− A[ ]T ,i+ B[ ]T ,i+KD( )2− 4 A[ ]T ,iB[ ]T ,i⎛ ⎝ ⎜ ⎞ ⎠ ⎟ / 2 A[ ]T ,i( )⎛ ⎝ ⎜ ⎞ ⎠ ⎟ ΔTot  where [A]T,i and [B]T,i are the total, dilution-adjusted concentrations of labeled and unlabeled species, respectively, at each titration point i. ΔTot is the total chemical shift change between the free and saturated states.   2.8.9 Paramagnetic relaxation experiments (PRE)  The QuickChange protocol (Stratagene) was used to generate the gene encoding cysteine-free ΔN279 with the mutations C350A and C416A. A single cysteine was then € 0.2 × Δδ N( )2+ Δδ HN( )2 72 introduced at the non-native N-terminus (H278C) remaining after thrombin cleavage of the His6-tag. The resulting protein was expressed, purified as described above, and thoroughly exchanged into DTT-free NMR sample buffer using an Amicon ultrafiltration device. Free sulfhydryl groups were reacted overnight at room temperature with a 10:1 molar ratio of the nitroxide spin label S-(2,2,5,5-tetramethyl-2,5-dihydro-1H-pyrrol-3-yl)methyl methanesulfonothioate (MTSL; Toronto Research Chemicals). The chemical modification was verified by MALDI-ToF mass spectrometry. The spin-labeled ΔN279 was concentrated to 50 µM and unreacted MTSL removed by multiple buffer exchanges. Non-sensitivity enhanced 15N-HSQC spectra were recorded for the protein in the paramagnetic and then diamagnetic states, the latter being formed by reduction with 10 mM DTT for 24 h. Ideally, a full T2 relaxation set should be collected for all PRE samples in their paramagnetic and diamagnetic states to report the changes in R2 values.  However, the low concentration of the samples prevented such experiments and instead the reported PRE values are the amide peak intensity ratios (Ipara/Idia) in the two states. PRE intensity of the oxidized and reduced states were fitted with Sparky in order to obtain the intensity ratio. Errors were calculated with the following formula:€ ErrorRatio = Ratio∗1S /NDIA⎛ ⎝ ⎜ ⎞ ⎠ ⎟ 2+1S /NPARA⎛ ⎝ ⎜ ⎞ ⎠ ⎟ 2⎧ ⎨ ⎩ ⎫ ⎬ ⎭ , where S/NDIA is the signal to noise for each residue in the diamagnetic state and S/NPARA is the signal to noise for each residue in the paramagnetic state.  73 Chapter 3: DNA binding by Ets1  Within a cellular context, transcription factors find their specific cognate DNA sites through facilitated diffusion involving a large background of non-specific DNA sequences. Therefore, we sought to increase our understanding of the principles underpinning DNA recognition by Ets1, including the effects of autoinhibition. NMR experiments revealed that when Ets1 is bound to both non-specific and specific 12 bp oligonucleotides, the inhibitory helices HI-1 and HI-2 are predominantly unfolded, yet also undergo conformational exchange with minor ordered states. Moreover, we show that Ets1 binds both specific and non-specific DNA through the same canonical binding interface. However, the structural and dynamic changes observed in the backbone and arginine/lysine sidechains of the ETS domain are much larger with specific DNA than with non-specific DNA. This is consistent with the general observation that sequence-specific TFs bind non-specific sequences via "loose" electrostatic interactions, whereas their cognate sites are recognized through high-affinity interactions including hydrogen bonding between protein sidechains and DNA basepairs. This chapter thus broadens our understanding of DNA binding autoinhibition and sequence recognition by Ets1.  3.1 Introduction  Pioneering studies in the 1970 and 1980s with systems such as the lac and λ repressors demonstrated that specific DNA-binding proteins have significant affinity for non-specific DNA. It was then also recognized that the amount of non-specific DNA sites present in the cell vastly outnumbers specific DNA sites. For example, a recent publication estimated the effective concentration of chromatin-free 10 bp DNA sequences in a human cell to be around 1.5 mM, whereas specific sites could be as rare as one per nucleus [60]. The overwhelming presence of non-specific sites also implies that the actual in vivo concentrations of a DNA-binding protein in its free and specifically-bound states are very low. Despite this, DNA-binding proteins still find their target sites at rates faster than expected from a simple 3D random search driven by Brownian motion (diffusion) [60]. Perhaps counterintuitively, this remarkable speed actually results from non-specific DNA interactions, which facilitates diffusion by reducing the volume and dimensionality of the search process (Figure 3.1). Indeed, proteins generally bind DNA non-specifically at first and search locally for cognate sides through rapid 1D rotational-coupled sliding along the double helix. In parallel, more global searching is enabled by transfer between distant DNA segments, either via "hopping/jumping" (dissociation, 3D diffusion, and  74 reassociation) or direct intersegment transfer (transiently bridging two DNA segments without dissociation into solution) [64, 67, 158-162].   Figure 3.1: Schematic view of a transcription factor interacting with a large DNA molecule in dilute solution. (The DNA molecules are well separated into “domains” under these conditions.) The (upper) expanded view shows a transcription factor bound to a segment of DNA, on which it can either “slide” or engage in intradomain dissociation-association hopping (or jumping) processes in seeking its cognate site. The (lower) expanded view shows a transcription factor bound to two DNA segments; this corresponds to direct intersegment transfer. Adapted figure from Von Hippel and Berg, JBC, 1989, 264(2), 675-678 with permission from the American Society for Biochemistry and Molecular Biology © 2014.   Although the general features of this facilitated diffusion model are well accepted, the contributions of the three key mechanisms (sliding, direct transfer, and jumping) toward the search for regulatory sites will certainly vary for each DNA-binding protein. Recent studies have helped clarify the importance of the these mechanisms for several different TFs containing either one (e.g., lac repressor and the HoxD9 homeodomain) or multiple DNA-binding domains (e.g., Oct-1 and Egr-1). Specifically, NMR experiments have shown that 1D sliding likely plays a smaller role than originally thought in the search for cognate sites by proteins such as the lac repressor and Egr-1 [65, 68]. Additionally, NMR studies have shown that direct intersegment transfer significantly increases the speed at which specific sites are found, particularly for Cognate siteTranscription factor 75 proteins such as Oct1 and Egr-1 with multiple DNA-binding domains [61, 63]. For these studies, relaxation experiments were used to characterize the kinetics of intermolecular translocation events between two different non-specific DNA molecules. At DNA concentrations similar to those present in vivo, direct transfer from one non-specifically-bound DNA molecule to another occurs without going through the intermediary of a free protein. Thus, the current consensus is that TFs in vivo are mostly bound non-specifically to uncondensed DNA [68, 163] and locally “scan” the DNA until they find the right binding site. Critically, to avoid being trapped in areas that do not have cognate sites, transcription factors “hop” or directly transfer further along on the same strand wound around the histones or even from one strand to the other [61, 164]. Of course, within the in vivo context of a cell, numerous other factors, such as chromatin structure, the presence of other bound proteins, and protein-protein interactions, must also come into play. Collectively, this ensures that transcription factors find quickly their specific target sites in the cell and thus respond efficiently to signalling events.   The vast majority of biophysical studies on the DNA-binding domains of TFs have focused on their interactions with specific DNA sequences. Broadly speaking, these studies have shown that TFs recognize their cognate sites through both direct readout of the basepairs and indirect readout of the sequence-dependent shape of the phosphodiester backbone. Paramount to this readout are hydrogen bonds between the amino acid sidechains and the nucleotide bases. In contrast, a more limited set of studies on non-specific protein-DNA complexes have demonstrated that binding is driven largely by electrostatic interactions between positively-charged amino acid sidechains and the negatively-charged phosphodiester backbone [165, 166]. Additionally, shown by elegant studies on the lac repressor, the backbone and sidechains of proteins in non-specific complexes are significantly more dynamic and hydrated than with specific complexes [54, 61, 67, 165, 167, 168]. This enables sliding, with rotation, of a protein along the approximately isopotential electrostatic surface of the helical DNA. Upon recognition of specific sequence, both the protein and DNA undergo conformational changes to form a well-ordered stable complex [168-170].   In this chapter, I have addressed the related questions of how does Ets1 recognize specific versus non-specific DNA, and what are the roles of protein dynamics and autoinhibition in these processes? Prior studies have shown that the marginally stable HI-1 is allosterically disrupted upon DNA binding and that phosphorylation of the SRR reinforces autoinhibition by stabilizing the IM and dampening motions of the ETS domain [32, 40, 43]. In particular, the DNA recognition helix H3 undergoes fluctuations detectable by amide hydrogen exchange and these fluctuations decrease with increasing SRR phosphorylation. By analogy with the lac repressor,  76 we speculated that the flexibility of the ETS domain is important for adopting a high affinity complex with its cognate DNA sites. This led to the model in which Ets1 exists in a phosphorylation-dependent conformational equilibrium between a rigid inactive state and a flexible active state (Figure 1.6). However, aside from HI-1 and HI-2 in the inhibitory module, the static crystal structures of the Ets1 ETS domain in its free and specifically-bound forms are highly similar. This suggests that any mean conformational changes in the ETS domain are subtle, and perhaps that flexibility may play a more important role in the interactions of Ets1 with non-specific DNA. Early circular dichroism (CD) spectroscopy and proteolysis experiments suggested that HI-1 unfolds upon binding non-specific, as well as specific, DNA [51]. Since then, no further studies of the interactions of Ets1 with non-specific DNA have been reported.  After optimizing the choice of oligonucleotides, I used NMR spectroscopy to probe the structural and dynamic features of Ets1 ΔN301 bound to specific and non-specific 12 bp DNAs. In both cases, the N-terminal sequences forming the inhibitory helices HI-1 and HI-2 in the free protein become predominantly unfolded upon DNA binding. However, consistent with their very modest contribution to autoinhibition (~ two-fold), the sequences are not completely disordered, but rather still transiently sample ordered conformations. Both DNAs are also bound via the same canonical ETS domain interface. However, the interactions leading to the non-specific complex are weaker relative to the specific complex. This is particularly evident when examining the lysine 15NζH3 and arginine 15NεH sidechain groups. In contrast to the relatively well-ordered amides, these sidechains exhibit a wide range of dynamic properties. In the presence of non-specific DNA, the lysines and arginines yield NMR signals that are broad and conformationally-averaged, whereas with specific DNA, they become better defined and more disperse. Overall, these studies indicate that autoinhibition does not impact the specificity of Ets1 for its cognate sites relative to non-cognate DNA sequences. Also, as seen with other TFs, Ets1 binds non-specific and specific DNAs via the same general interface. However, the non-specific interactions are weaker and more loosely-defined, as would be required in vivo for a facilitated diffusion to search for cognate sites.  3.2 Preliminary characterization of Ets1/DNA complexes by NMR spectroscopy  DNAase I and hydroxyl radical protection, as well as ethylation interference assays indicated that ETS factors bind DNA with a "footprint" spanning ~12 to 15 bp [171]. However, the intermolecular contacts observed in the X-ray crystallographic structural studies of numerous ETS domain/DNA complexes involve only ∼9 bp [8]. Consistent with this observation,  77 the high-affinity sites identified for most ETS factors correspond to a 9 bp consensus 5'ACCGGAAGT3' sequence with the invariant GGA(A/T) core flanked asymmetrically by more variable bases. Accordingly, we sought to find protein-oligonucleotide combinations that encompassed all of the salient features of the specific and non-specific Ets1/DNA complexes, while optimizing their behaviour as required for detailed NMR spectroscopic characterization. A summary of the species examined is provided in Table 3.1  Table 3.1: Ets1 constructs and DNA sequences used in DNA-binding studies. Name Sequence Comment ΔN279 279-440 ETS domain with IM and SRR ΔN301 301-440 ETS domain with IM SC1-9  5’GCCGGAAGT3’  3’CGGCCTTCA5’ 9 bp specific DNAa SC1-12  5’CAGCCGGAAGTG3’  3’GTCGGCCTTCAC5’ 12 bp specific DNAa SC1-14  5’CCAAGCCGGAAGTG3’  3’GGTTCGGCCTTCAC5’ 14 bp specific DNAa NS1-9  5’GCAGTCTAG3’  3’CGTCAGATC5’ 9 bp non-specific DNA  NS2-12  5’CAAAAATTTTTG3’  3’GTTTTTAAAAAC5’ 12 bp non-specific DNA (palindromic) NS1-15  5’GATGCAGTGTAGTCG3’  3’CTACGTCACATCAGC5’ 15 bp non-specific DNA  a consensus GGAA motif is identified in bold.  3.2.1 Optimizing specific Ets1/DNA complexes  Initial NMR spectroscopic studies involving Ets1 and specific DNA were carried out using 15N-labeled ΔN279 and SC1-14 (Figure 3.2). The latter is a 14 bp oligonucleotide with a consensus Ets1-recognition site, derived from longer sequences that were characterized extensively by Graves and co-workers [171]. The titration of SC1-14 into ΔN279 was monitored with 15N-HSQC spectra (Figure 3.2). Consistent with the expected high affinity of SC1-14 for the ETS domain (KD = 2.5±0.5 x 10-11 M for a 27 bp oligonucleotide containing this sequence [40]), the complex formed in slow exchange on the chemical shift timescale. That is, signals from the population of free ΔN279 gradually disappeared and a new set of signals from the resulting  78 complex concomitantly appeared. After reaching an ~ 1:1 DNA:protein ratio, no further changes accompanied the addition of more SC1-14, indicating saturation of the complex. As also expected, the spectra of the ΔN279 in its free and bound forms were drastically different. It is particularly noteworthy that many amide 1HN-15N signals from the bound protein were dispersed and thus indicative of a well-structured complex. However, new signals with poorly dispersed random coil 1H shifts (i.e., clustering around 8 - 8.5 ppm) also appeared, thus indicating that parts of ΔN279 (beyond the truncated SRR) become disordered upon binding DNA.    Figure 3.2: Formation of a specific Ets1/DNA complex. Overlaid 15N-HSQC spectra of 15N-labeled ΔN279 in the absence (cyan) and presence of SC1-14 (red) at 28°C at 600 MHz. Signals in green and gold are aliased for ΔN279 without and with SC1-14 respectively.  A final 2:1 DNA:protein molar ratio was used to ensure saturation. At sub-stoichiometric ratios, peaks from both the free and bound protein were observed, indicating binding occurs in the slow exchange limit (not shown). Samples were in 20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT and 5% D2O.  Due to slow exchange binding and significant spectral perturbations, any further analysis of the ΔN279/SC1-14 complex would have required the complete re-assignment of its NMR signals. Although the complex is well behaved at low ionic strength (i.e., soluble and stable), it undergoes slow rotational diffusion due to a combined molecule mass of ~27 kDa. This leads to substantial line broadening and low signal-to-noise levels, making the required 1H/13C/15N correlation experiments very challenging.  10 9 8 7 6130120110 6N3016N301 + SC1-141H (ppm)15N (ppm) 79 To offset these problems, two shorter oligonucleotides, SC1-12 and SC1-9, were tested with ΔN301 (Table 3.1). In both cases, binding occurred in the slow exchange limit and led to comparable spectral changes upon saturation (Figure 3.3). Importantly, a comparison of the 15N-HSQC spectra of the ΔN279/SC1-14 and ΔN301/SC1-12 complexes revealed that removal of the disordered SRR did not induce any notable chemical shift perturbations for the remaining well dispersed signals. Importantly, this demonstrates that the SRR remains disordered and does not interact with the structured ETS domain when bound to DNA. However, the resulting spectra are simpler due to the reduced number of signals with random coil chemical shifts from the SRR amides. Closer inspection of Figure 3.3 also reveals that spectra recorded with SC1-9 tended to show slightly smaller chemical shift changes than with SC1-12 or SC1-14. These differences indicate that bases flanking the minimum 9 bp ETS domain-binding site do contribute to the complex; at least at a spectroscopic level. This could, for example, result from terminal base pair fraying or weak electrostatic interactions between the protein and flanking regions of longer DNAs. Regardless, the ΔN301/SC1-12 complex was chosen for subsequent studies as a good compromise between recapitulating of the behaviour of a longer oligonucleotide, minimizing the adverse relaxation effects due to molecular mass, and eliminating signals from the non-interacting SRR.  80   Figure 3.3: Optimization of the protein fragment and oligonucleotide size for NMR studies of the specific Ets1/DNA complex. Shown are overlaid 15N-HSQC spectra of the following saturated complexes: ΔN279 with SC1-14 (purple/green), ΔN301 with SC1-12 (red) and ΔN301 with SC1-9 (yellow). Aliased amide signals from the ΔN279 + SC1-14 complex (green) are not observed with other complexes due to different 15N spectral widths. Comparing the two ΔN301 complexes with ΔN279/SC1-14 shows that deletion of the disordered SRR reduces the number of signals clustering around 8 - 8.5 ppm in the 1H dimension. Samples were in 20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT and 5% D2O at 28°C.  3.2.2 Optimizing non-specific Ets1/DNA complexes   The binding of randomly-chosen non-specific 9 and 15 bp duplex oligonucleotides lacking a 5'GGA(A/T)3' motif to ΔN301 were also investigated (Table 3.1). In principle, NS1-9 corresponds to the minimal site size and should be entirely covered by ΔN301 when bound in either orientation. In contrast, the longer NS1-15 could be bound at seven possible 9 bp sites, each of which could be occupied in two orientations. The titrations of 15N-labeled ΔN301 with these non-specific DNAs were monitored using 15N-HSQC spectra (Figure 3.4). In the case of NS1-9, many ΔN301 amide signals shifted progressively with added DNA, while also showing broadening at partial saturation (Figure 3.4). This corresponds to the intermediate-fast 11 10 9 8 7 61301201101H (ppm)15N (ppm)6N301 + SC1-96N301 + SC1-126N279 + SC1-14 81 exchange limit (kex ≥ Δω, where kex is the exchange rate constant and Δω is the chemical shift difference between the free and bound states), for which observed chemical shifts are approximately the population-weighted averages of the shifts of the unbound and fully bound states, albeit with substantial line broadening. Although we could not extract a KD value from the titration data, based on this behaviour, the dissociation constant KD for ΔN301/NS1-9 complex can be estimated as ~10 µM. Similar KD values in the low µM range have been recently reported for non-specific DNA binding by ETV6, a related ETS family member [29]. In contrast, NS1-15 interacts in the slow exchange limit, with separate signals from the free and bound states observable at partial saturation. This behaviour suggests that the KD value for NS1-15 is < 10 µM. The higher overall (macroscopic) affinity of ΔN301 for NS1-15 versus NS1-9 likely results from the increased number of potential microscopic binding sites.   82  Figure 3.4: Ets1 binds non-specific DNAs of differing length with varying affinity. 15N-HSQC of 15N-labelled ΔN301 alone (cyan) and in complex with the non-specific oligonucleotides NS1-9 (3.3:1 DNA:protein ratio, purple), NS2-12 (5:1, blue)  and NS1-15 (3.8:1, orange) at 28°C. The three lower panels are expanded views of selected regions. Note that fewer signals are detected with NS1-15 complex than the smaller NS1-9 or palindromic NS2-12 complexes. For example, signals from the indoles of W338 and, W375, and the amides of Q419 (expanded sections), L337 and Y386 (full spectrum) are broadened beyond detection in the NS1-15 complex, but not with the other complexes. This is attributed in part to conformational exchange between multiple binding sites on the longer oligonucleotide. However, the ΔN301 was 70% deuterated and 13C-labelled for studies with NS2-12, which also leads to sharper amide signals. All samples were in 20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3 and 5 mM DTT with 5% D2O.  All 15N-HSQC were collected at 850 MHz, except for the ΔN301/NS2-12 complex, which was collected at 500 MHz.  11 10 9 8 7 61301201101H (ppm)15N (ppm)6N3016N301 + NS1-96N301 + NS2-126N301 + NS1-1510.8 10.4 10.01321301281261H (ppm)15N (ppm)9.4 8.61281261249.01H (ppm)9.4 9.0 8.61211191171H (ppm)W356 H¡1R413 S355Q419W375W375 H¡1W338 H¡1 83  A comparison of the 15N-HSQC spectra of the two non-specific DNA complexes also revealed that many ΔN301 amide 1HN-15N signals were present with NS1-9, yet weak or absent with NS1-15 (Figure 3.4). This is unlikely a simple consequence of the higher molecular mass of the latter complex as ΔN279/SC1-14, with a comparable size, yielded detectable signals for many amides (Figure 3.2). Unfortunately, it is difficult to estimate the stoichiometries of the non-specific complexes due to their exchange behaviours over the coarsely-run titrations. At initial titration points with net DNA:ΔN301 molar ratios < 1, it is plausible that an oligonucleotide molecule is bound by more than one protein molecule. However the end-point spectra of Figure 3.4 were recorded with DNA:ΔN301 molar ratios > 3 and thus the proteins appeared to be saturated and most likely bound to a single oligonucleotide. In support of this argument, the 15N-HSQC spectra of the complexes did not change appreciably when the DNA:ΔN301 ratios were increased from ~2 to ~3 (not shown). Therefore, a likely explanation for the missing signals in the ΔN301/NS1-15 complex spectrum is conformational exchange broadening due to interconversion between binding sites within the 15 bp oliogonucleotide. This could occur via diffusion along the DNA or via dissociation/reassociation. Similar length-dependent line broadening has been reported for the lac repressor bound to various non-specific DNAs [68].   A non-specific palindromic sequence NS2-12 (5’CAAAAATTTTTG3’) was therefore designed to both limit the possible adverse effects of conformational averaging and to match the size of the optimal specific SC1-12 oligonucleotide. The rationale is that the staggered 9 bp sites within NS2-12 are roughly similar and that binding in either orientation is equivalent. The 15N-HSQC spectra of ΔN301 in complex with NS2-12 is also shown in Figure 3.4. Overall, the protein signals are sharper in presence of NS2-12 than for both NS1 complexes.  Additionally, several signals that disappear with the NS1-15 oligonucleotide are present with the palindromic sequence. These improvements in the NMR spectra result from a combination of factors, including the reduction in conformational averaging due to the palindromic sequence and a reduction of the proton transverse relaxation obtained through partial deuteration (70%) of the ΔN301 protein.  3.3 NMR spectral assignments   The NMR signals from the mainchain and many sidechain nuclei in ΔN301 bound to SC1-12 and NS2-12 were assigned using a suite of heteronuclear correlation spectra. Initially, a fully protonated13C/15N-labelled sample of ΔN301 with SC1-12 was examined. Although partial assignments for residues 348-440 were obtained, residues 301-347 could not be assigned  84 because of severe spectral overlap and/or low signal-to-noise. Therefore, a sample fully (99%) deuterated on the aliphatic and aromatic carbons (and full protonation of nitrogens, oxygens, and sulfurs) was prepared to obtain sharper signals and reduce proton as well as 13C transverse relaxation. Signals from many mainchain 1H, 13C, and 15N nuclei and sidechain 13C nuclei of this 99%-2H/13C/15N-labeled ΔN301 in complex with SC1-12 were obtained using TROSY-based heteronuclear correlation experiments. Partial assignments for sidechains 1H nuclei, including those of the arginine guanidinium groups, were obtained using a combination of spectra recorded for the fully protonated and deuterated 13C/15N-ΔN301 samples with SC1-12. In the end, the various labelling schemes used in combination with conventional and TROSY-based experiments allowed the assignment of at least three resonances for ~94% of the residues in the specific complex (Figure 3.5). Most of the missing signals are located in HI-1 and H1.  85  Figure 3.5: Assigned amide 15N-HSQC spectrum of the specific complex (ΔN301 with SC1-12). 15N-HSQC spectrum of the saturated specific complex (1.1:1 DNA:protein ratio) at 28°C on an 850 MHz spectrometer. Aliased amide and sidechains signals are colored green. The sample was in 20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT and 5% D2O.   Spectral assignment of the non-specific ΔN301/NS2-12 complex were obtained from a single partially deuterated sample. In this case, 70%-2H/13C/15N-labeled ΔN301 was used in order to obtain mainchain and sidechain 1H, 13C, and 15N assignments from a single sample. Partial deuteration reduced proton transverse relaxation and allowed the assignment of at least one three resonances for ~86% of the residues (Figure 3.6). Most of the remaining unassigned residues are located in helices HI-1, H1 and at the DNA binding interface.  12 11 10 9 8 7 6130120110W338N¡1K436L337Y412W361R409F363 L389W356N¡1 L365D440W375 L418K348A439F414M384A312R309W356T405I321Q336E387D347E370D359I354H403Y386 Q419 L421D317Y395I402E428A431 R378K388K364D417R373 Q339S366K408L393D434K381 N315Y410D398D313 M432L433V371N380Q351D367 V411R413 K383H430C416L345 D369R394W338K399F353 Y424Y396K379V415 S420S390 N400S349L422 I401G360 E427S352T357 G333G302T425G407G358 T346G392G376S355A323G423G328 G331L344T303Y329T330S332A406W375H¡1W361N¡1D438K404E362F304A3231H (ppm)15N (ppm) 8.4 8.0123121119Y329I321 0D317A325Q336 D359F340I354V308L314V320 K377K316N385E343Y307C350R374 V435A372Y397L429F304R311D310D438A324 86  Figure 3.6: Assigned amide 15N-HSQC spectrum of the non-specific complex (ΔN301 with NS2-12). 15N-HSQC spectrum of the saturated non-specific complex (5:1 DNA:protein ratio) at 28°C on an 850 MHz spectrometer. Aliased amide and sidechains signals  are colored orange. Sample was in 20 mM MES pH 6.5, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT and 5% D2O.  3.3.1 HI-1 and HI-2 unfold upon specific DNA binding    Insights into the secondary structure and dynamics of a protein can be obtained from the chemical shifts of its main chain nuclei (1HN, 15N, 13Cα, 13Cβ and 13CO). For example, the algorithm MICS (Motif Identification from Chemical Shifts) [172-175], uses these chemical shifts to determine the likelihood that a residue adopts an α-helix, β-strand, random coil, or one of five types of β-turns and helix cap conformations. This neural network program, which was calibrated using the chemical shifts from a library of proteins with known structures, also provides a "random coil index" squared order parameter (RCI-S2) as an indication of local backbone flexibility (0 is very mobile and 1 is rigid). As shown in Figure 3.7, a MICS analysis of the published chemical shifts of free ΔN301 (BMRB ID 5991) agrees closely with the secondary structure of the protein as determined by NMR and X-ray crystallographic approaches. The core ETS domain contains three α-helices and four β-strands, and the flanking IM folds as four additional helices. Also consistent with previous relaxation studies of the fast timescale motions of ΔN301, the helices and strands are ordered, whereas intervening loop regions are more 11 10 9 8 7 6130125120115110105A406W338H¡1W375H¡1K436W361H¡1F363 W361Y412W356H¡1L337D440T405L389K348 L418W375A439A312K404A323L365 E362R391F414I321H403I354 D347E387E370F304 D359 N385Q419 D438 S366Y386L421R378D317 E428A431D417V435K364D398R373Q339D434R394 I402N315 V371Q351D313 L433K381 M432 K408R413 S355 L344H430 D367C416N380K383 D369 Y424F353V415K379 K399Y410N400 S332S420S390W338S349E427G360 S352T357G333G302G392T425G358G376G407W356T346L342G328T303L345I401G423G3311H (ppm)15N (ppm)8.4 8.0 7.6124122120118R391 NQ336A325I321A327Y397I354L314K316E343M384V320K318K377R311C350L393NA372K388R374R373L326A324D310 87 flexible [32, 43]. (Helices HI-1 and HI-2 are marginally stable and undergo facile amide hydrogen exchange due to slower timescale motions not reflected by their chemical shifts.) Overall, the MICS predictions are in agreement with the well characterized structure and dynamics of the Ets1 ETS domain and IM, and thus provide a baseline for the analysis of the ΔN301 complexes formed with specific and non-specific DNAs.   Figure 3.7: Chemical shift-based prediction of the secondary structure and backbone dynamics of free ΔN301. The probabilities of forming α-helices (red histogram bars), β-strands (blue) or random coil (green) conformations according to the MICS algorithm are shown for each residue in the protein. The diagram above the graph indicates the secondary structural elements of ΔN301 determined from detailed NMR and X-ray crystallographic structure determinations (1R36.pdb; rectangles; α-helices, arrow; β-strands). Although conserved, the short helical turn between H1 and S1 is not explicitly named in the standard nomenclature used for ETS domains. The RCI-S2 order parameters are shown with the black line (1 indicates very ordered/rigid and 0 unrestrained/flexible).  HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4turn wing1.000.750.500.25Secondaray structure prediction0_-helix`-strandRandom CoilRCI-S2310 320 370360350340330 410400390380 420 430Residues 88   The MISC analysis of ΔN301 bound to specific DNA (SC1-12) is shown in Figure 3.8. The core ETS domain (residues 335-415) is well folded and ordered, retaining the same secondary structure as in the free protein. This is consistent with the close similarity of all ETS domain structures determined in the absence and presence of specific DNAs by X-ray crystallography. In contrast, the N-terminal portion of the IM changes dramatically with helices HI-1 and HI-2 unfolding as shown by high random coil propensities and reduced RSI-S2 values. However, fully disordered residues normally yield sharp intense signals with random coil chemical shifts. In contrast, the amide signals from residues corresponding to HI-1 and HI-2 are weak and their intensities decrease with increasing spectrometer magnet field strength (Figure 3.9). This is diagnostic of conformational exchange in the intermediate to fast exchange regime. This exchange can also be observed in heteronuclear NOE measurements, as discussed below. Thus, the N-terminal inhibitory residues appear to be predominantly unfolded, but also exchanging with minor alternative conformations, the latter possibly being helical. Interestingly, residues around position 310 (between HI-1 and HI-2) are predicted by MICS to have helical propensity when ΔN301 is bound to SC1-12, but not when the protein is free.  89  Figure 3.8: Helices HI-1 and HI-2 unfold when ΔN301 binds SC1-12. The unfolding of helices HI-1 and HI-2 is evident from their predicted random coil conformations and reduced RCI-S2 order parameters (black line). The diagram above the graph indicates the secondary structural elements predicted for the protein (rectangles; α-helices, arrow; β-strands). See Figure 3.6 for a comparison with the DNA-free protein and an explanation of the histogram.  310 320 370360350340330 410400390380 420 430ResiduesHI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4turn wing1.000.750.500.25Secondaray structure prediction0_-helix`-strandRandom CoilRCI-S2 90  Figure 3.9: Although predominantly unfolded, residues corresponding to helices HI-1 and HI-2 undergo conformational exchange in the ΔN301/SC1-12 complex. Shown are 1H slices through the 15N-HSQC spectra of the ΔN301/SC1-12, recorded with 850 and 600 MHz NMR spectrometers. Signals corresponding to HI-1 and HI-2 (white arrows) are weak and their intensities decrease with increasing spectrometer field strength. This is diagnostic of conformational exchange in the intermediate to fast exchange regime.  3.3.2 Helices HI-1 and HI-2 also unfold upon non-specific DNA binding    The MISC analysis of ΔN301 bound to NS2-12 is shown in Figure 3.10. As with the specific SC1-12 oligonucleotide (Figure 3.8), the ETS domain retains the well-ordered secondary structural elements characteristic of the free protein. In addition, the N-terminal inhibitory helices HI-1 and HI-2 are also unfolded, with random coil-like chemical shifts and reduced RCI-S2 values. As with SC1-12, the signals from the N-terminal inhibitory residues in the ΔN301/NS2-12 complex were weak (and difficult to assign) due to conformational exchange 15N HSQC 850 MHz 15N HSQC 600 MHz 91 broadening. Thus, these residues are not completely disordered and may be sampling possible folded conformations.  Figure 3.10 : Helices HI-1 and HI-2 unfold when ΔN301 binds NS2-12. The unfolding of helices HI-1 and HI-2 is evident from their predicted random coil conformations and reduced RCI-S2 order parameters (black line). The diagram above the graph indicates the secondary structural elements predicted for the protein (rectangles; α-helices, arrow; β-strands).  Several residues in HI-1 could not be assigned due to exchange broadening or spectral overlap. See Figure 3.6 for a comparison with the DNA-free protein and an explanation of the histogram.  3.4 Non-specific and specific DNA bind Ets1 at a similar canonical interface   To gain insights into the structural basis of DNA-binding by Ets1, we compared the assigned 15N-HSQC spectra of ΔN301 in its free versus oligonucleotide-saturated forms (Figure 3.11). Using these spectra, the amide CSPs for ΔN301 due to binding the specific SC1-12 DNA were calculated and are shown in Figure 3.12. Large perturbations are observed for amides at the N-terminus of helix H1, in the turn between helices H2 and H3, along the DNA binding helix HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4turn wing1.000.750.500.25Secondaray structure prediction0_-helix`-strandRandom CoilRCI-S2310 320 370360350340330 410400390380 420 430Residues 92 H3 and in the loop between strand S3 and S4. When mapped onto the structure of ΔN301, these residues corresponds closely to the DNA-binding interface of Ets1 observed by X-ray crystallography (Figure 3.13). Amide chemical shifts are highly sensitive to their local environment, and the CSPs could arise from subtle structural perturbations (directly due to DNA contacts or indirectly due to propagated changes) or simply from the aromatic ring currents and electric fields due to the nearby bases and phosphodiester groups in the DNA, respectively. Significant CSPs are also observed for the N-terminal inhibitory residues and  amides within the helices H1/H4/H5 onto which they pack. This provides further evidence that helices HI-1 and HI-2 unfold in an allosteric response to DNA binding.  93  Figure 3.11: Ets1 binding to non-specific and specific DNAs. (A) Full and (B-E) partial regions of 15N-HSQC spectra of 15N-labelled ΔN301 alone (cyan), and in complex with NS2-12 (5:1 DNA:protein ratio; purple) or SC1-12 (1.1:1 DNA:protein ratio; red). (B) G392 and (E) 11 9 71301201101H (ppm)15N (ppm)6N3016N301 + NS2-126N301 + SC1-128.0 7.8 7.6111110109108S390R413W338 H¡1L337G392 K364Y3869.9 9.7 9.51221209.4 9.2 9.0 8.811811611410.4 10.0 9.61301281261H (ppm)15N (ppm)1H (ppm)15N (ppm)AEDCB375 H¡1 94 W338ε1 move in opposite directions depending on which sequence is bound, whereas, the signals from (C) K364 and (D) R413 with NS2-12 are midway between those of the free and SC1-12-bound forms. (E) L337, (C) Y386 and (D) S390 are barely perturbed by the presence of non-specific DNA, yet change significantly upon binding the GGAA-containing sequence.  . Figure 3.12: Amide chemical shift perturbations resulting from binding of specific and non-specific DNA to ΔN301. Combined CSPs (Δδ = {(ΔδH)2 + (0.14 (ΔδN))2}1/2 ) upon binding of SC1-12 (final 1:07.1 DNA:protein ratio) are shown in red and NS2-12 (5:1) in blue. The top cartoon indicates the secondary structure of the protein (red rectangles, α-helices; blue arrows, β-strands), including the turn and wing involved in DNA binding by the winged helix-turn-helix motif. These data are mapped onto the structure of ΔN301 in Figure 3.13. Note that, with an estimated Kd ~ 10 µM, ΔN301 should be > 99% bound in the presence of a 5:1 NS2-12:protein ratio, and thus the small CSPs relative to the specific complex do not reflect incomplete saturation. Blank values correspond to prolines and unassigned amides. HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4turn wing1.500.750.500.25Chemical shift perturbation (ppm)0310 320 370360350340330 410400390380 420 430Residues1.251.006N301 +/- SC1-12bp 6N301 +/- NS2-12bp440 95  Figure 3.13: Mapping of ΔN301 amide CSPs due to binding of specific and non-specific DNA. (A) and (B, 180o rotation) show the CSPs on the structure of the free IM/ETS domain (1R36.pdb) due to binding specific DNA, whereas (C) and (D, 180o rotation) show those due to binding of non-specific DNA. The extensive CSPs in the specific complex reflect both the unfolding of helices HI-1 and HI-2 and the binding of DNA via the canonical ETS domain interface including the wHTH motif and the N-terminus of H1. The non-specific complex shows comparable CSPs due to the unfolding of HI-1 and HI-2 as well as smaller CSPs due directly to DNA binding. Residues in grey either correspond to prolines or residues for which 1HN-15N assignments were not obtained. Note that several such residues are located in HI-1, HI-2 and HI-1HI-2H1H2H3H4H5HI-1HI-2H1H2H3H4H5H2H3H4H5H2H3H4H5A BC DSpecific DNA6b > 0.45 ppm0.45 < 6b > 0.35 ppm0.35 < 6b > 0.25 ppm0.25 < 6b ppmNon-Specific DNA6b > 0.26 ppm0.26 < 6b > 0.20 ppm0.20< 6b > 0.15 ppm0.15 < 6b ppm 96 the wHTH motif of the non-specific complex, and likely undergo line broadening related to conformational exchange between binding sites along NS2-12. Purple residues are assigned in the free and bound form but cannot be found in the non-specific complex. Thus, they are likely exchanged broadened.   The amide CSPs for ΔN301 due to binding the non-specific NS2-12 DNA are also presented in Figures 3.12 and 3.13. The largest CSPs are exhibited by the N-terminal inhibitory region. These closely resemble the changes observed with SC1-12, and reflect the common unfolding of helices HI-1 and HI-2 that occurs when ΔN301 binds either oligonucleotide. In contrast, rather small CSPs are observed for the remainder of the ETS domain and C-terminal inhibitory region. These modest perturbations indicate that no major conformational changes take place upon binding the non-specific oligonucleotide. However, a limited set of residues, including Trp338, Trp375, Asn385, Asp398, Ile401 and Tyr410, do show above average CSPs. Additionally, residues including Tyr395, Tyr396, Arg409 and Val411 could not be detected and thus are likely broadened due to conformational exchange. These named residues all loosely map to the canonical specific DNA-binding surface of the ETS domain. Thus, we conclude that Ets1 binds both the non-specific NS2-2 and specific SC1-12 DNAs via the same general interface.   Upon closer inspection of the NMR data, it is evident that numerous important contacts with the specific DNA are absent in the lower affinity NS2-12 complex. According to several crystal structures of Ets1 (and many related ETS factors) bound to specific DNAs, residues in H1 (Leu337 and Trp338), H2 (Trp375), turn (Lys379, Lys381 and Met384), H3 (Tyr386-Gly392 and Arg394-Tyr397) and wing (Arg409, Tyr410 and Tyr412) make conserved hydrogen-bonding and salt bridging contacts with the DNA [22, 29, 35, 176]. As shown in Figures 3.11-3.13, the amides of Lys381, Lys388 and Lys404 shift partially in the non-specific complex, which likely indicated the formation of weaker or less persistent electrostatic contacts than in the specific complex. In contrast, signals from Leu337, Trp375, Met384, Tyr386, Glu387, Ser390, Arg391 and Arg394 remain essentially unperturbed in the non-specific complex. This is especially noteworthy since the amide of Leu337 and the indole of Trp375 form conserved hydrogen bonds with the phosphodiester backbone of DNA. As seen in Figure 3.11, signals from these two residues shift downfield upon contacting SC1-12 yet show no or small perturbations upon binding NS2-12.  Such downfield 1H shifts are diagnostic of hydrogen bond formation.    97  Additional insights into the binding of specific versus non-specific DNA by ΔN301 can be obtained by comparing the "direction" and magnitude of the CSPs for the amide 1HN and 15N nuclei, both separately and combined for a vector projection analysis [177] (Figure 3.14). Residues associated with helices HI-1, HI-2 and those in the helices onto which they pack (H1/H4/H5), exhibit projection values ~1 and cos(θ) ~1. This pattern indicates that their corresponding amide 15N and 1HN signals shift in the same direction, with a similar magnitude (albeit often small), upon binding NS2-12 relative to SC1-12. This is further evidence that HI-1 and HI-2 unfold upon binding either DNA sequence. A second subset of peaks shows amide 15N and 1H signals that shift in the same direction (cos(θ) ~1), but with a magnitude < 1, upon binding non-specific versus specific. Many of theses residues are located in helix H1 (e.g., Leu337), the wHTH turn, and the N-terminal portion of the recognition helix H3 (e.g., Tyr386 and Ser390), which is on the edge of the DNA binding motif. This indicates that ΔN301 forms generally similar, albeit less well-defined, time-averaged interactions with NS2-12 than with SC1-12. Such interactions likely involve electrostatic contacts between the positively-charged DNA-binding interface of the ETS domain and the negatively-charged phosphodiester backbone of DNA, rather than base-specific hydrogen bonds. Also, smaller net CSP's could arise due to averaging of potential positive and negative chemical shift changes as the protein rapidly exchanges between binding sites along NS2-12. A third subset of amides, including Gly392, Arg394, Tyr397 and Ala406, which are located in helices H2, H3 and strand S3 show distinctly different patterns of chemical shift changes, indicating that these regions of ΔN301 interact differently with specific and non-specific DNA. These differences likely reflect the formation of base-specific hydrogen-bonding contacts with the 5'GGA(A/T)3' motif in SC1-12 that are absent in the "looser" non-specific DNA complex.     98  HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4turn wingABC310 320 370360350340330 410400390380 420 430Residues1.02.00.0-1.0-2.0Projection / cos ecos eProjection6N301 +/- NS2-126N301 +/- SC1-120.250.75-0.25-0.756HN (ppm)6N301 +/- NS2-126N301 +/- SC1-126NH (ppm)0.250.75-0.25-0.751.25FreeDNA non spDNA spe 99 Figure 3.14: Comparison of the magnitudes and signs of the ΔN301 amide CSPs resulting from binding specific versus non-specific DNA. Magnitude and sign of the amide proton (A) (ΔδH) and (B) nitrogen (ΔδN) chemical shift changes in ΔN301 upon forming the SC1-12 (red) and NS2-12 (blue) complexes. (C) The relative magnitudes and directions of the CSPs for each residue in the two complexes can be compared by a vector projection analysis (inset diagram). The blue bars show the projections of the CSP vectors for the non-specific complex along the corresponding vectors for the specific complex. The cosines of the angles θ between the two vectors are shown by red dots. For most residues, cos(θ) ~1 but the projection is positive and ≤ 1. This indicates that their amide 15N and 1HN signals shift in the same "direction," but with similar or less magnitude, upon binding NS2-12 relative to SC1-12. However, some amides shift differently, including in the opposite direction (cos(θ) ~-1 and negative projection value). Collectively, these analyses support the conclusion that Ets1 binds non-specific and specific DNA by a similar canonical interface of the ETS domain. However, the interactions are generally stronger for SC1-12. Shaded boxes indicate the primary residues involved in DNA binding. Blank values correspond to prolines or unassigned amides.  3.5 The dynamic properties of ΔN301 change upon DNA binding    The chemical shift-derived RSI-S2 values for ΔN301 in its DNA-bound versus free form are presented in Figure 3.15. The predicted increase in local disordered due to the unfolding of residues in helices HI-1 and HI-2 with both SC1-12 and NS2-12 is clearly evident. More interestingly, the analysis hints at smaller changes in backbone dynamics throughout ETS domain. In the case of the specific complex, increased ordering is also suggested for the turn in the HTH, as well as helix H3 and strands S3 and S4. Surprisingly however, residues in helix H1 and, to a lesser extent, the wing are predicted to become slightly more flexible. In contrast, somewhat different changes are predicted to occur upon binding NS2-12, including slightly decreased RSI-S2 values for helix H3, increased values for the wing, and no perturbations for the turn. The significance of these results is difficult to judge as the underlying algorithm was not developed using the chemical shifts of DNA-bound proteins.    100  Figure 3.15: Changes in the RSI-S2 squared order parameter of ΔN301 upon binding specific and non-specific DNA. Plotted are the changes in RSI-S2 values calculated with MICS for the specific SC1-12 (red) and non-specific NS2-12 (blue) complexes relative to the corresponding amides in the free protein. Positive values indicate an increase in the local rigidity, whereas negative values indicate an increase in flexibility. The diagram above the graph indicates the secondary structure for both complexes (rectangles; α-helices, arrow; β-strands).    To directly characterize the dynamics of ΔN301 in its free and bound states, amide 15N relaxation studies were carried out. Of these, the heteronuclear 1H{15N}-NOE provides a sensitive measure of backbone mobility on the sub-nsec timescale. Overall, whether free or bound to non-specific (NS2-12) or specific (SC1-12) DNA, amides throughout the ETS domain are generally rigid with high NOE values (Figure 3.16). However, residues in the turn and the wing have slightly lower NOE values in the free protein and the non-specific complex, indicative of modest flexibilty. In contrast, upon binding specific DNA, they become more ordered. This is HI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4310 320 370360350340330 410400390380 420 430SC1-12 +/- 6N301 NS2-12 +/- 6N301turn wingResidues0.1-0.1-0.3-0.56RCI<S2Increase in orderIncrease in disorder 101 suggestive of contacts with the phosphodiester backbone that are more persistent than with non-specific DNA.    102  SC1-12 +/- 6N301NS2-12 +/- 6N301310 320 370360350340330 410400390380 420 4306{1H}-15N NOEHI-1 HI-2 H1 S1 H2 H3 H4 H5S2 S3 S4turn wing310 320 370360350340330 410400390380 420 4301.00.50-1.0-0.5{1 H}-15N NOE{1 H}-15N NOE6N3016N301+SC1-12310 320 370360350340330 410400390380 420 4301.00.50-1.0-0.56N3016N301+NS2-12Residues0.50-1.0-0.5 103 Figure 3.16: Amide heteronuclear 15N-NOE measurements indicate that helices HI-1 and HI-2 of the inhibitory module are unfolded, but not conformationally unrestricted, in the Ets1 DNA complexes. 1H{15N}-NOE values of ΔN301 in its free (grey) form and when bound to SC1-12 (red) and NS2-12(blue). Lower panel shows the increase and decrease in flexibility upon addition of specific (SC1-12) or non-specific (NS2-12 DNA). Values near ~0.8 indicate well-ordered amides, whereas deceasing values indicate increasing flexibility on the sub-nsec timescale. Missing residues correspond to prolines, and those with weak, overlapping, or unassigned signals.   Focusing on HI-1 and HI-2, in free ΔN301, the residues forming these helices are also well-ordered on the sub-nsec timescale with NOE values comparable to those of the core ETS domain (Figure 3.16). In contrast, when bound to both non-specific and specific DNAs, their NOE values drop substantially. This again demonstrates that the two inhibitory helices are predominantly unfolded. However, the NOE values are not highly negative (as seen with the very flexible N- and C-terminal residues), indicating that the fast timescale motions of these residues are partially restricted in both complexes. This is consistent with the msec-µsec exchange broadening observed for many of these amides in simple 15N-HSQC spectra (which, in several cases, precluded 1H{15N}-NOE measurements), and suggests that the N-terminal inhibitory residues are in conformational equilibria, perhaps to transiently-formed helices.  3.6 Sidechains contribute to DNA recognition  3.6.1 Arginine and lysine sidechains at the DNA-binding interface   The recognition helix H3 of Ets1 binds along the major groove of specific DNA such that two conserved arginine sidechains (Arg391 and Arg394) hydrogen bond to the two guanine bases in the 5’-GGA(A/T)-3’ motif. Several lysines and one additional arginine (Arg409) also appear to contribute to the interface as seen in X-ray crystallographic studies. To further characterize the difference between binding specific and non-specific DNA, we used NMR spectroscopy to investigate these sidechains in the free and bound forms of ΔN301. Such studies are, however, very challenging due to the typically rapid exchange (HX) of unprotected arginine guanidinium and lysine aminium protons with water and the necessity to assign their signals via multi-step 13C and/or 1H scalar correlations with distal main chain nuclei.    104  The lysine sidechains of ΔN301 in its free and SC1-12 and NS1-12 bound forms yielded very different 15N-HSQC spectra (Figure 3.17). Due to rapid HX, no lysine 15NζH3+ signals were observed for the free protein (pH 6.5 and 10oC). Upon binding to non-specific DNA, one weak signal appeared. This indicates that at least one sidechain becomes protected from HX, likely due to burial at the interface and hydrogen bonding with the phosphodiester backbone of the oliogonucleotide. Upon binding to specific DNA, signals from at least five lysines become visible, albeit with variable intensities. The lysines are clearly positively-charged due to their diagnostic 15N chemical shifts. Unfortunately, we were unable to assign these signals due to the difficulty in detecting the needed correlations to sidechain and main chain nuclei. However, the crystal structure of ΔN301 with a specific DNA sequence (PDB ID 1K79) shows that five lysine NH3+ groups (Lys379, 381, 388, 399 and 404) interact directly with phosphodiester oxygens.    Figure 3.17: Characterizing the interactions of ΔN301 lysine sidechains with DNA. Selected 15N-HSQC spectral regions showing signals from the lysine 15NζH3+ groups of ΔN301 when (A) free, (B) bound to non-specific NS2-12, and (C) bound to specific SC1-12 (pH 6.5 and 10 oC). Due to rapid exchange with the water, unprotected lysines are not seen in the absence of DNA. In contrast to the five signals observed with the specific complex, only one appears with the non-specific complex. This is indicative of weaker and less well-defined interactions with NS2-12 than SC1-12.   3430343010.0 9.0 8.0 7.0383430Lysine amino 15NcH3+1H (ppm)15N (ppm)3838CAGCCGGAAGTGCAAAAATTTTTGABC 105  In contrast to lysine 15NζH3+ groups, signals from the more slowly exchanging arginine 15NεH pairs are often observable in the 15N-HSQC spectra of proteins under typical experimental conditions. (The arginines are certainly positively-charged and the terminal 15NηH2 signals are generally broad due to rotation about the Nε-Cζ and Cζ-Nη partial double bonds). In the spectrum of ΔN301, eight strong 15NεH peaks were observed (Figure 3.18A). The protein has nine arginines and two of them have partially overlapping signals. In the presence of SC1-12, an entirely different pattern of nine sharp arginine 15NεH signals was observed (Figure 3.18C). In addition, several 15NηH2 pairs yielded dispersed signals, indicative of well-defined DNA interactions. In particular, the Arg391 and Arg394 1Hε shifted downfield considerably, as expected due to the formation of hydrogen bonds with the 5’-GGA(A/T)-3’ guanines. Also, Arg309, in HI-1 and distal from DNA, is perturbed relative to the free protein and somewhat broader than other signals at low temperature (data not shown). These differences likely reflect the unfolding of HI-1 and postulated conformational exchange broadening seen also for amides in the N-terminal portion of the IM.    106  Figure 3.18: Characterizing the interactions of ΔN301 arginine sidechains with DNA. Selected 15N-HSQC spectral regions showing signals from the arginine sidechains of ΔN301 when (A) free, (B) bound to non-specific NS2-12, and (C) bound to specific SC1-12 (pH 6.5 and 10oC). Many arginine 15NεH signals in the complexes have been assigned through 1H/13C/15N correlation experiments. The remaining unassigned signals from the non-specific complex are identified as Ra to Rd. Three assignments in A have been added for arginine sidechains that overlap closely with those in (C) and therefore do not interact with the DNA. Horizontal lines connect 15NηH2 signals that are non-degenerate due to restricted rotation about the Nε-Cζ and Cζ-Nη partial double bonds.   In contrast, upon binding the non-specific NS1-12, several arginine 15NεH and 15NηH2 signals were broadened and exhibited chemical shifts that are clearly different from either the free and SC1-12-bound states (Figure 3.18B). However, the resulting spectra are hard to 10.0 9.0 8.0 7.0Arginine sidechains 1H (ppm)15N (ppm)R394R378R391R413R309 R373R409R311R374ABC6.015N¡H+15NdH+15N (ppm)15N (ppm)908070908070R378R373 R311R374RaRdRcRb908070CAAAAATTTTTGCAGCCGGAAGTGR378R373R413 107 interpret as only eight out of the nine expected 15NεH signals were observed, and thus one is missing due to spectral overlap or exchange broadening. Furthermore, we could only assign the 15NεH signals from four arginines (Arg311, 373, 374, and 378) based on a comparison with the spectra of the specific DNA complex. Consistent with their similar chemical shifts, these four arginines are not involved in DNA interactions. The remaining crosspeaks were simply numbered Ra to Rd.  Although not identified, it is noteworthy that, in the ΔN301/NS2-12 complex, Arg391 and Arg394 in the recognition helix H3 clearly do not exhibit the chemical shift dispersion induced in the ΔN301/SC1-12 complex. Overall, the behaviour of the arginine sidechains is consistent with the notion that non-specific DNA complexes involve "loose", dynamic electrostatic interactions with the phosphodiester backbone, whereas specific complexes are well-ordered due, in large part, to basepair hydrogen bonding.  3.6.2 Arginine guanidinium groups have different relaxation properties that correlate with their DNA interactions   We also used 15N relaxation to investigate the dynamic properties of the arginine sidechain 15Nε nuclei in the ΔN301 complexes. (The free protein was not characterized as its arginine signals were not assigned). When bound to SC1-12, a wide range of arginine motions can be observed (Figures 3.19 and 3.20). For example, sidechains completely exposed to the solvent (Arg311 and 374) have negative heteronuclear NOE and low R2 values (< 5 s-1), which is consistent with fast motions on the sub-nsec timescale. Interestingly, in some (but not all) Ets1/DNA crystal structures, the sidechain of Arg409 contacts the phosphodiester backbone. However, its relaxation behaviour indicates that it is very flexible and thus such contacts may not be persistent in solution. Arg309, 373, 378 and 413 show intermediate R2 values (10-20 s-1) and heteronuclar NOE values ~ 0.5, indicative of partially restricted sidechains motions. The latter three arginines, which are not involved in DNA binding, form potential salt bridges with aspartate/glutamate residues in some (but not all) Ets1 crystal structures. Somewhat surprisingly, Arg309 lies within the unfolded N-terminal inhibitory region, and thus its dampened motions indicate further that, although HI-1 and HI-2 are unfolded, they are not completely disordered. In contrast, Arg391 and Arg394 show fast R2 relaxation (> 20 s-1) and high heteronuclear NOE values, which is consistent with their involvement in ordered, stable hydrogen bonds with the core 5’-GGAA/T-3’ sequence.    108  Figure 3.19: Dynamic properties of arginine sidechains upon binding of ΔN301 to specific and non-specific DNA. Arginine 15Nε longitudinal R1, transverse R2 and heteronuclear NOE values for both specific (left side) and non-specific (right side) DNA are shown. Note that the unassigned arginines in the NS2-12 complex are labeled Ra through Rd. Slow (fast) R2 relaxation and low (high) NOE values are diagnostic of mobile (rigid) 15NεH groups. Unusually fast R2 relaxation results from conformational exchange broadening.   109  Figure 3.20: The heteronuclear {1H}15N-NOE spectra of ΔN301 bound to specific DNA and non-specific DNA. Red contours (positive) correspond to the reference experiments without 1H saturation, whereas yellow and cyan correspond to experiments with saturation. For the latter, arginine sidechain 15NεH groups that are mobile on the sub-nsec timescale give negative (yellow) signals, whereas ordered sidechains give positive (cyan) signals. Such ordered arginines include Arg391 and Arg394, which directly bind the 5’-GGA(A/T)-3’ motif. Aliased signals from lysine sidechain 15NζH3+ groups  (yellow) fall between 90 and 95 ppm.   As noted above, only four arginine 15NεH signals (Arg311, 373, 374, and 378) could be assigned in the 15N-HSQC spectra of the ΔN301/NS2-12 complex. These residues are not involved in DNA binding and exhibit similar chemical shifts and 15N relaxation behaviour in both the specific and non-specific complexes (Figures 3.19 and 3.20). All but one of remaining unassigned arginines have least partially restricted motions on the sub-nsec timescale. These likely include Arg391 and Arg394, which should contribute to the interface with non-specific DNA.  3.7 Discussion   In this chapter, I investigated the mechanisms by which Ets1 binds specific and non-specific DNAs and the interplay of autoinhibition with these processes. In both cases, the inhibitory module is disrupted and the same canonical interface is used for DNA binding. However, the resulting structural and dynamic changes in the backbone and arginine/lysine sidechains of the ETS domain are much larger with specific DNA than with non-specific DNA. This is consistent with the general observation that sequence-specific TFs bind non-specific sequences via "loose" electrostatic interactions, whereas their cognate sites are recognized through high-affinity interactions including hydrogen bonding between protein sidechains and DNA basepairs. 10.0 9.0 8.0 7.0R394R378R413R391R373R374R309R311R4099590858010.0 9.0 8.0 7.0R378R373R374 R3116N301 + SC1-12 6N301 + NS2-121H (ppm)15N (ppm)1H (ppm)RdRbRcRa 110 3.7.1 HI-1 and HI-2 unfold upon DNA-binding   NMR spectroscopy provided new insights into the mechanism of Ets1 autoinhibition. Upon binding both non-specific and specific oligonucleotides, residues throughout the N-terminal IM of ΔN301 exhibited similar, large amide CSPs. Main chain chemical shift analysis with the MICS algorithm indicated that residues forming helices HI-1 and HI-2 in the free protein adopt random coil conformations in the SC1-12 and NS2-12 complexes. The unfolding of this portion of the IM was confirmed by the decreased amide heteronuclear 15N-NOE and chemical shift-derived RSI-S2 values of these residues. However, the 15N-HSQC signals and NOE values from these residues were not as sharp or low, respectively, as expected for a completely disordered polypeptide, This indicates that the sub-nsec timescale motions of the N-terminal inhibitory sequence are partially dampened. Furthermore, the NMR signals from these amides broadened with increasing magnetic strength, which is diagnostic of conformational exchange on the msec-µsec timescale. It is  reasonable to suggest that this reflects the transient refolding of helices HI-1 and HI-2 within the DNA-bound complexes.    Previously, CD spectroscopy experiments indicated a loss of helical structure when Ets1 bound both specific and non-specific DNA [51]. This was attributed to the unfolding of HI-1 because of the increased proteolytic susceptibility of residues within or adjacent to this helix. Subsequent X-ray crystallographic studies of Ets1 fragments bound to cognate DNAs (1K79.pdb) and in a ternary complex with Pax-5 on the mb-1 promoter (1MDM.pdb) also showed HI-1 to be disordered, whereas HI-2 remained folded. However, both inhibitory helices are folded in the crystal structure of a comparable Ets2/DNA complex (4BQA.pdb). Conversely, both are displaced by the ETS-interacting domain (EID) of Runx1 in a ternary complex of the two partner transcription factors bound to a composite DNA site [33]. In another ternary complex with Runx1 lacking the EID, HI-1 is unfolded, whereas HI-2 is helical but packs against the ETS domain in an alternative register that is incompatible with the native fold of the IM [58] . Furthermore, amide HX studies of DNA-free Ets1 fragments demonstrated that helices HI-1 and HI-2 are marginally stable and protected from exchange by only ~15- and ~75-fold, respectively, relative to a reference random coil polypeptide [32, 43]. Taken together, this suggests that the energetic difference between the folded and unfolded states of HI-1 and HI-2 in both free and DNA-bound Ets1 is small. In solution, DNA-binding tips the balance towards the unfolded conformation, whereas the folding of HI-2 could be favored by its environment within a crystal lattice.    111  In the absence of the stabilizing SRR, the IM only imparts ~ 2-fold autoinhibition of specific DNA binding by Ets1 [43]. This small effect is consistent with the apparently delicate balance between the folded and unfolded forms of the N-terminal inhibitory sequence. Furthermore, a detailed comparison of the structures of various Ets1 fragments in their free and bound states reveals no obvious differences beyond a small displacement of the flexible loop between helices HI-2 and H1. This displacement appears necessary to avoid a steric clash with the DNA backbone and to enable a critical dipole-enhanced hydrogen bond between the amide of Leu337 at the start of helix H1 with a phosphodiester oxygen [32, 43]. The lack of any clear "allosteric pathway" linking DNA binding and unfolding of the distal IM also suggests that only subtle perturbations are required tip this balance.   3.7.2 Autoinhibition does not impart DNA-binding specificity   The effects of autoinhibition on the affinity of Ets1 for non-specific DNA have not been quantitatively determined. However, given the similar disruption of the IM as seen with specific DNA, it is likely that autoinhibition also increases the Kd value of ΔN301 for non-specific DNA by ~ 2-fold. Furthermore, since the SRR reinforces autoinhibition by stabilizing the IM and by sterically blocking the DNA-binding interface of the ETS domain (Chapter 2), it is also likely that full length Ets1 is inhibited comparably (~ 20-fold) for binding both specific and non-specific DNAs. Similar behavior has been reported for ETV6, a related family member that is autoinhibited due to steric blockage of its ETS domain by an appended C-terminal helix. This helix reduces the affinity of ETV6 for specific and non-specific DNA by ~ 50-fold and ~ 5-fold, as measured using EMSA and ITC, respectively [29]. (The significance of the apparent 10-fold difference in autoinhibition is uncertain due to the challenge in the determining the microscopic Kd value of ETV6 for a single non-specific DNA site). Unlike Ultrabithorax (Ubx) a HOX TF, autoinhibition of Ets1 does not appear to contribute substantially to the specificity (i.e., relative affinity) of ETS factors for their cognate sites versus "background" DNA within the nucleus [178]. Rather, it enables regulation via routes including protein partnerships and post-translational modifications.  3.7.3 Ets1 binds specific and non-specific DNA through its canonical interface   Ets1 binds specific 5'GGA(A/T)3'-containing DNA sites with Kd values on the order of ~10 pM [32, 43, 171]. In contrast, from the NMR-monitored titrations presented in this chapter, the  112 Kd value for 9 bp non-specific DNA can be estimated as ~10 µM. This represent an ~ 105-fold difference in affinity that must arise through structural and dynamic complementarity of the Ets1 ETS domain with the nucleotide bases (direct readout), phosphodiester backbone (with sequence-dependent conformations; indirect readout) and hydrating waters of its cognate sites [167-170, 179]. Although seemingly large, this affinity difference is offset by the vast excess of non-specific versus specific DNA sites within the human genome. Of course, numerous additional factors, including DNA accessibility and cooperative protein partnerships, will further modulate the partitioning of Ets1 between its free, non-specifically, and specifically-bound forms in the nucleus.    Despite this large difference in affinity, our studies show that ΔN301 binds specific and non-specific DNA through the same general interface found in all ETS domain/DNA structures. As expected, upon binding the specific oligonucleotide SC1-12, large CSPs are observed for ΔN301 amides in helix H1, throughout the turn preceding H3, the recognition helix H3, the following "wing" and strands S3 and S4. These perturbations arise mainly from the formation of key hydrogen bonds (e.g., the amide of Leu337 in helix H1 with the phosphodiester backbone and the sidechains of Arg391 and 394 in helix H3 with the invariant 5'GGA(A/T)3 guanines) and the unfolding of the N-terminal helices in the IM (HI-1 and HI-2). Subtle changes in the structure and dynamics of the ETS domain, combined with ring current and electric field effects from the closely juxtaposed DNA may also induce CSPs.    In the case of non-specific DNA, upon binding NS2-12, ΔN301 amides including Trp338 (H1), Trp375 (H2), Asn385 (H3), Asp398 (loop H3 to S1), Ile401 (loop H3 to S1) and Tyr410 (turn) also show CSPs. Although generally similar in "direction", the magnitude of these CSPs is smaller than observed with SC1-12, indicating weaker time-averaged interactions than in the specific complex. Furthermore, Tyr395 (H3), Tyr396 (H3), R409 (turn) and Val411 (turn) are broadened beyond detection. Such exchange broadening becomes more pronounced when ΔN301 is saturated with longer oligonucleotides (NS1-15 versus NS1-9) and is somewhat alleviated with the palindromic NS2-12. Moreover, the indole 15Nε1H signals of Trp338 and Trp375 are broadened beyond detection with the longer NS1-15 DNA, but present with the palindromic and smaller non-specific sequences. As seen with the lac repressor [68], this likely reflects the msec-µsec interconversion of ΔN301 between different possible binding positions and orientations along the non-specific oligonucleotide. (Slow exchange could also lead to apparent broadening if the protein is partitioned between many micro-states that yield slightly different amide chemical shifts.) Such interconversions could occur via rotationally-coupled  113 sliding along the DNA or from dissociation/reassociation events with different orientations and locations on the DNA. This stands in contrast to the specific ΔN301/SC1-12 complex, which only exhibits one well-defined conformation.  3.7.4 Arginine and lysine sidechains reflect differential interactions of Ets1 with specific and non-specific DNA   X-ray crystallographic studies have revealed that two or three arginine and five lysine sidechains are present at the DNA-binding interface of Ets1. Therefore, I used NMR spectroscopy to probe the contributions of these residues in the binding of specific versus non-specific oligonucleotides. In the free protein, all lysine 15NζH3+ groups underwent rapid HX with water and were not detectable. However, upon addition of NS2-12, one weak signal was observed, whereas upon addition of the 5'GGAA3'-containing SC1-12 sequence, five distinct lysines signals appeared. This nicely reflects the "loose" versus well-define interactions of the Ets1 lysine sidechains with non-specific versus specific DNAs.   Due to their intrinsically slower HX, the 15NεH signals from all nine arginines of Ets1 were readily detected in its free and SC1-12-bound states. These sidechains exhibited a wide range of relaxation behaviours that are generally consistent with expectations based on crystallographic studies of several Ets1/DNA complexes, as well as studies on other DNA-bound TFs such as the HoxD9 homeodomain [61]. For example, in the specific complex,  solvent exposed Arg311 and Arg374 are very mobile on the sub-nsec timescale, whereas salt-bridged Arg373, Arg378 and Arg413 are partially restricted. Arg391 and 394, which form hydrogen bonds with the 5’-GGA(A/T)-3’ sequence, are the most ordered. However, Arg409 is also mobile (e.g., showing 15N-NOE values < 1) despite apparent interactions with the phosphodiester backbone of DNA seen in crystal structures. Also, the motions Arg309 are partially restricted despite its presence in the unfolded HI-1, suggesting that the sidechains of the disrupted IM still contact the ETS domain.    Upon addition of non-specific DNA, arginine sidechains signals became broad, which is consistent with conformational averaging of several distinct states or sliding on DNA.  Only eight arginine sidechains are visible and the last one either is overlapped or broadened beyond detection. Low signal-to-noise in heteronuclear correlations led to assignement of only four arginine 15NεH signals (Arg311, 373, 374, and 378). They were assigned based on their close overlap with the signals of the specific complex. Consistent with this overlap, these arginines  114 are not involved in DNA binding. It is noteworthy, that all but one of remaining unassigned arginines have at least partially restricted motions on the sub-nsec timescale. These likely include Arg391 and Arg394, which should contribute to the interface with non-specific DNA. The dynamics reported for the non-specific complex are in agreement with the current models for non-specific DNA binding, which involves mostly electrostatic interactions with the backbone and dynamic sidechains.  However further NMR studies are required to assign signals from the remaining sidechains in order to better understand their roles in non-specific DNA binding.  3.7.5 Transitioning from non-specific to specific binding   Although the large majority biophysical studies of TFs have focused on their specific DNA complexes, several non-specific complexes have been investigated by X-ray crystallography and NMR spectroscopy. These include bacterial repressors (lac and cro) [54, 68, 165], steroid receptors [167, 180], homeodomains (HoxD9 and MATα2) [61, 63, 181, 182], a HMG-box protein [183], a POU domain (Oct-1) [66, 162], and zinc fingers (ZNF217 and Egr-1) [67, 184]. These studies have shown that non-specific binding generally results from weakly time-averaged electrostatic interactions between the canonical DNA binding interface of a TF and the phosphodiester backbone of the DNA. These short-lived interactions are dynamic and often involve positively-charged lysine and arginine sidechains. Upon recognition of a target site, several critical hydrogen bonds and stable electrostatic interactions are formed, leading to high affinity sequence-specific binding.    Our studies on Ets1 show that upon non-specific binding, “loose” electrostatic interactions are formed between lysine 15NζH3+ and arginine 15NεH groups with the DNA. This is reflected by the partial protection of these labile protons from rapid exchange with water. The broad linewidths observed for the complex further suggest the presence of several distinct chemical shift environment being sampled at the DNA interface. Upon recognition of the consensus 5’-GGA(A/T)-3’ sequence, five stable electrostatic interactions are formed between lysine sidechains, as well as two direct hydrogen bonds are formed with arginine sidechains.  Thus, specific and non-specific interactions between Ets1 and DNA are driven by distinct mechanisms, as hypothesized by the current models for DNA searching in vivo via facilitated diffusion.   The studies presented in this chapter shed light on a missing step between the free and specific DNA-bound structures of Ets1, namely that of its non-specific DNA complexes. The  115 conformational exchange observed in helices HI-1 and HI-2 upon binding DNA explains why crystal structures of Ets1 have been obtained with the IMs in differing conformations. Moreover, the dynamic studies performed on specific and non-specific DNA highlights the importance of lysine and arginine sidechains in DNA binding. The sidechains and amide patterns observed are clearly different even though both complex bind DNA through the same Ets1 canonical binding interface. The structural characterization of Ets1 bound to specific and non-specific DNA also increase our understanding of DNA-binding autoinhibition in the context of the ETS transcription family.   3.8 Material and methods   3.8.1 Fully and partially deuterated 13C/15N-labelled ΔN301   Samples of 15N and 15N/13C-labeled ΔN301 and ΔN279 were prepared as described in Chapter 2. Deuterated ΔN301 samples were expressed in Escherichia coli HMS174 (λDE3). For full deuteration, the M9 minimal media was prepared with 99% D2O (Cambridge Isotope Laboratories, Inc.) containing 1 g/L (15N, 99%)-NH4Cl for uniform 2H/15N-labeling and 1 g/L (15N, 99%)-NH4Cl and 3 g/L (2H7/13C6, 99%)-glucose for uniform 2H/13C/15N-labeling. The M9 salts and (15N, 99%)-NH4Cl were dissolved in a small amount of 99% D2O, lyophilized to remove residual protons, and then added to the remainder of the media in 1.1 L of 99% D2O. For partial deuteration, the M9 minimal media was prepared with 70% D2O containing 1 g/L (15N, 99%)-NH4Cl and 3 g/L (13C6, 99%)-glucose. In each case, an initial bacterial culture was grown at 37 oC in 5 mL of LB media until OD600 = 2.0, pelleted by centrifugation, resuspended in pre-warmed 0.1 L deuterated M9 media, and then grown until OD600 = 1.0. Cells were centrifuged again and resuspended in 1 L fresh pre-warmed deuterated M9 media. The cultures were grown at 37 °C until OD600 = 0.3-0.5, induced with 0.4 mM IPTG, and incubated at 30°C for an additional 8-16 h. Cells were centrifuged and the ΔN301 protein purified in H2O as previously described in section 2.8.2. Labile hydrogens, including those on the amide nitrogens, were exchanged from deuterons to protons during purification in H2O buffers.  3.8.2 Duplex oligonucleotide preparation   All DNA oligonucleotides were chemically synthesized by Integrated DNA Technologies (IDT). The corresponding complementary oligonucleotides were resuspended in water, mixed in  116 a 1:1 ratio as determined by absorbance at 260 nm using predicted single strand molar absorptivities ε260, and annealed by heating to 95°C followed by slow cooling by PCR (-0.5 °C/min.) to favor duplex formation. The duplex DNAs were then purified and exchanged into NMR sample buffer using a Sephadex S75 gel filtration column with an ÄKTA FPLC system, followed by concentration with a 3 kDa MWCO Amicon centrifugal filtration device. Final duplex DNA concentrations were determined by absorbance at 260 nm using predicted double strand molar absorptivities ε260 summarized in Table 3.2. The ε260 values were obtained from the "DNA thermodynamic and hybridization tool" on IDT website.   Table 3.2 : Sequences and predicted molar absorptivities of the duplex DNAs used in Chapter 3. Name Sequence ε260 (M-1cm-1) SC1-9 5’GCCGGAAGT3’ 3’CGGCCTTCA5’ 145,234 SC1-12 5’CAGCCGGAAGTG3’ 3’GTCGGCCTTCAC5’ 191,511 SC1-14 5’CCAAGCCGGAAGTG3’ 3’GGTTCGGCCTTCAC5’ 222,457 NS1-9 5’GCAGTCTAG3’ 3’CGTCAGATC5’ 144,675 NS2-12 5’CAAAAATTTTTG3’ 3’GTTTTTAAAAAC5’ 179,188 NS1-15 5’GATGCAGTGTAGTCG3’ 3’CTACGTCACATCAGC5’ 241,116  3.8.3 NMR-monitored DNA titrations   Small aliquots of the DNA oligonucleotides (initially 0.71-1.12 mM) were titrated into uniformly 15N-labeled ΔN301 (initially 0.25 mM in 450 µL) to final DNA:protein molar ratios of 1.5:1 (SC1-9 and SC1-12), 3.3:1 (NS1-9), 3.8:1 (NS1-15), and 5:1 (NS2-12) were reached. In the case of ΔN279, SC1-14 was added to a final 2:1 ratio. The titrations were monitored by recording sensitivity-enhanced 15N-HSQC spectra at 28°C with an 850 MHz Bruker Avance III spectrometer, with increasing DNA:protein ratios 0:1, 0.25:1, 0.5:1, 0.75:1, 1:1, 2:1, 3:1 and varying final ratios depending on the oligonucleotide. The numbers of scans taken was increased with progressive protein dilution.   117 3.8.4 NMR experiments for spectral assignments of the DNA/protein complexes   NMR spectral assignments of the Ets1/DNA complexes were obtained using 300 µL samples of ΔN301 (0.2-0.45 mM) with a DNA:protein ratio of 1.1:1 for SC1-12 and 5:1 for NS2-12 to ensure saturation. The complexes were in NMR sample buffer (20 mM MES, 50 mM NaCl, 0.5 mM EDTA, 0.02% NaN3, 5 mM DTT at pH 6.5). Data were recorded at 10, 28 and 31°C on cryoprobe-equipped 500 MHz, 600 MHz and 850 MHz Bruker Avance III spectrometers. The resulting spectra were processed and analyzed using NMRpipe [155] and Sparky [156]. Signals from backbone and sidechain 1H, 13C, and 15N nuclei were assigned through extensive multi-dimensional heteronuclear experiments including TROSY and non-TROSY versions of 3D HNCO [185], 3D HNCA [186], 3D HN(CA)CO [187, 188], 3D HNCOCA, 3D HNCACB [189], 3D HNCOCACB, 3D CCC-TOCSYNH, 4D 15N HSQC-NOESY-(15N)HSQC [190] and 3D 15N-NOESY-HSQC [191]. Signals from arginine sidechain nuclei were assigned with 2D HNCD, H2CN [192], 2D CCC-TOCSY (optimized for arginines) and intramolecular 3D HCCH-TOCSY [193].  3.8.5 Relaxation measurements   Steady-state heteronuclear 1H{15N}-NOE spectra were acquired on a 500 MHz spectrometer at 31°C with and without 3 s of 1H saturation and a total recycle delay of 5 s. Independent sets of T1, T2 and heteronuclear NOE were collected and centered on amides (119 ppm), arginine (80 ppm) and lysine (30 ppm) sidechains. Data were fitted by NMRpipe with automated lineshape fitting via nlinlS and analyzed with Mathlab.   118 Chapter 4: Conclusions and future studies   Transcription factors belonging to the ETS family translate the genetic information by binding specific promoter/enhancer sequences, recruiting additional components of the transcriptional machinery, and modifying gene expression in response to signaling pathways. Thus the aberrant activities of these factors, or their upstream regulators, frequently lead to dysregulated gene expression and oncogenesis.    Key to the tightly controlled function of transcription factors is their ability to bind specific promoter and enhancer sequences and thereby activate or repress gene expression. Since all ETS transcription factors recognize a similar 5’-GGA(A/T)-3’ consensus sequence and possess a highly conserved DNA-binding domain, additional mechanisms are required for their biological specificity. These mechanisms, which include amongst many others, appended helices on the ETS domain and additional functional domains within the full-length protein, are then used to fine tune the individual activity of each ETS transcription factors in response to different signaling events. The main objective of my thesis was to elucidate the structural and dynamic basis underlying the autoinhibition of Ets1 DNA-binding by the appended IM and SRR.   4.1 Autoinhibition of Ets1 through “fuzzy interactions” and aromatic/phosphoserine synergy   The affinity of Ets1 for DNA is allosterically repressed by four α-helices, which form the inhibitory module (IM) and pack on the opposite surface of the DNA-binding ETS domain. Helices HI-1 and HI-2 of the IM are dynamic and poised to unfold upon binding DNA. The affinity of Ets1 is further modulated by the presence of the adjacent SRR. This intrinsically disordered region, which is unique to Ets1 and Ets2, both stabilizes the IM and sterically blocks the DNA-binding interface. The transient ("fuzzy") interactions of the SRR with the IM/ETS domain increase upon its phosphorylation, thus providing link between calcium-signaling and Ets1-dependent transcription (Figure 1.6).    In Chapter 2, I investigated the physicochemical mechanisms underlying DNA-binding regulation by an intrinsically disordered protein sequence. Through mutational analyses, it was found that Phe/Tyr (ϕ) residues in the SRR act synergistically with nearby phosphoserines to reinforce autoinhibition. The introduction of additional phosphorylated Ser-ϕ-Asp, but not Ser- 119 Ala-Asp, repeats within the SRR dramatically enhanced this effect. Using NMR spectroscopy, I studied the intermolecular interactions of peptides, corresponding to the SRR, with the IM and ETS domain. The trans-SRR peptides retained inhibitory activity and formed dynamic complexes, lacking any persistent induced conformation. Intermolecular binding was also enhanced by phosphorylation and weakened with increasing ionic strength and upon alanine substitution of the Phe/Tyr residues. Through complementary NMR approaches, including chemical shift perturbation (CSP) mapping and paramagnetic relaxation enhancement (PRE) measurements, I identified the SRR-interacting surface of the ETS domain. This surface encompasses its positively-charged DNA-recognition interface and an adjacent region of neutral polar and nonpolar residues. Collectively, these studies uncovered a previously unrecognized role of aromatic residues and their synergy with phosphoserines in an intrinsically disordered regulatory sequence that integrates cellular signaling and gene expression.   4.2 Future studies on the disordered SRR   Over the past two decades, extensive time and effort has been invested into studying the Ets1 transcription factor. Regulation of this TF is complex and involves several layers that converge to fine tune its activity in different contexts. Our research has helped to gain a better understanding on some of the regulatory mechanisms for DNA binding that involve post-translational modifications, conformational changes, and disordered segments. The knowledge thus accumulated both sheds light on the behaviour of Ets1 and is applicable for understanding other transcription factors, such as Ubx [178, 194, 195], that have disordered regulatory sequences. However, much still remains to be done in order to gain a full understanding of these complex and often subtle regulatory mechanisms.   Despite our extensive structural studies of the transient interactions between the ETS domain and SRR, we still need to understand the physical mechanism driving the synergy between the aromatic residues and phosphoserines. Following the studies presented in this thesis, I suggested that interactions between the SRR and ETS domain are mediated by two possible routes. The first might involve the cooperative association of lysine and arginine side chains located at the DNA-binding interface of the ETS domain with the SRR aromatic residues and phosphoserines via π-cation and salt-bridge interactions, respectively. The second involves an intramolecular "salting-out" mechanism in which phosphorylation-enhanced hydrophobic clustering of the aromatic residues favors their localization on the neutral polar surface of the ETS domain. This is also augmented by electrostatic interactions between the negatively- 120 charged phosphoserine and asparate/glutamate residues in the SRR and the adjacent positively-charged DNA binding helix (H3) (Figure 2.16).    The two models of aromatic-phosphoserine synergy can be tested by measuring the affinities of a set of chemically-synthesized variant trans-SRR2P peptides for the ETS domain (Figure 4.1). These peptides will have the wild type Tyr/Phe residues replaced with only Trp, Tyr, Phe, 5F-Phe (5-fluoro-phenylalanine), or Leu. The rationale is that these residues differ in terms of stability in π-cation interactions (Trp > Tyr ~ Phe >> 5F-Phe > Leu) and hydrophobicity (5F-Phe > Leu ~ Phe > Trp > Tyr). Furthermore, this addresses related questions as to why the SRR contains Tyr/Phe but not Trp residues, and whether aromaticity (versus just hydrophobicity, e.g. Leu) is indeed critical for the phosphoserine synergy. If the SRR interacts through π-cation interactions, the SRR2P-4φW peptide will have a higher affinity for the ETS domain than the WT peptide, whereas the SRR2P-4φ5F-Phe should bind very poorly. In contrast, higher affinity binding of the fluorinated peptide would support the hydrophobic clustering mechanism. This forthcoming experimental should help to clarify the physical mechanisms underlying phosphorylation-enhanced autoinhibition of Ets1 DNA-binding.  121     Figure 4.1: Investigation of the physical mechanism driving the synergistic interactions between the aromatic residues and phosphoserines with the trans-system. NMR spectroscopy can be used to measure the affinities (KD) of a set of variant SRR peptides with the ETS domain of ΔN301. The pattern of affinity in the series should indicate if the SRR-ETS domain interactions are driven by π-cation interactions or hydrophobic clustering. X indicates the four aromatic residues that will be replace by Trp, Tyr, Phe, 5F-Phe or Leu. To increase affinity into a range measurable by NMR-monitored titrations, the peptides will be phosphorylated. Depending on the outcomes, non-phosphorylated peptides may be examined.   Another outstanding problem regarding autoinhibition lies with understanding the coupling between the SRR and IM and identifying the "allosteric pathway" between the IM and DNA-binding interface of the ETS domain. A strikingly "visible" feature of autoinhibition is the unfolding of the inhibitory helices HI-1 and HI-2 upon DNA binding. Surprisingly, this seemingly large change only imparts ~ 2-fold autoinhibition, and the full ~ 20-fold effect also requires the disordered SRR. However, previous reports have shown that an intact IM is necessary for SRR-dependent autoinhibition. That is, mutations which disrupt HI-1 lead to full activation of DNA-binding despite the presence of the SRR [51]. Also, the SRR stabilizes helixes HI-1 and H1-2 against unfolding as reflected by their increased amide HX protection factors [43]. HI-1 HI-2 H1 H2 H3 H4 H5 S1 S2 S3 S4 ETS domain 440 301 trans-SRR peptide 300 279 RVPpSxDpSxDxEDxP x = Trp !"Tyr !"Phe 5F-Phe very hydrophobic poor –cation  hydrophobic best –cation! !!!!Leu NH OH FFFFF 122   Somewhat perplexingly, my recent studies with the trans-SRR peptides indicated that the SRR acts not by binding the IM, but rather by sterically blocking the ETS domain DNA-binding interface. To rationalize this result, we hypothesize that helix HI-1 could serve to orient the dynamic SRR towards the DNA-binding helix H3, thereby increasing its effective local concentration and favouring transient interactions that lead to autoinhibition. By thermodynamic linkage, the interaction of the SRR with helix H3 would also stabilize helix HI-1 in its folded conformation, thus strengthening the allosteric component of autoinhibition. I initially attempted to test this hypothesis by using NMR spectroscopy to study  ΔN279 variants with mutations that disrupt helix HI-1. These mutations caused extensive changes in the 15N-HSQC spectra of the mutant proteins relative to the WT species (Figure 2.16). Indeed, these changes seemed more extensive than expected for just the unfolding of HI-1 and possibly H1-2, suggesting that the IM contributes substantially to the overall structure and stability of ΔN279. Unfortunately, the proteins were also prone to rapid aggregation (as is ΔN331 lacking the entire N-terminal portion of the IM), thus precluding the needed assignment of their NMR spectra. Alternatively, it might be possible to use SAXS (small angle X-ray scattering) to investigate ΔN279 variants with an intact and disrupted IM. This could provide useful insights into the overall shapes and sizes of the proteins and the extent to which a folded HI-1 limits the conformational distribution of the SRR.   4.3 Autoinhibition of Ets1 and the search for target site in the cell   Many research groups have addressed the question of how TFs of the same family execute specific programs of gene expression. Indeed large families of transcription factors like HOX, which is responsible for embryonic anterior-posterior axis development, and ETS, which is responsible for blood cell maturation and differentiation programs, all possess highly conserved DNA binding domains and bind in vitro to similar consensus DNA sequences with similar affinities. However, in vivo, these TFs often control both common and unique sets of genes. The disparity between the requirement for distinct activities in vivo and the apparent conservation of DNA-binding properties in vitro raises a specificity paradox. Part of the solution to this paradox involves control of DNA specificity through post-translational modifications and protein partnership, along with phenomena including autoinhibition. However, additional factors such as the different dynamic properties of different family members may also play a role in the specificity of gene expression.  123  Another related long standing problem is to understand how TFs can find a small number of specific high-affinity target sites within a very large background of non-specific DNA. TFs typically bind non-specific DNA with µM affinity and thus are rarely present within the nucleus as free proteins. Rather, TFs undergo 1D sliding along non-specific DNA, combined with 3D hopping and intersegment transfer from one region of DNA to another nearby region until a cognate site is found. (Figure 1.8 and Figure 4.1) [59]. Importantly, non-specific DNA is bound primarily through electrostatic interactions, thus enabling facile sliding, hopping, and transfer, whereas specific DNA is recognized primarily through the formation of longer lived and higher affinity protein-basepair hydrogen bonds. The second objective of my thesis was to investigate the binding of Ets1 to specific and non-specific DNA in order to better understand how its cognate sequences are recognized. I also probed the role of autoinhibition and protein dynamics in DNA-binding recognition by Ets1.  Based on the assigned chemical shifts of ΔN301 in both its specific and non-specific complexes, I demonstrated that helices HI-1 and HI-2 are in conformational exchange between a major unfolded and possible minor folded states. I also determined that Ets1 binds specific and non-specific DNA through its canonical interface. However, relative to specific binding, non-specific binding induces small CSPs and significant line broadening of many interfacial residues. The latter likely reflects sliding or dissociation/association between sites along the non-specific DNA. In contrast to the relatively well-ordered amide backbone of Ets1 in its free and bound states, the lysine and arginine sidechains show diverse NMR spectroscopic behaviours. Formation of the specific complex induces a significant downfield shift for the sidechain of Arg391 and Arg394, which is consistent with the formation of hydrogen bonds by theses two sidechains with the 5'GGAA3' recognition sequence. Five lysine sidechains are also protected from rapid hydrogen exchange upon binding of specific DNA, whereas only one is stabilized in the non-specific complex, indicating that electrostatic recognition of the two oligonucleotides is different. Overall, my studies highlighted the conformational changes that takes place in the IM upon binding DNA and broaden our understanding on recognition of cognate site by ETS factors in the cell.  4.4 Future studies on the DNA binding activity of Ets1   A logical extension to the studies presented in my thesis would involved more detailed characterization and quantification of the different mechanisms used by TFs to search for target site in the cell. Although several structures of Ets1 bound to specific DNAs have been solved by  124 X-ray crystallography (and in one early case, by NMR spectroscopy), my thesis represents the first effort to characterize a non-specific complex. Although challenging due to conformational exchange, the structure of Ets1 bound to NS2-12 could be solved by NMR methods. This would require more complete assignment of the signals from the protein and DNA (ideally 13C/15N-labeled), followed by inter-protons filtered-NOEs and residual dipolar coupling measurements. These complete assignments would also open the door to more detailed relaxation studies to probe the mainchain and sidechain (e.g., methyl groups) dynamics of Ets1 when bound to specific versus non-specific DNAs.   In parallel, the association/dissociation and sliding kinetics of ETS factors on non-specific and specific DNA could be examined using elegant PRE experiments developed by the Clore group [196]. These methods involve concentration-dependent relaxation measurements using combinations of relatively long oligonucleotides without and with paramagnetic metal ions attached at terminal positions. Stop-flow kinetics studies could also be used to determine the length of DNA Ets1 can slide on before dissociating or hopping to another strand of DNA [65]. Taken together, these studies will allow us to establish which mechanism(s) Ets1 uses to find its cognate site in a sea of non-specific DNA.   4.5 Ets1 as a model system for understanding intrinsically disordered regions and fuzzy regulatory interactions   DNA-binding autoinhibition provides an important pathway for the regulation of gene expression in Ets1. The studies presented here help to further elucidate the physical basis for this fascinating example involving “fuzzy interactions” and regulation of structured domains by IDRs. In addition to the SRR, the disordered region N-terminal of the PNT domain has been characterized in detail by NMR methods. MAP kinase phosphorylation of this region enhances binding of the Ets1 PNT domain to the TAZ1 domain of the co-activator CBP, thereby activating expression of Ras-responsive genes. Linking the PNT and ETS domain is a long segment of residues that are predicted to be disordered. Although often referred as a transactivation domain, little is known about the function of these residues. Thus, it will be interesting to assess their role(s) in transcriptional regulation by Ets1. Are they simply a flexible tether between the DNA-bound ETS domain and the protein-interacting PNT domain, or are they sites of additional post-translational modifications and protein partnerships?    Essentially, Ets1 is an excellent model system to tackle the various technical issues and  125 concepts surrounding autoinhibition and IDRs. The concepts studied in this thesis for Ets1 are also applicable to other transcription factors in the ETS family, as well as large TF family such as HOX. However, much work remains to be done in order to fully understand the contributions to functions by disorder in autoinhibition, transcription factors and regulatory networks.      126 Bibliography 1. Magnani, L., J. Eeckhoute, and M. Lupien, Pioneer factors: directing transcriptional regulators within the chromatin environment. Trends Genet, 2011. 27(11): p. 465-74. 2. Heinz, S., et al., Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell, 2010. 38(4): p. 576-89. 3. Findlay, V.J., et al., Understanding the role of ETS-mediated gene regulation in complex biological processes. Adv Cancer Res, 2013. 119: p. 1-61. 4. Hollenhorst, P.C., et al., Genome-wide analyses reveal properties of redundant and specific promoter occupancy within the ETS gene family. Genes Dev, 2007. 21(15): p. 1882-94. 5. Hollenhorst, P.C., et al., DNA specificity determinants associate with distinct transcription factor functions. PLoS Genet, 2009. 5(12): p. e1000778. 6. Ciau-Uitz, A., et al., ETS transcription factors in hematopoietic stem cell development. Blood Cells Mol Dis, 2013. 51(4): p. 248-55. 7. Garrett-Sinha, L.A., Review of Ets1 structure, function, and roles in immunity. Cell Mol Life Sci, 2013. 70(18): p. 3375-90. 8. Hollenhorst, P.C., L.P. McIntosh, and B.J. Graves, Genomic and biochemical insights into the specificity of ETS transcription factors. Annu Rev Biochem, 2011. 80: p. 437-71. 9. Clark, C.E., G.L. Beatty, and R.H. Vonderheide, Immunosurveillance of pancreatic adenocarcinoma: insights from genetically engineered mouse models of cancer. Cancer Lett, 2009. 279(1): p. 1-7. 10. Oh, S., S. Shin, and R. Janknecht, ETV1, 4 and 5: an oncogenic subfamily of ETS transcription factors. Biochim Biophys Acta, 2012. 1826(1): p. 1-12. 11. Rahim, S. and A. Uren, Emergence of ETS transcription factors as diagnostic tools and therapeutic targets in prostate cancer. Am J Transl Res, 2013. 5(3): p. 254-68. 12. Bohlander, S.K., ETV6: a versatile player in leukemogenesis. Semin Cancer Biol, 2005. 15(3): p. 162-74. 13. Tognon, C.E., et al., Mutations in the SAM domain of the ETV6-NTRK3 chimeric tyrosine kinase block polymerization and transformation activity. Mol Cell Biol, 2004. 24(11): p. 4636-50. 14. Janknecht, R., EWS-ETS oncoproteins: the linchpins of Ewing tumors. Gene, 2005. 363: p. 1-14.  127 15. Lim, F., et al., DNA binding by c-Ets-1, but not v-Ets, is repressed by an intramolecular mechanism. EMBO J, 1992. 11(2): p. 643-52. 16. Buggy, Y., et al., Overexpression of the Ets-1 transcription factor in human breast cancer. Br J Cancer, 2004. 91(7): p. 1308-15. 17. Buggy, Y., et al., Ets2 transcription factor in normal and neoplastic human breast tissue. Eur J Cancer, 2006. 42(4): p. 485-91. 18. Benz, C.C., et al., HER2/Neu and the Ets transcription activator PEA3 are coordinately upregulated in human breast cancer. Oncogene, 1997. 15(13): p. 1513-25. 19. Chang, C.H., et al., ESX: a structurally unique Ets overexpressed early during human breast tumorigenesis. Oncogene, 1997. 14(13): p. 1617-22. 20. Bosc, D.G., B.S. Goueli, and R. Janknecht, HER2/Neu-mediated activation of the ETS transcription factor ER81 and its target gene MMP-1. Oncogene, 2001. 20(43): p. 6215-24. 21. Yarden, Y. and M.X. Sliwkowski, Untangling the ErbB signalling network. Nat Rev Mol Cell Biol, 2001. 2(2): p. 127-37. 22. Garvie, C.W. and C. Wolberger, Recognition of specific DNA sequences. Mol Cell, 2001. 8(5): p. 937-46. 23. Garvie, C.W., J. Hagman, and C. Wolberger, Structural studies of Ets-1/Pax5 complex formation on DNA. Mol Cell, 2001. 8(6): p. 1267-76. 24. Mo, Y., et al., Structures of SAP-1 bound to DNA targets from the E74 and c-fos promoters: insights into DNA sequence discrimination by Ets proteins. Mol Cell, 1998. 2(2): p. 201-12. 25. Mo, Y., et al., Structure of the elk-1-DNA complex reveals how DNA-distal residues affect ETS domain recognition of DNA. Nat Struct Biol, 2000. 7(4): p. 292-7. 26. Kodandapani, R., et al., A new pattern for helix-turn-helix recognition revealed by the PU.1 ETS-domain-DNA complex. Nature, 1996. 380(6573): p. 456-60. 27. Babayeva, N.D., et al., Structural basis of Ets1 cooperative binding to palindromic sequences on stromelysin-1 promoter DNA. Cell Cycle, 2010. 9(15): p. 3054-62. 28. Regan, M.C., et al., Structural and dynamic studies of the transcription factor ERG reveal DNA binding is allosterically autoinhibited. Proc Natl Acad Sci U S A, 2013. 110(33): p. 13374-9.  128 29. De, S., et al., Steric mechanism of auto-inhibitory regulation of specific and non-specific DNA binding by the ETS transcriptional repressor ETV6. J Mol Biol, 2014. 426(7): p. 1390-406. 30. Dittmer, J., The biology of the Ets1 proto-oncogene. Mol Cancer, 2003. 2: p. 29. 31. Nelson, M.L., et al., Ras signaling requires dynamic properties of Ets1 for phosphorylation-enhanced binding to coactivator CBP. Proc Natl Acad Sci U S A, 2010. 107(22): p. 10026-31. 32. Lee, G.M., et al., The structural and dynamic basis of Ets-1 DNA binding autoinhibition. J Biol Chem, 2005. 280(8): p. 7088-99. 33. Shiina, M., et al., A novel allosteric mechanism on protein-DNA interactions underlying the phosphorylation-dependent regulation of Ets1 target gene expressions. J Mol Biol, 2014. 34. Garvie, C.W., et al., Structural analysis of the autoinhibition of Ets-1 and its role in protein partnerships. J Biol Chem, 2002. 277(47): p. 45529-36. 35. Werner, M.H., et al., Correction of the NMR structure of the ETS1/DNA complex. J Biomol NMR, 1997. 10(4): p. 317-28. 36. Ji, Z., et al., Regulation of the Ets-1 transcription factor by sumoylation and ubiquitinylation. Oncogene, 2007. 26(3): p. 395-406. 37. Wasylyk, C., et al., Conserved mechanisms of Ras regulation of evolutionary related transcription factors, Ets1 and Pointed P2. Oncogene, 1997. 14(8): p. 899-913. 38. Cowley, D.O. and B.J. Graves, Phosphorylation represses Ets-1 DNA binding by reinforcing autoinhibition. Genes Dev, 2000. 14(3): p. 366-76. 39. Macauley, M.S., et al., Beads-on-a-string, characterization of ETS-1 sumoylated within its flexible N-terminal sequence. J Biol Chem, 2006. 281(7): p. 4164-72. 40. Pufall, M.A., et al., Variable control of Ets-1 DNA binding by multiple phosphates in an unstructured region. Science, 2005. 309(5731): p. 142-5. 41. Jayaraman, G., et al., p300/cAMP-responsive element-binding protein interactions with ets-1 and ets-2 in the transcriptional activation of the human stromelysin promoter. J Biol Chem, 1999. 274(24): p. 17342-52. 42. Charlot, C., et al., A review of post-translational modifications and subcellular localization of Ets transcription factors: possible connection with cancer and involvement in the hypoxic response. Methods Mol Biol, 2010. 647: p. 3-30.  129 43. Lee, G.M., et al., The affinity of Ets-1 for DNA is modulated by phosphorylation through transient interactions of an unstructured region. J Mol Biol, 2008. 382(4): p. 1014-30. 44. Foulds, C.E., et al., Ras/mitogen-activated protein kinase signaling activates Ets-1 and Ets-2 by CBP/p300 recruitment. Mol Cell Biol, 2004. 24(24): p. 10954-64. 45. Vo, N. and R.H. Goodman, CREB-binding protein and p300 in transcriptional regulation. J Biol Chem, 2001. 276(17): p. 13505-8. 46. Bui, J.M. and J. Gsponer, Phosphorylation of an intrinsically disordered segment in Ets1 shifts conformational sampling toward binding-competent substates. Structure, 2014. 22(8): p. 1196-203. 47. Lau, D.K., M. Okon, and L.P. McIntosh, The PNT domain from Drosophila pointed-P2 contains a dynamic N-terminal helix preceded by a disordered phosphoacceptor sequence. Protein Sci, 2012. 21(11): p. 1716-25. 48. Pufall, M.A. and B.J. Graves, Autoinhibitory domains: modular effectors of cellular regulation. Annu Rev Cell Dev Biol, 2002. 18: p. 421-62. 49. Coyne, H.J., 3rd, et al., Autoinhibition of ETV6 (TEL) DNA binding: appended helices sterically block the ETS domain. J Mol Biol, 2012. 421(1): p. 67-84. 50. Skalicky, J.J., et al., Structural coupling of the inhibitory regions flanking the ETS domain of murine Ets-1. Protein Sci, 1996. 5(2): p. 296-309. 51. Petersen, J.M., et al., Modulation of transcription factor Ets-1 DNA binding: DNA-induced unfolding of an alpha helix. Science, 1995. 269(5232): p. 1866-9. 52. Liu, H. and T. Grundstrom, Calcium regulation of GM-CSF by calmodulin-dependent kinase II phosphorylation of Ets1. Mol Biol Cell, 2002. 13(12): p. 4497-507. 53. Gardner, K.H. and M. Montminy, Can you hear me now? Regulating transcriptional activators by phosphorylation. Sci STKE, 2005. 2005(301): p. pe44. 54. Kalodimos, C.G., R. Boelens, and R. Kaptein, Toward an integrated model of protein-DNA recognition as inferred from NMR studies on the Lac repressor system. Chem Rev, 2004. 104(8): p. 3567-86. 55. Lionneton, F., et al., Characterization and functional analysis of the p42Ets-1 variant of the mouse Ets-1 transcription factor. Oncogene, 2003. 22(57): p. 9156-64. 56. Baillat, D., et al., ETS-1 transcription factor binds cooperatively to the palindromic head to head ETS-binding sites of the stromelysin-1 promoter by counteracting autoinhibition. J Biol Chem, 2002. 277(33): p. 29386-98.  130 57. Fitzsimmons, D., et al., Highly conserved amino acids in Pax and Ets proteins are required for DNA binding and ternary complex assembly. Nucleic Acids Res, 2001. 29(20): p. 4154-65. 58. Shrivastava, T., et al., Structural basis of Ets1 activation by Runx1. Leukemia, 2014. 59. Berg, O.G., R.B. Winter, and P.H. von Hippel, Diffusion-driven mechanisms of protein translocation on nucleic acids. 1. Models and theory. Biochemistry, 1981. 20(24): p. 6929-48. 60. Clore, G.M., Exploring translocation of proteins on DNA by NMR. J Biomol NMR, 2011. 51(3): p. 209-19. 61. Iwahara, J., M. Zweckstetter, and G.M. Clore, NMR structural and kinetic characterization of a homeodomain diffusing and hopping on nonspecific DNA. Proc Natl Acad Sci U S A, 2006. 103(41): p. 15062-7. 62. Iwahara, J. and G.M. Clore, Direct observation of enhanced translocation of a homeodomain between DNA cognate sites by NMR exchange spectroscopy. J Am Chem Soc, 2006. 128(2): p. 404-5. 63. Iwahara, J. and G.M. Clore, Detecting transient intermediates in macromolecular binding by paramagnetic NMR. Nature, 2006. 440(7088): p. 1227-30. 64. Blainey, P.C., et al., Nonspecifically bound proteins spin while diffusing along DNA. Nat Struct Mol Biol, 2009. 16(12): p. 1224-9. 65. Esadze, A. and J. Iwahara, Stopped-flow fluorescence kinetic study of protein sliding and intersegment transfer in the target DNA search process. J Mol Biol, 2014. 426(1): p. 230-44. 66. Takayama, Y. and G.M. Clore, Interplay between minor and major groove-binding transcription factors Sox2 and Oct1 in translocation on DNA studied by paramagnetic and diamagnetic NMR. J Biol Chem, 2012. 287(18): p. 14349-63. 67. Zandarashvili, L., et al., Asymmetrical roles of zinc fingers in dynamic DNA-scanning process by the inducible transcription factor Egr-1. Proc Natl Acad Sci U S A, 2012. 109(26): p. E1724-32. 68. Loth, K., et al., Sliding and target location of DNA-binding proteins: an NMR view of the lac repressor system. J Biomol NMR, 2013. 56(1): p. 41-9. 69. Blainey, P.C., et al., A base-excision DNA-repair protein finds intrahelical lesion bases by fast sliding in contact with DNA. Proc Natl Acad Sci U S A, 2006. 103(15): p. 5752-7. 70. Gorman, J., et al., Single-molecule imaging reveals target-search mechanisms during DNA mismatch repair. Proc Natl Acad Sci U S A, 2012. 109(45): p. E3074-83.  131 71. Rau, D.C. and N.Y. Sidorova, Diffusion of the restriction nuclease EcoRI along DNA. J Mol Biol, 2010. 395(2): p. 408-16. 72. Tafvizi, A., et al., Tumor suppressor p53 slides on DNA with low friction and high stability. Biophys J, 2008. 95(1): p. L01-3. 73. Bonnet, I., et al., Sliding and jumping of single EcoRV restriction enzymes on non-cognate DNA. Nucleic Acids Res, 2008. 36(12): p. 4118-27. 74. Vuzman, D., A. Azia, and Y. Levy, Searching DNA via a "Monkey Bar" mechanism: the significance of disordered tails. J Mol Biol, 2010. 396(3): p. 674-84. 75. Choy, W.Y. and J.D. Forman-Kay, Calculation of ensembles of structures representing the unfolded state of an SH3 domain. J Mol Biol, 2001. 308(5): p. 1011-32. 76. Huang, A. and C.M. Stultz, The effect of a DeltaK280 mutation on the unfolded state of a microtubule-binding repeat in Tau. PLoS Comput Biol, 2008. 4(8): p. e1000155. 77. Chen, J.W., et al., Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions. J Proteome Res, 2006. 5(4): p. 879-87. 78. Dunker, A.K., et al., Intrinsically disordered protein. J Mol Graph Model, 2001. 19(1): p. 26-59. 79. Ward, J.J., et al., Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol, 2004. 337(3): p. 635-45. 80. Tompa, P., Multisteric regulation by structural disorder in modular signaling proteins: an extension of the concept of allostery. Chem Rev, 2014. 114(13): p. 6715-32. 81. Tompa, P., Intrinsically disordered proteins: a 10-year recap. Trends Biochem Sci, 2012. 37(12): p. 509-16. 82. Rezaei-Ghaleh, N., M. Blackledge, and M. Zweckstetter, Intrinsically disordered proteins: from sequence and conformational properties toward drug discovery. Chembiochem, 2012. 13(7): p. 930-50. 83. Xie, H., et al., Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. J Proteome Res, 2007. 6(5): p. 1882-98. 84. Tompa, P. and M. Fuxreiter, Fuzzy complexes: polymorphism and structural disorder in protein-protein interactions. Trends Biochem Sci, 2008. 33(1): p. 2-8. 85. Lise, S. and D.T. Jones, Sequence patterns associated with disordered regions in proteins. Proteins, 2005. 58(1): p. 144-50.  132 86. Uversky, V.N., J.R. Gillespie, and A.L. Fink, Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins, 2000. 41(3): p. 415-27. 87. Linding, R., et al., Protein disorder prediction: implications for structural proteomics. Structure, 2003. 11(11): p. 1453-9. 88. Dosztanyi, Z., et al., IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics, 2005. 21(16): p. 3433-4. 89. Romero, Obradovic, and K. Dunker, Sequence Data Analysis for Long Disordered Regions Prediction in the Calcineurin Family. Genome Inform Ser Workshop Genome Inform, 1997. 8: p. 110-124. 90. Dunker, A.K., C.J. Brown, and Z. Obradovic, Identification and functions of usefully disordered proteins. Adv Protein Chem, 2002. 62: p. 25-49. 91. Chen, J., Towards the physical basis of how intrinsic disorder mediates protein function. Arch Biochem Biophys, 2012. 524(2): p. 123-31. 92. Shoemaker, B.A., J.J. Portman, and P.G. Wolynes, Speeding molecular recognition by using the folding funnel: the fly-casting mechanism. Proc Natl Acad Sci U S A, 2000. 97(16): p. 8868-73. 93. Mohan, A., et al., Analysis of molecular recognition features (MoRFs). J Mol Biol, 2006. 362(5): p. 1043-59. 94. Fuxreiter, M., et al., Preformed structural elements feature in partner recognition by intrinsically unstructured proteins. J Mol Biol, 2004. 338(5): p. 1015-26. 95. Gould, C.M., et al., ELM: the status of the 2010 eukaryotic linear motif resource. Nucleic Acids Res, 2010. 38(Database issue): p. D167-80. 96. Fuxreiter, M., Fuzziness: linking regulation to protein dynamics. Mol Biosyst, 2012. 8(1): p. 168-77. 97. Kriwacki, R.W., et al., Structural studies of p21Waf1/Cip1/Sdi1 in the free and Cdk2-bound state: conformational disorder mediates binding diversity. Proc Natl Acad Sci U S A, 1996. 93(21): p. 11504-9. 98. Oldfield, C.J., et al., Flexible nets: disorder and induced fit in the associations of p53 and 14-3-3 with their partners. BMC Genomics, 2008. 9 Suppl 1: p. S1. 99. Mittag, T., et al., Structure/function implications in a dynamic complex of the intrinsically disordered Sic1 with the Cdc4 subunit of an SCF ubiquitin ligase. Structure, 2010. 18(4): p. 494-506.  133 100. Hazy, E. and P. Tompa, Limitations of induced folding in molecular recognition by intrinsically disordered proteins. Chemphyschem, 2009. 10(9-10): p. 1415-9. 101. Oldfield, C.J., et al., Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry, 2005. 44(37): p. 12454-70. 102. Tompa, P., C. Szasz, and L. Buday, Structural disorder throws new light on moonlighting. Trends Biochem Sci, 2005. 30(9): p. 484-9. 103. Gsponer, J., et al., Tight regulation of unstructured proteins: from transcript synthesis to protein degradation. Science, 2008. 322(5906): p. 1365-8. 104. Edwards, Y.J., et al., Insights into the regulation of intrinsically disordered proteins in the human proteome by analyzing sequence and gene expression data. Genome Biol, 2009. 10(5): p. R50. 105. Chen, J., H. Liang, and A. Fernandez, Protein structure protection commits gene expression patterns. Genome Biol, 2008. 9(7): p. R107. 106. Babu, M.M., et al., Intrinsically disordered proteins: regulation and disease. Curr Opin Struct Biol, 2011. 21(3): p. 432-40. 107. Tsvetkov, P., N. Reuven, and Y. Shaul, The nanny model for IDPs. Nat Chem Biol, 2009. 5(11): p. 778-81. 108. Dunker, A.K., et al., The unfoldomics decade: an update on intrinsically disordered proteins. BMC Genomics, 2008. 9 Suppl 2: p. S1. 109. Romero, P.R., et al., Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proc Natl Acad Sci U S A, 2006. 103(22): p. 8390-5. 110. Uversky, V.N., et al., Pathological unfoldomics of uncontrolled chaos: intrinsically disordered proteins and human diseases. Chem Rev, 2014. 114(13): p. 6844-79. 111. Perez-Ortin, J.E., P.M. Alepuz, and J. Moreno, Genomics and gene transcription kinetics in yeast. Trends Genet, 2007. 23(5): p. 250-7. 112. O'Dea, E.L., et al., A homeostatic model of IkappaB metabolism to control constitutive NF-kappaB activity. Mol Syst Biol, 2007. 3: p. 111. 113. Choul-Li, S., et al., Caspase cleavage of Ets-1 p51 generates fragments with transcriptional dominant-negative function. Biochem J, 2010. 426(2): p. 229-41. 114. Shaikhibrahim, Z., et al., Novel identification of the ETS-1 splice variants p42 and p27 in prostate cancer cell lines. Oncol Rep, 2012. 27(5): p. 1321-4.  134 115. Iakoucheva, L.M., et al., The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res, 2004. 32(3): p. 1037-49. 116. Gioeli, D. and B.M. Paschal, Post-translational modification of the androgen receptor. Mol Cell Endocrinol, 2012. 352(1-2): p. 70-8. 117. Campbell, M.J. and B.M. Turner, Altered histone modifications in cancer. Adv Exp Med Biol, 2013. 754: p. 81-107. 118. Di Gennaro, E., et al., Acetylation of proteins as novel target for antitumor therapy: review article. Amino Acids, 2004. 26(4): p. 435-41. 119. Gargalionis, A.N., et al., Histone modifications as a pathogenic mechanism of colorectal tumorigenesis. Int J Biochem Cell Biol, 2012. 44(8): p. 1276-89. 120. Hegyi, H., L. Buday, and P. Tompa, Intrinsic structural disorder confers cellular viability on oncogenic fusion proteins. PLoS Comput Biol, 2009. 5(10): p. e1000552. 121. Adamia, S., et al., A genome-wide aberrant RNA splicing in patients with acute myeloid leukemia identifies novel potential disease markers and therapeutic targets. Clin Cancer Res, 2014. 20(5): p. 1135-45. 122. Peng, Z., et al., Resilience of death: intrinsic disorder in proteins involved in the programmed cell death. Cell Death Differ, 2013. 20(9): p. 1257-67. 123. Philips, A.V. and T.A. Cooper, RNA processing and human disease. Cell Mol Life Sci, 2000. 57(2): p. 235-49. 124. Uversky, V.N., et al., Unfoldomics of human diseases: linking protein intrinsic disorder with diseases. BMC Genomics, 2009. 10 Suppl 1: p. S7. 125. Uversky, V.N., Targeting intrinsically disordered proteins in neurodegenerative and protein dysfunction diseases: another illustration of the D(2) concept. Expert Rev Proteomics, 2010. 7(4): p. 543-64. 126. Dyson, H.J. and P.E. Wright, Intrinsically unstructured proteins and their functions. Nature Reviews Molecular Cell Biology, 2005. 6(3): p. 197-208. 127. Dunker, A.K., et al., Function and structure of inherently disordered proteins. Current Opinion in Structural Biology, 2008. 18(6): p. 756-764. 128. Jonsen, M.D., et al., Characterization of the cooperative function of inhibitory sequences in Ets-1. Mol Cell Biol, 1996. 16(5): p. 2065-73. 129. Ubersax, J.A. and J.E. Ferrell, Jr., Mechanisms of specificity in protein phosphorylation. Nat Rev Mol Cell Biol, 2007. 8(7): p. 530-41.  135 130. Songyang, Z., et al., A structural basis for substrate specificities of protein Ser/Thr kinases: primary sequence preference of casein kinases I and II, NIMA, phosphorylase kinase, calmodulin-dependent kinase II, CDK5, and Erk1. Mol Cell Biol, 1996. 16(11): p. 6486-93. 131. Pintacuda, G. and G. Otting, Identification of protein surfaces by NMR measurements with a pramagnetic Gd(III) chelate. J Am Chem Soc, 2002. 124(3): p. 372-3. 132. Respondek, M., et al., Mapping the orientation of helices in micelle-bound peptides by paramagnetic relaxation waves. J Am Chem Soc, 2007. 129(16): p. 5228-34. 133. Battiste, J.L. and G. Wagner, Utilization of site-directed spin labeling and high-resolution heteronuclear nuclear magnetic resonance for global fold determination of large proteins with limited nuclear overhauser effect data. Biochemistry, 2000. 39(18): p. 5355-65. 134. Farrow, N.A., et al., Backbone dynamics of a free and phosphopeptide-complexed Src homology 2 domain studied by 15N NMR relaxation. Biochemistry, 1994. 33(19): p. 5984-6003. 135. Farrow, N.A., et al., A heteronuclear correlation experiment for simultaneous determination of 15N longitudinal decay and chemical exchange rates of systems in slow equilibrium. J Biomol NMR, 1994. 4(5): p. 727-34. 136. Shen, Y. and A. Bax, Prediction of Xaa-Pro peptide bond conformation from sequence and chemical shifts. J Biomol NMR, 2010. 46(3): p. 199-204. 137. Camilloni, C., et al., Determination of secondary structure populations in disordered states of proteins using nuclear magnetic resonance chemical shifts. Biochemistry, 2012. 51(11): p. 2224-31. 138. Campbell, A.P., et al., Backbone dynamics of a bacterially expressed peptide from the receptor binding domain of Pseudomonas aeruginosa pilin strain PAK from heteronuclear 1H-15N NMR spectroscopy. J Biomol NMR, 2000. 17(3): p. 239-55. 139. Renner, C., et al., Practical aspects of the 2D 15N-[1h]-NOE experiment. J Biomol NMR, 2002. 23(1): p. 23-33. 140. Uversky, V.N. and A.K. Dunker, Multiparametric analysis of intrinsically disordered proteins: looking at intrinsic disorder through compound eyes. Anal Chem, 2012. 84(5): p. 2096-104. 141. Crowley, P.B. and A. Golovin, Cation-pi interactions in protein-protein interfaces. Proteins, 2005. 59(2): p. 231-9. 142. Dougherty, D.A., The cation-pi interaction. Acc Chem Res, 2013. 46(4): p. 885-93.  136 143. Witte, K., et al., Structure and dynamics of the two amphipathic arginine-rich peptides RW9 and RL9 in a lipid environment investigated by solid-state NMR and MD simulations. Biochim Biophys Acta, 2013. 1828(2): p. 824-33. 144. Mahadevi, A.S. and G.N. Sastry, Cation-pi interaction: its role and relevance in chemistry, biology, and material science. Chem Rev, 2013. 113(3): p. 2100-38. 145. von Hippel, P.H. and T. Schleich, Ion effects on the solution structure of biological macromolecules. Acc. Chem. Res., 1968. 2(9): p. 257-265. 146. Baldwin, R.L., How Hofmeister ion interactions affect protein stability. Biophysical Journal, 1996. 71(4): p. 2056-2063. 147. Dill, K.A., et al., Modeling water, the hydrophobic effect, and ion solvation. Annual Review of Biophysics and Biomolecular Structure, 2005. 34: p. 173-199. 148. Zhang, Y. and P.S. Cremer, Interactions between macromolecules and ions: The Hofmeister series. Curr Opin Chem Biol, 2006. 10(6): p. 658-63. 149. Tadeo, X., et al., Protein stabilization and the Hofmeister effect: the role of hydrophobic solvation. Biophys J, 2009. 97(9): p. 2595-603. 150. Tadeo, X., M. Pons, and O. Millet, Influence of the Hofmeister anions on protein stability as studied by thermal denaturation and chemical shift perturbation. Biochemistry, 2007. 46(3): p. 917-23. 151. Ng, K.P., et al., Multiple aromatic side chains within a disordered structure are critical for transcription and transforming activity of EWS family oncoproteins. Proc Natl Acad Sci U S A, 2007. 104(2): p. 479-484. 152. Lee, K.A., Molecular recognition by the EWS transcriptional activation domain. Adv Exp Med Biol, 2012. 725: p. 106-25. 153. Kwon, I., et al., Phosphorylation-Regulated Binding of RNA Polymerase II to Fibrous Polymers of Low-Complexity Domains. Cell, 2013. 155(5): p. 1049-60. 154. Song, J., et al., Polycation-pi Interactions Are a Driving Force for Molecular Recognition by an Intrinsically Disordered Oncoprotein Family. PLoS Comput Biol, 2013. 9(9): p. e1003239. 155. Delaglio, F., et al., NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR, 1995. 6(3): p. 277-93. 156. Goddard, T.D.a.K., D. G., Sparky 3: San Francisco, United States.  137 157. Kay LE, K.P., Saarinen T. , Pure absorption gradient enhanced heteronuclear single quantum correlation spectroscopy with improved sensitivity. J Am Chem Soc, 1992. 114(26): p. 10663-10665. 158. Givaty, O. and Y. Levy, Protein sliding along DNA: dynamics and structural characterization. J Mol Biol, 2009. 385(4): p. 1087-97. 159. von Hippel, P.H., et al., Protein-nucleic acid interactions in transcription: a molecular analysis. Annu Rev Biochem, 1984. 53: p. 389-446. 160. Berg, O.G. and C. Blomberg, Association kinetics with coupled diffusion III. Ionic-strength dependence of the lac repressor-operator association. Biophys Chem, 1978. 8(4): p. 271-80. 161. Ohlendorf, D.H. and J.B. Matthew, Electrostatics and flexibility in protein-DNA interactions. Adv Biophys, 1985. 20: p. 137-51. 162. Takayama, Y. and G.M. Clore, Intra- and intermolecular translocation of the bi-domain transcription factor Oct1 characterized by liquid crystal and paramagnetic NMR. Proc Natl Acad Sci U S A, 2011. 108(22): p. E169-76. 163. Wang, Y.M., R.H. Austin, and E.C. Cox, Single molecule measurements of repressor protein 1D diffusion on DNA. Phys Rev Lett, 2006. 97(4): p. 048302. 164. Esadze, A. and J. Iwahara, Stopped-Flow Fluorescence Kinetic Study of Protein Sliding and Intersegment Transfer in the Target DNA Search Process. J Mol Biol, 2013. 165. Kalodimos, C.G., et al., Structure and flexibility adaptation in nonspecific and specific protein-DNA complexes. Science, 2004. 305(5682): p. 386-9. 166. Viadiu, H. and A.K. Aggarwal, Structure of BamHI bound to nonspecific DNA: a model for DNA sliding. Mol Cell, 2000. 5(5): p. 889-95. 167. Gewirth, D.T. and P.B. Sigler, The basis for half-site specificity explored through a non-cognate steroid receptor-DNA complex. Nat Struct Biol, 1995. 2(5): p. 386-94. 168. Marcovitz, A. and Y. Levy, Frustration in protein-DNA binding influences conformational switching and target search kinetics. Proc Natl Acad Sci U S A, 2011. 108(44): p. 17957-62. 169. Sanchez, I.E., et al., Experimental snapshots of a protein-DNA binding landscape. Proc Natl Acad Sci U S A, 2010. 107(17): p. 7751-6. 170. Zhou, H.X., Rapid search for specific sites on DNA through conformational switch of nonspecifically bound proteins. Proc Natl Acad Sci U S A, 2011. 108(21): p. 8651-6.  138 171. Nye, J.A., et al., Interaction of murine ets-1 with GGA-binding sites establishes the ETS domain as a new DNA-binding motif. Genes Dev, 1992. 6(6): p. 975-90. 172. Shen, Y. and A. Bax, Identification of helix capping and b-turn motifs from NMR chemical shifts. J Biomol NMR, 2012. 52(3): p. 211-32. 173. Berjanskii, M. and D.S. Wishart, NMR: prediction of protein flexibility. Nat Protoc, 2006. 1(2): p. 683-8. 174. Berjanskii, M.V. and D.S. Wishart, A simple method to predict protein flexibility using secondary chemical shifts. J Am Chem Soc, 2005. 127(43): p. 14970-1. 175. Wishart, D.S. and B.D. Sykes, The 13C chemical-shift index: a simple method for the identification of protein secondary structure using 13C chemical-shift data. J Biomol NMR, 1994. 4(2): p. 171-80. 176. Grishin, A.V., et al., [Conserved structural features of ETS domain--DNA complexes]. Mol Biol (Mosk), 2009. 43(4): p. 666-74. 177. Selvaratnam, R., et al., The auto-inhibitory role of the EPAC hinge helix as mapped by NMR. PLoS One, 2012. 7(11): p. e48707. 178. Bondos, S.E. and H.C. Hsiao, Roles for intrinsic disorder and fuzziness in generating context-specific function in Ultrabithorax, a Hox transcription factor. Adv Exp Med Biol, 2012. 725: p. 86-105. 179. Sidorova, N.Y. and D.C. Rau, Differences in water release for the binding of EcoRI to specific and nonspecific DNA sequences. Proc Natl Acad Sci U S A, 1996. 93(22): p. 12272-7. 180. Luisi, B.F., et al., Crystallographic analysis of the interaction of the glucocorticoid receptor with DNA. Nature, 1991. 352(6335): p. 497-505. 181. Anderson, K.M., et al., Direct observation of the ion-pair dynamics at a protein-DNA interface by NMR spectroscopy. J Am Chem Soc, 2013. 135(9): p. 3613-9. 182. Aishima, J. and C. Wolberger, Insights into nonspecific binding of homeodomains from a structure of MATalpha2 bound to DNA. Proteins, 2003. 51(4): p. 544-51. 183. Iwahara, J., C.D. Schwieters, and G.M. Clore, Characterization of nonspecific protein-DNA interactions by 1H paramagnetic relaxation enhancement. J Am Chem Soc, 2004. 126(40): p. 12800-8. 184. Nunez, N., et al., The multi-zinc finger protein ZNF217 contacts DNA through a two-finger domain. J Biol Chem, 2011. 286(44): p. 38190-201.  139 185. Salzmann, M., et al., TROSY in triple-resonance experiments: new perspectives for sequential NMR assignment of large proteins. Proc Natl Acad Sci U S A, 1998. 95(23): p. 13585-90. 186. Salzmann, M., et al., [13C]-constant-time [15N,1H]-TROSY-HNCA for sequential assignments of large proteins. J Biomol NMR, 1999. 14(1): p. 85-8. 187. Matsuo, H., et al., Use of selective C alpha pulses for improvement of HN(CA)CO-D and HN(COCA)NH-D experiments. J Magn Reson B, 1996. 111(2): p. 194-8. 188. Matsuo, H., H. Li, and G. Wagner, A sensitive HN(CA)CO experiment for deuterated proteins. J Magn Reson B, 1996. 110(1): p. 112-5. 189. Eletsky, A., A. Kienhofer, and K. Pervushin, TROSY NMR with partially deuterated proteins. J Biomol NMR, 2001. 20(2): p. 177-80. 190. Diercks, T., M. Coles, and H. Kessler, An efficient strategy for assignment of cross-peaks in 3D heteronuclear NOESY experiments. J Biomol NMR, 1999. 15(2): p. 177-80. 191. Gardner, K.H. and L.E. Kay, The use of 2H, 13C, 15N multidimensional NMR to study the structure and dynamics of proteins. Annu Rev Biophys Biomol Struct, 1998. 27: p. 357-406. 192. Andre, I., S. Linse, and F.A. Mulder, Residue-specific pKa determination of lysine and arginine side chains by indirect 15N and 13C NMR spectroscopy: application to apo calmodulin. J Am Chem Soc, 2007. 129(51): p. 15805-13. 193. Xu, Q., G.C. Johnston, and R.A. Singer, The Saccharomyces cerevisiae Cdc68 transcription activator is antagonized by San1, a protein implicated in transcriptional silencing. Mol Cell Biol, 1993. 13(12): p. 7553-65. 194. Liu, Y., K.S. Matthews, and S.E. Bondos, Multiple intrinsically disordered sequences alter DNA binding by the homeodomain of the Drosophila hox protein ultrabithorax. J Biol Chem, 2008. 283(30): p. 20874-87. 195. Liu, Y., K.S. Matthews, and S.E. Bondos, Internal regulatory interactions determine DNA binding specificity by a Hox transcription factor. J Mol Biol, 2009. 390(4): p. 760-74. 196. Clore, G.M., Seeing the invisible by paramagnetic and diamagnetic NMR. Biochem Soc Trans, 2013. 41(6): p. 1343-54.   

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0166107/manifest

Comment

Related Items