Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Insights into the inhibition of ETV6 PNT domain polymerization Gerak, Chloe Ann Nolan 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2020_november_gerak_chloe.pdf [ 53.83MB ]
Metadata
JSON: 24-1.0392799.json
JSON-LD: 24-1.0392799-ld.json
RDF/XML (Pretty): 24-1.0392799-rdf.xml
RDF/JSON: 24-1.0392799-rdf.json
Turtle: 24-1.0392799-turtle.txt
N-Triples: 24-1.0392799-rdf-ntriples.txt
Original Record: 24-1.0392799-source.json
Full Text
24-1.0392799-fulltext.txt
Citation
24-1.0392799.ris

Full Text

INSIGHTS INTO THE INHIBITION OF ETV6 PNT DOMAIN POLYMERIZATION by  Chloe Ann Nolan Gerak  B.Sc., Simon Fraser University, 2014   A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Biochemistry and Molecular Biology)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  August 2020  © Chloe Ann Nolan Gerak, 2020  ii  The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled:  Insights into the Inhibition of ETV6 PNT Domain Polymerization  submitted by Chloe Ann Nolan Gerak in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biochemistry and Molecular Biology  Examining Committee: Lawrence McIntosh, Professor, Biochemistry and Molecular Biology and Chemistry, UBC Co-supervisor  Michel Roberge, Professor, Biochemistry and Molecular Biology, UBC Co-supervisor  Suzana Straus, Professor, Chemistry, UBC Supervisory Committee Member  Alice Mui, Associate Professor, Biochemistry and Molecular Biology and Surgery, UBC University Examiner  Calvin Roskelley, Professor, Cellular and Physiological Sciences, UBC University Examiner   Additional Supervisory Committee Members: LeAnn Howe, Professor, Biochemistry and Molecular Biology, UBC Supervisory Committee Member iii   Abstract ETV6 is a modular transcriptional repressor for which head-to-tail polymerization of its PNT (or SAM) domain facilitates cooperative binding to tandem DNA sites by its ETS domain. Chromosomal translocations frequently fuse the ETV6 PNT domain to the catalytic domain of one of several receptor protein tyrosine kinases. The resulting chimeric oncoproteins undergo ligand-independent self-association, auto-phosphorylation, and aberrant stimulation of downstream signaling pathways, thereby leading to a diverse range of cancers. Inhibition of PNT domain polymerization through mutations renders the chimeric proteins non-oncogenic. This indicates that a small molecule inhibitor of polymerization could be a viable therapeutic against many ETV6-linked cancers.  Protein-protein interactions are challenging to disrupt with small molecules, and thus I followed a multi-pronged approach for lead compound discovery. In Chapter 2, I describe work to obtain structural, dynamic, and thermodynamic insights into the PNT domain and its self-association properties. To this end, I characterized monomeric and heterodimeric forms of the PNT domain using nuclear magnetic resonance spectroscopy, X-ray crystallography, molecular dynamics simulations, amide hydrogen exchange, and alanine scanning mutagenesis in conjunction with surface plasmon resonance binding studies. Collectively, these studies defined “hot spot” regions critical to the PNT domain self-association interface. In Chapter 3, I discuss efforts undertaken to find inhibitors of ETV6 PNT domain polymerization using two high-throughput cellular approaches – a split luciferase reporter assay and a modified yeast two-hybrid assay – and a computational approach with the Bristol University Docking Engine (BUDE). Over 75 lead compounds from these assays were tested for binding to the PNT domain through NMR spectroscopy. None were found to bind to the PNT domain or iv  inhibit its self-association. However, lessons learned from these screening assays may facilitate future high-throughput screening for or rational design of therapeutics that act against ETV6 oncoproteins by disrupting PNT domain polymerization.    v  Lay Summary I studied a protein segment called the PNT domain, encoded by the ETV6 gene, that can improperly fuse to another gene fragment encoding a protein tyrosine kinase (PTK). This results in many tumor-causing PNT-PTK chimera oncoproteins that have been implicated in various cancers affecting diverse demographics. These oncoproteins are activated due to PNT domain self-association. Thus, my goal was to find a compound that inhibits this self-association, thereby inactivating the oncoprotein and preventing unregulated cell growth. Initially, I investigated the molecular mechanisms of PNT domain self-association using several biophysical experiments. Armed with this knowledge, I carried out experimental and computational screening assays to extensively search large chemical libraries for compounds that might prevent self-association. Although I was unable to identify such inhibitory compounds, lessons learned may facilitate future high-throughput screening for or design of therapeutics that act against PNT-PTK oncoproteins by disrupting PNT domain polymerization.  vi  Preface The overall goals of my PhD research were established by myself, Dr. Lawrence McIntosh, and Dr. Michel Roberge, and extend from an early collaboration with Dr. Poul Sorenson at the BC Cancer Agency. Chapter 2 is based on research I conducted at the University of British Columbia (UBC) and University of Bristol (UoB), in collaboration with Dr. McIntosh, Dr. Roberge, Dr. Richard Sessions, Dr. Michael Murphy, Dr. Mark Okon, Mr. Maxim Kolesnikov and Ms. Sophia Cho. Chapter 3 is based on research I conducted at UBC and UoB in collaboration with Dr. Roberge, Dr. McIntosh, Dr. Sessions, Dr. Ivan Sadowski, Mrs. Aruna Balgi, Mr. Stephen Zhang and Mrs. Yoko Shimizu.  Currently, I am preparing the results of Chapter 2 for a first-author publication. I am fully responsible for the bulk of the research design, experimental work and data analysis with oversight and advice from Drs. McIntosh and Roberge. Dr. Okon provided key assistance with the NMR spectroscopy experiments and spectral assignments (Section 2.3.1 and 2.3.3). I purified the protein and isolated crystals for the X-ray crystallographic sections (Section 2.3.2). Dr. Murphy provided knowledge of X-ray crystallography and his student, Mr. Kolesnikov, solved and refined the final crystal structure. Ms. Cho, under my guidance, and I generated mutants for the alanine scanning studies, purified the proteins, and performed the SPR-monitored binding studies (Section 2.3.4). I ran the molecular dynamics simulations (Section 2.3.5) at the UoB under guidance and feedback of Dr. Sessions.  In addition, I am preparing a second first-author manuscript summarizing the results of Chapter 3. I carried out the bulk of the research design, experimental work and data analysis, with advice and oversight from Drs. McIntosh and Roberge. Mrs. Shimizu aided with use of the Varioskan for the high-throughput luciferase assays (Section 3.3.2). Dr. Sadowski generated the vii  yeast strains used in the yeast screening assays (Section 3.3.3). While I helped oversee the results, the assays were performed by Mrs. Balgi and Mr. Zhang (Section 3.3.4). I executed the in silico BUDE screen at the UoB with the BlueCrystal4 High Performance Computer under guidance of Dr. Sessions (Section 3.3.5 and Section 3.3.6). Dr. Amaurys Ibarra provided computational support. I performed all compound validations by cell assays and NMR spectroscopy at UBC.  viii  Table of Contents Abstract .................................................................................................................................... iii Lay Summary ............................................................................................................................ v Preface ...................................................................................................................................... vi Table of Contents ................................................................................................................... viii List of Tables .......................................................................................................................... xiv List of Figures ......................................................................................................................... xv List of Abbreviations ........................................................................................................... xviii List of Amino Acid Abbreviations ....................................................................................... xxiii Acknowledgements .............................................................................................................. xxiv Dedication ........................................................................................................................... xxvii Chapter 1: Introduction ........................................................................................................... 1 1.1 General overview of cancer ......................................................................................... 1 1.1.1 Chromosomal translocations ................................................................................ 3 1.1.2 Receptor tyrosine kinases ..................................................................................... 5 1.2 ETS transcription factor family .................................................................................... 7 1.2.1 General features and classifications ...................................................................... 7 1.2.2 ETV6 (TEL) general features and biological significance .................................. 10 1.2.3 The ETV6 PNT domain ..................................................................................... 12 1.3 ETV6 and its role in cancer and disease ..................................................................... 14 1.3.1 ETV6-NTRK3 fusion oncoprotein ..................................................................... 17 1.4 Protein-protein interactions as candidate therapeutic targets....................................... 21 1.4.1 Protein-protein interactions in a cell ................................................................... 21 ix  1.4.2 Challenges in targeting protein-protein interactions............................................ 25 1.5 Research questions and goals ..................................................................................... 28 1.5.1 Characterizing ETV6 PNT domain polymerization ............................................ 29 1.5.2 Complementary screening assays ....................................................................... 29 1.5.3 Future directions ................................................................................................ 30 Chapter 2: Biophysical characterization of the ETV6 PNT domain .................................... 31 2.1 Overview ................................................................................................................... 31 2.2 Introduction ............................................................................................................... 31 2.3 Results....................................................................................................................... 33 2.3.1 NMR spectroscopic characterization of the ETV6 PNT domains ........................ 33 2.3.1.1 NMR spectral assignments of the monomeric PNT domains .......................... 33 2.3.1.2 PNT domain dimerization characterized by NMR spectroscopy ..................... 36 2.3.1.3 Chemical shift-based structural analysis of the monomeric and heterodimeric ETV6 PNT domains ...................................................................................................... 42 2.3.2 Crystallographic comparison of monomeric and dimeric PNT domains.............. 46 2.3.3 Amide hydrogen exchange data show increased protection of interfacial residues upon dimerization ............................................................................................................. 51 2.3.4 Alanine scanning mutagenesis at the PNT domain PPI interface ........................ 59 2.3.5 Molecular dynamics simulations of the PNT monomer and dimer ...................... 68 2.4 Discussion ................................................................................................................. 70 2.4.1 The PNT domain retains a similar structure in monomeric and heterodimeric states........ ......................................................................................................................... 71 2.4.2 Alanine scanning mutagenesis determines interfacial “hot spot” residues ........... 71 x  2.4.3 Many of the “hot spot” residues have increased protection from amide hydrogen exchange. .......................................................................................................................... 72 2.4.4 Insights into small molecule inhibition of PNT domain polymerization .............. 74 2.5 Materials and Methods .............................................................................................. 76 2.5.1 Protein construct, expression and purification .................................................... 76 2.5.2 NMR Spectroscopy ............................................................................................ 78 2.5.2.1 General NMR spectroscopy methods ............................................................. 78 2.5.2.2 Spectral assignments and chemical shift analyses ........................................... 79 2.5.2.3 Amide hydrogen exchange (HX) .................................................................... 79 2.5.3 Experimental alanine-scanning mutagenesis methods ........................................ 81 2.5.3.1 Site-directed mutagenesis and construct cloning ............................................. 81 2.5.3.2 Protein expression and purification for surface plasmon resonance (SPR) ...... 82 2.5.3.3 Surface plasmon resonance (SPR) .................................................................. 82 2.5.4 Structural determination of A93D-V112E PNT by X-ray crystallography .......... 83 2.5.5 In silico structural comparison ........................................................................... 85 2.5.5.1 Molecular dynamics (MD) simulations on monomeric and dimeric PNT domains... ...................................................................................................................... 85 Chapter 3: Comprehensive screening for candidate inhibitors of ETV6 PNT polymerization using cellular, in vitro and in silico techniques ............................................. 87 3.1 Overview ................................................................................................................... 87 3.2 Introduction ............................................................................................................... 88 3.2.1 Mammalian cell protein-fragment complementation assay (PCA) ...................... 90 3.2.2 Yeast assay two-hybrid screening assay ............................................................. 92 xi  3.2.3 In silico screening methods ................................................................................ 95 3.3 Results....................................................................................................................... 96 3.3.1 Development of a PCA for inhibition of PNT domain association ...................... 96 3.3.2 High throughput screening using the PNT domain PCA ................................... 103 3.3.3 Development of a yeast two-hybrid PNT domain assay .................................... 108 3.3.4 High-throughput screening using the yeast two-hybrid assay ........................... 109 3.3.5 In silico screening for ETV6 PNT domain PPI inhibitors ................................. 115 3.3.6 In silico screening with Bristol University Docking Engine for compounds binding to the intermolecular salt bridge forming residues of the ETV6 PNT domains .... 121 3.4 Discussion ............................................................................................................... 127 3.4.1 Caveats of the experimental screening assays performed against the ETV6 PNT domain..... ....................................................................................................................... 127 3.4.2 The in silico screening assay also failed to identify any compounds that bound the PNT domain .................................................................................................................... 127 3.4.3 Potential avenues for HTS improvement .......................................................... 129 3.4.4 Identifying an inhibitor of PNT domain polymerization will likely involve complementary approaches ............................................................................................. 130 3.5 Materials and Methods ............................................................................................ 131 3.5.1 Chemicals for screening ................................................................................... 131 3.5.2 Split luciferase PCA......................................................................................... 131 3.5.2.1 Cloning steps for in vivo split luciferase assay .............................................. 131 3.5.2.2 Mammalian cell culture................................................................................ 132 xii  3.5.2.3 Transient expression validation and establishment of a stably expressing PNT domain PCA system .................................................................................................... 132 3.5.2.4 High-throughput split luciferase chemiluminescence assay .......................... 134 3.5.2.5 Secondary validation assays ......................................................................... 134 3.5.3 Yeast assay ...................................................................................................... 135 3.5.3.1 Yeast two-hybrid assay construct development ............................................ 135 3.5.3.2 Yeast screening assay................................................................................... 136 3.5.4 In silico screening ............................................................................................ 137 3.5.4.1 Virtual screening utilizing BUDE................................................................. 137 3.5.4.2 Molecular dynamics simulations on top candidates at each PNT domain interface.... .................................................................................................................. 138 3.5.4.3 Candidate selection and validation ............................................................... 138 3.5.5 Testing of compound binding to the PNT domain by NMR spectroscopy ......... 139 Chapter 4: Concluding remarks........................................................................................... 140 4.1.1 The ETV6 PNT domain is a stable helical bundle that retains its structure upon self-association ................................................................................................................ 140 4.1.2 Several key “hot spots” at the PNT domain PPI contribute to the strong binding interaction ....................................................................................................................... 140 4.1.3 Targeting the PNT domain PPI is challenging with small molecules ................ 141 4.1.4 Future directions .............................................................................................. 142 Bibliography .......................................................................................................................... 144 Appendices ............................................................................................................................ 161 Appendix A - Protein sequences of described constructs ..................................................... 161 xiii  Appendix B - Chemical shift assignments ........................................................................... 163 Appendix C - Amide hydrogen exchange protection factors ................................................ 175 Appendix D - BUDE compounds tested by NMR spectroscopy ........................................... 178 xiv  List of Tables Table 2-1 Alanine scanning mutagenesis of the EH-surface on the A93D PNT domain a ........... 61 Table 2-2 Alanine scanning mutagenesis of the ML-surface on the V112E PNT domain a ......... 62 Table 2-3 Comparison of the alanine scanning data to HX data for the A93D PNT domain ....... 73 Table 2-4 Comparison of the alanine scanning data to HX data for the V112E PNT subunit ..... 74 Table 2-5 Data collection and refinement statistics for ETV6 A93D-V112E PNT domain ......... 85 Table 3-1 Luminescence readings observed upon transient transfection of different constructs . 99 Table 3-2 Comparison of transiently transfected and stably expressing PCA systems .............. 100 Table 3-3 Testing of initial PNT domain PCA hits by NMR spectroscopy............................... 107 Table 3-4 Compounds identified as verified hits in the yeast assay .......................................... 114 Table 3-5 Candidate BUDE compounds tested for ETV6 PNT domain binding....................... 120 Table 3-6 Compounds tested against the K99-D101 salt bridge ............................................... 124 Table A-1 Protein sequences of described constructs............................................................... 161 Table A-2 Chemical shift assignments (in ppm) of the monomeric V112E PNT domain ......... 163 Table A-3 Chemical shift assignments (in ppm) for the monomeric A93D PNT domain ......... 166 Table A-4 Chemical shift assignments (in ppm) for the V112E PNT domain as a heterodimer with the A93D PNT domain .................................................................................................... 169 Table A-5 Chemical shift assignments (in ppm) for the A93D PNT domain as a heterodimer with the V112E PNT domain .......................................................................................................... 172 Table A-6 Amide HX protection factors (log(PF)) .................................................................. 175 Table A-7 Chemical structures of compounds selected from BUDE ........................................ 178  xv  List of Figures Figure 1.1 ETS transcription factor family. ................................................................................. 8 Figure 1.2 Structures of several ETS family PNT domains. ....................................................... 10 Figure 1.3 ETV6 PNT domain interface. ................................................................................... 13 Figure 1.4 Cartoon representation of ETV6 PNT domain polymerization and heterodimerization. ................................................................................................................................................. 14 Figure 1.5 ETV6 is prevalent in oncogenic chromosomal translocations. .................................. 16 Figure 1.6 Cartoon representation of the modular ETV6-NTRK3 (EN) oncoprotein. ................. 18 Figure 1.7 Outline of the ETV6-NTRK3 (EN) oncogenic signal transduction pathway. ............. 20 Figure 1.8 Potential mechanisms of PPI inhibition. ................................................................... 26 Figure 2.1 15N-HSQC spectrum of the monomeric V112E PNT domain. ................................... 34 Figure 2.2 15N-HSQC spectrum of the monomeric A93D PNT domain. .................................... 35 Figure 2.3 The A93D and V112E PNT domains bind in the slow exchange limit. ..................... 38 Figure 2.4 15N-HSQC spectrum of the 15N-labelled V112E PNT domain bound to the unlabelled A93D PNT domain. .................................................................................................................. 40 Figure 2.5 15N-HSQC spectrum of the 15N-labelled A93D PNT domain bound to the unlabelled V112E PNT domain. ................................................................................................................. 41 Figure 2.6 Chemical shift-based secondary structural and dynamic analyses of the ETV6 PNT domains. ................................................................................................................................... 43 Figure 2.7 PNT domain dimerization interface identified by amide chemical shift perturbations  ................................................................................................................................................. 45 Figure 2.8 Comparison of the asymmetric units of the monomeric and polymeric ETV6 PNT domain variants......................................................................................................................... 48 xvi  Figure 2.9 Structural comparison of the monomeric PNT domain with a polymeric subunit. ..... 50 Figure 2.10 Measuring PNT domain HX using 15N-HSQC spectroscopy. .................................. 53 Figure 2.11 PNT domain amide HX protection factors increase upon heterodimerization. ......... 54 Figure 2.12 Mapping of HX protection factors on the monomeric PNT domain structures. ....... 56 Figure 2.13 Mapping of HX protection factors on the structure of the heterodimeric PNT domain ................................................................................................................................................. 58 Figure 2.14 SPR provides a reliable measure of the A93D and V112E PNT domain interactions. ................................................................................................................................................. 60 Figure 2.15 Characterization of the PNT domain interface by alanine scanning mutagenesis. .... 63 Figure 2.16 Mapping the results of the alanine scanning mutagenesis studies on the ETV6 PNT domain heterodimer structure. ................................................................................................... 66 Figure 2.17 Mapping the results of the alanine scanning mutagenesis studies on the ETV6 PNT domain heterodimer structure. ................................................................................................... 67 Figure 2.18 Structural overlay of PNT domain structures from 100 ns MD simulations. ............ 69 Figure 2.19 Analysis of PNT domain structural fluctuations from 100 ns MD simulations. ....... 70 Figure 3.1 High-throughput screening in drug discovery ........................................................... 89 Figure 3.2 The design of a split Gaussia luciferase PCA for cell-based screening of inhibitors of ETV6 PNT domain heterodimerization ..................................................................................... 92 Figure 3.3 Principle of the yeast two-hybrid screening assay for inhibitors of ETV6 PNT domain heterodimerization .................................................................................................................... 94 Figure 3.4 Cartoon representations of the establishment and validation of the PNT domain PCA. ................................................................................................................................................. 98 Figure 3.5 Time dependent luminescence of the PNT domain PCA ......................................... 101 xvii  Figure 3.6 Transient expression of A93D PNT domain in pooled stable transformants ............ 103 Figure 3.7 Sample results of the high-throughput PNT domain PCA screening assay .............. 105 Figure 3.8 Secondary validation of initial PCA screening hits ................................................. 106 Figure 3.9 Lanatoside C does not bind the A93D PNT domain ................................................ 108 Figure 3.10 Yeast two-hybrid assay controls ........................................................................... 109 Figure 3.11 Screening compounds from six libraries with the yeast two-hybrid assay ............. 111 Figure 3.12 Secondary validation of screening hits from the yeast two-hybrid assay ............... 113 Figure 3.13 Tannic acid does not bind the A93D PNT domain ................................................ 115 Figure 3.14 MD simulations of initial BUDE hits for binding to the A93D PNT domain ......... 117 Figure 3.15 Predicted structures of compounds bound to the ETV6 PNT domain .................... 118 Figure 3.16 Superimposed MD simulated structures for in silico screening of compounds predicted to bind proximal to the K99-D101 salt bridge .......................................................... 122 Figure 3.17 Virtual docking of compounds targeted to the K99-D101 interfacial regions of the PNT domain ........................................................................................................................... 123 Figure 3.18 MolPort-005-035-860 does not bind the A93D PNT domain ................................ 124 Figure 3.19 Compounds selected from the BUDE in silico screens did not inhibit PNT domain association in the mammalian split-luciferase PCA. ................................................................ 126  xviii  List of Abbreviations 2D  two-dimensional 3-AT  3-amino-1,2,4-triazole 3D three-dimensional Å Angstrom ABL Abelson murine leukemia AD activating domain ALK anaplastic lymphoma kinase ALL acute lymphoblastic leukemia AML acute myeloid leukemia ARNT aryl hydrocarbon receptor nuclear translocator ATP adenosine triphosphate BCR breakpoint cluster region BUDE Bristol university docking engine CBP CREB-binding protein CEL chronic eosinophilic leukemia cLogP lipophilicity C-Luc C-terminal luciferase fragment CML chronic myeloid leukemia CSP chemical shift perturbation Da  Dalton D2O deuterium oxide DMSO dimethyl sulfoxide xix  DNA deoxyribonucleic acid DSB double-stranded break DTT dithiothreitol E. coli  Escherichia coli EDTA ethylenediaminetetraacetic acid EGFR epidermal growth factor receptor EH end-helix ERBB erythroblastic oncogene B ERK extracellular signal-regulated kinase ETS E26 transformation specific ETV6 ETS-variant gene 6 EN ETV6-NTRK FGFR fibroblast growth factor receptor FID free induction decay GABPA GA-binding protein GAFF general amber force field GdnHCl guanidinium hydrochloride H1 helix-1 H2 helix-2 H3 helix-3 H4 helix-4 HEK human embryonic kidney HEPES 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid xx  HPC high performance computer HSQC heteronuclear single quantum coherence HTS high-throughput screening HX hydrogen exchange IGF1 insulin-like growth factor 1  IPTG isopropyl-b-D-thiogalactopyranoside ITC isothermal titration calorimetry  JAK2 Janus kinase 2  KD equilibrium dissociation constant LB lysogeny broth M molar concentration MAP mitogen-activated protein MD molecular dynamics MDM2 mouse double minute 2 homolog MICS motif identification from chemical shift ML mid-loop M/NK cell L myeloid/natural killer cell leukemia MOPS 3-(N-morpholino)propanesulfonic acid MPN myeloproliferative neoplasm mTOR mammalian target of rapamycin MWCO molecular weight cut off N-CoR nuclear receptor co-repressor N-Luc N-terminal luciferase fragment xxi  NMR nuclear magnetic resonance NOE nuclear Overhauser effect NOESY nuclear Overhauser effect spectroscopy NPM nucleophosmin ns nanosecond NTRK neurotrophic tyrosine receptor kinase OST on-sight PCA protein-fragment complementation assay PCR polymerase chain reaction PDB protein data bank PDGF platelet-derived growth factor PF protection factor PI3K phosphoinositide 3-kinase PIPE polymerase incomplete primer extension pKa acid dissociation constant PoPPI perturbation of protein-protein interactions PPI protein-protein interaction  ps picosecond PTK protein tyrosine kinase PTM posttranslational modification ppm parts per million RAEB refractory anemia with excess of blasts RCI-S2 random coil index-squared xxii  RMSD root mean square deviation RMSF root mean square fluctuation RNA ribonucleic acid RUNX runt-related transcription factor SAM sterile alpha motif SDS-PAGE sodium dodecyl sulfate-polyacrylamide gel electrophoresis Ship2 phosphatidylinositol 3,4,5-triphosphate 5-phosphatase 2 SMRT silencing mediator for retinoid or thyroid-hormone  SPPIDER solvent accessibility-based protein-protein interface identification and recognition SPR surface plasmon resonance TEAD TEA domain TEL translocation ETS leukemia  TF transcription factor Tris tris(hydroxymethyl)aminomethane UBC University of British Columbia UCSF University of California San Francisco  V(D)J variable, diversity, joining VEGFR vascular endothelial growth factor YAP yes-associated protein ZINC zinc is not commercial  xxiii  List of Amino Acid Abbreviations  A Ala alanine R Arg arginine D Asp aspartic acid/aspartate  N Asn asparagine C Cys cysteine E Glu glutamic acid/glutamate Q Gln glutamine G Gly glycine H His histidine I Ile isoleucine L Leu leucine K Lys lysine  M Met methionine F Phe phenylalanine P Pro proline S Ser serine T Thr threonine W Trp tryptophan Y Tyr tyrosine V Val valine xxiv  Acknowledgements First and foremost, I need to thank Ramona Gerak, my mother, and Ross Bailey, my father, who have each contributed to my success in their own unique ways. Without all of the support you gave in my life, I would not have been able to achieve everything I have and your love and encouragement has helped me on this journey. I do not believe I could have had two better supervisors and I will be eternally grateful to Dr. Lawrence McIntosh and Dr. Michel Roberge for all the encouragement, support, freedom, scientific discussions and teachings that you both have provided. You both were my academic parents, and I have been so privileged to be able to have two great scientists as my role models. Words cannot describe how lucky and thankful I feel.  I am fortunate to have had two committee members, Dr. LeAnn Howe and Dr. Suzana Straus, who have always contributed to my personal and project growth and I appreciate all the advice you have ever given me. I want to thank all the past and present members of the Roberge and McIntosh groups, you are all rock-stars and enhanced my PhD experience every day. I am extremely thankful for Mark Okon and all that you have taught me, I only wish I had more time to learn more. In particular, I want to thank Cecilia Perez Borrajero for holding my hand those first few months and becoming one of my close friends. As well, I need to especially thank Desmond Lau, Florian Heinkel, Soumya De, Aruna Balgi and Alireza Baradaran-Heravi for helping me learn basic lab skills and inspiring me to continue with my project. Karlton Scheu and Mike Ferguson, thank you both for the lab banter and insightful thoughts. Jasmine Li-Brubacher and Alexandra Krause for being good friends, support and travel companions. Finally, Sophia Cho, you were the best undergraduate student I could have asked for and thank you for all your work and motivation on my project. xxv  In my degree, I was fortunate enough that Lawrence and Michel supported an abroad research collaboration that led me to University of Bristol and Dr. Richard Sessions’ group. Sesh, thank you for your mentorship – you treated me as your own student and I will be forever grateful. I would also like to thank Sofia Oliveira, Debbie Shoemark, Amaurys Avila and Sukhee Bancroft for fully accepting me into the lab, tea breaks and all. In addition, I need to thank many at Simon Fraser University for helping ignite my passion for research and start me on my academic journey. Dr. Nabyl Merbouh who had a great influence in who I am today, thank you for taking a chance and introducing me to research. Dr. Andrew Bennet, Natalia Sannikova and Viviana Cerda, for giving my first research projects and patience in answering all of my questions. Dr. Gerhard Gries and Catherine Scott who allowed me to conquer my fears of spiders. Dr. Andrew Lewis, who has an infectious curiosity, and Colin Zhang, for trusting me to “play” with the NMRs and for teaching me an aspect of NMR many people do not get to see. You all played a huge role in my beginnings and I will eternally be thankful. Personally, I have to thank my extended family for their support. While I collectively thank everyone, I really felt the influence from Uncle Ron, Auntie Anne-Marie and my Grampy. My friends are the family I chose and did I ever pick an amazing family. Sharon Tsui, no matter the time difference you have always been there for me to talk about absolutely everything and I know this friendship is one in a lifetime. As well, Kayla Walker, your support has meant everything to me and I am so privileged to have you as my friend. “Natoleg”, thank you both for literally becoming my family and for your constant support and much needed adventures. Justin Stubbs, thank you for always being a voice of reason and compassion, whether you realize it, you really have helped me grow as a person and your support, along with the confidence you have given me, has meant the world. I would like to acknowledge the following friends and colleagues for all the xxvi  encouragement throughout knowing them in this journey, I could not have done it without you: Marija Jovanovic, Matt Courtemanche, Maxim Kolesnikov, Shams Bhuiyan, Hilda Doan, Angelé Arrieta, Irvin Wason, Fabian Meili, Israel Matos, Etienne Melese, Julien Bergeron, Kristy Dockstader, Kristina McBurney, Blaine Betzold, Vickie Loosemore, Eugene Kuatsjah, Fred Rosell, Claire Shanna, Nicole Morgan, Kele Elliott and Tony Ferreira, Desireé and Graham Shea, Harvir Dhupar, Fabian Garces, Elisa Wong, Karolina Lapinskaite, Ali Jay, Will Wong and many others, as well as the volleyball community that has been so great to me. I would like to add a special thank you to Nathan De Ruiter and Jack Ward, for being my virtual teammates in Azeroth during the pandemic lockdown.  Finally, I would like to thank UBC’s Department of Biochemistry and Molecular Biology for offering so many opportunities to succeed academically, personally and professionally. In particular, I do not think I could have such a positive outlook if it was not for Doris Metcalf – the department is lucky to have you. I have been privileged to have received many financial scholarships in my degree, and I would like to thank the Zbarsky Family, Ronnie Miller, Canadian Cancer Society, NSERC, CIHR and UBC for funding of my project. xxvii  Dedication    For my mother, Ramona Gerak.  1 Chapter 1: Introduction 1.1 General overview of cancer Cancer broadly refers to a group of diseases characterized by uncontrolled cell proliferation with the potential to invade other tissues or organs. Although great advances in diagnosis and treatment have been made over the past decades, cancer remains a leading cause of death worldwide (Hassanpour and Dehghani, 2017). Cancer typically occurs when an accumulation of mutations disrupts the processes that regulate cellular proliferation, either through stimulating cell growth pathways or inhibiting cell death or cell cycle arrest (DeBerardinis et al., 2008; Vermeulen et al., 2003; Vogelstein and Kinzler, 2004). In general, these mutations can be sporadic or inherited, with the sporadic mutations caused by a variety of factors including DNA damage, genomic instability, oncoviruses and carcinogenic bacteria (Ferguson et al., 2015; Parkin, 2006).  Hanahan and Weinberg have described several of the cancer ‘hallmarks’ that enable tumor growth and the transition to metastasis. These hallmarks include sustained proliferative signaling, evading growth suppressors, activating invasion and metastasis, enabling replicative immortality, inducing angiogenesis, and resisting cell death (Hanahan and Weinberg, 2000). Additional traits leading to tumorigenesis and the creation of a favorable tumor microenvironment include avoiding immune destruction, deregulating cellular energetics, genome instability and mutation, and tumor-promoting inflammation (Hanahan and Weinberg, 2011). The resulting tumors can be clinically defined as liquid – present in blood and bone marrow – or solid. These mainly correspond to leukemias or sarcomas and carcinomas, respectively, although lymphomas may be liquid or solid (Vogelstein and Kinzler, 2004).  Mutations in genes that cause cancer are primarily found in two categories of genes known as proto-oncogenes and tumor suppressor genes (Vermeulen et al., 2003). Simply, cancers arising   2 from oncogenes are a result of gain-of-function or activating mutations, whereas cancers arising from tumor suppressor genes are from recessive loss-of-function or reduced activity mutations (Hanahan and Weinberg, 2000; Vogelstein and Kinzler, 2004). In general, carcinogenesis driven by tumor suppressor genes involves loss-of-function mutations in both alleles (Knudson, 2001). By contrast, oncogene mutations may act dominantly and only require mutation in a single allele. There can also be considered a third class of genes known as stability genes. These genes, such as those involved in DNA repair or mitotic recombination, are responsible for keeping mutations to a minimum. Their inactivation results in more frequent mutation rates that contribute to cancer progression (Vogelstein and Kinzler, 2004). It has been found that such mutations are acquired faster during tumorigenesis and this can be attributed to an increase in cell proliferation and damage in DNA maintenance pathways (Salk et al., 2010). Notably, mutations that cause an increase in angiogenesis, a key step prior to tumor cell invasion, have been found directly or indirectly in all oncogenic and tumor-suppressor gene pathways (Salk et al., 2010; Vogelstein and Kinzler, 2004).  Early detection of cancer and identification of cancer-types are important for successful therapeutic intervention to prevent death. Death from cancer primarily occurs once a tumor has become metastatic, meaning that cancer cells have spread from a primary tumor location and established themselves into one or more secondary tissues (Chambers et al., 2002). Thus, early detection of primary tumors and the use of therapies tailored to the tumor type are crucial for a successful recovery. Targeted therapies are challenged by a tumor’s heterogeneity, which is caused by the sequential accumulation of mutations during clonal expansion. Indeed, large-scale cancer genome-sequencing studies have revealed both a significant variation in mutations, as well as few commonly mutated genes, in any type of cancer (Salk et al., 2010). This tumor heterogeneity can   3 in turn confer differences in the responses of individuals to therapeutics despite an identical initial tumor diagnosis (Vogelstein and Kinzler, 2004). Differences in the tumor landscape are why many current cancer treatments involve a combination of approaches, including chemotherapy and radiotherapy. This further highlights the importance of identifying primary driver mutations in developing targeted therapeutics for personalized precision cancer treatment (Croce, 2008).  1.1.1 Chromosomal translocations Eukaryotic DNA is condensed and packaged into distinct chromosomes. In cancer, the most frequent type of genomic instability is chromosomal instability, referring to either a change in chromosome structure or number (Negrini et al., 2010). Exemplifying the former, chromosomal translocations occur when one region of a chromosome is juxtaposed to another non-homologous chromosome or chromosome region (Croce, 2008; Friedberg, 2003). This generally results in aberrant functioning through oncogene activation or creation of oncogenic fusion proteins (Rabbitts, 1994; Vogelstein and Kinzler, 2004).  An example of a chromosomal translocation was first reported in 1960, for a leukemic patient. It involved reciprocal swapping of part of chromosome 9 to chromosome 22, resulting in the presently known Philadelphia chromosome (Nowell and Hungerford, 1960; Rowley, 1998). Since then, many additional translocations have been associated with hematologic malignancies and childhood sarcomas (Aplan, 2006). Chromosomal translocations are the most common genetic alteration in liquid tumors, frequently involving genes encoding tyrosine kinases and transcription factors (TFs) that are critical for hematopoietic differentiation (Aplan, 2006; Rowley, 1998; Vogelstein and Kinzler, 2004). This highlights both the importance of transcriptional control and chromosomal integrity in cancer (Rabbitts, 1994).   4  The generation of a chromosomal translocation requires a DNA double-stranded break (DSB) and improper repair (Aplan, 2006). It is estimated that ~50 DSBs occur per cell cycle, and thus there are plenty of opportunities for chromosomal translocation to take place during mammalian cell division (Vilenchik and Knudson, 2003). Individuals may have an inherited predisposition to acquiring chromosomal translocations if they have mutations in genes involved in the recognition and repair of DNA DSBs (Rotman and Shiloh, 1998). Aplan has summarized four possible mechanisms leading to chromosomal translocations, including illegitimate V(D)J recombination (V, variable; D, diversity; J, joining), homologous recombination mediated by repetitive sequences, DNA topoisomerase II subunit exchange and error-prone non-homologous end-joining. In addition, since several types of chromosomal translocations appear to be recurrent, there may also be common genomic features that are more susceptible to rearrangements. These features include purine and pyrimidine repeat regions, scaffold and matrix attachment regions and DNA topoisomerase II cleavage sites (Aplan, 2006).  Chromosomal translocations are a type of mutation that can begin the carcinogenesis process. However, translocations that have been implicated in leukemia or lymphoma, such as BCR-ABL and ETV6-AML1, have also been found in healthy individuals (Janz et al., 2003). This suggests that, in some instances, the translocations are not the cancer-driving mutation, and a subsequent mutation may be necessary for oncogenesis. In leukemic cells it is common to find a single balanced translocation that can create two translocations, whereby one disrupts a transcription factor and the other constitutively activates a tyrosine kinase. In contrast, solid tumor karyotypes are often complex in their abnormalities (Aplan, 2006). To treat cancers driven by chromosomal translocations, either preventative therapies for individuals afflicted with defects in   5 their DSB machinery, or the identification of drugs that directly target the resulting fusion oncoproteins, are needed.  1.1.2 Receptor tyrosine kinases The most common class of known oncogenes arise through chromosomal translocations and the most common oncoproteins are protein kinases (Futreal et al., 2004), which include protein serine/threonine kinases and receptor protein tyrosine kinases (PTKs) (Druker and Lydon, 2000). Humans have a total of 58 identified receptor PTKs that can be grouped into 20 subfamilies (Lemmon and Schlessinger, 2010). Receptor kinases help control a range of cellular processes by coupling the recognition of extracellular factors with intracellular signal transduction pathways. They are tightly regulated by variety of positive and negative feedback mechanisms. Receptor PTKs typically contain an extracellular ligand-binding domain, a single-pass transmembrane region and an intracellular kinase domain. Many receptor PTKs are activated by ligand-induced dimerization, or oligomerization, that juxtaposes their catalytic kinase domains to facilitate trans auto-phosphorylation of tyrosines in the kinase activation loop and/or juxtamembrane region (Lemmon and Schlessinger, 1994). In general, this juxtaposition allows conformational changes that serve to stabilize the active state of the kinase. However, some receptors, such as the insulin receptor and IGF1, are constitutively oligomeric and binding of the ligand induces an activating structural change (Lemmon and Schlessinger, 2010). The intracellular PTK domain can also be phosphorylated and dephosphorylated by other kinases and phosphatases, respectively (Lemmon and Schlessinger, 1994) Although differing in detail, these modes of activation generally result in the recruitment of a plethora of downstream signaling proteins, such as enzymes and adaptor/scaffolding proteins, that either recognize the phosphotyrosines in the receptor PTK or recognize docking proteins that have been phosphorylated by the receptor PTK.   6 Receptors may terminate their signaling and be downregulated through endocytosis (Lemmon and Schlessinger, 2010). Aberrant receptor PTK activation is closely linked with cancer development and can result from autocrine activation, PTK overexpression, gain-of-function mutations or chromosomal translocations (Lemmon and Schlessinger, 2010). Furthermore, receptor PTKs can become oncoproteins through a range of DNA alterations, including point mutations, deletions, and over-expression by gene amplification (Hunter and Blume-Jensen, 2001). One example is gene amplification and overexpression of ErbB2, an EGFR family receptor that correlates with poor cancer prognosis (Lemmon and Schlessinger, 2010). Chromosomal rearrangements also generate chimeric fusion oncoproteins that contain aberrantly regulated PTK domains. Two examples include the PDGF-b receptor kinase fused to the ETS family transcription factor TEL (or ETV6) and the ALK tyrosine kinase fused to a nucleolar protein NPM. These result in chronic myelomonocytic leukemia and non-Hodgkin’s lymphoma, respectively (Rabbitts, 1994).  Several therapeutics have been successfully developed against receptor PTK-driven cancers. In general, these are either small molecule inhibitors that target the ATP-binding site in the kinase domain or “biologics”, such as monoclonal antibodies, that interfere with receptor PTK activation (Lemmon and Schlessinger, 2010). One of the first examples of a successful small molecule receptor PTK inhibitor is Imatinib/Gleevec that was developed in the late 1990's. This compound was identified as a BCR-ABL inhibitor to treat chronic myelogenous leukemia through random library screening and subsequent structure-activity optimization of lead compounds (Druker and Lydon, 2000).  Many additional receptor PTK inhibitors have since been developed. The design of these molecules takes advantage of both conserved and non-conserved residues within and around the   7 kinase ATP-binding site. In general, the non-conserved residues do not directly participate in ATP binding and thus provide specificity of a given inhibitor to a subset of receptor PTKs (Druker and Lydon, 2000). However, there tends to be some promiscuity of inhibitors for different receptor PTK subfamilies. For example, Imatinib can also target PDGFR, KIT, and ABL, whereas another receptor PTK inhibitor, Sunitinib, can target KIT, VEGFR2, PDGFR, Flt3, and Ret (Lemmon and Schlessinger, 2010). This presents a double-edged sword as it may be beneficial to target multiple aberrant receptor PTKs in a cell with only one compound but harmful to inhibit a range of wild type receptor PTKs and thereby perturb normal signal transduction. In addition, a challenge with receptor PTK inhibitors is that selective pressure of the inhibitor allows drug-resistant mutations, frequently in the tyrosine kinase domain, to emerge and proliferate (Lemmon and Schlessinger, 2010). While receptor PTKs have been proven to be an important target to therapeutically regulate, there are many possible on- and off-target effects that need to be considered. 1.2 ETS transcription factor family 1.2.1 General features and classifications Transcription factors (TFs) are proteins that regulate the transcription of genes through interactions with DNA, RNA polymerase, and other regulatory proteins (Latchman, 1990). TFs are important for both transcriptional activation and repression, contributing to the control of numerous processes including cellular proliferation, cellular development, cellular differentiation, apoptosis and tissue remodeling (Oikawa, 2004). The human genome encodes more than 1600 TFs, many of which can be cataloged into families based on common features such as the presence of conserved sequence-specific DNA-binding domains (Lambert et al., 2018). For example, the ETS (E26 transformation specific) transcription factor family contains 28 paralogs that all have a winged helix-turn-helix ETS domain, which is responsible for specific recognition of a common   8 DNA sequence motif, 5′GGA(A/T)3′. The founding member of this family is ETS1, discovered as part of a fusion oncoprotein expressed by the E26 avian erythroblastosis virus (Nunn et al., 1983). ETS factors can be further divided into subfamilies based on their phylogenic relationships, which also correlate with features including DNA-binding specificity and overlapping cellular functions (Figure 1.1) (Hollenhorst et al., 2011).  Figure 1.1 ETS transcription factor family. This cartoon figure highlights several human ETS family members containing different structural domains. Members are clustered into their phylogenetic subfamilies (in bold) including ERG (ERG, FLI1, FEV), ETS (ETS1, ETS2), TCF (ELK1, ELK3, ELK4) PEA3 (ETV1, ETV4, ET5) and TEL (ETV6, ETV7). The ETV6 Drosophila ortholog Yan is also shown to display its similarity to ETV6. The colored boxes indicate structured domains, joined by assumed or characterized intrinsically disordered sequences (gray). The circled P's represent selected phosphorylation sites. Not identified are additional regions involved in transcriptional activation/repression, post-translational modifications (including kinase docking and sumoylation), nuclear import/export, turnover, etc. This figure is adapted from Hollenhorst et al.    9 In addition to the defining DNA-binding ETS domain, ETS factors contain a variety of structured domains and intrinsically disordered sequences that contribute to diversity and specificity in transcriptional regulation. The most common of these is the PNT domain present in 11 ETS family members (Figure 1.2). PNT domains are a subset of the larger group of sterile alpha motif (SAM) domains (Lopez et al., 1999). These small a-helical bundles are found in over 250 regulatory proteins, including members of the protein kinase family, adaptor proteins and transcription factors. They mediate a diverse range of protein-protein and protein-RNA interactions, and are known to form a wide variety of homotypic and heterotypic complexes (Kim et al., 2001). In the case of the ETS factors, the monomeric ETS1 and ETS2 PNT domains play central roles in Ras signal transduction by serving as docking domains for the MAP kinase ERK2 and by mediating phosphorylation-enhanced interactions with the general transcriptional coactivator histone acetyltransferase CBP (Foulds et al., 2004; Nelson et al., 2010). In contrast, and as will be discussed in detail below, the ETV6 PNT domain self-associates in a head-to-tail configuration to form extended helical polymers.  Less common elements are also present in some ETS transcription factors. These include the B-box in the ELK family members that is involved in interactions with serum response factor on DNA, and the CBP-interacting OST domain found only in GABPA (Hassler and Richmond, 2001; Kang et al., 2008).   10  Figure 1.2 Structures of several ETS family PNT domains. This figure compares PNT domains from the ETS family (ETV6 (PDB: 1LKY), GABPA (PDB: 1SXD) and ETS1 (PDB: 2JV3)). The common core of the PNT domain, consisting of four a-helices and a small a- or 310-helix, has a fold similar to that of the SAM domains (purple). Appended N-terminal helices (pink) provide functional diversity. Although not identified for clarity, residues preceding the first helix of the ETV6 PNT domain form a conserved helical-like turn. The ETV6 PNT domain structure is from the X-ray crystallographically characterized polymer, whereas the monomeric GABPA and ETS1 PNT domain structures were determined by NMR spectroscopy.  1.2.2 ETV6 (TEL) general features and biological significance In contrast to most ETS family factors, which are transcriptional activators, the ETS variant gene 6 (ETV6), also known as translocation ETS leukemia (TEL), and related ETV7 are transcriptional repressors (Lopez et al., 1999; Rasighaemi and Ward, 2017; Vivekanand and Rebay, 2012). The ETV6 gene was discovered through the identification of a fusion oncoprotein associated with chronic myelomonocyte leukemia and shown to be located on chromosome 12p13 (Golub et al., 1994). The 452 amino acid wild type human ETV6 protein is coded from eight exons and includes an ~ 70 residue N-terminal self-associating PNT domain, a central region that   11 functions with co-repressors to recruit histone deacetylases and a C-terminal DNA-binding ETS domain (Baens et al., 1996; Bohlander, 2005). An alternative start codon corresponds to Met43. ETV6 is widely expressed in mammalian tissues and mainly localized in the nucleus, but it is also present in the cytoplasm (Bohlander, 2005; Chakrabarti et al., 2000; Golub et al., 1996).  Studies with knockout mice have shown that ETV6 is biologically important in embryonic development and hematopoietic regulation (Hock et al., 2004; Wang et al., 1997). However, the mechanisms underlying transcriptional repression by ETV6 are not well defined. The specificity of ETV6-mediated repression is in part due to the intrinsically disordered sequences between its PNT and ETS domains. These sequences have been reported to bind corepressors including mSin3A, SMRT, and N-CoR, which in turn recruit histone deacetylases (Kim et al., 2001; Lopez et al., 1999). In addition, as will be discussed below in detail, the ETV6 PNT domain can polymerize. Impairment of polymerization results in inhibition of transcriptional repression at ETS-DNA binding sites (Lopez et al., 1999). Transcriptional repression can be restored by exchanging the PNT domain for another oligomerization domain, demonstrating that self-association is essential for ETV6’s repression function (Lopez et al., 1999). Conversely, repression activity is maintained if the ETV6 ETS domain is exchanged for that of ETS1, indicating that the exact ETS domain is not critical.  Kim et al. (2001) proposed that DNA-bound polymeric ETV6 may also cause localized chromatin compaction to block access of the transcriptional machinery to target genes. This speculative model is based on the observation that the repeat distance of the helical polymer (53 Å) formed by the head-to-tail association of ETV6 PNT domains is equivalent to the width of the nucleosome core particle. Additionally, the isolated ETV6 ETS domain binds a single consensus 5′GGA(A/T)3' site with relatively low affinity compared to other family members (Green et al.,   12 2010). This is due to an autoinhibitory helix that sterically blocks its DNA-recognition interface (Coyne et al., 2012; De et al., 2016). The cooperative binding of polymeric ETV6 to DNA containing multiple consensus sites may compensate for this low affinity to enable specific transcriptional repression. 1.2.3 The ETV6 PNT domain Unlike the monomeric PNT domains of most ETS factors, the ETV6 PNT domain tightly self-associates into an open-ended, left-handed helical polymer via head-to-tail binding of two relatively flat, hydrophobic interfaces with complementary peripheral charges (Figure 1.3) (Kim and Bowie, 2003; Mackereth et al., 2004; Tran et al., 2002). The two interfaces have been defined as the mid-loop (ML) and the end-helix (EH) surfaces (Kim et al., 2001). The ML-surface encompasses hydrophobic residues that are located on the loop at the center of one interface. The EH-surface corresponds to complementary interfacial residues along the C-terminal helix of the PNT domain. The total surface area buried at the interface between two monomers is ~ 1070-1250 Å2 (Kim et al., 2001). The only other self-associating PNT domains in the ETS family are those of Drosophila Yan, and possibly human ETV7, as all other PNT domains lack these two interfaces due to amino acid differences or steric blockage (Hollenhorst et al., 2011; Mackereth et al., 2004; Potter et al., 2000). Yan is a polymeric transcriptional repressor with a well-characterized role in the sevenless signalling pathway for eye development (Lai and Rubin, 1992; Lopez et al., 1999).   13  Figure 1.3 ETV6 PNT domain interface. (a) Ribbon diagram of an ETV6 PNT domain heterodimer (from PDB: 1LKY). Mutations that inhibit polymerization are highlighted in red, with their respective wild type residues shown as darker sticks at the interface. The V112R construct retains a wild type ML surface whereas the A93D construct retains a wild type EH surface. (b) An enlarged view of the interfacial region (corresponding to the dashed box in a). Interfacial hydrophobic (yellow), negatively charged (purple) and positively charged (orange) residues are shown as sticks.  The introduction of a charged group on the ETV6 PNT domain at either binding interface (Ala93 to Asp, or Val112 to Glu or Arg, according to the human ETV6 numbering) renders the domain monomeric in solution (Figure 1.4) (Tran et al., 2002). The V112E mutant can still associate in solution with pH values below ~ 7, likely due to protonation of the buried glutamic acid (Kim et al., 2001). Importantly, PNT domains with mutations on opposite surfaces of the binding interface can “heterodimerize” through their remaining complementary wild type   14 interfaces (Cetinbas et al., 2013; Kim et al., 2001). The heterodimer strongly associates with reported equilibrium dissociation constants (KD values) in the range of 1.7 to 4.4 nM (Cetinbas et al., 2013; Kim et al., 2001). This allows the PNT domain to be studied in solution, whereas the wild type species is insoluble due to its strong propensity to polymerize.  Figure 1.4 Cartoon representation of ETV6 PNT domain polymerization and heterodimerization. (a) Wild type head-to-tail PNT domain polymer. Native alanine and valine residues are depicted at the interfaces. (b) Mixing two PNT domains with monomerizing mutations (V112E and A93D constructs) yields a heterodimer with native binding interfaces.  1.3 ETV6 and its role in cancer and disease There is mounting evidence that the ETV6 gene acts as a tumor suppressor, and all three of its structurally defined regions (PNT domain, linker and ETS domain) have roles in disease. In particular, the ETV6 PNT domain is frequently present in chimeric oncoproteins resulting from chromosomal translocations. These chromosomal translocations fuse gene fragments encoding the self-associating PNT domain of ETV6 with the kinase domain of one of many receptor PTKs or to the DNA-binding domain from one of several transcription factors (Bohlander, 2005). Known   15 receptor PTK fusion partners of ETV6 include PDGFβ, ABL1/2, JAK2, FGFR3, and NTRK3, and the resulting oncoproteins have been linked to over 40 human leukemias, as well as fibrosarcomas, breast carcinomas, and nephromas (De Braekeleer et al., 2012; Wai et al., 2000) (Figure 1.5).  ETV6 self-association is essential for the oncogenic properties of the chimeras (Lopez et al., 1999). In the context of a PNT-PTK fusion, polymerization of the PNT domain causes ligand-independent auto-phosphorylation of the PTK domain. This in turn leads to constitutive activation of downstream cellular pathways, such as the PI3K/Akt and Ras-MAPK signaling cascades, and, ultimately, cellular transformation (Lannon and Sorensen, 2005; Tognon et al., 2012; Wai et al., 2000). In the case of transcription factors, the ETV6-RUNX chimera is associated with B-cell acute lymphoblastic leukemia, and the ETV6-ARNT chimera with T-cell acute lymphoblastic leukemia and acute myeloblastic leukemia (De Braekeleer et al., 2012; Mauchauffe et al., 2000; Otsubo et al., 2010).   16  Figure 1.5 ETV6 is prevalent in oncogenic chromosomal translocations. ETV6 occurs in numerous chromosomal translocations. A selection of these translocations linked with leukemias and their malignant phenotype(s) are shown. The cartoon representations are not to scale and represent wild type ETV6 (top) and the fusion oncoproteins. Abbreviations of the associated leukemias are as follows; acute lymphoblastic leukemia (ALL), acute myeloblastic leukemia (AML), chronic eosinophilic leukemia (CEL), chromic myeloid leukemia (CML), myeloid/natural killer cell leukemia (M/NK cell L), myeloproliferative neoplasm (MPN), refractory anemia with excess of blasts (RAEB). Adapted from (De Braekeleer et al., 2012).  In these PNT-PTK chromosomal translocations leading to cancers, there is an absence of the reciprocal translocation and, frequently, the normal ETV6 allele is deleted. This suggests that   17 wild type ETV6 may inhibit chimeric oncoproteins (Bohlander, 2005; Knezevich et al., 1998). This is supported by the observation that introducing a wild type ETV6 PNT domain causes a dominant negative effect on oncogenic cellular transformation (Lannon and Sorensen, 2005).  In addition to fusion oncoproteins, the PNT domain has been surmised to influence oncogenesis through mechanisms not related to its polymerizing properties. For example, sumoylation of the ETV6-AML1 fusion oncoprotein results in its localization into “TEL bodies” (Chakrabarti et al., 2000). The authors of this study speculated that this localization may contribute to the tumorigenesis process by stabilizing the fusion protein against degradation or by allowing the AML1 component to interact with different proteins. Mutations in other regions of ETV6, including the ETS domain and preceding linker, that contribute to human disease have also been described. Germline missense mutations have been found to cause thrombocytopenia that can drive malignancies (Zhang et al., 2015). Additionally, a point mutation in the ETS domain that eliminates DNA binding has been found in the non-rearranged allele of ETV6 in a cell that had a fusion oncoprotein, supporting a role of ETV6 as a tumor suppressor gene (Bohlander, 2005). However, PNT-PTK fusion oncoproteins are the prominent oncogenic feature associated with ETV6 translocations. 1.3.1 ETV6-NTRK3 fusion oncoprotein The group of our University of British Columbia (UBC) collaborator Dr. Sorensen initially identified an oncoprotein (named EN; for ETV6-NTRK3), associated with congenital fibrosarcoma. This oncoprotein has the ETV6 PNT domain fused to the PTK domain of neurotrophin tyrosine receptor kinase-3 (NTRK3) (Knezevich et al., 1998). NTRK3 is a transmembrane nerve growth factor receptor for neurotrophin-3 that is preferentially expressed in   18 neuronal cells to regulate growth, development, and cell survival (Lamballe et al., 1991; Lannon and Sorensen, 2005).  The EN oncoprotein is the result of a chromosomal translocation (t(12;15)(p14;q25)) and was the first ETV6 fusion shown to give rise to a solid tumor in congenital fibrosarcoma (Figure 1.6). Since then, the EN fusion oncoprotein has been implicated in secretory breast carcinoma, mammary analogue secretory carcinoma of salivary glands and skin, and radiation-associated thyroid carcinoma (Kazakov et al., 2010; Leeman-Neill et al., 2014; Skálová et al., 2010; Tognon et al., 2002). Moreover, cryptic translocations can produce EN gene fusions, thus suggesting that EN may be present in cancers lacking gross t(12;15) translocations (Watanabe et al., 2002).  Figure 1.6 Cartoon representation of the modular ETV6-NTRK3 (EN) oncoprotein. The EN oncoprotein results from a chromosomal translocation that fuses the gene fragments encoding the N-terminal region of ETV6 with the C-terminal region of NTKR3. ETV6 contains PNT (purple) and ETS (red) domains, and NTRK3 contains an extracellular protein domain (ECD, yellow), two ligand binding IG-like C2 domains (pale green), a transmembrane (TM, orange) region and a protein tyrosine kinase (PTK, green) domain.   PNT domain polymerization is intimately responsible for the oncogenic properties of EN. These properties include phenotypic transformation and soft agar colony formation of several experimental cell lines, as well as tumor formation in nude mice. Deletion of the PNT domain   19 from EN eliminated NTRK3 tyrosine phosphorylation and resulted in no observable cellular transformation (Wai et al., 2000). When the PNT domain monomerizing mutations (A93D, V112E, or V112R) were introduced into the EN fusion oncoprotein, the kinase domain was not activated and cellular transformation did not occur (Tognon et al., 2004). Subsequently, mutations that disrupt the intermolecular K99-D101 salt bridge at the ETV6 PNT domain association interfaces were also found to prevent cellular transformation (Cetinbas et al., 2013). Co-expression of the isolated wild type and monomeric mutant ETV6 PNT domains also had dominant-negative effects on transformation (Tognon et al., 2004). It is also noteworthy that an EN variant with the PNT domain replaced by an inducible dimerization domain failed to transform cells. Thus, PNT domain-mediated polymerization, rather than simple dimerization, is important for the oncogenic properties of EN. Similar to other PNT-PTK fusion oncoproteins, EN fusion proteins activate the downstream signaling PI3K-Akt and Ras-MAPK pathways (Figure 1.7) (Cetinbas et al., 2013). The PI3K/Akt/mTOR pathway is a highly conserved and tightly controlled growth factor response pathway (DeBerardinis et al., 2008). In brief, upon growth factor binding, there is phosphorylation of phosphatidylinositol lipids that recruit and activate downstream proteins to increase metabolic activities and promote cell growth (DeBerardinis et al., 2008). The Ras-MAPK pathway functions similarly whereby growth factor binding stimulates a phosphorylation cascade of the Raf/MEK/ERK serine/threonine kinases to regulate gene expression (McCubrey et al., 2007). EN causes constitutive activation of these pathways through bypassing the need for stimulation by growth factors.   20  Figure 1.7 Outline of the ETV6-NTRK3 (EN) oncogenic signal transduction pathway. The EN fusion oncoprotein becomes activated through polymerization of its PNT domain (magenta), allowing autophosphorylation of the PTK kinase domains (dark green). These complexes interact with IRS-1 (insulin receptor substrate 1) to constitutively activate the downstream Ras-MAPK and PI3K-Akt signaling pathways and cause cellular transformation. This occurs in absence of internal growth factors (IGF in figure) that normally bind to the insulin growth factor receptor. Adapted from Lannon and Sorensen, 2005.  As discussed above, several small molecule inhibitors that specifically target the kinase domain’s ATP-binding site have been developed against receptor PTK-driven cancers. In the case of EN, a compound called Larotrectinib (formerly LOXO-101) is an ATP competitive inhibitor with a 1,000-fold selectivity for NTRK3 over other PTKs (Nagasubramanian et al., 2016). Upon treatment with this drug, rapid therapeutic benefits were seen in a young pediatric patient with an EN fusion oncoprotein (Nagasubramanian et al., 2016). This clinical observation highlights the   21 potential of early detection and EN oncoprotein inhibition for treatment of many cancers including papillary thyroid carcinomas, pancreatic adenocarcinoma, and AML (Lannon and Sorensen, 2005).  An isolated ETV6 PNT domain has a dominant negative effect on EN. Thus ETV6-PTK fusion oncoproteins can also be inactivated by blocking PNT domain-mediated polymerization (Tognon et al., 2004). In contrast to targeting the kinase domain, this potential therapeutic strategy has two distinct advantages. First, this would avoid any possible toxicities associated with off-target inhibition of receptor PTKs linked with normal cellular growth. Second, numerous chimeric oncoproteins contain the ETV6 PNT domain, and such inhibitors would therefore be “wide spectrum” against many cancers. Thus, the overarching goal of my thesis research was to search for small molecules that prevent polymerization of the ETV6 PNT domain. 1.4 Protein-protein interactions as candidate therapeutic targets 1.4.1 Protein-protein interactions in a cell Protein-protein interactions (PPIs) are essential for life. The human interactome, which refers to the vast network of PPIs within the cell, has been estimated to encompass between 130,000 and 650,000 interactions (Morelli et al., 2011). These interactions are involved in numerous intra- and intercellular processes including, but not limited to, cytoskeleton remodeling, vesicle transport, signal transduction, transcription, translation, immune responses, and apoptosis (Arkin et al., 2014; Bogan and Thorn, 1998). Many human diseases result from aberrant PPIs through either loss of an essential interaction or gain of an interaction that takes place at an inappropriate time or location (Ryan and Matthews, 2005). These deleterious interactions can arise from numerous causes, including mutations that alter normal protein functions to the introduction of pathogenic proteins that negatively impact cellular processes.   22 PPIs are generally considered to involve specific interactions between defined binding interfaces of two or more polypeptides that can lead to biological functions. PPIs are diverse and can be categorized in a variety of ways. For instance, they can be stable and long-lived or transient interactions. The former class includes those leading to formation of multi-subunit enzymes, and the latter encompasses those involved in signaling-effector and enzyme-substrate complexes (Mintseris and Weng, 2005). Transient PPIs highlight the dynamic nature of much of the interactome. PPIs can also be homotypic or heterotypic, with homocomplexes tending to be multimeric with long-lived interactions (Jones and Thornton, 1996).  While there are many different definitions of PPIs, several common themes exist. For example, Smith and Gestwicki described four general categories of PPIs and their associated interfaces as “tight and narrow”, “tight and wide”, “loose and narrow”, and “loose and wide”. In their perspective, a PPI is considered to be “tight” if the KD value is less than 200 nM and “loose” if greater than 200 nM. “Narrow” is characterized to have a buried surface area of less than 2500 Å and “wide” is greater than 2500 Å. Two interfacial residues can be defined as being in contact with each other if the distance between any of their atoms is within 0.5 Å of the sum of their van der Waals radii, and two residues can be considered nearby if the separation of their Ca atoms is less than 6 Å (Tsai et al., 1997). Residues that participate in hydrogen bonding have appropriately orientated donor and acceptor atoms within ~ 3.5 Å. For the sake of simplification, electrostatic interactions are considered to occur when charged atoms are less than 4.5 Å away (Keskin et al., 2005).  Partner proteins involved in a PPI can be further defined by complexity with “primary” referring to a linear protein sequence, “secondary” referring to a single region of secondary structure and “tertiary” referring to multiple sequences (Arkin et al., 2014). In another perspective,   23 the PPIs may be also classified into four structural categories (Scott et al., 2016). These consist of pairs of globular proteins that interact through discontinuous portions of their peptide chains that have no substantial structural change upon binding, globular proteins that have a structural change upon binding, a globular protein that interacts with a single peptide chain, and two interacting peptide chains (Scott et al., 2016).  The thermodynamic equilibria for PPIs reflect the free energy difference between the associated and unassociated states of the interacting proteins or peptides. Dissecting the underlying changes in enthalpy, entropy, and heat capacity can provide information about mechanisms of the interactions (Stites, 1997). Varying experimental factors such as sample pH value, buffer composition, and temperature can facilitate such studies. An overview of 69 different PPIs found 18 to be predominantly entropically driven, 31 to be predominantly enthalpically driven and the remaining 20 to be from contributions of both (Stites, 1997).  There are many different factors that contribute to the thermodynamics of PPIs. These include electrostatic interactions, hydrogen bond formation, and several entropic terms (Sturtevant, 1977). Although the hydrophobic effect does not dominate the PPI to the same extent as in protein folding, it is still one of the most important factors for complex formation (Stites, 1997). Burial of a hydrophobic patch at a protein-protein interface can yield a large entropy gain due to solvent reorganization (Yan et al., 2008). The entropy of PPIs can also be affected through change in the degrees of translational, rotational, and vibrational degrees of freedom (Stites, 1997).   PPIs generally arise from complementary physicochemical features at the protein-protein interface (Yan et al., 2008). However, not all interfacial residues contribute equally to the specificity and affinity of complex formation. Rather, certain residues or regions called “hot spots” are most critical. Alanine scanning mutagenesis is an experimental technique that can identify hot   24 spots by systematically determining the importance of each residue at a protein-protein interface. Mutating a residue to alanine reduces its side chain to a single uncharged methyl group while maintaining a preference for backbone torsion angles typical of amino acids except glycine and proline. Although various criteria have been used, in general a hot spot residue is classified as such when an alanine substitution leads to a DDG of ~ 6 kJ/mol or higher (Bogan and Thorn, 1998; Clackson and Wells, 1995). Systematic alanine scanning of both partner interfaces also allow researchers to determine if certain interactions are complementary, for instance, as would be expected with intermolecular salt bridges.  There is no particular type or group of amino acids that are implicated as critical to PPIs, although Trp, Tyr, and Arg (and to a lesser extent, Asp and His) residues are often enriched at hot spots (Bogan and Thorn, 1998; Scott et al., 2016). These hot spot residues are likely enriched due to the multitude of different interactions they can participate in, such as p-interactions, hydrogen bonding and salt bridge formation, and hydrophobic interactions (Bogan and Thorn, 1998). Hot spot residues are typically buried and occluded from solvent contact at interfaces. However, being protected from solvent alone does not guarantee that the residue is a hot spot (Bogan and Thorn, 1998; Keskin et al., 2005). Additionally, compared with hydrophobic residues, which tend to be found in the center of the PPI interface, and hydrophilic residues, which are usually found on the periphery near the solvent-exposed surface, His, Gly, Tyr, and Trp residues are found in relatively equal interior and exterior proportions (Tsai et al., 1997).  As expected, residues that contribute most to a given PPI tend to be more evolutionarily conserved than those at the rest of the interface (Thangudu et al., 2012). It has been proposed that hot spot residues within the same interfacial region work cooperatively in the binding of partner proteins, whereas those in different regions contribute in an additive manner (Keskin et al., 2005).   25 For PPIs that have buried surface areas of less than 2,000 Å2, generally only one hot spot region contributes to binding (Scott et al., 2016). Identifying these hot spots can help researchers focus on areas of a PPI interface for targeting inhibitory molecules. 1.4.2 Challenges in targeting protein-protein interactions Methods to therapeutically regulate PPIs have been long sought-after. Although inroads have been made, PPIs are notoriously difficult to disrupt because their interfaces are generally large, relatively flat surfaces with few grooves or pockets into which small molecules can bind (Pelay-Gimeno et al., 2015). One avenue that has been used to specifically target PPIs is through humanized monoclonal antibodies. However, this presents a variety of complicating issues relating to solubility, stability, route of administration and a possible strong immune response (Buchwald, 2010; Ryan and Matthews, 2005).  Recently, considerable efforts into finding small molecule inhibitors of PPIs have been undertaken. However, unlike enzymes and receptors, protein-protein interfaces generally do not have associated small natural substrates or ligands that can be used to guide structure-activity relationship studies (Wells and McClendon, 2007). Thus, it is very challenging to find lead compounds that bind either directly on an interface to inhibit partner binding (orthosteric inhibitors), or that induce a conformational change in a protein to interfere with partner binding (allosteric inhibitors) (Figure 1.8) (Morelli et al., 2011).  Despite these considerable hurdles, current progress on small molecule PPI inhibition is promising as several compounds have reached various stages of therapeutic development. Increasing partner complexity is correlated with fewer known inhibitors of the PPI. Conversely, PPI inhibitors that have been taken to clinical trials tend to be targeting an interface of a globular protein into which a single polypeptide chain binds (Arkin et al., 2014; Scott et al., 2016). Several   26 existing examples in clinical trials are as advanced as Phase III, including Idasanutlin, which was developed by Roche to target the MDM2-p53 PPI (Scott et al., 2016). Encouragingly, it has been estimated that up to 40% of PPIs are of relative low complexity and involve a single peptide binding in a groove of the partner protein (Petsalaki and Russell, 2008). However, the remaining 60% are of greater complexity and it is often found that one or both interfaces will be composed of non-sequential amino acids on the peptide chain (Wells and McClendon, 2007).  Figure 1.8 Potential mechanisms of PPI inhibition. The interaction of Proteins A and B may be blocked by (a) an orthosteric inhibitor or (b) an allosteric inhibitor. The former binds directly at the interprotein interface, whereas the latter acts through an induced conformation change.  A critical step in identifying orthosteric PPI inhibitors is to consider both the size of the inhibitor and the PPI interface. Small molecules binding to a protein tend to have contact areas of 300 – 1000 Å2 and PPIs that have identified inhibitors have buried surface areas of 1000 – 6000   27 Å2 (Scott et al., 2016; Smith and Gestwicki, 2012). Inhibitors of PPIs are potentially more challenging to identify, as PPI inhibitors typically bind in 3 to 5 surface pockets with an average occupied volume of ~ 100 Å3, whereas most marketed drugs bound a single pocket with an averaged occupied volume of ~ 270 Å3 (Buchwald, 2010). In addition, small molecule inhibitors of PPIs tend to be relatively large, rigid, and composed of hydrophobic and aromatic groups (Thangudu et al., 2012). Thus, identifying promising PPI inhibitors presents a conundrum: the compounds likely have to be large enough to overcome the distributed protein-protein contact surface and provide sufficient binding in “shallow” pockets while also abiding by the “rule-of-five” (Wells and McClendon, 2007). Serving as a guideline, the “rule-of-five” maintains that a therapeutic compound is less likely to be absorbed when there are more than 5 hydrogen-bond donors, the molecular weight is greater than 500 Da and the lipophilicity is high (cLogP > 5) (Lipinski et al., 1997). This rule is thought to aid in avoiding an obvious challenge common to drug discovery. Namely, when a protein of interest is expressed intracellularly, the inhibitors of its PPIs must also be membrane permeable (Buchwald, 2010). While the rule-of-five is a useful guideline for drug discovery, it cannot be readily applied to the context of PPI inhibitor screening, which must include a high variety of compounds due to the inhibitors’ higher molecular weights and increased hydrophobicity (Morelli et al., 2011).   Encouragingly, protein-protein interfaces are not static. In contrast, the possibility of surface pocket openings occurring dynamically has both been modelled computationally and seen experimentally by comparing PPI interfaces with versus without bound small molecule ligands (Eyrisch and Helms, 2007; Wells and McClendon, 2007). Such comparisons have revealed that small molecule inhibitors of PPIs tend to utilize “cryptic” binding pockets on the protein not seen with the natural protein binding partner (Wells and McClendon, 2007). The dynamic pockets that   28 form are found to be deeper when bound to a small molecule inhibitor than to a partner protein (Johnson and Karanicolas, 2013).  Progress in targeting PPIs is steadily improving as researchers learn about more appropriate options and strategies. Rather than focusing on high-throughput screening with traditional small molecule libraries, there is increasing effort to utilize both natural product libraries and cyclic or stapled peptide libraries (Smith and Gestwicki, 2012). Natural product libraries tend to have a high average complexity and are thought to be evolutionarily biased towards functions such as binding protein interfaces (Arkin et al., 2014; Smith and Gestwicki, 2012). With protein-like features, cyclic peptides and peptide mimetics are likely candidates for binding to the protein surface and are more resistant to peptidases than linear peptides (Joo, 2012). Fragment-based libraries are also being used to identify lead chemical features that bind (Arkin et al., 2014). In parallel, as molecular docking software algorithms advance, in silico screening techniques are also being utilized for rational drug design and to identify PPI inhibitors (Bajorath, 2002). 1.5 Research questions and goals The goal of my thesis project was to find an inhibitor of ETV6 PNT domain polymerization. As previously discussed, the widespread occurrence of ETV6 PNT domain fusions across diverse tumor types makes it an attractive target for therapeutic intervention. That is, targeting one domain has the potential to result in a therapeutic against a broad range of cancers. In the effort to achieve this goal, I undertook two parallel approaches. The first was to investigate what drives PNT domain polymerization and thereby identify molecular features that could be useful in targeted drug design. The second approach was to utilize several experimental and theoretical screening methods to identify potential inhibitors of PNT domain polymerization. The theoretical approach involved using structure-guided in silico methods to identify potential   29 compounds that might bind the ETV6 PNT domain self-association interfaces. The experimental approach was to carry out unbiased high-throughput cellular screens to search for inhibitors that disrupt PNT domain association. 1.5.1 Characterizing ETV6 PNT domain polymerization Structural biology has played an important role in both identifying PPI inhibitors and characterizing bound inhibitors (Scott et al., 2016). PPIs are challenging to disrupt and thus, one goal of this thesis work was to gain insight into the ETV6 PNT domain polymerization through structural and biophysical analyses. Although several PNT domain polymer structures have been reported, the structure of a monomeric species evaded elucidation due to the strong propensity of ETV6 to polymerize (Tran et al., 2002). Using NMR spectroscopy, X-ray crystallography, and molecular dynamics simulations, I carried out detailed comparisons of the structure and dynamics ETV6 PNT domains in their monomeric versus polymeric forms. In addition, I used amide hydrogen exchange (HX) experiments and alanine scanning mutagenesis to dissect the roles of interfacial residues in driving PNT domain polymerization. This also served to identify hot spot residues and thereby aid a targeted approach to inhibitor discovery (Arkin et al., 2014). The outcomes of this research objective are discussed in Chapter 2. 1.5.2 Complementary screening assays High-throughput screening assays are commonly used for drug discovery. I employed three different screening assays to attempt to find an inhibitor of ETV6 PNT domain polymerization – a yeast two-hybrid assay, a mammalian cell protein-fragment complementation assay (PCA) and an in silico screening assay. The various screening assays complemented one another’s strengths and weaknesses. For example, the in silico screening assay was devised to directly target the PPI interface, whereas the unbiased yeast and mammalian cell assays allowed for screening for both   30 orthosteric and possible allosteric inhibitors. In addition, in silico screening enabled interrogation of millions of virtual compounds whereas the number of compounds tested experimentally in the yeast and cell screening assays was necessarily much more limited. The design of the assays, their development and screening results are presented in Chapter 3. 1.5.3 Future directions The overall results of my thesis are summarized in Chapter 4. In brief, I learned that the PNT domain polymerization does not confer a structural change from the monomeric species, and the PNT domain structure is stable in its monomeric and heterodimeric states. Hot spot residues were also identified at the protein-protein interface and this knowledge was used for the BUDE virtual screening, and could be used to guide other in silico or targeted screening approaches. Unfortunately, the cell based and virtual screening approaches produced no lead compounds that bound to either PNT domain as monitored by NMR spectroscopy. However, screening of further compounds in these assays may elicit a positive lead compound. To conclude, I proposed several future experiments that may be successful in finding an inhibitor of PNT domain polymerization, which include fragment-based drug design, helix mimetics, and disulphide tethering approaches.    31 Chapter 2: Biophysical characterization of the ETV6 PNT domain  2.1 Overview The ETV6 PNT domain has been implicated in many chromosomal translocations. These frequently involve protein tyrosine kinase domains whereby PNT domain polymerization mediates constitutive oncogenic activity. In this chapter, I characterized the PNT domain and its self-association interface using a combination of NMR spectroscopy, X-ray crystallography, amide hydrogen exchange, alanine scanning mutagenesis, surface plasmon resonance (SPR), and molecular dynamics simulations. I found that the PNT domain is stably folded and does not have significant conformational fluctuations. The structures of the monomeric and heterodimeric species are the same, both in solution and in the crystalline state. Also, as confirmed through NMR spectroscopy and SPR, self-association of the PNT domain occurs with nM affinity. Alanine scanning mutagenesis and amide hydrogen exchange aided in determining “hot spot” residues at the interface between PNT domains. Key hydrophobic residues at the centres of the ML- and EH-interfaces and several flanking intermolecular salt-bridge-forming residues contribute most significantly to the tight association of the PNT domain. Small molecules that can mimic the interaction of these hydrophobic residues or that disrupt salt bridge formation may have the greatest chance of being effective inhibitors of PNT domain polymerization. 2.2 Introduction The ETV6 PNT domain is an attractive target for therapeutic intervention due to its presence in numerous fusion oncoproteins. However, its propensity to form long insoluble polymers via the tight head-to-tail association (KD ~ nM) of two relatively flat interfaces makes it challenging to identify suitable small molecule PPI inhibitors (Kim et al., 2001). These interfaces – termed the ML- and EH-surfaces – lie roughly on opposite sides on the globular PNT domain   32 (Figure 1.3). Each surface is composed of a hydrophobic patch flanked by polar and charged sidechains. The introduction of an ionizable residue into either hydrophobic interface yields a monomeric PNT domain as judged by several techniques including equilibrium ultracentrifugation and native gel electrophoresis (Kim et al., 2001; Mackereth et al., 2004). Examples include the V112E or V112R mutations that disrupt the EH-surface or the A93D mutation that disrupts the ML-surface. Two mutant PNT domains with complementary wild type interfaces can form a heterodimer. The availability of monomeric and heterodimeric forms of the PNT domain facilitates studies of ETV6 self-association that are not confounded by multiple interfaces and sample precipitation.  To facilitate the identification of small molecule inhibitors of ETV6 polymerization, I undertook a detailed biophysical analysis of the monomeric and heterodimeric forms of its PNT domain. Using NMR spectroscopy, X-ray crystallography, and MD simulations, I demonstrated that the structure of the PNT domain is very stable and does not change significantly upon self-association. NMR spectroscopy also enabled me to identify interfacial regions protected from amide HX upon heterodimerization, and subsequently to test candidate small molecules for binding to the PNT domain (to be discussed in Chapter 3). Complementary alanine scanning mutagenesis, with binding measured by SPR, revealed several hot spot residues. These residues partake in both hydrophobic and electrostatic interactions. Collectively, this information helped guide the computational and experimental efforts to discover ETV6 PNT domain inhibitors that I will present in Chapter 3.   33 2.3 Results 2.3.1 NMR spectroscopic characterization of the ETV6 PNT domains 2.3.1.1 NMR spectral assignments of the monomeric PNT domains Central to protein NMR spectroscopy is the 2D 15N-HSQC (heteronuclear single quantum correlation) spectrum. The 15N-HSQC spectrum shows correlated signals between directly bonded 1H and 15N nuclei in backbone amides and Trp, Gln, Asn, Arg, Lys and His side chains. As such, this provides a “spectroscopic fingerprint” for monitoring the structure, dynamics and ligand binding properties of a protein.  For these studies, I expressed and purified samples of uniformly 13C/15N-labelled ETV6 fragments (residues 40-125) that contain either an A93D mutation or V112E mutation. These will henceforth be described as either the A93D or V112E PNT domains, respectively. Under neutral pH solution conditions, the A93D and V112E PNT domains yielded well dispersed NMR spectra indicative of stably folded structures. However, the latter showed some propensity to self-associate, and improved spectra were obtained at a sample pH value of 8. Presumably this reflects the deprotonation of E112, which may have an anomalously high pKa value when buried at the polymer interface. Unfortunately, the more alkaline conditions resulted in some loss of signal intensity due to base-catalyzed amide HX. Regardless, using a combination of scalar and NOE correlation experiments, I assigned the NMR signals from most main chain 1H, 13C, and 15N nuclei of the monomeric V112E and A93D PNT domains (Figures 2.1 and 2.2, respectively).   34  Figure 2.1 15N-HSQC spectrum of the monomeric V112E PNT domain.  (a) This spectrum was collected in 20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 8.0 and 25 °C. The assigned mainchain amide 1HN-15N signals are labelled, with the crowded central region (grey shading) enlarged for clarity in panel (b). Horizontal dashed lines connect the unassigned signals from Asn and Gln sidechain amides. A list of chemical shift assignments is available in Appendix B.   35  Figure 2.2 15N-HSQC spectrum of the monomeric A93D PNT domain. (a) This spectrum was collected in 20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 7.0 and 25 °C. Signals from the assigned mainchain amide 1HN-15N and several Gln and Asn sidechains are labelled, with the crowded central region (grey shading) enlarged for clarity in panel (b). An unassigned aliased peak from a tryptophan sidechain is denoted with an asterisk. A list of the assigned chemical shifts is provided in Appendix B.    36  2.3.1.2 PNT domain dimerization characterized by NMR spectroscopy NMR spectroscopy can give insights into the thermodynamic, kinetic and structural mechanisms of protein-ligand interactions. A particularly convenient approach is to use 15N-HSQC spectra to monitor the titration of a 15N-labelled protein with an unlabelled, and hence NMR “silent”, ligand (Williamson, 2013). Amide chemical shifts are highly sensitive to even subtle environmental changes, and thus an interaction with the unlabelled species can usually be detected through chemical shift perturbations (CSPs) of the labelled protein. Amides exhibiting CSPs typically cluster around the protein-ligand interface, yet may also be distal if binding causes longer range (allosteric) structural changes (Williamson, 2013).  For a simple 1:1 binding equilibrium in the fast exchange regime, where kex = kon[L] + koff is much greater than the chemical shift difference Dw of a given reporter nucleus in the free versus bound state of the protein, the observed CSP for that nucleus is the population-weighted average of its free and bound shifts (Kleckner and Foster, 2011). This usually corresponds to relatively low affinity binding, with an equilibrium dissociation constant Kd = koff/kon that is > 3 µM (Williamson, 2013). In contrast, in the case of binding in the slow exchange regime, where kex << Dw, separate signals from nuclei in the free and bound states are observed with population-weighted intensities. This generally occurs with higher affinity binding (Kd < 3 µM). In the intermediate exchange regime, where kex ~ Dw, peaks both shift and broaden over the course of a titration. In favorable situations, both thermodynamic (KD) and kinetic (kon, koff) constants can be extracted from such titration data for this exchange regime.    37 Following this approach, I used 15N-HSQC spectroscopy to characterize the heterodimerization of the V112E and A93D PNT domains. Upon addition of the unlabelled V112E PNT domain to the 15N-labelled A93D PNT domain, many amides exhibited CSPs in the slow exchange regime (Figure 2.3). That is, at an intermediate titration point, separate 1HN-15N peaks corresponding to the unbound (monomeric) and bound (heterodimeric) forms of the labelled protein were observed. This is consistent with the previously reported KD value ~ 2 nM for the high-affinity binding equilibrium (Kim et al., 2001). Comparable results were observed for the reciprocal titration of the unlabelled A93D PNT domain to the 15N-labelled V112E PNT domain (shown in panel a of Figure 2.7). Parenthetically, although the oligomerization states of the PNT domains in these experiments were not directly determined, the results are entirely consistent with previous studies showing that the A93D and V112E PNT domains are monomeric when separated and heterodimeric when combined (Kim et al., 2001).    38  Figure 2.3 The A93D and V112E PNT domains bind in the slow exchange limit. (a) An overlay of the full 15N-HSQC spectra showing the titration of unlabelled (NMR silent) V112E PNT domain to 15N-labelled A93D PNT domain at 0:1 (blue), 0.4:1 (yellow) and 1.5:1 (purple) molar ratios. The region encompassed by the black dotted square is enlarged in (b). In the left panel, only 1HN-15N peaks from the unbound A93D PNT domain are present. In the middle panel, separate signals from amides in the unbound and bound protein are seen. In the right panel, only signals from the bound protein are detected. All spectra were collected in 20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 8 and 25 °C.  In the slow exchange limit, the assignment of signals from amides exhibiting CSPs cannot be easily “transferred” from the unbound to the bound state. Therefore, I prepared samples of each uniformly 13C/15N-labelled PNT domain with its unlabelled partner in a 1.1 molar excess. Again, using a combination of heteronuclear scalar and NOE correlation experiments, I assigned the NMR   39 signals from most main chain 1H, 13C, and 15N nuclei of the two heterodimerized PNT domains (Figures 2.4 and 2.5, respectively).      40  Figure 2.4 15N-HSQC spectrum of the 15N-labelled V112E PNT domain bound to the unlabelled A93D PNT domain. (a) This spectrum was collected with a 1.1 molar excess of unlabelled protein in 20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 7.5 and 25 °C. Assignment of mainchain amide 1HN-15N signals from the 15N-labelled V112E PNT domain are indicated, with the crowded central region (grey shading) enlarged for clarity in panel (b). A list of the assigned chemical shifts is provided in Appendix B.    41  Figure 2.5 15N-HSQC spectrum of the 15N-labelled A93D PNT domain bound to the unlabelled V112E PNT domain. (a) This spectrum was collected with a 1.1 molar excess of unlabelled protein in 20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 7.0 and 25 °C. Assignment of mainchain amide 1HN-15N signals from the 15N-labelled A93D PNT domain are indicated, with the crowded central region (grey shading) enlarged for clarity in panel (b). A list of the assigned chemical shifts is provided in Appendix B.   42 2.3.1.3 Chemical shift-based structural analysis of the monomeric and heterodimeric ETV6 PNT domains  In their monomeric and heterodimeric forms, the A93D and V112E PNT domains yielded well-dispersed 15N-HSQC spectra, thereby enabling chemical shift-based structural analyses. Utilizing the MICS algorithm (Shen and Bax, 2012), the secondary structural elements for the four species were predicted from their mainchain 1H, 15N, and 13C chemical shifts (Figure 2.6). In each case, four distinct helical regions were detected. These coincide well with the four a-helices (H1: R63-E76; H2: G91-L94; H3: K99-R105; H4: G110-K122) identified in the X-ray crystal structures of the self-associated ETV6 PNT domains (PDB: 1JI7 and 1LKY) by PDBsum (Laskowski et al., 2018). Matching two short N-terminal 310-helices observed in these crystal structures, residues A52-L54 and, to a lesser extent, P58-Y60 also have chemical shifts indicative of helical character. In contrast, such diagnostic chemical shifts were not seen for residues S84-T86 even though they are classified as a forming a 310-helix in a subset of the monomer subunits of PDB file 1LKY. This minor discrepancy may arise as these residues are within an extended, solvent exposed polypeptide segment between helices H1 and H2. Amides in this region have chemical shift-derived random coil index-squared ordered values (RCI-S2, a proxy for backbone dynamics (Berjanskii and Wishart, 2005)) indicative of increased flexibility relative to the well-ordered helices. Most importantly, these analyses demonstrated that the A93D and V112E PNT domains have very similar secondary structures in their monomeric and heterodimeric forms. Thus, the proteins in solution do not undergo any significant conformational changes upon head-to-tail association into the structures previously characterized by X-ray crystallography.    43  Figure 2.6 Chemical shift-based secondary structural and dynamic analyses of the ETV6 PNT domains. Shown are the predicted helix and strand secondary structural elements and RCI-S2 values (black lines; decreasing values from 1 to 0 indicate increasing flexibility) for the V112E PNT domain (top) and A93D PNT domain (bottom) in their monomeric (left) and complexed (right) states, calculated with the MICS algorithm. Missing data corresponds to residues lacking chemical shift assignment (most pronounced for the monomeric V112E PNT domain). The locations of the four a-helices (H1-H4) and two 310-helices observed in the X-ray crystal structures of the self-associated ETV6 PNT domains are indicated above each plot.  Armed with chemical shift assignments, the amide 1HN-15N CSPs resulting from PNT domain dimerization were readily calculated (Figure 2.3 and Figure 2.7a,b). Of note, I82, T86, G91, K92, L96, R103, Y104 and R105 in the V112E PNT domain, and N75, L79, R80, H108, S109, V112, L113, E115 and L116 in A93D PNT domain experienced the largest CSPs. These   44 residues cluster within the ML and EH surfaces, respectively. This confirms that, as seen by X-ray crystallography, the two monomerized PNT domains indeed associate in solution through their wild type interfaces. Although CSP mapping provides a low-resolution identification of a protein-protein interface, it is often difficult to rationalize the origin of the shift changes for individual amides. For example, of all residues in the A93D PNT domain, V112 exhibited the largest CSP upon heterodimer formation, whereas A93 was not the most perturbed in the V112E PNT domain heterodimerization. These two residues are certainly “hot spots” for dimerization as their mutation to charged sidechains renders the PNT domain monomeric. In contrast, R103 showed the largest CSP of all amides in the V112E PNT domain. R103 is peripheral to the dimer interface and its sidechain participates in an intermolecular salt bridge.   45  Figure 2.7 PNT domain dimerization interface identified by amide chemical shift perturbations  (a) Overlaid 15N-HSQC spectra of the 15N-labelled V112E PNT domain in the absence (green) and presence of a 1.1 molar excess of the unlabelled A93D PNT domain (purple). As with the reciprocal titration (Figure 2.3), binding occurred in the slow exchange regime. (b) Backbone amide 1HN-15N CSPs (ppm) resulting from the heterodimerization of the V112E PNT (top) and A93D PNT (bottom) domains. Missing data correspond to prolines or residues without assigned NMR signals in either protein state. Most residues showed small CSPs which may be due in part to the difference in conditions under which spectra were assigned (V112E PNT domain monomer, pH 8.0; V112E PNT domain complexed with the A93D PNT domain, pH 7.5; A93D PNT domain monomer, pH 7.0; A93D PNT domain complexed with the V112E PNT domain, pH 7.5). (c) However, amides with CSP values > 0.2 ppm, which are highlighted in orange on a model of the V112E (green) / A93D (blue) PNT domain heterodimer, map to the interfacial regions. The cartoon is derived from PDB: 1LKY, with a V112R PNT domain.      46 2.3.2 Crystallographic comparison of monomeric and dimeric PNT domains In their original studies of ETV6, the Bowie group obtained crystals of the V112E PNT domain (PDB: 1JI7, C 2 space group, 3 monomers in the asymmetric unit) (Kim et al., 2001). Despite burial of E112, the monomers assembled in the crystal lattice via their ML- and EH-surfaces to form an extended helical polymer with an approximate 65 screw symmetry (Figure 2.8a). Subsequently, they determined the structure of a heterodimer composed of a A93D PNT domain bound to a V112R PNT domain via their complementary wild type interfaces (PDB: 1LKY, P 1 space group, 3 heterodimers in the asymmetric unit) (Tran et al., 2002). Although no longer polymeric within the crystal lattice, a model built from PDB: 1LKY using appropriate monomers subunits with native interfaces closely matched the polymeric structure of PDB: 1JI7 with the V112E substitution. Thus, the latter serves as a reliable experimental structure of the ETV6 PNT domain polymer. The NMR spectroscopic studies presented above indicate that the secondary structures and binding interfaces of the A93D and V112E PNT domains in solution closely resemble those observed by X-ray crystallography. However, the PNT domain in a complexed form may have subtle structural differences relative to a monomeric form, which could be advantageous for targeted small molecule screening. I reasoned that such a monomeric species could be obtained by introducing both the A93D and V112E mutations into the same PNT domain. Crystals of this double mutant formed readily in a variety of conditions and I determined the structure of the A93D-V112E PNT domain from those grown in 2.8 M sodium acetate (pH 7.0). The crystal structure was solved to 1.86 Å resolution using molecular replacement. The full data collection and refinement statistics are summarized in Table 2.5.    47 In contrast to crystals obtained by Bowie and co-workers, the A93D-V112E PNT domain crystallized in the P 65 2 2 space group with two monomers in the asymmetric unit (Figure 2.8b). More importantly, the presence of both monomerizing mutations prevented any intermonomer interactions within the crystal lattice via the ML- and EH-surfaces (Figure 2.8b). Rather, nearest neighbor contacts were via alternative interfaces that are not functionally relevant. Consequently, I reasoned that this crystal structure would be a good model for a “free” PNT domain.       48   Figure 2.8 Comparison of the asymmetric units of the monomeric and polymeric ETV6 PNT domain variants (a) The X-ray crystallographic structure of the V112E PNT domain reported by the Bowie group (PDB: 1JI7) The three subunits of the V112E PNT domain in the asymmetric unit are coloured (green). These form a helical polymer with neighbouring subunits (grey). The E112 mutation is shown as red spheres and the wild type A93 as blue spheres. (b) The structure of the A93D-V112E PNT domain determine herein. The two A93D-V112E PNT domains in the asymmetric unit are highlighted in orange (E112 as red spheres, D93 as purple spheres), and neighbouring asymmetric units are in grey. The crystal contacts are unlike those in the polymeric structure and no wild type protein-protein polymerization interface is seen.   Overall, regardless of differing crystallization conditions and mutations, the structure of the A93D-V112E PNT domain closely resembles those previously determined by the Bowie group. For example, using the DALI server to compare one subunit from this structure with a A93D PNT domain subunit from PBD: 1LKY, a total of 77 residues were aligned with a RMSD   49 value of 0.7 Å and DALI Z-score of 16.8 (Holm and Sander, 1995). However, a few subtle differences can be seen upon detailed comparison (Figure 2.9). For example, residues N-terminal to helix H1 have variable conformations. This is consistent with their RCI-S2 scores indicating a degree of flexibility (Figure 2.6). Not unexpectedly, several surface residues adopted different sidechain rotamer conformations, whereas residues within the interior hydrophobic core of the PNT domain superimposed well. Most importantly, the local structural features of the ML (including residue 93) and EL (including residue 112) interfaces do not differ despite the presence or absence of monomerizing mutations or their association upon dimer or polymer formation. Thus, interactions of the ETV6 PNT domain do not contribute to any discernible conformational changes. Unfortunately, this also indicates that the monomeric PNT domain does not have any obvious exposed binding pockets or grooves that might be suitable for small molecule targeting.   50   Figure 2.9 Structural comparison of the monomeric PNT domain with a polymeric subunit. Structural overlay of an A93D PNT domain subunit that was in complex with a V112R PNT domain (blue, PDB: 1LKY chain B) and the A93D-V112E PNT domain (orange), determined herein. Residues at positions 93 and 112 are highlighted. In contrast to well aligned interior side chains, variations in the rotamer conformations of surface sidechains (e.g. the lower tyrosine) and the N-terminal residues (expanded view) can be seen.      51 2.3.3 Amide hydrogen exchange data show increased protection of interfacial residues upon dimerization  Amide hydrogen exchange (HX) is a useful technique to characterize protein structure, stability and dynamics, as well as identifying ligand-binding interfaces (Dyson et al., 2008; Mandell et al., 1998; Paterson et al., 1990). Through a continuum of local to global conformational fluctuations, main chain amide hydrogens are constantly exchanging with the hydrogens of solvent water (Skinner et al., 2012). If a labile amide proton exchanges for a deuteron it will become silent for 1H-detected NMR, and thus its signal will disappear from a 15N-HSQC spectrum. The rate at which it disappears is determined by its structural features (e.g. hydrogen bonding and solvent accessibility) as well as the experimental conditions (e.g. pH and temperature). To account for the latter, the observed exchange rate constant can be compared to the predicted rate constant for a random coil polypeptide with the same sequence and under the same conditions. The ratio of the predicted versus observed rate constants is called the protection factor (PF).  To gain further insights into the ETV6 PNT domain, I used NMR spectroscopy to measure the amide PFs of its monomeric and heterodimeric forms. To complete this, I transferred samples of the two 15N-labelled PNT domains, in the absence and presence of their unlabelled partners, into D2O buffer and acquired a series of 15N-HSQC spectra over a three-month period. By comparing the spectra recorded after three days of exchange (Figure 2.10), it is immediately obvious that substantially more amides were protected from HX in the heterodimeric versus monomeric species. Also, after 3 months, at least 7 and 8 amides were still observed in the spectra of the A93D and V112E PNT domain heterodimers, respectively, as compared to only 4 and 2 in the corresponding monomers.    52 More quantitatively, the exchange rate constants for most amides in the four samples were determined by fitting their time dependent 1HN-15N signal intensities to single exponential decays. These values were converted into the PFs shown in Figure 2.11 and tabulated in Appendix C. In the cases of amides that exchanged fully before the acquisition of the first 15N-HSQC spectra, upper limits on their PFs are less than the smallest measured value of log(PF) ~ 2.7. Conversely, for amides that were highly protected from HX and showed insufficient signal decay over 3 months for reliable fitting, lower limits on their PFs were set based on the largest PFs confidently measured. These limits were log(PF) = 7 for monomeric PNT domains, and log(PF) = 8.5 for the PNT domains in a heterocomplex.      53  Figure 2.10 Measuring PNT domain HX using 15N-HSQC spectroscopy.  Shown are overlaid 15N-HSQC spectra for the monomeric and heterodimeric forms of the V112E and A93D PNT domains in H2O (green and blue) and after 3 days at 21 oC in D2O buffer (orange and labelled). In both cases, significantly more amides in the heterodimeric species were protected from HX. Initially the samples were in 20 mM MOPS, 50 mM NaCl and 0.5 mM EDTA at pH 7.5 (15N-labelled V112E PNT domain) or pH 7.0 (15N-labelled A93D PNT domain monomer and dimer). Under higher sample pH conditions, with faster base-catalyzed exchange, only a few signals are observed for the monomeric V112E PNT domain after 3 days. However, these differences in experimental conditions were accounted for when calculating PFs.     54  Figure 2.11 PNT domain amide HX protection factors increase upon heterodimerization. A summary of the PFs for the V112E and A93D PNT domains in their monomeric and complexed states. Missing data correspond to prolines, residues with unassigned or overlapping amide chemical shifts, or residues that exchanged prior to collection of the first 15N-HSQC spectrum after transfer into D2O buffer and thus have log(PF) values less than an upper limit of ~2.7. The latter include amides preceding residue 60, which were not observed for any protein after transfer into D2O buffer. Due to conditions of higher sample pH, missing data was particularly limiting for the monomeric V112E PNT domain. In the cases of amides with that had not exchanged significantly after 3 months, estimated lower limits of log(PF) values > 7 (monomeric) and > 8.5 (heterodimeric) are indicated by upwards arrows.   55 The monomeric A93D and V112E PNT domains exhibited very similar patterns of amide HX (Figures 2.11 and 2.12). All of the residues N-terminal to Y60 and most of those in interhelical regions exchanged rapidly under these experiment conditions. This is consistent with their surface exposure and general lack of intramolecular hydrogen bonding interactions. Conformational flexibility of these residues was also indicated by their RCI-S2 values (Figure 2.6). Conversely, amides within or near the four a-helices showed substantial protection from HX. In particular, W69 and L70 in helix H1 and L116 and L117 in helix H4 of both proteins exchanged very slowly, with several of these residues having log(PF) values > 7. Under the commonly observed EX2 conditions, PFs reflect the residue-specific free energy changes, DGHX = 2.303RTlog(PF), governing local or global conformational equilibria leading to exchange (Krishna et al., 2004). Assuming that these most protected amides exchange through global or near-global structural fluctuations, these HX data provide an estimation of the unfolding free energy for each monomeric PNT domain of > 40 kJ/mol. Such a value is consistent with the view that, even without polymerizing, the ETV6 PNT domain adopts a very stable folded conformation. This also indicates that the A93D and V112E mutations prevent polymerization without disrupting the structure or stability of the monomeric PNT domain.    56   Figure 2.12 Mapping of HX protection factors on the monomeric PNT domain structures. Amide PFs are displayed as spheres on the ribbon diagrams of the V112E PNT domain (upper, modified from PDB: 1LKY) and A93D PNT domain (lower, from PDB: 1LKY) according to the indicated color and size scheme. Residues without spheres are either prolines, lack an assigned or fully resolved 1HN-15N signal, or exchanged too fast for reliable HX quantitation. Both monomeric PNT domains showed similar HX profiles, with the most protected amides located in structured helical regions.   57 Heterodimerization resulted in increased HX protection for many residues in both the A93D and V112E PNT domains (Figures 2.11 and 2.13). Indeed, several amides in both proteins did not exchange significantly even after 3 months in D2O buffer and thus have log(PF) > 8.5 (and DGHX > 48 kJ/mol). Although global stabilization upon heterodimer formation is expected, many of the residues with at least a 100-fold increase in HX protection localized around the complementary interfacial regions of the two PNT domains. These are exemplified by amides within or near the ML-surface of the V112E PNT domain (K92, A93, L96, T98, D101, F102) and the EH-surface of the A93D PNT domain (F77, V112, L113, Y114) (Figure 2.13). Given that the structures of the A93D and V112E PNT domains do not change significantly upon heterodimerization, the increased protection of interfacial residues against HX may result from their local stabilization against conformational fluctuations allowing exchange. Alternatively, if exchange occurs predominantly through transient monomers, then the increased protection of amides in the heterodimer would reflect the equilibrium population distribution of these two species (Paterson et al., 1990). Regardless of mechanism, and not unexpectedly, the HX data are consistent with the role of these interfaces in ETV6 PNT domain polymerization.   58  Figure 2.13 Mapping of HX protection factors on the structure of the heterodimeric PNT domain  (a) Data from Figure 2.11 were combined to provide a comparison the amide PFs of the V112E and A93D PNT domains when associated within a heterodimer. The helical regions are indicated as rectangles in the upper diagram. Amides that did not exchange sufficiently over 3 months in D2O buffer for quantitative fitting have log(PF) > 8.5 and are indicated by upwards arrows. Many of these occur at the dimer interface. (b) Amide PFs are displayed as spheres on the ribbon diagram of the V112E PNT domain (green, from PDB: 1LKY) and A93D PNT domain (blue, from PDB: 1LKY) according to the indicated color and size scheme. Residues without spheres are prolines, lack an assigned or fully resolved 1HN-15N signal, or exchanged too fast for HX quantitation.   59 2.3.4 Alanine scanning mutagenesis at the PNT domain PPI interface To help focus subsequent drug design efforts, I used alanine scanning mutagenesis to identify which residues at the PNT domain heterodimer interface contribute most to binding affinity. The mutation of a residue to alanine reduces its side chain to a single methyl group, thereby eliminating its contributions to intermolecular binding, while also avoiding the introduction of any additional non-native interactions. Alanine was chosen over glycine as the latter may led to increased backbone flexibility. For these studies, I used SPR as this technique can rapidly provide both KD and the kinetic parameters – kon and koff – governing a binding interaction using only small quantities of bacterially expressed proteins.  Initially I wanted to determine whether I could reproduce the results of previously reported SPR studies of the PNT domain dimerization (Kim et al., 2001). As shown above, the N-termini of the ETV6 constructs studied herein are flexible as evidenced by reduced RCI-S2 values and facile amide HX, and do not contribute to the heterodimer interface. Thus, I expressed the A93D and V112E PNT domains with an N-terminal Avitag to enable in vivo biotinylation during expression in E. coli (Ashraf et al., 2004). I reasoned that having a specific site for biotinylation would be preferable to chemical biotinylation of random lysine residues as there is a key lysine (K99) in the PNT domain that is involved in an intermolecular salt bridge. I loaded the biotinylated PNT domain “ligand” onto a streptavidin chip for SPR studies with non-biotinylated “analyte” PNT domains. An example is shown in Figure 2.14 where either an A93D PNT domain (positive control) or V112E PNT domain (negative control) were passed at various concentrations over the immobilized biotinylated V112E PNT domain. High affinity binding was seen for the heterodimer, whereas the identical PNT domains did not measurably interact.    60  Figure 2.14 SPR provides a reliable measure of the A93D and V112E PNT domain interactions. Shown are Biacore X100 SPR sensograms for control experiments with either the A93D PNT domain (left, blue cartoon shape) or V112E PNT domain (right, green cartoon) analyte passed over the biotinylated V112E PNT domain (green cartoon) ligand immobilized on a streptavidin chip. Different concentrations of analyte were run over the chip for 300 sec, followed by buffer only to allow dissociation. A 30 second regeneration wash with 0.2% SDS produced the response unit (RU) spike at ~ 1000 seconds (truncated at the dashed line) and returned the baseline back to its starting RU value. Fitting of these concentration dependence response curves demonstrated heterodimer formation between the A93D and V112E PNT domains via their wild type interfaces with a KD value of 7.5 nM. In contrast, the V112E PNT domain did not measurably self-associate.  Using this approach, the A93D PNT domain analyte bound the V112E PNT domain ligand with a fit KD value of 7.5 nM, and the V112E PNT domain analyte bound the A93D PNT domain ligand with a fit KD value of 5.1 nM. These results agreed well with previously reported KD values of 1.7 to 4.4 nM for the PNT domain interactions as measured by SPR (Kim et al., 2001) and ITC (Cetinbas et al., 2013). Thus, SPR was used to characterize the effects of alanine substitutions of 18 residues within or around the EH-surface of the A93D PNT domain and 14 residues within or around the ML-surface of the V112E PNT domain (Tables 2.1 and 2.2).     61 Table 2-1 Alanine scanning mutagenesis of the EH-surface on the A93D PNT domain a  Ala Mutation kon (1/Ms) koff (1/s) KD (nM) DDG (kJ/mol) b None  2.0 × 105 1.5 × 10-3 7.5 - S47A 2.1 × 105 1.4 × 10-3 6.9 -0.2 I48A 3.4 × 105 4.0 × 10-3 12 1.2 E76A 6.2 × 105 6.2 × 10-3 9.9 0.7 F77A 3.4 × 104 1.5 × 10-2 450 10.2 S78A 2.8 × 106 9.3 × 10-3 3.3 -2.0 L79A 8.1 × 103 1.6 × 10-3 200 8.2 R80A 9.1 × 104 2.9 × 10-3 32 3.6 K99A 2.4 × 103 2.3 × 10-3 930 12.0 E100A 3.1 × 105 1.5 × 10-3 5 -1.0 R103A 2.5 × 105 5.4 × 10-3 21 2.6 P107A 2.6 × 105 2.1 × 10-3 8.1 0.2 H108A 2.6 × 105 1.4 × 10-3 5.4 -0.8 D111A 3.3 × 103 1.4 × 10-3 400 9.9 V112A 2.0 × 104 1.1 × 10-2 550 10.7 Y114A 3.5 × 104 4.5 × 10-3 130 7.1 E115A 2.3 × 105 2.5 × 10-3 11 1.0 L116A 2.3 × 105 7.6 × 10-4 3.3 -2.0 H119A 2.7 × 105 3.4 × 10-3 13 1.4  a All A93D PNT domain analytes were run on the streptavidin SPR chip bound with the biotinylated V112E PNT domain ligand.  b Calculated as DDG = RT ln(KD,mutant/KD,wild type) where wild type is the top-listed protein with an unmodified interface.                   62 Table 2-2 Alanine scanning mutagenesis of the ML-surface on the V112E PNT domain a  Ala Mutation kon (1/Ms) koff (1/s) KD (nM) DDG (kJ/mol) b None  4.4 × 105 2.3 × 10-3 5.1 - I59A 4.3 × 105 2.5 × 10-3 5.8 0.3 R63A 3.6 × 105 5.4 × 10-3 15 2.7 N85A 3.5 × 105 3.4 × 10-3 9.9 1.6 E88A 5.0 × 105 4.0 × 10-3 8.1 1.1 M89A 9.7 × 104 3.3 × 10-2 340 10.4 N90A 7.8 × 104 3.7 × 10-2 480 11.3 K92A 3.5 × 105 2.2 × 10-2 62 6.2 L96A 9.5 × 103 9.5 × 10-3 1000 13.1 L97A 9.8 × 104 6.8 × 10-2 700 12.2 T98A 6.2 × 105 5.4 × 10-3 8.8 1.3 E100A 5.0 × 105 3.5 × 10-3 7 0.8 D101A 6.3 × 103 2.5 × 10-3 400 10.8 Y104A 7.3 × 105 1.5 × 10-2 21 3.5 R105A 2.3 × 103 5.4 × 10-3 2340 15.2 a All V112E PNT domain analytes were run on the streptavidin SPR chip bound with the biotinylated A93D PNT domain ligand.  b Calculated as DDG = RT ln(KD,mutant/KD,wild type) where wild type is the top-listed protein with an unmodified interface.  Six out of 18 on the A93D PNT domain and 7 out of 14 tested residues on the V112E PNT domain had large detrimental effects on binding (DDG > 6 kJ/mol) and can be classified as “hot spots” (Figure 2.15). Although more difficult to interpret than KD values, these mutations acted to varying degrees by slowing the association rate constants, kon, and/or increasing the dissociation rate constants, koff. In the case of the A93D PNT domain, some alanine mutations appeared to slightly enhance binding. However, these relatively small changes in KD values may be within experimental error as only single measurements were made in order to expedite this study. The higher proportion of “hot spot” interfacial residues on the V112E PNT domain, including the three most detrimental of the entire set of alanine mutants, suggests that the features of this interface contribute most significantly to PNT domain polymerization.   63  Figure 2.15 Characterization of the PNT domain interface by alanine scanning mutagenesis. A summary of the effects of alanine substitutions on the ML-surface of V112E PNT domain (top) and the EH-surface of the A93D PNT domain (bottom) interfaces. Each DDG value corresponds to the change in binding free energy to the complementary PNT domain relative to that of the reference protein with an unmodified wild type interface.    64 Many of the hydrophobic residues at the centre of the heterodimer interfaces are “hot spots” for binding. In particular, the L96A mutation on the ML-surface of the V112E PNT domain resulted in an ~1000-fold decrease in affinity. As shown in Figures 2.16 and 2.17, L96 protrudes out of the V112E PNT domain as a “key” to fit into the “lock” between residues F77, V112, E115 and H119 on the EH surface of the A93D PNT domain. Thus, in addition to contributing to heterodimerization through the hydrophobic effect, L96 partakes in favorable van der Waals interactions at the PPI interface. Other hydrophobic residues that are important include L97 and M89 on the ML-surface and F77, L79, and V112 on the EH-surface.  Surrounding the hydrophobic centre of the heterodimer interface is a ring of residues that generally have less of contribution towards binding affinity (Figure 2.16). However, an important exception is the peripheral K99 on the A93D PNT domain EH-surface. This residue forms an intermolecular salt bridge with D101 on the ML-surface of the V112E PNT domain. Alanine mutations of K99 or D101 result in decreased binding affinities of 930 nM or 400 nM, respectively. Thus, consistent with their salt bridge pairing, either mutation weakened binding relative to the wild type by a factor of ~ 100. Mutation of K99 has been shown to interfere with oncogenic cellular transformation, confirming the functional importance of this salt bridge (Cetinbas et al., 2013). A second intermolecular salt bridge consistently seen in the PNT domain heterodimer structures involves R105-D111. The R105A and D111A mutations also severely weakened binding to KD values of 2.3 µM and 400 nM, respectively, thus confirming the importance of this interaction. In contrast, several additional salt bridges involving K99-E100, R103-E100 and R103-D101 are seen in some, but not all crystallographically defined interfaces. However, alanine substitutions of either E100 or R103 did not significantly impact binding. Thus, the K99-D101 and R105-D111 salt bridges play important roles in PNT domain association, whereas those involving E100 or R103   65 do not. It is also notable that the polar residue N90 on the ML-surface of the V112E PNT domain surface is also a hot spot, but it does not appear to have a potential reciprocal hydrogen bond donor or acceptor. Overall, the alanine scanning mutagenesis study illustrated that PNT domain dimerization is a result of interactions involving both hydrophobic and charged interfacial residues.     66  Figure 2.16 Mapping the results of the alanine scanning mutagenesis studies on the ETV6 PNT domain heterodimer structure. The effects of alanine substitutions on the KD values for heterodimer dissociation are mapped onto the structures of the V112E (green background) and A93D (blue background) PNT domains. Residues for which alanine mutations maintained binding affinity approximately the wild type KD (~ 10-9 M) are indicated in pale yellow, whereas those weakening binding by approximately 10-fold (orange), 100-fold (red) and 1000-fold (burgundy) are in increasingly darker colors.    67  Figure 2.17 Mapping the results of the alanine scanning mutagenesis studies on the ETV6 PNT domain heterodimer structure. The V112E PNT domain (top) and A93D PNT domain (bottom) alanine mutation sites are shown as cartoon sticks and any residue that weakened binding by 100-fold or greater is labelled and color coded. The interfacial view and a side view are shown.     68 2.3.5 Molecular dynamics simulations of the PNT monomer and dimer Molecular dynamics (MD) simulations can yield insights into the conformational flexibility and motions of a protein. Using the University of Bristol computational facilities, I ran 100 ns MD simulations on the PNT domain heterodimer (a A93D PNT domain and V112R PNT domain subunit from 1LKY.pdb) and the A93D-V112E PNT domain monomer, reported herein. In addition, I wanted to generate alternate conformers of the PNT domain for in silico docking of virtual chemicals.  Both the heterodimer and the monomer remained structurally stable throughout the 100 ns simulation. This can be seen by the close superimposition of 1000 structures taken over the time course of each simulation (Figure 2.18). Importantly, the heterodimeric PNT domain subunits remained associated with each other, indicating that their tight association can be modelled in silico. Another way to view these simulations is to consider the root mean square deviation (RMSD) of non-hydrogen atoms in the entire PNT domain monomer and heterodimer over the time course of the 100 ns simulation (Figure 2.19a). Both remained stable with an overall RMSD of 2 Å relative to the starting structures, typical for a globular protein of this size. Comparing the averaged root mean square fluctuation (RMSF) of each residue throughout the simulation also showed little flexibility except at the termini (Figure 2.19b). With the A93D PNT domain, there was a slight spike in the RMSF of residue E88 and this likely due to the residue being central on a loop region. Aside from these minor exceptions, there appeared to be no correlation of the RMSF of a residue and whether it is present in a helix or loop region. Collectively, these MD simulations indicated that both the PNT domain monomers and heterodimer are structurally stable and likely do not experience any significant conformational fluctuations.    69  Figure 2.18 Structural overlay of PNT domain structures from 100 ns MD simulations. MD simulations were run on the PNT domain heterodimer (top, PDB: 1LKY with a R112E mutation modelled in the V112E PNT domain) with V112E PNT (green) and A93D PNT (blue) subunits, as well as the A93D-V112E PNT domain monomer (lower, yellow). The 1000 overlaid structures shown represent coordinate files taken every 100 ps over the course of each 100 ns simulation. Very little fluctuations are seen for the heterodimer or the monomer.   70  Figure 2.19 Analysis of PNT domain structural fluctuations from 100 ns MD simulations. (a) The RMSD values of non-hydrogen atoms throughout the 100 ns MD simulation relative to the starting structures. After the initial energy minimization, both the monomer and heterodimer showed little deviations over the time courses of the simulations. (b) The RMSF values, relative to the average side chain conformation, throughout the 100 ns MD simulation plotted for each residue. The left plot shows the A93D-V112E PNT monomer (left; orange) and the right plot shows the heterodimer (right) of the V112E PNT domain (green) and A93D PNT domain (blue). With the exception of terminal residues, there are very little fluctuations throughout the simulation.   2.4 Discussion The goal of this chapter was to determine if there are any structural and dynamic differences between the ETV6 PNT domain in its monomeric and heterodimeric forms, and to gain   71 insights into the residues that drive polymerization. An understanding of these factors can facilitate drug screening or rational design.  2.4.1 The PNT domain retains a similar structure in monomeric and heterodimeric states The biophysical studies of the PNT domain show that its structure is very stable in the monomeric and heterodimeric forms. There are no substantial conformational differences between the species as seen in both by X-ray crystallography and by NMR spectroscopy (secondary structural propensities and chemical shift perturbations). Furthermore, the X-ray crystallographic structure of a monomeric A93D-V112E PNT domain closely resembles that of the heterodimers, and does not provide any additional insights on surface pockets to which a small molecule could bind. Similarly, the 100 ns MD simulations do not show any evidence for surface pocket fluctuations. Thus, the well-characterized PNT domain structure is retained throughout polymerization.  2.4.2 Alanine scanning mutagenesis determines interfacial “hot spot” residues  Detailed SPR-monitored alanine scanning mutagenesis studies revealed that the ETV6 PNT domains heterodimerize with high affinity (KD ~ nM) due to residues partaking in electrostatic, van der Waals and hydrophobic interactions. Although it has been reported that hot spots are generally not enriched in electrostatic interactions (Keskin et al., 2005), I find the opposite with residues in two proposed intermolecular salt bridges (K99-D101 and R105-D111) being critical for high affinity binding. Previous studies have shown that a K99R substitution also resulted in weaker binding, indicating that both the charge and amino acid structure are important, at least for one of these salt bridges (Cetinbas et al., 2013). In contrast, the E100-R103 salt bridge does not contribute significantly as alanine substitutions of these residues did not result in a reduction in binding affinity or the oncogenic properties of the EN fusion protein (Cetinbas et al.,   72 2013). Indeed, presence of this salt bridge, and others involving these residues, varies between the reported ETV6 PNT domain structures. This also suggests that it is dynamic and not as persistent as the two formed by K99-D101 and R105-D111.  In addition to A93 and V112 (the founding sites for monomerizing mutations), several hydrophobic residues with the EH- and ML-surfaces are also hot spots for heterodimerization. In particular, L96 on the ML-surface plays a critical role in binding as removal of its side chain resulted in ~ 1000-fold weaker affinity. The leucine makes contacts with several residues on the reciprocal EH-interface for which alanine substitutions also reduced the binding affinity, albeit each to a lesser extent. Targeting a molecule to bind near these EH-surface residues may exclude L96 and thereby inhibit PNT polymerization.   2.4.3 Many of the “hot spot” residues have increased protection from amide hydrogen exchange Alanine scanning mutagenesis and amide HX experiments provide complementary insights into the self-association of the ETV6 PNT domain. That is, alanine scanning revealed the effect of removing the side chain of a given residue on the affinity for heterodimer formation, whereas HX experiments showed how dimerization changes the conformational fluctuations of backbone amide hydrogens leading to exchange with water. In general, many of the hydrophobic hot spot residues within the core regions of the EH- and ML-surfaces, including L79, L96, L97, T98, Y104, V112, Y114, and L116 also exhibited enhanced HX protection upon heterodimerization (Tables 2.3 and 2.4). This is consistent with their burial, and likely dampened dynamics, within the heterodimer relative to the monomeric species. In contrast, although partaking in important intermolecular interactions, K99 and R105 underwent fast amide HX in both the monomeric and heterodimeric PNT domains, whereas their salt bridge partners D101 and D111 exhibited increased HX   73 protection. This is also consistent with the location of the amides of these residues at the periphery of the dimer interface.  Table 2-3 Comparison of the alanine scanning data to HX data for the A93D PNT domain All the residues in the A93D PNT domain mutated to alanine are listed. The protection factors for the monomer and heterodimer with the V112E PNT domain are listed as log(PF). Residues with blank values exchanged too quickly for quantitation and have log(PF) < 2.7. Residues with an asterisk did not exchange significantly in three months and estimated lower limits of log(PF) values > 7 (monomeric) and > 8.5 (heterodimeric) are shown. Residue Alanine Mutant  KD (nM) Monomer log(PF) Dimer log(PF) Dlog(PF)  S47 6.9    I48 12    E76 9.9 4.0 4.6 0.6 F77 450 4.4 7.1 2.7 S78 3.3 4.4 5.7 1.3 L79 200  5.2 > 2.5 R80 32    K99 930    E100 5    R103 21    P107 8.1    H108 5.4    D111 400  4.1 > 1.4 V112 550 3.2 5.3 2.1 Y114 130 6.0 8.5* > 2.5 E115 11 5.9 5.6 -0.3 L116 3.3 7.0* 8.5* - H119 13          74 Table 2-4 Comparison of the alanine scanning data to HX data for the V112E PNT subunit All the residues in the V112E PNT domain mutated to alanine are listed. The protection factors for the monomer and heterodimer with the A93D PNT domain are listed as log(PF). With the exception of unassigned M89, residues with blank values exchanged too quickly for quantitation and have log(PF) < 2.7. Residues with an asterisk did not exchange significantly in three months and estimated lower limits of log(PF) values > 7 (monomeric) and > 8.5 (heterodimeric) are shown. Residue Alanine Mutant  KD (nM) Monomer log(PF) Dimer log(PF) Dlog(PF) Ratio  I59 5.8    R63 15    N85 9.9    E88 8.1    M89 340 No assignment 7.0  N90 480 No assignment 7.3  K92 62 4.1 7.2 3.1 L96 1000 3.7 8.5* > 4.8 L97 700  8.5* > 5.8 T98 8.8 4.4 7.0 2.6 E100 7    D101 400 3.8 5.7 1.9 Y104 21  6.5 > 3.8 R105 2300     2.4.4 Insights into small molecule inhibition of PNT domain polymerization Insights into the factors that drive ETV6 PNT domain polymerization may aid drug design. For example, small molecules that bind interaction surfaces of hot spot hydrophobic (e.g. L97) or salt-bridging (e.g. K99-D101 or R105-D111) residues may competitively prevent PNT domains from self-associating. However, designing or identifying such small molecules may be challenging as the polymerization occurs with high affinity and there does not appear to be any structural differences between the PNT domains in their monomeric and heterodimeric states. One strategy that could be implemented for screening assays is to utilize an alanine mutant that weakens binding. This way, it may be easier to identify an initial compound that inhibits a micromolar   75 affinity interaction, as opposed to nanomolar interaction of the wild type PNT domains. Once discovered, such a lead molecule could be optimized for higher affinity binding. As previously shown, mutations of K99 weaken PNT domain polymerization and alter the oncogenic properties of EN (Cetinbas et al., 2013). This residue is not at the centre of the heterodimer interface and the K99A or K99R mutants may be well suited for such a screening strategy. Alternatively, from the alanine scanning mutagenesis, we know which residues are not hot spots and thus tolerant to modifications. An example of a technique that would benefit from this acquired knowledge is disulphide tethering where weakly binding chemical fragments are tethered via an introduced cysteine residue near the protein-protein interaction interface (Arkin et al., 2003). In principle, one could modify a residue, that does not affect binding and is near a hot spot, to a cysteine for identifying weak binders. Helix “stapling” is another method that has been used to design molecules that disrupt PPIs. The general principle is to covalently stabilize the secondary structure of residues that normally form a helix along an interaction surface (Wilson, 2009). The PNT domain, a subset of SAM domains, is a helical bundle, with helices H4 and helices H2 and H3 forming complementary interaction surfaces. Residues D111, V112 and Y114 in helix H4 are all defined hot spots and experience substantial increases in HX protection upon dimerization (Table 2.3). Thus, a stapled helical polypeptide encompassing these residues might be sufficient to prevent polymerization. In contrast, whereas several residues on H2 and H3 are hot spots, such as D101 and R105, they are not adjoining as a single helix. A similar methodology has been used to target the Ship2 and EphA2 SAM-SAM domain PPI, whereby a penta-amino acid motif found in EphA2 binds to the SAM domain of Ship2 with a KD in the high micromolar range (Mercurio et al., 2017). Therefore,   76 exploration of polypeptide mimics of helix H4 for disruption of PNT domain polymerization may be a suitable option. As a closing comment, SPR experiments demonstrated that the lifetime of the PNT domain heterodimer (1/koff) is ~ 10 min. The lifetime of polymeric forms will be longer since dissociation must occur at multiple interfaces. If the dissociation of the PNT domain polymer is slow in the context of a PNT-PTK fusion oncoprotein, then a prospective small molecule drug that would have the greatest effect may likely need to act on newly translated PNT-PTK fusion oncoproteins.   2.5 Materials and Methods 2.5.1 Protein construct, expression and purification The gene encoding residues 1-125 of the human ETV6 (Genbank Gene ID: 2120), preceded by a thrombin-cleavable N-terminal His6 affinity tag, was initially cloned into the pET28a vector (Invitrogen) by the Sorensen lab through standard PCR techniques (Kramer and Coen, 2001). The monomerizing A93D, V112E, and A93D-V112E substitutions were introduced through QuikChange site-directed mutagenesis (Stratagene) (Appendix A). Subsequently, it was recognized that a thrombin cleavage site is present within these constructs (residues V37-P38-R39¯A40). Coincidently, this occurs just before an alternative start site (M43) for ETV6 expression (Bohlander, 2005). Previous studies by our group demonstrated that the first ~ 50 residues of ETV6 are intrinsically disordered, and their removal by proteolysis at this site or by gene truncation to encode residues 43-125 did not affect the structure or polymerization properties of the PNT domain (Huang-Hobbs, 2013). Furthermore, the absence of these disordered residues is advantageous for NMR spectroscopic studies of the PNT domain. Thus, ETV6 fragments were expressed from available clones as residues 1-125 with a His6 tag, and cleaved with thrombin to yield final purified samples spanning residues 40-125.   77  Unlabelled proteins were expressed in E. coli BL21 (lDE3) cells grown in lysogeny broth (LB) media. Isotopically labelled proteins were expressed in M9 minimal media supplemented with either 1 g/L of 15NH4Cl and 2 g/L glucose for 15N-labelled proteins or 1 g/L of 15NH4Cl and 3 g/L 13C6-glucose for 13C/15N-labelled proteins. In all cases, 35 mg/L kanamycin was included for plasmid selection. Overnight seed cultures in LB media were used to inoculate larger volumes. In the case of isotopically enriched protein expression, the seed cultures were collected by centrifugation and the LB media discarded before inoculating the M9 media. The cells were grown at 37 °C to an OD600 ~ 0.6 and protein expression was induced with 1 mM IPTG. After continued growth at 37 °C overnight, the cells were harvested by centrifugation (Sorvall GSA rotor; 5,000 rpm) and frozen at -80 °C. The cell pellet was thawed for purification and resuspended in denaturing buffer (4 M GdnHCl, 20 mM sodium phosphate, 500 mM NaCl, 20 mM imidazole, pH 7.5) and lysed by sonication on ice. The lysate was then cleared by centrifugation (Sorvall SS34 rotor; 15,000 rpm) and the resulting supernatant was passed through either a 0.45 or 0.8 µm pore size filter and loaded onto a 5 mL Ni+2-NTA HisTrap HP column (GE Healthcare) pre-equilibrated with binding buffer (20 mM imidazole, 50 mM sodium phosphate, 500 mM NaCl, pH 7.5). After washing with several column volumes of binding buffer, the protein was eluted using a 120 mL linear gradient of elution buffer (500 mM imidazole, 50 mM sodium phosphate, 400 mM NaCl, pH 7.5).   The collected fractions were analyzed by SDS-PAGE and those containing the desired protein were pooled. In general, the ETV6 fragments eluted as two major peaks. Previous studies demonstrated that proteins from these fractions had the same masses yet showed small differences in their 15N-HSQC spectra (Huang-Hobbs, 2013). Despite significant efforts, the origin of these   78 differences was never elucidated. For consistency, the only fastest eluting peak was collected and dialyzed overnight at 4 °C in thrombin cleavage buffer (20 mM Tris, 0.15 mM NaCl, 2.5 mM CaCl2, 0.5 mM EDTA, 1 mM DTT, pH 8.4) with ~ 1 unit of thrombin (Millipore) per 20 mL of collected fractions. The sample was then spun to remove any precipitate and the cleaved protein was separated from any His6-tagged species by passage through the Ni+2-NTA HisTrap HP column. The flow-through was concentrated to 1 mL with an Amicon 3K MWCO centrifugal filter and purified further using size-exclusion chromatography (Superdex S75, GE Healthcare). This also served to exchange the protein in a buffer optimized for NMR experiments (noted below). The concentration of each protein sample was determined by ultraviolet absorbance at 280 nm based on its predicted molar absorptivity (Appendix A) (Gasteiger et al., 2003). 2.5.2 NMR Spectroscopy 2.5.2.1 General NMR spectroscopy methods NMR spectra were recorded with cryoprobe equipped Bruker Avance III 500, 600, and 850 MHz spectrometers. All data acquired were processed and analyzed with NMRPipe (Delaglio et al., 1995), NMRDraw and NMRFAM-Sparky (Goddard and Kneeler, 1999; Lee et al., 2015). In general, protein samples were 150 µM to 600 µM in ~ 450 µL of standard buffer (20 mM MOPS, 50 mM NaCl, and 0.5 mM EDTA) with D2O (5% v/v) added for signal locking.  Unless otherwise noted, experiments involving the monomeric A93D PNT domain and A93D-V112E PNT domain were conducted at pH 7.0 and those involving the monomeric V112E PNT domain were conducted at pH 8.0 due to its propensity to self-associate under lower sample pH conditions. In the case of the dimer species, the unlabelled partner protein was added to its isotopically-labelled partner in a 1.1 molar excess. The dimer containing an isotopically-labelled   79 V112E PNT domain was studied at pH 7.5 whereas the dimer containing an isotopically-labelled A93D PNT domain was studied at pH 7.0.  2.5.2.2 Spectral assignments and chemical shift analyses The signals from mainchain 1H, 13C, and 15N nuclei of the 13C/15N-labelled PNT domain constructs were assigned using standard heteronuclear 1H-13C-15N scalar correlation experiments (Sattler et al., 1999), including HNCACB, HNCO, CBCA(CO)NH, HNCACO, along with HSQC-NOESY spectra, recorded on a 600 MHz spectrometer at 25 °C. Spectra of the monomeric A93D PNT domain were assigned manually, whereas those of the monomeric V112E PNT domain and the two heterodimer complexes were automatically interpreted using PINE and verified manually (Bahrami et al., 2009).  Secondary structure analyses were carried out using the MICS (Motif Identification from Chemical Shift) online server (Shen and Bax, 2012). To determine the amide CSPs due to dimerization, each unlabelled PNT domain (A93D or V112E) was titrated stepwise into a sample of its isotopically labelled partner to a small final excess molar ratio. 15N-HSQC spectra were recorded at each step, in order to verify complete dimerization. Combined CSPs for the 1HN (DdH) and 15N (DdN) signals of the corresponding amides in the dimer versus monomer species were calculated using the expression:  ∆𝛿 = 	%(∆𝛿')) + (0.14∆𝛿/))  2.5.2.3 Amide hydrogen exchange (HX)  Protium-deuterium HX experiments for the PNT domains in the absence (monomeric) and presence of their unlabelled partner (heterodimeric) were conducted on a Bruker Avance 600 MHz   80 spectrometer at 21 °C. The initial pH values were 7.0 for the samples containing the 15N-labelled A93D PNT domains and 7.5 for those containing the 15N-labelled V112E PNT domain. The temperature was set to 21 °C to match ambient room temperature. Initial reference 15N-HSQC spectra of the proteins in H2O buffer were recorded. Subsequently a 450 µL aliquot was lyophilized, resuspended with the same volume of D2O, and immediately put into the NMR spectrometer to begin data recording within 4-7 minutes. Initially, a series of ~ 5-minute 15N-HSQC spectra were acquired with 2 scans/FID to characterize amides exchanging on the minutes timescale. Then ~ 20-minute 15N-HSQC spectra with 8 scans/FID were collected back-to-back for ~ 3 hours, followed by a 20-minute spectrum every hour for ~ 24 hours, and then intermittent 20-minute spectra over a period up to 3 months. After the first week, the sample was removed from the spectrometer and stored at ambient room temperature between recording spectra. Upon completion of data recording, the pH* (pH meter reading uncorrected for the deuterium isotope effect) of each sample was measured as 7.3 (monomeric A93D PNT domain), 7.6 (monomeric V112E PNT domain), 7.4 (15N-labelled A93D PNT domain in complex with V112E PNT domain), and 7.5 (15N-labelled V112E PNT domain in complex with A93D PNT domain).  For each amide with measurable signals at times t after resuspension in D2O, the pseudo-first order exchange rate constant kobs was obtained by fitting with NMRFAM-Sparky the 1HN-15N peak intensity It, scaled for number of acquisitions/FID, to the equation for a single exponential decay: 𝐼1 = 𝐼2𝑒4(5678)(1) I0 is the fit initial intensity extrapolated to t = 0. The protection factor (𝑃𝐹 =	 5;<=>5678 ) for each amide was determined as the ratio of its predicted intrinsic exchange rate constant (kpred) in an   81 unstructured polypeptide of the same amino acid sequence versus its experimentally measured kobs. The kpred values were determined with the program Sphere (Zhang, 1995) which uses reference data based on poly-DL-alanine and corrected for amino acid type, temperature, pH and isotope effects (Bai et al., 1993; Connelly et al., 1993). In the cases of amide that had not exchanged significantly after 3 months, lower limits on their PFs were estimated based on the largest measured PFs for the given sample.  2.5.3 Experimental alanine-scanning mutagenesis methods 2.5.3.1 Site-directed mutagenesis and construct cloning To enable site-directed biotinylation during protein expression in E. coli, a gene encoding residues 43-125 of V112E-ETV6 with an N-terminal His6-tag and Avitag was constructed in the pET28a vector using PIPE (polymerase incomplete primer extension) cloning techniques (Ashraf et al., 2004; Klock and Lesley, 2009). Due to the length of the Avitag (45 nucleotides) and the difficulty introducing this segment as one piece, it was cloned into the plasmid in two approximately equal sections. The A93D and E112V mutations were sequentially introduced to generate the complementary A93D-PNT domain construct with the His6-tag and Avitag.  Interfacial residues present in the X-ray crystal structure of the PNT domain dimer (PDB: 1LKY) were identified using the online SPPIDER (Solvent accessibility-based Protein-Protein Interface iDEntification and Recognition) server (Porollo and Meller, 2007). Alanine substitutions were encoded at these sites in the respective A93D or V112E PNT domain clones using QuikChange site-directed mutagenesis techniques. The alanine codons were chosen primarily to minimize the number of nucleotide changes, or to optimize codon usage. All but two constructs were successfully generated in-house, and genes encoding N90A-V112E PNT domain and L116A-A93D PNT domain were purchased commercially (Biomatik).    82 2.5.3.2 Protein expression and purification for surface plasmon resonance (SPR) Each plasmid encoding either the biotinylated A93D PNT and V112E PNT domain was co-transformed into E. coli BL21 (lDE3) with the pET21a-BirA plasmid, which produces biotin ligase. Selection of both plasmids was maintained by including kanamycin (35 mg/L) and ampicillin (100 mg/L) in all media. Overnight seed cultures were used to inoculate LB media, supplemented with 0.05 mM biotin, and grown at 37 °C to an OD600 ~ 0.6. Protein expression was induced with 0.5 mM IPTG and the cells were grown at 30 °C overnight. The cells were collected by centrifugation and cell pellets were frozen at -80 °C.  Protein purification was carried out fully as described above (Section 2.5.1). However, the final size exclusion purification step was omitted because the proteins were sufficiently pure, as judged by SDS-PAGE, after thrombin cleavage and removal of the His6-tag by passage through a Ni+2-NTA HisTrap HP column. The final protein samples were exchanged into NMR buffer (20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA, pH 8.0) and concentrated to ~ 1 mL with an Amicon 3K MWCO centrifugal filter. The protein samples were then snap frozen in liquid nitrogen and stored at -80 °C prior to SPR analysis. The final proteins were ~ 50-95% biotinylated as judged by MALDI-ToF mass spectrometry. 2.5.3.3 Surface plasmon resonance (SPR) SPR experiments were performed on a Biacore X100 instrument using the streptavidin Sensor Chip SA to capture the biotinylated PNT domain “ligand”. The ligand was diluted to 50 µg/mL in HBS-EP+ buffer (10 mM HEPES, 150 mM NaCl, 3 mM EDTA, 0.05% v/v Tween-20, pH 7.4) and immobilized on the chip using the Biacore X100 control software immobilization wizard. The kinetics/affinity assay software wizard was used to optimize conditions for measuring   83 dimerization of the reciprocal PNT binding partner (the “analyte”) and the regeneration scouting software wizard was used to determine regeneration conditions. The latter included varying concentrations of different buffers (NaOH (1, 2, 3, 4 mM), acetonitrile (5 and 10 % CH3CN with 5 mM NaOH, and 1 and 2 % CH3CN with 1 mM NaOH), ethylene glycol (25, 50 %) and SDS (0.05, 0.1, 0.15, 0.2 %)), contact time (30, 60 s) and cycles (3, 5 cycles). The final selected regeneration condition involved flowing 0.2% SDS over the chip for 30 s. The association (kon) and dissociation (koff) rate constants and the equilibrium dissociation constant (KD) for binding of the analyte with the immobilized ligand were determined using the Biacore X100 kinetics/affinity assay software wizard. The positive and negative binding controls were analytes with the complementary wild type PNT domain interface or with the same monomerizing substitution, respectively. In general, the sample contact time was set to 180 s and the dissociation time was set to 300 s followed by a 30 s regeneration step with 0.2 % SDS, and stabilization period of 60 s. For the experimental runs, the analyte protein sample was initially diluted in HBS-EP+ buffer to 0.2, 2, 20, 40 and 60 nM. If weakened binding (KD > 60 nM) was observed, the analyte protein sample was re-run in HBS-EP+ buffer at 20, 200, 2000, 4000 and 6000 nM. The Sensor Chip SA loaded with the biotinylated ligand was stored in HBS-EP+ buffer at 4 °C between runs. Periodically, the control PNT domain samples were run to confirm the integrity of the Sensor Chip SA.  2.5.4 Structural determination of A93D-V112E PNT by X-ray crystallography A construct of ETV6 spanning residues 1-125 with the A93D and V112E substitutions (A93D-V112E PNT domain) was purified as described in Section 2.5.1. Crystals were grown by sitting drop vapour diffusion at room temperature in 2 µL drops, prepared with a 1:1 mixture of ~ 9.3 mg/mL protein (20 mM MOPS, 50 mM NaCl, and 0.5 mM EDTA, pH 7.0) and reservoir   84 solutions from the Hampton Index reagent crystallization screen (Hampton Research). A negative control plate was set up in parallel using the flow-through obtained while concentrating the sample with an Amicon 3K MWCO centrifugal filter. Potential crystals were identified after 2 days through visual examination under a microscope of the negative control and the protein sample plates. Several conditions yielded protein crystals and those grown in 2.8 M sodium acetate (pH 7.0) were used for data collection. These crystals were cryoprotected by brief soaking in reservoir buffer supplemented with 30% (v/v) glycerol followed by flash freezing in liquid nitrogen.  Diffraction data were collected at the CLS (Canadian Light Source) on beamline 08B1-1 (Fodje et al., 2014). The data were cut-off at 1.85 Å based on the CC1/2 metric and processed and scaled using XDS (Kabsch, 2010). Crystals were of space group P 65 2 2 with two protein molecules in the asymmetric unit. Phase determination using molecular replacement was performed with a PNT domain monomer from PDB: 1LKY using the AutoSol program in Phenix (Adams et al., 2010). Model building was performed in Coot and refinement was executed using the Phenix software suite (Emsley and Cowtan, 2004).           85 Table 2-5 Data collection and refinement statistics for ETV6 A93D-V112E PNT domain Data collection  Space group P 65 2 2  Cell dimensions   a, b, c (Å) 59.9, 59.9, 169.4   a, b, g (°) 90, 90, 120  Resolution range (Å) 44.2 – 1.9 (1.92 – 1.85)   R-merge 0.055  I / sI 2.64   Completeness (%) 99.97 (99.87)   Refinement  Resolution (Å) 1.86   No. unique reflections 16166 (1576)  Rwork 0.193 (0.224)  Rfree 0.231 (0.288)  No. of non-hydrogen atoms 1441   Macromolecules 1328   Ligands 6   Solvent 107  Average B-factor (Å2) 35.6   Macromolecules 35.1   Ligands 46.4   Solvent 41.4  RMS deviations   Bond lengths (Å) 0.006   Bond angles (°) 0.87   Ramachandran favoured (%) 99.33  Ramachandran allowed (%) 0.67  Ramachandran outliers (%) 0.00  Rotamer outliers (%) 0.69 *Statistics for the highest-resolution shell are shown in parentheses*  2.5.5 In silico structural comparison  2.5.5.1 Molecular dynamics (MD) simulations on monomeric and dimeric PNT domains MD simulations were carried out at the University of Bristol under the guidance of Dr. Richard Sessions. MD simulations were based on coordinate files from the following crystal structures: the monomeric A93D-V112E PNT domain (determined herein), an A93D-PNT domain subunit from PDB: 1LKY (with an alanine introduced at position 93 with the mutagenesis function   86 in PyMol (Version 1.8) (Schrödinger, 2015)), and a dimer of the A93D-PNT domain and V112R-PNT domain from PDB: 1LKY.  All MD simulations were performed on the University of Bristol High Performance Computer BlueCrystal using GROMACS (Version 5.1.2) (Berendsen et al., 1995). The systems were solvated with TIP3P waters in an orthorhombic box 2 nm larger than the longest dimension of the protein. Na+ and Cl- ions were included at 50 mM to emulate experimental NMR conditions, while neutralizing the monomerizing mutations (i.e. the negatively charged A93D) to have no overall net charge. In the case of the dimer simulation, the monomerizing mutations (i.e. A93D, V112R) cancelled out the charges. The amber99sb-ildn forcefield was used to parameterize the protein simulations (Lindorff-Larsen et al., 2010). Non-bonded long-range electrostatic interactions were calculated using the Particle Mesh Ewald method with a 1.2 nm cut-off. Bonds were constrained using the LINCS default algorithm to allow the use of a 2 fs timestep for the MD integration.  The energy of the system was minimized over 5000 steps of the steepest descent energy minimization. The system then underwent a position-restraint simulation over 200 ps where the protein was restrained to its initial position while heating the system to 310 K and introducing pressure at 1 bar using the Berendsen barostat (Berendsen et al., 1995). The full unconstrained MD simulations were run over 100 ns with integration step sizes of 2 fs using the leap-frog algorithm, and trajectory files were recorded every 100 ps. The temperature was maintained at 310 K using v-rescale modified Berendsen thermostat temperature coupling and at 1 bar with the Berendsen barostat pressure coupling. Trajectories were analyzed and processed utilizing GROMACS tools. The simulations were visualized with VMD (Version 1.9.2) (Humphrey et al., 1996), gnuplot (Version 4.6) and PyMol (Version 1.7) (Schrödinger, 2015).    87 Chapter 3: Comprehensive screening for candidate inhibitors of ETV6 PNT polymerization using cellular, in vitro and in silico techniques 3.1 Overview The ETV6 PNT domain is implicated in many fusion oncoproteins that result in over 40 different types of cancer. Inhibition of PNT domain polymerization has been shown to return aberrant cellular morphology back to wild type. In this chapter, I discuss three different screening approaches used in an attempt to identify compounds that either inhibit PNT domain heterodimerization or bind to the PNT domain. Initially, I set up and characterized two cell-based screening approaches, in human and yeast cells, to identify potential inhibitors of PNT domain heterodimerization. As proof-of-concept, I confirmed that the assays could detect an inhibitor of PNT domain polymerization. Then, I undertook a large-scale screening of several in-house compound libraries to search for inhibitors of PNT domain heterodimerization. Several lead compounds were identified, but upon validation with NMR, none showed any appreciable affinity for the PNT domain. In parallel, I turned to an in silico screening approach to identify compounds that could bind either at the PNT domain interfacial residues or at the salt bridge forming residues. At the University of Bristol, I utilized their Bristol University Docking Engine (BUDE) algorithm to virtually screen over 8 million unique compounds in ~ 20 different conformations. Lead compounds were selected after subjection to short (10 ns) MD simulations. Of these, over 60 compounds were purchased for experimental testing. However, upon validation through the cell-based screening assays and NMR spectroscopy, none bound to the PNT domain or prevented PNT domain heterodimerization.    88 3.2 Introduction The ETV6 PNT domain polymerizes with high affinity through two relatively flat PPI interfaces. In Chapter 2, using NMR spectroscopy and X-ray crystallography, I highlighted the stability of the interaction and showed that there is little structural change upon heterodimerization. Through SPR-monitored alanine scanning mutagenesis, I also identified key “hot spot” hydrophobic and salt-bridge forming residues at the ML- and EH-interfaces. Previous studies demonstrated that the mutation of one such residue (K99 in the K99-D101 salt bridge) prevented aberrant cellular transformation by the EN chimeric oncoprotein (Cetinbas et al., 2013). This suggested that small molecules which interact with the hot spot regions of either interface of the ETV6 PNT domain may also disrupt ETV6-oncoprotein polymerization and thereby prevent transformation. The goal of this chapter is to identify those molecules as potential lead therapeutic molecules using a comprehensive array of screening strategies. Scott et al. outline several strategies for discovery of PPI inhibitors that act through allosteric or orthosteric mechanisms. High-throughput screening (HTS) is an experimental technique with three principles of performance management (time, cost and quality) that need to be optimized (Mayr and Bojanic, 2009). HTS is generally challenging for PPI inhibitor discovery because of low hit rates, weakly binding hits, difficulties in removing false positives and the use of available chemical libraries that may be biased for enzymes or well-defined pockets. Therefore, important considerations in the development of HTS assays also include the relative sensitivity (signal-to-noise ratio) and sources of potential false positive/negative hits. A general pipeline of drug discovery utilizing HTS assays is shown in Figure 3.1.     89  Figure 3.1 High-throughput screening in drug discovery  A general overview of HTS workflow adapted from Mayr and Bojanic, 2009. In brief, initially a target of interest needs to be identified and a suitable assay has to be developed with positive and negative controls. After optimization, the HTS commences. Counter screening will remove artifacts in the assay and validation screening will demonstrate the hit is not an artifact of the initial screen. A hit-to-lead phase may explore structure-activity relationships to further optimize a lead compound.  In contrast to experimental screening, virtual (in silico) screening offers a targeted approach to the interface. Computational approaches to screening span from the assessment of the druggability of an interface, such as surface pocket fluctuation formation, to the docking of in silico ligands. Virtual docking has the appeal of allowing for the rapid screening of millions of compounds. However, challenges lie with accounting for possible induced conformational changes of the small compound or the protein upon binding (Scott et al., 2016). Also, targeted approaches may exclude identifying any allosteric inhibitors. While not at the interface, allosteric inhibitors may take advantage of pockets that are more hydrophilic and suited for small molecule drugs. Finally, for numerous reasons a lead theoretical compound may not exhibit its desired effects in vivo. Thus, eventually predicted lead compounds must be tested experimentally for inhibitory activity. Another metric for assessing the potential benefits of experimental HTS assays are whether they can be classified as “up” or “down”. “Up” assays look for an increase in a measured readout whereas “down” assays rely on a decrease in the readout. In general, down assays inherently have more confounding factors as there are many ways to deleteriously affect a system. This is commonly problematic in cancer drug discovery as assays with a “down” readout, such as decreased cell proliferation and viability, could detect off-target or nonspecific effects (Kaelin,   90 2017). Rescue experiments, which show that an effect may be reversed by a perturbant-resistant version of the target, can aid in determining if an effect is on- or off-target (Kaelin, 2017).  With any HTS approach, it is also critical to have a series of positive and negative controls for assay and target hit validation. Positive controls allow negative results to be understood while negative controls allow positive results to be understood (Kaelin, 2017). Counter-screens, to remove artifacts caused by the assay, and selectivity screens, to look at the different molecular nature of the target versus other targets, may be used as controls (Mayr and Bojanic, 2009). Importantly, any initial identified screening hits need to be corroborated with complementary experimental approaches to form correct conclusions regarding its target specificity and affinity. The PNT domain PPI certainly presents challenges for drug targeting, but also has several encouraging characteristics. PPI inhibition is more successful in systems that have hot spots clustered tightly in space, such as an a-helix binding cleft (Scott et al., 2016). In the PNT domain PPI, many of the hot spots are in close proximity to each other. Encouragingly, the EH-interface also is mainly comprised of a single a-helix. In addition, previously reported computational analyses have identified two small molecule binding pockets in close proximity to the K99-D101 salt bridge that could be exploited for inhibitor development (Cetinbas et al., 2013).  3.2.1 Mammalian cell protein-fragment complementation assay (PCA) Cell-based assays can be advantageous as, in addition to screening for a desired outcome, they help identify compounds that can cross the plasma membrane, remain stable within cells and do not have toxic side-effects. One such example of a cell-based approach to identify PPI modulators involves protein-fragment complementation assays (PCAs). In these assays, two proteins of interest are fused to complementary fragments of a reporter protein. If the target   91 proteins interact, the fragments assemble into their near-native folded structure and convey a reconstituted activity (Michnick et al., 2007).  A range of different PCA reporters have been developed, including murine dihydrofolate reductase, green fluorescent protein, TEM1 b-lactamase and various luciferases (Michnick et al., 2007). Important features of these systems are that the reporter protein fragments cannot fold or assemble spontaneously in absence of any interaction between the target proteins. In general, the reporter fragments must also reversibly dissociate when the target proteins dissociate. However, some PCAs based on split green fluorescent protein are irreversible and this can be useful for investigating rare or transient interactions (Michnick et al., 2007).  I selected a PCA based on split Gaussia luciferase to use in cellular screening for inhibitors of the ETV6 PNT domain association (Remy and Michnick, 2006). Remy and Michnick developed this fully reversible PCA in HEK293 cells such that the greatest luminescence was observed upon leucine zipper-induced complementation of the Gaussia luciferase fragments separated between Gly93 and Glu94. Humanized Gaussia luciferase generates higher bioluminescence in live cells when compared to humanized forms of firefly or Renilla luciferases and thus is sensitive at low protein expression levels (Tannous et al., 2005). In addition, Gaussia luciferase requires no cofactors to catalyze the oxidation of its substrate, coelenterazine, and thereby emit blue light (l = 480 nm). This light can pass through cell membranes and be visualized instrumentally (Remy and Michnick, 2006).  I applied the split Gaussia luciferase PCA to the PNT domain heterodimer system (Figure 3.2). Upon PNT domain association the luciferase fragments will have reconstituted activity, which can be measured by luminescence readout. If a small molecule inhibits the heterodimerization, either through orthosteric or allosteric mechanisms, then a reduction of   92 luminescence will be detected. Inherently this “down” assay will screen for inhibitors of PNT domain association, as well as inhibitors of split luciferase formation/association and inhibitors to luciferase activity. In addition, any compounds that affect a cell's vitality may cause a decrease in luminescence.  Figure 3.2 The design of a split Gaussia luciferase PCA for cell-based screening of inhibitors of ETV6 PNT domain heterodimerization  (a) In the event that the A93D and V112E PNT domains heterodimerize, the split luciferase fragments assemble into an active enzyme which, upon addition of its substrate, generates a luminescence reading. (b) If a small molecule (star) disrupts the interaction, either by directly inhibiting the PPI or by inducing an allosteric conformation change, then the equilibrium shifts away from reconstitution of the luciferase fragments, resulting in reduced luminescence. These cartoons are schematic, especially with respect to the folding of the luciferase fragments.  3.2.2 Yeast assay two-hybrid screening assay A yeast two-hybrid screening assay was devised to run in parallel to the PCA. This is desirable to aid in identifying any false negative or false positive compounds that are artifacts of either particular assay. Yeast two-hybrid screening utilizes two target proteins fused to either the separated DNA-binding domain (DBD) or transactivation domain of a reporter transcription factor. The interaction of the target proteins “reassembles” the modular transcription factor, which in turn activates expression of a downstream reporter gene (Young, 1998). In context of investigating the   93 PNT domain polymerization, I utilized a “bait” plasmid encoding a POU DBD mammalian transcription factor fused with the ETV6 A93D PNT domain and the “prey” plasmid encoding of an activating domain (AD) linked with the ETV6 V112E PNT domain (Herr and Cleary, 1995).  This assay has similar attributes to the PNT domain PCA mammalian screening assay. In essence, heterodimer formation results in the transcriptional activation of the HIS3 gene, which encodes an essential intermediate enzyme of the histidine biosynthetic pathway to enable yeast growth in the absence of histidine (Figure 3.3). An inhibitor of PNT domain association will prevent yeast growth in the absence of histidine due to a lack of the required expression of the HIS3 gene. Supplementation of the growth medium with histidine will circumvent growth inhibition caused by an on-target inhibitor of the PPI, but not by compounds with off-pathway cytotoxic effects. An exception to this point is that reduced growth caused by inhibitors of the histidine biosynthetic pathway will also be rescued by addition of histidine. Furthermore, the competitive inhibitor 3-amino-1,2,4-triazole (3-AT) can be used to titrate the HIS3 activity level (Brennan and Struhl, 1980; Joung et al., 2000). This means that yeast cells require a higher level of HIS3 expression to grow on selective media in the presence of 3-AT, which can be important to eliminate growth resulting from basal or “leaky” HIS3 expression.    94  Figure 3.3 Principle of the yeast two-hybrid screening assay for inhibitors of ETV6 PNT domain heterodimerization The bait plasmid contains a constitutive ALDH1 promoter that drives expression of the POU DNA-binding domain (DBD) linked to the A93D PNT domain. The prey plasmid has an inducible GAL promoter for expression of the V112E PNT domain linked to the B42 activating domain (AD). (a) The presence of galactose (+Gal) induces expression of the fusion protein from the prey plasmid. The PNT domains can heterodimerize and induce expression of the HIS3 gene to allow cell growth in medium lacking histidine (-His). In the presence of an inhibitor of PNT domain heterodimerization, there is no expression of the HIS3 gene and no yeast growth. (b) Addition of histidine (+His) to the media will enable yeast growth in the absence or presence of an inhibitor of the PNT domain PPI or the histidine biosynthetic pathway.     95 3.2.3 In silico screening methods In silico screening is the use of computational algorithms to help identify molecules or compounds predicted to show bioactivity, for example, by binding to a target protein of known structure (Schneider, 2010). Most common virtual screens involve the docking of a vast array of small molecules on a target protein of known structure, with the goal of finding those that might form low energy complexes based on a given set of theoretical criteria (Bajorath, 2002). To this end, a variety of docking programs that utilize different energetic docking parameters and scoring metrics for compound association have been developed. The accuracy of these computational approaches has steadily improved by better modeling of flexible fitting, role of water molecules, protonation states and the entropic and enthalpic contributions underpinning complex formation (Schneider, 2010). This has led to notable successes in both identifying compounds from existing in silico libraries that bind to receptors and in the de novo design of new compounds that have subsequently been shown to bind target proteins upon their synthesis (Shoichet, 2004).  The Bristol University Docking Engine (BUDE) is an established in silico screening method. BUDE has been used to identify a compound with therapeutic benefits due to the interruption of the YAP-TEAD protein-protein interaction (Smith et al., 2019). BUDE is distinct from other in silico screening assays in that it utilizes both an empirical free-energy forcefield and an atom-based forcefield (McIntosh-Smith et al., 2015; Smith et al., 2019). BUDE incorporates evolutionary Monte Carlo energy minimization techniques. A six-dimensional search space allowing free rotation and translation is defined for ligands to bind to the protein of interest. The ligands are from the ZINC library, a free database of virtual ligands in ready-to-dock, 3D formats (Sterling and Irwin, 2015). Multiple conformers of the compounds were generated using Confort from the Sybyl Suite (Tripos). The search consists of the ligand’s “pose”, or its coordinates, in the   96 space that undergoes a sequence of generations, each evaluated for the binding energy with the protein (McIntosh-Smith et al., 2015). The first generation looks at a uniformly random positioning of poses in the search space and subsequent generations use randomly shifted variants of the best poses from the previous generations. The predicted binding affinity must be calculated continuously throughout the docking algorithm using forcefields to estimate the entropic and enthalpic changes upon ligand-receptor binding. The final set of candidate compounds are chosen for experimental validation through a combination of visual examination and short ligand-protein MD simulations.  3.3  Results 3.3.1 Development of a PCA for inhibition of PNT domain association My first goal was to establish and validate a PCA for monitoring PNT domain association. I used polymerase incomplete primer extension (PIPE) cloning techniques to generate, within the mammalian expression plasmid pcDNA3.1, combinations of genes encoding the A93D- or V112E-PNT domains covalently attached to humanized Gaussia luciferase fragments through a flexible 10 amino-acid polypeptide linker. The N-terminal luciferase fragment will be referred as N-Luc and the C-terminal fragment as C-Luc. Various fusion proteins (Figure 3.4) were transiently expressed in HEK293 cells and relative luminescence readings were measured on the Varioskan LUX multimode microplate reader either 48- or 72-hours post-transfection (Table 3.1). Cell culture medium alone or cells exposed to the reagents needed for the transient transfections showed a luminescence reading of ~ 20. When expressed alone, V112E-CLuc, A93D-CLuc and A93D-NLuc produced a relatively low luminescence reading of ~ 600 either 48- or 72-hours post-transfection. Expressing PNT domains that can heterodimerize (A93D and V112E), but fused to the same luciferase fragment, also resulted in a low luminescence reading of ~ 70. Introduction of   97 the complementary luciferase fragments linked to the same PNT domain (A93D) yielded a higher luminescence of ~ 2500. However, this was still low when compared to combinations of A93D-CLuc and V112E-NLuc or A93D-NLuc and V112E-CLuc, that showed readings of 320,000 and 91,000 48-hours post-transfection, respectively. The luminescence readings of these combinations diminished 72 hours post-transfection.  The higher luminescence readings of the A93D-CLuc/V112E-NLuc combination compared with the A93D-NLuc/V112E-CLuc may indicate that one orientation of the luciferase fragments is more favourable to reconstitution of the active enzyme than the other. The moderate luminescence readings of the complementary luciferase fragments paired with PNT domains that do not associate indicates that the two luciferase fragments can interact transiently or with low affinity when they are not brought together in space by PNT domain association. In addition to the PNT domain constructs, I tested leucine zipper dimerization constructs provided by Dr. Michnick. After 48 hours the luminescence reading was ~ 45,000. I postulate that the lower readings of the leucine zipper constructs compared to the PNT domain constructs reflect a lower affinity of leucine zipper dimer or a less favourable spatial orientation or accessibility of the luciferase fragments for complementation. Overall, these experiments showed that the PCA system can provide a strong signal that is seen only upon PNT domain heterodimerization and is substantially greater than the background signal from the luciferase fragments. This indicates that a compound that inhibits PNT domain association should indeed cause a large decrease in luciferase activity.    98  Figure 3.4 Cartoon representations of the establishment and validation of the PNT domain PCA. A cartoon representation of different protein constructs expressed upon transient transfection in HEK293 cells. For detailed descriptions of the plasmids, see Appendix A. (a)-(c) represent single PNT domains covalently linked to a luciferase fragment; (d) represents the PNT domain heterodimer with each PNT domain attached to the same luciferase fragment; (e) represents complementary luciferase fragments attached to the same PNT domain. These are all expected to give little to no luminescence signal. In contrast, (f) and (g) represent combinations of PNT domains and luciferase fragments that should associate and reconstitute the luciferase and show increased luminescence. Finally, (h) represents a leucine zipper construct previously reported to dimerize and reconstitute the luciferase fragments (Remy and Michnick, 2006). The luminescence readings and transfection times for these various constructs are listed in Table 3.1.        99 Table 3-1 Luminescence readings observed upon transient transfection of different constructs  Cartoon representations of the control experiments are shown in Figure 3.4. Plasmids were transiently transfected as outlined in the Section 3.5.2.3. A93D refers to the A93D PNT domain, V112E refers to the V112E PNT domain, CLuc refers to the C-terminal fragment of luciferase and NLuc refers to the N-terminal fragment of luciferase. All luminescence readings were recorded on a Varioskan LUX multimode microplate reader. All 48-hour experiments were done in duplicate and 72-hour experiments in triplicates, with average values tabulated.  Cartoon Representation Plasmid(s) Transfected or Conditions Hours Post-Transfection Luminescence Reading NA DMEM, no HEK293 cells 48 20 NA HEK293 cells with a mock (no DNA) transfection 48 25 a V112E-CLuc 48 670 a V112E-CLuc 72 500 b A93D-CLuc 48 450 c A93D-NLuc 48 690 d A93D-NLuc and V112E-NLuc 48 70 e A93D-NLuc and A93D-CLuc 48 2500 f A93D-CLuc and V112E-NLuc 48 320,000 f A93D-CLuc and V112E-NLuc 72 165,000 g A93D-NLuc and V112E-CLuc 48 91,000 g A93D-NLuc and V112E-CLuc 72 68,000 h Zip(1)-NLuc and Zip(2)-CLuc 48 45,000 h Zip(1)-NLuc and Zip(2)-CLuc 72 22,000  I carried out preliminary screens using transiently transfected cells, but found this monetarily and time expensive. To streamline the HTS assay, I created stable cell lines expressing the genes for the PNT domain and control leucine zipper fragments. Note that the expression plasmid pcDNA3.1 is available with different antibiotic selection markers for mammalian tissue culture including G418, zeocin and hygromycin. The original constructs received from the Michnick group and the PNT domain constructs described above all encoded zeocin resistance. To enable selection of clones expressing two fragments, I inserted the sequence encoding the A93D PNT domain, the linker and the N-Luc fragment into an empty pcDNA3.1 with neomycin   100 resistance using PIPE cloning. I then co-transfected this construct and a zeocin resistance construct encoding V112E-PNT domain-C-Luc into HEK293 cells. Both antibiotics were present at minimal concentrations to kill all untransfected HEK293 cells after one week. I also used zeocin to select for pooled stable transformants expressing the leucine zipper PCA components to be used as a control for validating initial screening hits.  Luminescence of the stable transformants was reduced compared to the transiently transfected cells (Table 3.2). This is likely due to a lower number of plasmids being stably incorporated into the genome than the number of plasmids introduced by transient transfection. In addition, a greater proportional decrease in luminescence was observed for the leucine zipper PCA compared to the PNT domain PCA. This may be due to some cells having only incorporated a single construct as the leucine zipper PCA utilized the same selection marker, zeocin, for both plasmids.  Table 3-2 Comparison of transiently transfected and stably expressing PCA systems A comparison of 48-hour post transient transfection of the A93D-PNT/NLuc and V112E-PNT/CLuc PCA and the leucine zipper PCA versus their pooled stable transformants.  Condition Luminescence Reading Transiently transfected PNT domain PCA (48 hours) 90,000 Stably expressing PNT domain PCA 14,000 Transiently transfected leucine zipper PCA (48 hours) 45,000 Stably expressing leucine zipper PCA 1,020  Creation of a stably expressing PNT domain PCA system abolished the need for transient transfections, thus reducing both time and cost for subsequent screening. To optimize the signal-to-noise ratio, I determined the peak luminescence reading of the PCA with the NanoFuel GLOW Assay kit. Using a 1-second luminescence integration time on the Varioskan, I determined peak luminescence to be between 18 – 34 minutes after addition of the reagents (Figure 3.5). This incubation time with the reagents was longer than the 5 minutes recommended by the   101 manufacturer. Combined with the use of automatic pipetters and robotics, this enabled ~ ten 96-well plate scans before the luminescence readings started to fall substantially.  Figure 3.5 Time dependent luminescence of the PNT domain PCA  A 96-well plate containing stably expressing PNT domain PCA cells was measured on the VarioSkan over a period of ~ 4 hours after addition of the coelenterarizine and NanoFuel GLOW Assay reagents. Each luminescence reading is the average of all 96 wells and standard deviations are indicated by the dashed lines.  At the time of the assay development, there were no known inhibitors of ETV6 PNT domain polymerization to use as controls. However, introduction of a “free” PNT domain into cells expressing EN can revert cell morphology back to wild type (Cetinbas et al., 2013). Thus, I reasoned that introducing an A93D PNT domain not linked to a luciferase fragment into the PCA should reduce the luminescence reading through competition with the A93D-PNT/NLuc for the   102 V112E-PNT/CLuc binding interface. Thus, I cloned the A93D PNT domain into a mammalian vector (pcDNA3.1/Neo) and transiently transfected it into the PNT domain and leucine zipper PCA systems (Figure 3.6). As controls, I performed transient transfections with an empty plasmid vector (pcDNA3.1) and without any plasmid DNA. I measured luminescence 24- and 48-hours post-transfection and normalized the readings to respective untreated cells. At 24-hour post-transfection, the “free” A93D PNT domain caused a significant decrease in luminescence with the PNT domain but not the leucine zipper PCA. At 48-hour post-transfection, both systems were affected, but the decrease was most pronounced with the PNT domain PCA cells. The empty pcDNA3.1 vector also reduced luminescence for the two systems 48-hour post-transfection, although not to the extent observed for the “free” A93D PNT domain with the PNT domain PCA. Collectively, these controls validate the PNT domain PCA and define the expected sensitivity to inhibition of PNT domain heterodimerization.     103  Figure 3.6 Transient expression of A93D PNT domain in pooled stable transformants The A93D PNT domain-expressing and empty pcDNA3.1 vectors were transiently transfected into the PNT domain and leucine zipper PCA systems. In addition, controls with only transfection reagents were run in parallel. Luminescence readings were normalized to untreated PNT domain or leucine zipper PCAs cells and the standard deviations of the replicates are shown. One-way ANOVA statistical analyses were performed to determine statistically significant differences. The calculated P values are indicated by the horizontal bars.   3.3.2 High throughput screening using the PNT domain PCA After generating, characterizing and validating the PNT domain PCA, I carried out a high throughput screen using several chemical libraries. Due to factors such as incubator space and compound incubation time, I could screen up to two batches of ten plates per day. I utilized a robotic pinning tool to transfer compounds from 96-well stock plates to 96-well plates of cells. After 4 hours of incubation at 37 °C, the Nanofuel Glow reagent was added to the cells and   104 luminescence of each well was read 15 minutes later. In total I screened the entire Prestwick Chemicals, Sigma LOPAC, Microsource Spectrum, Selleck L1700 Bioactive Compound, Biomol and PoPPI libraries, as well as a collection of natural products from Raymond Andersen’s group and approximately half of the Maybridge Hitfinder collection, totaling ~ 18,000 compounds. Details of these libraries are outlined in the Materials and Methods section of this chapter. Plates were analyzed individually due to the stability of the luminescence over time and compounds of interest were identified as having a luminescence reading that was more than two standard deviations away from the average luminescence reading of a plate (Figure 3.7). By this criterion, an initial hit rate of ~ 2.5 % was observed when screening the Biomol, Microsource Spectrum, Prestwick and Sigma collections. This is a rather high hit rate for a HTS assay, and may be due to the relatively high compound concentrations of ~ 34 µM achieved by using a 0.7 mm diameter pin tool to transfer ~ 340 nL of stock solution to the cells in 50 µL of culture medium. Such high concentrations were chosen to increase the likelihood of identifying a PNT domain PPI inhibitor albeit with an increased rate of false positives due to toxicity effects. To screen additional libraries, I used a 0.4 mm diameter pin tool that resulted in a final concentration of ~ 20 µM of compound per well. This decreased the overall hit rate to ~ 0.6 %. Of the ~ 18,000 compounds screened, an overall ~ 1.0 % hit rate was observed through the two different pinning concentrations.   105  Figure 3.7 Sample results of the high-throughput PNT domain PCA screening assay Shown are the results of several drug-screening plates using the Prestwick (335, 338, 339), Biomol (342), Sigma (348, 355) and Microsource (368) compound libraries. The circles represent the luminescence reading of each well of a 96-well plate. The red bars represent the mean relative luminescence and standard deviation of that plate. Pink circles represent wells that were exposed to only solvent (DMSO). An initial screening hit is defined as being more than two standard deviations away from the mean (blue circles). Screening hits that were found to be toxic to cells under visual examination under a microscope are represented by yellow circles.  Screening hits can arise in this assay due to inhibition of PNT domain polymerization, split luciferase reconstitution or luciferase activity, as well as due to cytotoxicity or inhibition of cellular proliferation pathways. Therefore, the leucine zipper PCA was used as a secondary assay to determine whether any reduction in luminescence was due to inhibition of PNT domain polymerization or to a confounding effect. The latter would be expected to cause a decrease in luminescence with both PCAs. Furthermore, I retested screening hits in duplicates at 1, 3, 10 and 30 µM concentrations, in both the PNT domain leucine zipper PCA systems (Figure 3.8). I also examined the cells under a microscope to check for any toxic effects or compound precipitation.   106 An example of a toxic compound was Z-L-Phe chloromethyl ketone, a serine protease inhibitor. In addition, fresh stocks of compounds were purchased and dissolved in DMSO for these repeat assays. Out of 179 compounds retested, 83 did not decrease the luminescence in a concentration-dependent manner and were likely artifacts of the initial screens. Unfortunately, every compound that showed a concentration-dependent decrease of luminescence in the PNT domain PCA exhibited the same pattern with the leucine zipper PCA. Thus, changes in luminescence were likely due to factors other than inhibition of the PNT domain association.  Figure 3.8 Secondary validation of initial PCA screening hits Compounds identified as screening hits in the initial high-throughput screen were retested for a concentration-dependent response in the PNT domain and leucine zipper PCAs. Compounds were tested in duplicate at 1, 3, 10 and 30 µM and cells were visually examined to determine toxicity or compound precipitation. Those that were not cytotoxic and elicited a similar response in the both PCA systems, were likely inhibitors of split-luciferase reconstitution, luciferase activity itself or another variable. Compounds 1-2 represent examples of artifacts of the original, large-scaled screen whereas compounds 3-5 represent examples of concentration-dependent responses.   107  I ordered 13 compounds to test for binding to the PNT domain using NMR spectroscopy based on availability and on an apparent dose-dependent concentration response in the PNT PCA system. The 15N-HSQC spectra of the 15N-labelled A93D PNT domain were recorded upon progressive titration with each compound (Table 3.3). In no case were any amide 1HN-15N chemical shift perturbations observed, indicating no detectable binding (see Figure 3.9 for an example with Lanatoside C, a cardiac glycoside). These 13 compounds were ordered without validation in the leucine zipper PCA. Retrospectively, later validation showed that the compounds had the same effect with the leucine zipper PCA system. In summary, testing of ~ 18,000 compounds failed to identify any selective inhibitors of PNT domain association. Table 3-3 Testing of initial PNT domain PCA hits by NMR spectroscopy Thirteen compounds, initially identified to cause a concentration-dependent decrease in luminescence in the PNT domain PCA, were tested for binding to the A93D PNT domain via NMR-monitored titrations. In no case were any amide chemical shift perturbations observed, thus indicating no detectable binding.  Compound Tested Molar Ratios Tested Merbromin 3 Sarmentogenin 0.5, 5 4’-hydroxychalcone 0.5, 5 Emetine dihydrochloride 3 Cephaeline hydrochloride 0.5, 5 Curcumin 0.5, 3 N-oleoyldopamine 0.5, 3 Lanatoside C 0.5, 5 N-p-tosyl-L-phenylalanine chloromethyl ketone 0.5, 5 Lasalocid A 0.5, 5 Carnosic acid 0.5, 5 Phorbadione 0.5, 2 Oridonin 0.5, 5    108  Figure 3.9 Lanatoside C does not bind the A93D PNT domain Shown are overlaid 15N-HSQC spectra of the 15N-labelled A93D PNT domain in the absence (blue) and presence of a 5:1 molar ratio (orange) of Lanatoside C (with 5 % DMSO, final). The lack of any chemical shift perturbations indicates no detectable binding. For a list of all compounds tested in a similar manner, see Table 3.3.  3.3.3 Development of a yeast two-hybrid PNT domain assay In collaboration with Dr. Ivan Sadowski, a yeast two-hybrid assay was developed to screen for inhibitors of ETV6 PNT domain association. Two-hybrid systems commonly utilize the GAL4 DBD. However, in preliminary tests with the A93D PNT domain covalently attached to this DBD and the V112E PNT domain covalently attached to the HIS3 activating domain, no PNT domain association was observed. The GAL4 DBD binds to DNA as a dimer, and we speculated that this may somehow preclude PNT domain heterodimerization. Thus, we switched to the POU DBD, which binds DNA as a monomer. To characterize and validate the assay introduced earlier in   109 Figure 3.3, we showed that the yeast strains containing the “Bait + Prey” plasmids indeed grew in the absence of histidine, whereas the “Bait + Control” strain that lacks the V112E PNT domain showed little to no growth (Figure 3.10). Growth was restored upon addition of histidine to the latter, indicating that lack of growth was specifically due to lack of HIS3 gene expression.  Figure 3.10 Yeast two-hybrid assay controls Yeast growth as detected by the OD600. Co-expression of the “Bait” (A93D PNT Domain/POU DBD) + “Prey” (V112E PNT domain/HIS3 AD) plasmids enables yeast growth, as monitored by OD600, with (light purple) and without (dark purple) addition of histidine. In contrast, without added histidine, no growth was seen for yeast containing the “Bait” plasmid and the empty prey plasmid (Control) that lacks the complementary PNT domain. Each condition was replicated 5 times.  3.3.4 High-throughput screening using the yeast two-hybrid assay In total, over 8,000 compounds from the Selleck, Biomol, Sigma LOPAC, Prestwick, Microsource Spectrum and PoPPI libraries were screened in the yeast two-hybrid assay. Details of these libraries are outlined in the Materials and Methods section of this chapter. The effect of the compounds on yeast cell growth were displayed as histograms for each library (Figure 3.11). In   110 general, the libraries showed a narrow growth percentage distribution range, with most compounds affecting growth by less than 10%. The PoPPI library contained 22 compounds that showed ³ 20% growth inhibition at 15 µM. The Sigma library had 12 compounds that showed ³ 20% growth inhibition at 7.5 µM. The Prestwick library had 30 compounds that showed ³ 25% growth inhibition. The Microsource library had 47 compounds that showed ³ 20% growth inhibition at 7.5 µM. The Biomol library had 16 compounds that showed ³ 20% growth inhibition. The Selleck library had a wider distribution, with 115 compounds having ³ 50% growth inhibition.   111  Figure 3.11 Screening compounds from six libraries with the yeast two-hybrid assay  Compounds from six libraries were tested with yeast co-expressing the “Bait” (A93D PNT Domain/POU DBD) + “Prey” (V112E PNT domain/AD) plasmids. Histograms show the number of compounds (frequency) versus growth of untreated cells, defined as 100%. Compounds causing reduced growth were subsequently carried on to secondary validation. Although not pursued further, it is interesting that some compounds increased cell growth. One speculative cause of this could be enhanced PNT domain heterodimerization.    112  Secondary screening was carried out in the “Bait+Prey” strain without and with histidine, using multiple concentrations of the hits from the primary screen. For the PoPPI library, none of the 22 compounds showed any growth inhibition (without and with histidine) upon retesting at 15 µM. The 12 hits from the Sigma library were retested at 2.5 and 7.5 µM and similarly none were found to be inhibitory. The 30 hits from the Prestwick library and the 115 from the Selleck library were retested at 5 and 15 µM. All compounds that inhibited yeast growth without histidine also inhibited growth in the presence of histidine.  Several compounds from the Microsource library, such as tannic acid, showed a decrease of cell growth in the “Bait + Prey” strain that could be restored with addition of histidine and no cell toxicity in the “Bait + Control” strain (Figure 3.12). Although such results suggest that this compound may inhibit PNT domain dimerization, this behaviour would also be expected of compounds that interfere with histidine biosynthesis. Indeed, one hit from the Biomol collection, acivicin, that caused growth inhibition in the absence, but not presence, of histidine is a glutamine analogue that can inhibit g-glutamyltransferase (Winkler and Ramos-Montanez, 2009). This inhibition of this enzymatic activity impairs histidine biosynthesis, and this is likely the reason for the selective growth reduction. Despite having an off-target effect, this result provides a validation of the assay in the sense that it can successfully identify compounds that interfere with histidine biosynthesis.    113  Figure 3.12 Secondary validation of screening hits from the yeast two-hybrid assay Both tannic acid and acivicin showed concentration-dependent inhibition of the growth of yeast containing the “Bait + Prey” plasmids. This could be rescued through addition of histidine, indicating that neither compound is toxic to the cells. A similar pattern was seen with yeast harboring the “Bait + Control” plasmids, whereby histidine rescues the cell growth in absence of a functioning HIS3 gene. Acivicin was later found to be an inhibitor of g-glutamyltransferase which is necessary for histidine synthesis.     114 Table 3-4 Compounds identified as verified hits in the yeast assay Three compounds that were identified as selectively inhibiting yeast growth in the absence, but not presence, of histidine were purchased for testing for binding to either the A93D PNT or V112E PNT domain via NMR-monitored titrations.  Compound Tested Molar Ratios Tested Paromomycin 0.5, 3 Tannic Acid 0.5, 5 Sanguinarine 0.5, 5  Three compounds that passed the secondary screening and have no reported impact on the biosynthesis of histidine were purchased for final validation utilizing NMR spectroscopy (Table 3.4). As any potential interface selectivity was unknown, NMR-monitored titrations were carried out with both the A93D PNT and V112E PNT domain. However, no amide chemical shift perturbations indicative of binding to either monomeric protein were observed upon addition of any of the three compounds (Figure 3.13). It is possible, but unlikely, that these compounds may bind the PNT domain heterodimer and induce a structural change that precludes activation of HIS3 expression. However, a more likely explanation is the compounds identified in the two-hybrid assay impact some aspect of the histidine biosynthesis pathway. Thus, as with the PCA screening approach, no compounds that inhibit PNT domain were identified using the complementary yeast two-hybrid system.   115  Figure 3.13 Tannic acid does not bind the A93D PNT domain Shown are overlaid 15N-HSQC spectra of the 15N-labelled A93D PNT domain in the absence (blue) and presence of a 3:1 molar ratio (orange) of tannic acid (with 5 % DMSO, final). The lack of any chemical shift perturbations indicates no detectable binding. For a list of all compounds tested in a similar manner, see Table 3.4.  3.3.5 In silico screening for ETV6 PNT domain PPI inhibitors In collaboration with Dr. Sessions at the University of Bristol, I carried out in silico screening for ETV6 PNT domain PPI interactions using the BUDE algorithm with their in-house high-performance computer cluster. Dr. Sessions’ group has taken the ~ 8 million commercially available compounds in the University of California San Francisco ZINC (Zinc Is Not Commercial) 8 library and generated 20 conformers of each to produce over 160 million poses. Virtual docking of these species was initially directed towards 15 x 15 x 15 Å3 search boxes centred on either one of the two interfacial regions of the PNT domain (using a monomer structure with the appropriate wild-type interfaces from PDB 1LKY). That is, for screening of potential inhibitors that bind the ML- or EH-interfaces, the search boxes were centred on A93 in the V112E PNT   116 domain monomer or V112 in the A93D PNT domain monomer, respectively. The ZINC8 library with generated conformers was partitioned into 362 separate folders with 25 subdirectories, each containing the ligands and their conformers, to allow for smaller-scaled BUDE runs and thereby minimize adverse effects from any failed jobs and to not overload the high-performance computer. After running BUDE on each PNT domain interface, the top 500 compounds predicted to have the lowest binding energies were selected. I then ran MD simulations on the resulting complexes to determine if the compounds would theoretically remain associated with the PNT domain over a 10 ns duration. To analyze these data, I plotted the root mean square deviation (RMSD) of the initial ligand pose, the midpoint pose (5 ns) and the endpoint pose (10 ns) throughout the 10 ns MD simulation. The 500 ligands were scored as “excellent”, “good”, “mediocre” and “bad” based on the RMSD fluctuations (Figure 3.14). In parallel, using the visualization software Chimera, I looked at every compound in the top 500 list and noted whether it appeared to be closely associated to the EH- or ML-interface, exhibiting complementary protein-ligand interactions including fitting into small surface pockets (Figure 3.15). In addition, I noted whether the same compound with a different conformation occurred within the top 500 list and whether any common chemical motifs frequently appeared.  Compounds selected for further validation by NMR spectroscopy had excellent or good scores in MD simulations, appeared to interact closely with the PNT domain, and represented a range of chemical motifs to increase diversity. Preference was given to compounds that were predicted to bind in multiple conformers. Several additional compounds that had a BUDE ranking >500 were also selected due to possessing certain structural motifs, such as carboxylates or protonated amines that could potentially interact with complementary motifs of the PNT domain.   117 Collectively, this resulted in 35 and 17 compounds predicted to bind the ML- or EH-interfaces of the V112E or A93D PNT domain monomers, respectively (Table 3.5).   Figure 3.14 MD simulations of initial BUDE hits for binding to the A93D PNT domain Coordinates of the ligand over the course of 10 ns MD simulations were compared to the starting (red), midpoint (purple) and endpoint (blue). Ligands were deemed as excellent candidates if their RMSD fluctuations remained stable throughout the simulation and the RMSD fluctuations remained under 5 Å (top left). Ligands were deemed good if the RMSD fluctuations remained under 10 Å (top right). Ligands were deemed mediocre if they were not stable and exceed RMSD fluctuations greater than 10 Å. Ligands were deemed bad if the fluctuations were greater than 30 Å, likely meaning that the ligand dissociated from the protein. These compounds are identified by their BUDE Rank in the figure, which was BUDE’s output of order of compounds that were most likely to bind to the PNT domain interface.    118  Figure 3.15 Predicted structures of compounds bound to the ETV6 PNT domain   (a) Surface rendition of the V112E PNT domain with compounds predicted to bind the ML-interface centred around A93 (red). (b) Surface rendition of the A93D PNT domain with compounds predicted to bind the EH-interface centred around V112 (red). A range of chemical diversity and binding locations can be seen. The illustrated compounds were experimentally tested for binding by NMR spectroscopy (Table 3.5, Appendix D). Adapted from PDB 1LKY.  As with the cellular PCA and yeast two-hybrid screens, I chose to use NMR spectroscopy to validate whether the in silico screening hits indeed bound to the ETV6 PNT domains. Specifically, 15N-HSQC spectra were used to monitored the addition of the candidates to either the A93D or V112E PNT domains (Table 3.5). Not all the compounds in the final filtered list were commercially available and thus I tested only a subset that could be purchased from MolPort (Latvia). Of these, compound MolPort-019-818-459 was a close structural analog to one of the BUDE compounds that was unavailable for purchasing. Initially, I screened compounds at 0.5:1 and 5:1 molar ratios, but subsequently used a 3:1 ratio for expediency. (At 20:1 ratios of the compounds to PNT domain, DMSO effects were observed.) Unfortunately, in all cases but one, no   119 chemical shift perturbations were observed indicating no detectable PNT domain binding. In the case of compound MolPort-001-020-317, its titration into the V112E PNT domain caused substantial chemical shift perturbations. However, upon further investigation, this was found to result from a lowering of the sample pH value and not PNT domain binding (data not shown).                       120 Table 3-5 Candidate BUDE compounds tested for ETV6 PNT domain binding Based on BUDE virtual docking 52 compounds were ordered and tested for binding to either the A93D or V112E PNT domain via NMR-monitored titrations.  BUDE Rank MolPort ID Target Molar Ratios Tested 6 MolPort-010-780-927 A93D 0.5, 5, 20 11 MolPort-005-019-637 A93D 3 28 MolPort-007-725-254 A93D 3 39 MolPort-007-701-986 A93D 3 51 MolPort-005-140-761 A93D  0.5, 5 56 MolPort-002-668-848 A93D  5 78 MolPort-005-090-877 A93D  0.5, 5 86 MolPort-005-123-375 A93D  0.5, 5 93 MolPort-005-308-219 A93D 3 95 MolPort-007-773-309 A93D 3 129 MolPort-007-690-631 A93D 3 130 MolPort-007-702-123 A93D 3 155 MolPort-007-899-016 A93D 3 208 MolPort-002-736-787 A93D 3 255 MolPort-001-906-708 A93D 3 292 MolPort-009-154-411 A93D  0.5, 5 319 MolPort-009-723-175 A93D  5 374 MolPort-005-299-926 A93D  0.5, 5 378 MolPort-005-689-600 A93D  0.5, 5 421 MolPort-009-483-070 A93D  0.5, 5, 20 428 MolPort-009-178-639 A93D  0.5, 5 453 MolPort-009-077-356 A93D 3 466 MolPort-005-035-860 A93D  0.5, 5 478 MolPort-009-113-626 A93D  0.5, 5, 20 631 MolPort-001-930-558 A93D 3 825 MolPort-001-930-557 A93D 3 2585 MolPort-000-512-139 A93D 3 3186 MolPort-004-881-711 A93D 3 3584 MolPort-004-284-939 A93D 3 4073 MolPort-008-311-002 A93D 3 7648 MolPort-007-566-792 A93D 3 7994 MolPort-004-882-201 A93D 3 8088 MolPort-003-155-823 A93D 3 N/A Structural Analog MolPort-019-818-459 A93D 3 2 MolPort-007-690-631 V112E 3 3 MolPort-007-690-630 V112E  0.5, 5, 22   121 BUDE Rank MolPort ID Target Molar Ratios Tested 75 MolPort-005-970-014 V112E  0.5, 5 76 MolPort-002-694-487 V112E  0.5, 5 89 MolPort-001-020-317 V112E 0.5, 1, 3, 5, 10 92 MolPort-009-696-679 V112E  0.5, 5 95 MolPort-000-473-014 V112E 3 104 MolPort-009-746-988 V112E  0.5, 5, 22 129 MolPort-007-969-298 V112E 3 134 MolPort-004-826-786 V112E 3 136 MolPort-005-004-054 V112E  0.5, 5 196 MolPort-044-259-270 V112E  0.5, 5 253 MolPort-009-724-755 V112E  0.5, 5, 20 280 MolPort-002-736-787 V112E 3 356 MolPort-003-175-993 V112E  0.5, 5 376 MolPort-000-839-413 V112E 0.5, 5, 20 385 MolPort-002-571-074 V112E 3 483 MolPort-003-109-637 V112E 0.5, 5  3.3.6 In silico screening with Bristol University Docking Engine for compounds binding to the intermolecular salt bridge forming residues of the ETV6 PNT domains The alanine scanning experiments were carried out in parallel with the BUDE screen and indicated that intermolecular salt bridges contribute substantially to PNT domain association. Also, previous studies suggested that potential small molecule binding pockets occur near these salt bridges (Cetinbas et al., 2013). Therefore, I decided to target these interfacial residues for a further round of in silico screening. Protein structures are dynamic, hence, I chose to use structures from the MD simulations, discussed in Section 2.5.4, to account for possible sidechain or backbone conformational fluctuations of the PNT domain (Figure 3.16). A structure was chosen from every 10 ns step of the 100 ns MD simulation and BUDE was targeted for the K99-D101 salt bridge. To accelerate the process, we chose to screen only the top 200,000 compounds from the first PNT PPI BUDE screen   122 and their respective conformers totaling ~ 4,000,000 poses of ligands targeted to the 10 different A93D or V112E PNT domain structures. The resulting top 500 candidates were visually inspected and 10 compounds with chemical diversity were selected to test for binding to the A93D or V112E PNT domain. Notably, these corresponded to complexes with only 3 or 2 of the 10 different starting structures for the A93D and V112E PNT domains, respectively (Figure 3.17). The virtual ligands seemed to be predicted to preferentially bind to certain PNT structures, indicating that flexibility of the protein may expose conformations better suited for small molecule interactions. In addition to a visual inspection, several compounds with a BUDE ranking >500 were selected due to possessing certain desired structural motifs.   Figure 3.16 Superimposed MD simulated structures for in silico screening of compounds predicted to bind proximal to the K99-D101 salt bridge Overlaid are ten representative V112E PNT domain (green) and A93D PNT domain (blue) structures resulting from every 10 ns interval of a 100 ns simulation. The PNT domain structure is maintained throughout the trajectories with small conformational fluctuations. The wild type A93 and V112 residues are highlighted (red). BUDE targeted the virtual ligands to either K99 or D101 (purple).     123  Figure 3.17 Virtual docking of compounds targeted to the K99-D101 interfacial regions of the PNT domain Shown are surface models of three A93D PNT domain (blue) and two V112E PNT domain (green) structures, generated by MD simulations, with four and six docked compounds, respectively, that were experimentally tested for binding by NMR spectroscopy. Compounds tested are shown in more detail in Appendix D.   As performed previously, NMR spectroscopy was used to test whether the compounds from this virtual screen indeed bound to the PNT domain (Table 3.6). Unfortunately, at a 3:1 molar ratio, no compound produced any amide 1HN-15N chemical shift perturbations indicative of binding spectra (Figure 3.18).      124 Table 3-6 Compounds tested against the K99-D101 salt bridge Compounds that were identified by BUDE to bind at the PNT domain K99-D101 salt bridge. 10 compounds were ordered and tested for binding to either the A93D PNT or V112E PNT domain with 1H15N-HSQC NMR-monitored titrations.  BUDE Rank MolPort ID Target Molar Ratio Tested 31 MolPort-002-007-977 A93D 3 43 MolPort-005-102-430 A93D 3 2860 MolPort-002-668-848 A93D 3 3221 MolPort-009-704-070 A93D 3 1 MolPort-009-747-005 V112E 3 3 MolPort-009-746-988 V112E 3 22 MolPort-003-156-017 V112E 3 45 MolPort-009-747-002 V112E 3 75 MolPort-001-630-482 V112E 3 3580 MolPort-008-295-789 V112E 3   Figure 3.18 MolPort-005-035-860 does not bind the A93D PNT domain Shown are overlaid 15N-HSQC spectra of the 15N-labelled A93D PNT domain in the absence (blue) and presence of a 5:1 molar ratio (orange) of a BUDE screening hit, MolPort-005-035-860 (with 5 % DMSO, final). The lack of any chemical shift perturbations indicates no detectable binding.    125 Collectively, 60 candidate compounds from two BUDE in silico screens were tested by NMR spectroscopy and found not to bind the ETV6 PNT domain with any detectable affinity. For further validation, the compounds were additionally screened in both the yeast two-hybrid and the mammalian cell assays. No compounds were growth inhibitory in the yeast assay, and in the PNT domain PCA, one compound was identified as a screening hit (Figure 3.19). However, this compound also demonstrated reduced luminescence in the leucine zipper PCA, possibly inhibiting luciferase or other confounding factors related to cellular proliferation. Thus, after exhaustive efforts with experimental and in silico screening, I was unable to identify a compound that bound the ETV6 PNT domain and inhibited its self-association.   126  Figure 3.19 Compounds selected from the BUDE in silico screens did not inhibit PNT domain association in the mammalian split-luciferase PCA. Luminescence readings (BioTek Neo 2 instrument) of the 60 compounds that were ordered as potential lead compounds from the BUDE in silico screens. One compound (blue) reduced luminescence in both the PNT domain and leucine zipper PCAs.    127 3.4 Discussion 3.4.1 Caveats of the experimental screening assays performed against the ETV6 PNT domain Despite using three high-throughput screening approaches, no compounds were identified that inhibited PNT domain association in the mammalian PCA or yeast-two hybrid assays, or bound to the PNT domain as measured by NMR spectroscopy. In retrospect, this disappointing result is not unexpected. A caveat of experimental high-throughput screens against PPIs is that they are often successful only after screening hundreds of thousands of compounds and carrying out SAR studies to optimize initially detected weak-binding compounds (Wells and McClendon, 2007). In contrast, the PCA and the yeast assays described herein involved only ~ 18,000 compounds. Within constraints of an academic environment, it was not possible for me to increase the size of these screens by a desired ten or hundred-fold. Furthermore, unless present at very high concentrations, a compound that binds to the PNT domain with even micromolar affinity would not likely disrupt PNT domain heterodimerization (KD ~ nM) sufficiently for detection in "down" cellular assays. In addition, currently available HTS libraries of small molecules may be better suited for binding protein pockets rather than large, flat interfaces.  3.4.2 The in silico screening assay also failed to identify any compounds that bound the PNT domain In principle, virtual screening is an attractive high throughput approach, allowing millions of compounds to be tested in silico for binding at specific locations. However, high costs in computer time and manual labor for various optimizations and trials is still be required. Indeed, the computational time alone took well over a month, with two extended visits to the University of Bristol, carrying out the virtual screens presented in this chapter.   128 Rigid docking of ligands to a well-defined groove or cleft on a receptor presents a relatively straightforward “lock-in-key” scenario for virtual screening. In contrast, cases such as the PNT domain with relatively large flat interfaces are significantly more challenging. Accounting for protein flexibility or surface fluctuations may help find small molecules amenable to binding. Unfortunately, current computational limits make screening with MD simulations of ligands and receptors prohibitive. This is why I chose to run 10 ns MD simulations only on the top 500 compounds that bound to each interface. Such simulations were sufficient to identify transient pockets of BCL-XL, IL-2, and MDM2 into which inhibitors have been docked (Eyrisch and Helms, 2007).  Despite identifying compounds predicted to associate to the PNT interface for 10 ns, when tested experimentally, none showed any binding or disruption of heterodimerization. One of the most promising compounds that had multiple conformers associate for the full 10 ns was not commercially available. Multiple conformers of the same ligand appearing as a screening hit are desirable as it may indicate that a certain backbone structure is being selected. Also, it shows that a similar structure gives a similar outcome. This poses a challenge for researchers as synthesizing compounds identified from a virtual screen may be time-consuming and costly. Thus, it is beneficial when conducting an in silico screen to focus on compounds that have commercial sources. No weak or tight binding compounds were identified from the in silico screen. While efforts were made to maximize chemical diversity and interfacial area, compounds capable of binding to the ETV6 PNT domain may have been missed. We utilized the ZINC8 library, instead of a recently expanded version (ZINC15) comprising 120 million purchasable “drug-like” compounds (Sterling and Irwin, 2015). Although this larger library could be tested, a “similarity   129 paradox” has been described where minor chemical modifications of otherwise similar molecules can render them active or inactive (Bajorath, 2002). Thus, while certain chemical motifs and scaffolds were commonly seen from the BUDE screen, the small structural differences may have prevented binding. Detecting such differences in high-throughput in silico screens may prove challenging. 3.4.3 Potential avenues for HTS improvement The PNT domain polymerization is tight and the interaction long-lived, and a possible way to improve the PCA approach is to insert a tunable “key” for protein expression. Using this approach, after addition of compounds to the cells, the partner PNT domain and luciferase fragment could be “turned on”. This would then allow a compound to bind to a free PNT domain interface rather than “waiting” for heterodimer dissociation. Several methods of introducing a tunable key exist and in general, gene expression can be controlled by inducible promoters (Pedone et al., 2019). However, introducing an inducible promoter adds another layer of complexity to a HTS assay. Another modification that may benefit identifying a small molecule inhibitor would be utilizing the alanine scanning data and create a lower affinity PNT domain heterodimer in order to initially screen with less stringency for weakly binding compounds. When traditional HTS methods have not produced good starting compounds for lead optimization, several other PPI inhibitor screening approaches may be advantageous. Success has been achieved using tethered molecule and fragment-based drug design (FBDD) approaches. (Mayr and Bojanic, 2009). Tethering, or disulphide trapping, involves identifying molecules that form a covalent link upon bind weakly to a targeted surface area (Arkin et al., 2014). FBDD libraries include relatively small molecules with representative chemical properties that lack any inherent bias towards a certain target class (Scott et al., 2016). In both cases, these techniques rely   130 on initially identifying simple, weak binding molecules. Combinatorial chemistry is then used to build a larger, higher affinity compound. Again, insights from the alanine scanning mutagenesis studies may aid a rationalized approach of tethering near the ETV6 PNT domain interface.  3.4.4 Identifying an inhibitor of PNT domain polymerization will likely involve complementary approaches Identifying PPI inhibitors is challenging and success is usually achieved through utilizing many techniques. Virtual and experimental screening often complement each other (Bajorath, 2002). While an inhibitor was not identified, we have established two cellular assays that can be used to rapidly screen many compounds. The in silico approach identified certain common structural motifs predicted to bind to the ETV6 PNT domain, such as a pyridazine moiety that seemed to preferentially bind the A93D PNT domain interface, and this information may be able to be used to guide a FBDD or tethering screening approach.  While the PNT domain interface has proven to be challenging, it may be that a small molecule capable of binding is present in commercial collections that has not been screened yet. At the time of this research, only a small library 1,534 compounds targeting PPIs was available through the Perturbation of Protein-Protein Interactions (PoPPI) collaborative program. More libraries suited for targeting PPIs are being developed and these may prove beneficial in finding potential inhibitors of PNT domain polymerization. In addition, peptides or peptidomimetics may have more success in binding to the interface.  Despite identifying no binding compound, significant progress in assay development was achieved and we have found how challenging of a target this interface is. Thus, either a targeted approach or screening libraries thought to be better suited for PPIs, for example peptides or macrocycles libraries, may be the next direction.   131 3.5 Materials and Methods 3.5.1 Chemicals for screening Screening compounds for the split luciferase PCA and the yeast two-hybrid assays came from the Canadian Chemical Biology Network (CCBN) library housed in the Roberge laboratory. This collection consists of >30,000 chemicals: 16,000 compounds from the Maybridge Hitfinder collection, 10,000 compounds from the ChemBridge DIVERset collection, 1,120 compounds from Prestwick Chemicals, 1,280 compounds from the Sigma LOPAC library, 2,000 compounds from the Microsource Spectrum collection, 2,697 compounds from the Selleck L1700 Bioactive Compound library, and 500 compounds from Biomol. The compounds were stored in 96-well plates at -25 °C as 5 mM stock solutions in DMSO. In addition, a small molecule library targeting PPIs consisting of 1,534 compounds was provided by the Perturbation of Protein-Protein Interactions (PoPPI) collaborative program (UK). A small natural products library from Dr. Raymond Andersen’s laboratory was also screened. The UCSF ZINC8 chemical compounds database was used for in silico docking and candidate compounds were purchased from Molport for screening in both cellular assays and by NMR spectroscopy.  3.5.2 Split luciferase PCA 3.5.2.1 Cloning steps for in vivo split luciferase assay The genes encoding a truncated fragment of ETV643-125, encompassing the PNT domain, were generated with either an A93D or V112E substitution and cloned into a modified mammalian expression vector pcDNA3.1/Zeo(+) at the 5’-end of a construct containing a 10 amino-acid flexible polypeptide and humanized Gaussian Luciferase (hGLuc) fragments (provided by Dr. Michnick (Remy and Michnick, 2006)) using polymerase incomplete primer extension (PIPE) techniques (Klock and Lesley, 2009). The resulting constructs encoded either A93D or V112E   132 PNT domain, a (GGGGS)2 flexible linker and either hGLuc(1)1-93 or hGLuc(2)94-196, described here as N-Luc or C-Luc, respectively (Figure 3.2, Appendix A). These vectors also contain a zeocin selection marker for mammalian transfections and an ampicillin resistance gene and pUC origin for selection and maintenance in E. coli. Subsequently, the A93D-PNT domain/N-Luc fusion protein was subcloned into the mammalian expression vector pcDNA3.1/Neo(+) to allow for differential antibiotic selection in co-transfections. A control set of plasmids containing leucine zippers as the dimerization domains were also provided by Dr. Michnick (Remy and Michnick, 2006). 3.5.2.2 Mammalian cell culture Human embryonic kidney cells 293 (HEK293) cells were provided by Dr. Masayuki Numata (UBC) and cultured in Dulbecco’s Modified Eagle Medium (DMEM; Gibco) supplemented with 10% fetal bovine serum (FBS; Sigma) and 1% antibiotic-antimycotic (Gibco). Unless otherwise noted, this medium was used in all experiments. Stably expressing transformants were treated additionally with either or both 400 µg/mL G418 (Gibco) and 50 µg/mL zeocin (Invitrogen). Cells were incubated at 37 °C with 5% CO2, and passaged upon reaching approximately 75-85% confluency. 3.5.2.3 Transient expression validation and establishment of a stably expressing PNT domain PCA system Prior to generation of stably expressing cells, proof-of-principle validation was conducted in transiently transfected cells. HEK293 cells were seeded in 96-well clear bottom black microplates (Corning) at 15,000 cells/well and incubated at 37 °C. After 24 hours, the media was aspirated and 100 µL of fresh medium was added. For transfection of a single species of DNA, 20 ng/µL of plasmid was added to OPTIMEM (Gibco), and for transfection of two species of DNA,   133 10 ng/µL of each plasmid was added to OPTIMEM. Lipofectamine 2000 (Invitrogen) was diluted to 8% in OPTIMEM and added to the DNA at a 1:1 v:v ratio and incubated for 5 minutes at room temperature. After incubation, 10 µL of the total prepared DNA, OPTIMEM and Lipofectamine 2000 were added to each well. Cells were incubated at 37 °C for either 24, 48, or 72 hours. Prior to the luminescence reading, 50 µL of cell media was removed and then an equal volume of 1:1 ratio of NanoFuel GLOW Assay (Nanolight Technology) for Gaussia luciferase reagent was added to the cells. The plates were incubated in the dark at ambient temperature for 5 minutes, and luminescence output was read for one second with the Varioskan LUX multimode microplate reader.  For generation of stably expressing cells, antibiotic kill curves were performed for G418 and zeocin, for their respective neomycin and zeocin antibiotic resistance, on HEK293 cells and optimal concentrations of antibiotic (400 µg/mL and 50 µg/mL, respectively) were determined for selection. HEK293 cells were seeded in 6-well microtitre plates at 400,000 cells/well and grown overnight to approximately 80% confluence. Cell media was changed and then cells were transfected with Lipofectamine 2000 utilizing 25 ng/µL DNA for single plasmid transfections and 12.5 ng/µL DNA for each plasmid in a double plasmid transfection. After 24 hours, the media was aspirated, fresh media was added and the cells were incubated again overnight. After 48 hours of transfection time, selection was introduced by incubating cells with media supplemented with the corresponding antibiotic(s), replacing media every 2-3 days, and splitting cells when 80% confluency was achieved. The resulting stable transformants were stored in liquid nitrogen.    134 3.5.2.4 High-throughput split luciferase chemiluminescence assay For screening, A93D-PNT/N-Luc(Neo) and V112E-PNT/C-Luc(Zeo) stably expressing cells were plated at 40,000 cells/well in 96-well clear bottom black microplates (Corning) and incubated overnight at 37 °C. Compound plates were thawed and compounds were added to cells using a BioRobotics BioGrid Robot Microarrayer Model equipped with a 96-Pin Tool. Either 0.7 mm or 0.4 mm diameter pins were used. After a 4-hour incubation at 37 °C, a 1:1 ratio of NanoFuel GLOW Assay for Gaussia luciferase reagent was added to the cells and plates were incubated in the dark at ambient room temperature for 15 minutes. This period of time was determined to give the highest luminescence readout through monitoring the luminescence reading of untreated cells over several hours. Luminescence output was then read with the Varioskan LUX multimode microplate reader, outputted into MS Office Excel (Microsoft), and subsequently analyzed with GraphPad Prism software. Due to restricted availability of the Varioskan reader, a BioTek Neo 2 reader was utilized in some final experiments. 3.5.2.5 Secondary validation assays Compounds that yielded a lower luminescence in the initial PNT domain PCA screens were re-tested at different concentrations in the stable cell line PNT domain and leucine zipper PCAs. The latter served as a specificity control. Cells were seeded at 40,000 cells per well in 96-well plates and incubated at 37 °C overnight. The selected compounds were added to wells in duplicate at final concentrations of 1, 3, 10 and 30 µM. After 4 hours incubation, cells were visually examined to determine if the compound caused toxicity or precipitated from solution. Luminescence was read as previously described and results from two PCAs compared.   135 3.5.3 Yeast assay 3.5.3.1 Yeast two-hybrid assay construct development The two-hybrid assay consists of bait and prey plasmids and a yeast reporter strain. All plasmids and yeast strains were generated by Dr. Ivan Sadowski (UBC). The bait plasmids express a fusion between the Oct1 POU DNA-binding domain and the A93D-PNT domain (pIS586) or the V112E-PNT domain (pIS587) from an ADH1 promoter. They are ARS-CEN (single copy) plasmids with a TRP1 marker. The empty vector expressing the Oct1 POU DNA-binding domain alone is pIS341.The prey plasmids express a fusion between the NLS-B42 activation domain (B42 is a random fragment selected for its intermediate transactivation capability) and A93D-PNT domain (pIS590) or V112E-PNT domain (pIS591), from a GAL1 promoter. These are 2 micron (multicopy) plasmids with a LEU2 marker. The empty vector expressing the NLS-B42 (referred as the HIS3 AD) activation domain alone is pIS580.  S. cerevisiae strain ISY361 was constructed from strain W303. It contains two integrated reporter genes. The HIS3 reporter was integrated at an ade8 disruption with plasmid pIS452. The lacZ reporter was integrated at a lys2 disruption with pIS341. The expression of the two reporter genes is controlled by 4 POU binding sites. The genotype is MATa, ade2-1, his3-11,15, leu2-3,112, trp1-1, ura3-1, can1-100, lys2::POU ops-LACZ, ade8::Pou ops-HIS3. ISY361 was transformed with bait plasmid pIS586 and prey plasmid pIS591 to generate strain ISY361+/+ expressing POU-A93D bait and NLS-B42-V112E prey. A strain containing bait plasmid pIS586 and prey plasmid pIS580 lacking the V112E domain was also generated to serve as a control and is referred to as ISY361+/-.   136 3.5.3.2 Yeast screening assay  Yeast media were prepared with reagents obtained from Becton Dickinson and Sunrise Science Products. Strain ISY361+/+ was grown overnight at 30 °C with agitation in Synthetic Complete (SC) medium lacking Leu and Trp and containing 2% glucose. Cells were harvested by centrifugation at 4,700 g for 5 minutes, pellets were rinsed twice with sterile distilled water and cells were suspended at OD595 = 0.01 in SC medium lacking Leu, Trp and His and containing 2% galactose instead of glucose. Cell suspension (100 µL) was distributed to wells of sterile clear flat bottom polystyrene 96-well plates (Costar # 3370) using a dispensing 8-channel pipettor. Eight wells were reserved for blanks. Chemicals were added to each well using a Biorobotics Biogrid II robot equipped with either a 0.7 mm or 0.4 mm diameter 96-pin tool, respectively. The final compound concentrations are determined by the pin diameter, the stock compound concentration and its volume (column height) and they are indicated below for each compound collection. The yeast plates were incubated at 30 °C in a humidified chamber without agitation for 48 h. The cells were suspended by gently vortexing for 1 minute and OD595 readings were obtained using an Opsys MR 96-well plate reader (Dynex Technologies). Data were exported in Excel format for analysis. OD595 readings of wells containing only medium were defined as 0% growth and OD595 readings of wells containing yeast but no screening chemicals were defined as 100% growth. The compounds from the Selleck L1700 Bioactive Compound library, Biomol, Sigma LOPAC, Prestwick Collection, and Microsource Spectrum were tested at a final concentration of 15 µM. The PoPPI library was tested at final concentrations of 15 µM and 10 µM. Compounds showing growth inhibition were typically re-tested at two different concentrations in duplicate in medium lacking His and in medium containing 20 µg/mL His. Compounds showing more growth   137 inhibition in medium lacking His than in medium containing His were re-tested in two or more replicates over a concentration range. 3.5.4 In silico screening  3.5.4.1 Virtual screening utilizing BUDE Virtual ligand screening was carried out on the University of Bristol’s high performance computing BlueCrystal system with the docking program BUDE (Bristol University Docking Engine) (Version 1.2.9) (McIntosh-Smith et al., 2015) utilizing the ZINC8 virtual ligand database (Sterling and Irwin, 2015). Monomeric protein coordinates were taken from the PNT domain polymeric structure (PDB: 1LKY) as representative of the wild type interfaces in the A93D PNT or V112R PNT domain backgrounds. In brief, the protein structure, known as the receptor, was placed as a mol2 file such that the origin (0, 0, 0) is located at the centre of the docking grid. The origin was set to where the wild type A93 or V112 residue was located for the V112R or A93D PNT domain, respectively. The docking grid search volume was a 15 x 15 x 15 Å cube. Members of the ZINC8 library, consisting of greater than 8 million ligands (each having approximately 20 conformers per ligand generated in Dr. Sessions’ group), were docked at the predefined origin. The molecular docking program uses a genetic algorithm to search the docking grid via translations and rotations to find the lowest, or a low, energy pose of a given ligand conformer. BUDE utilized its own empirical free energy forcefield that has soft-core potentials to calculate the interaction energies of a ligand in its pose (McIntosh-Smith et al., 2012). A utility was used to extract and rank the top compounds based on the predicted free energy of binding. A secondary BUDE screen was set targeting the intermolecular K99-D101 salt bridge. Similar to the interface, the reciprocal salt-bridge participating residue was set as the origin. However, unlike the previous BUDE screen, a series of 10 different structures, obtained from 10   138 ns steps of the 100 ns MD simulations, were used as the receptors. The top 200,000 compounds, and their conformers, with the lowest binding energies to the interface from the original docking were screened.  3.5.4.2 Molecular dynamics simulations on top candidates at each PNT domain interface Ligands that were targeted for the interfacial residues of the PNT domain were ranked on their predicted free energy of binding. The top 500 compounds to each interface underwent a short 10 ns MD simulation of the ligand-protein complex to determine if they maintained a stable interaction. In brief the MD simulations were performed with AMBER (Version 16) (Case et al., 2005) using the FF14SB-ildn forcefield, TIP3P water and ligand parameters taken from the GAFF (General Amber Force Field) (Wang et al., 2004). Set up and simulation conditions were as those described for the protein-only simulation with GROMACS as described in Section 2.5.4 for the ligand and the PNT domain from the BUDE screen. The full 10 ns simulations were run with 2 fs integration step size while maintaining a temperature of 300 K and pressure of 1 bar. Resulting ligand RSMD time courses were calculated for the trajectories relative to the initial, midpoint and final poses using CPPTRAJ (Roe and Cheatham, 2013). In addition, the trajectories were visualized using VMD (Version 1.9.2) software (Humphrey et al., 1996). 3.5.4.3 Candidate selection and validation Compounds identified by virtual docking underwent several iterations of selection. First, the top 500 ranked poses with the lowest binding energies were manually inspected in Chimera (Version 1.13.1) (Pettersen et al., 2004). Note was made of duplicate conformers and any unique functional groups present. In addition to the top 500, manual inspection was carried out on ligands containing either a carboxylate or phosphate oxygen and/or a protonated amine from the top several thousand ranked poses. For the ligands targeted to the interface, those that had duplicate   139 conformers with low RMSD fluctuation during 10 ns MD simulations (excellent or good in Figure 3.14) were preferentially selected over ligands that had high RMSDs and appeared to dissociate (mediocre or bad). The generated list of potential compounds was further refined by considering their commercial availability and selected to give a range of chemical diversity targeting each interface of the PNT domain. In total, 50 compounds were ordered to target the core interfacial residues (16 targeted the A93 or ML interface in the V112E PNT domain structure, 32 targeted the V112 or EH interface in the A93D PNT domain structure, and 2 targeted both) and 10 compounds were ordered to target the K99-D101 salt bridge (6 targeted D101 and 4 targeted K99). 3.5.5 Testing of compound binding to the PNT domain by NMR spectroscopy Candidate compounds tested in vitro for binding to the PNT domain via 15N-HSQC monitored titrations recorded at 25 °C with Bruker Avance 500/600 MHz spectrometers. Compounds were purchased from various vendors, and dissolved to 50 mM stock solutions in DMSO. Protein samples were prepared as described in Section 2.5.1, to a final concentration of 150 µM and volume of 450 µL. All compounds were tested at a minimum of 2:1 molar ratio a maximum of 20:1 molar ratio compound to protein. Control titrations with DMSO were also recorded.     140 Chapter 4: Concluding remarks 4.1.1 The ETV6 PNT domain is a stable helical bundle that retains its structure upon self-association Previous crystal structures of the multimeric ETV6 PNT domain have shown a core four-helical bundle with a strong propensity for head-to-tail polymerization. Here, I have demonstrated that this structure is retained with the crystallized monomeric A93D/V112E PNT domain, as well as monomeric and heterodimeric forms of the PNT domain in solution. Using NMR spectroscopy, I found that only residues at the EH- and ML-interfaces experience significant chemical shift perturbations upon heterodimer formation. Also, only amides within or near these interfaces show increased HX protection. MD simulations also show little difference in the structures and dynamics of the monomeric and heterodimeric forms of the PNT domain. Together, these observations support the conclusions the ETV6 PNT domain is a rigid "building block" poised for polymerization, and that its self-association does not lead to any significant conformational changes that could be exploited for inhibitor design.  4.1.2 Several key “hot spots” at the PNT domain PPI contribute to the strong binding interaction Alanine scanning mutagenesis demonstrated that both the hydrophobic core and flanking charged groups in the EH- and ML-interfaces contribute to the self-association of the ETV6 PNT domain. Within the core, L96 is a key hot spot that fits within a ring of residues on the EH-interface. These residues form a small pocket that could potentially be targeted for inhibitor design. Within the flanking regions, K99, D101, R105 and D111 are critical salt bridge-forming residues. Targeting these residues may be advantageous for identifying an inhibitory compound that is relatively hydrophilic in nature.   141 Future work on alanine scanning mutagenesis could include looking at the effects of double mutations to elucidate pairwise interactions that contribute to binding. In particular, it may be of interest to look at the alanine mutant combinations targeted for the salt bridges to gain further insights into the roles of additional potential salt bridges, such as E100-R103. In addition, utilizing screening assays with a weaker affinity PNT domain heterodimer may yield a greater success in finding lead compounds. 4.1.3 Targeting the PNT domain PPI is challenging with small molecules Despite a rigorous effort that spanned several years using both experimental and computational screening techniques, no compounds that inhibit PNT domain polymerization were identified. The PNT domain self-association has a KD value in the nanomolar range and thus, an inhibitor would need to compete at a similar binding affinity. An increase in screening effort with respect to the number of compounds and chemical diversity, such as cyclic peptide or natural product libraries, may be required for inhibitor identification. In addition, the nature of the “down” experimental assays may be challenging to identify a compound as a weakly binding compound may show a less pronounced effect than an “up” assay. It would be beneficial to design an “up” assay, whereby inhibition of PNT domain polymerization causes cell vitality or growth restoration, as there are inherently less confounding factors (Balgi and Roberge, 2009; Kaelin, 2017).  Computational research is advancing as better docking algorithms and high performance computing allows more advanced systems to be modelled. Recently, millisecond MD simulations have been able to be performed on biological systems with the Anton supercomputer, which was a jump of ~ 2 orders of magnitude from previously published data (Shaw et al., 2009). This increase in computation power will certainly facilitate quicker virtual drug screens with incorporation of dynamic simulations for both the proteins and ligands.    142 The BUDE screen, while unsuccessful, did identify several common structural motifs that may be predicted to have weak binding to the PNT domain. Unfortunately, one of the top candidate compounds was not available for experimental validation. Due to the constraint of testing a small set of commercially available compounds, I may have also been confronted with a “similarity paradox”. Experimental validation of a wider set compounds that possess a common structural motif but vary in minor chemical modifications may circumvent this problem (Bajorath, 2002). Overall, likely a combination of complementary computational design and experimental techniques will be the best strategy in targeting the ETV6 PNT domain. 4.1.4 Future directions An example of designing an inhibitor against a SAM-SAM domain interaction can be found with the work of the Leone group on the Ship2 and EphA2 receptors (Mercurio et al., 2017, 2019). The two five-helix SAM bundles bind with low micromolar affinity via ML- and EH-interfaces similar to those of the EVT6 PNT domain. These researchers used a combination of computational design of cyclic peptides and helix peptide mimetics to design inhibitors that bind the Ship2 interface. In the case of the helix mimetics, they found that a repeating positively charged penta-amino acid motif, found in the EH-interface of the EphA2 SAM domain, indeed bound with high micromolar affinity to the ML-interface of the Ship2 SAM domain (Mercurio et al., 2017).  There are several critical similarities and differences of this system to the ETV6 PNT domain that are worth considering. Both PPIs involve the ML- and EH-interfaces. However, the Ship2-EphA2 system involves a positively charged EH-interface and a central negatively charged ML-interface (Mercurio et al., 2017), whereas the ETV6 PNT domain interfaces are composed of more hydrophobic or neutral residues. A promising avenue to disrupt PNT domain polymerization   143 would be to design peptide mimetic of helix-4, although this may require substantial modifications, perhaps with non-natural amino acids, for features including stability and solubility. Similar to the BUDE in silico screen, the Leone group also explored computational docking of a virtual cyclic peptide library on the Ship2-EphA2 interface (Mercurio et al., 2019). Cyclic peptides may be more promising candidates than small molecules for binding the protein surface as they consist of more “protein-like” elements (Joo, 2012). However, their efforts also failed to produce any experimentally validated results (Mercurio et al., 2019). They came to a similar conclusion that more in silico compounds may need to be experimentally tested to find an inhibitor. In closing, although the inhibition of PPIs is a challenging task, steady progress towards this important goal has been made over the past years. Fragment-based drug design (FBDD) and disulphide tethering are two approaches that could be undertaken as the next steps in designing an inhibitor against PNT domain polymerization. FBDD relies on low molecular weight compounds that bind to a target weakly to identify lead chemical features, and upon combinatorial chemistry of several fragments produce a lead compound with tighter affinity (Arkin et al., 2014). Analysis of the top BUDE screening compounds can lead to identification of predicted chemical motifs that may show weak binding and aid in a FBDD screen. In parallel, disulphide tethering, where weakly binding chemical fragments are tethered via an introduced cysteine residue at a targeted area, could target fragments to “hot spots” of PNT domain polymerization (Arkin et al., 2003). A targeted approach to find compounds that bind to the PNT domain with weaker affinity may yield more success for identifying an inhibitor of PNT domain polymerization.           144 Bibliography Adams, P.D., Afonine, P. V., Bunkóczi, G., Chen, V.B., Davis, I.W., Echols, N., Headd, J.J., Hung, L.W., Kapral, G.J., Grosse-Kunstleve, R.W., et al. (2010). PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 213–221. Aplan, P.D. (2006). Causes of oncogenic chromosomal translocation. Trends Genet. 22, 46–55. Arkin, M.R., Randal, M., DeLano, W.L., Hyde, J., Luong, T.N., Oslob, J.D., Raphael, D.R., Taylor, L., Wang, J., McDowell, R.S., et al. (2003). Binding of small molecules to an adaptive protein-protein interface. Proc. Natl. Acad. Sci. U. S. A. 100, 1603–1608. Arkin, M.R., Tang, Y., and Wells, J.A. (2014). Small-molecule inhibitors of protein-protein interactions: Progressing toward the reality. Chem. Biol. 21, 1102–1114. Ashraf, S.S., Benson, R.E., Payne, E.S., Halbleib, C.M., and Grøn, H. (2004). A novel multi-affinity tag system to produce high levels of soluble and biotinylated proteins in Escherichia coli. Protein Expr. Purif. 33, 238–245. Baens, M., Peeters, P., Guo, C., Aerssens, J., and Marynen, P. (1996). Genomic organization of the TEL: the human ETS-variant gene 6. Genome Res. 6, 404–413. Bahrami, A., Assadi, A.H., Markley, J.L., and Eghbalnia, H.R. (2009). Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy. PLoS Comput. Biol. 5. Bai, Y., Milne, J.S., Mayne, L., and Englander, S.W. (1993). Primary structure effects on peptide group hydrogen exchange. 17, 75–86. Bajorath, J. (2002). Integration of virtual and high-throughput screening. Nat. Rev. Drug Discov. 1, 882–894.   145 Balgi, A.D., and Roberge, M. (2009). Screening for Chemical Inhibitors of Heterologous Proteins Expressed in Yeast Using a Simple Growth-Restoration Assay. Cell-Based Assays High-Throughput Screen. 125–137. Berendsen, H.J.C., van der Spoel, D., and van Drunen, R. (1995). GROMACS: A message-passing parallel molecular dynamics implementation. Comput. Phys. Commun. 91, 43–56. Berjanskii, M. V., and Wishart, D.S. (2005). A simple method to predict protein flexibility using secondary chemical shifts. J. Am. Chem. Soc. 127, 14970–14971. Bogan, A.A., and Thorn, K.S. (1998). Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280, 1–9. Bohlander, S.K. (2005). ETV6: a versatile player in leukemogenesis. Semin. Cancer Biol. 15, 162–174. De Braekeleer, E., Douet-Guilbert, N., Morel, F., Le Bris, M.-J., Basinko, A., and De Braekeleer, M. (2012). ETV6 fusion genes in hematological malignancies: a review. Leuk. Res. 36, 945–961. Brennan, M.B., and Struhl, K. (1980). Mechanisms of increasing expression of a yeast gene in Escherichia coli. J. Mol. Biol. 136, 333–338. Buchwald, P. (2010). Small-molecule protein-protein interaction inhibitors: Therapeutic potential in light of molecular size, chemical space, and ligand binding efficiency considerations. IUBMB Life 62, 724–731. Case, D.A., Cheatham, T.E., Darden, T., Gohlke, H., Luo, R., Merz, K.M., Onufriev, A., Simmerling, C., Wang, B., and Woods, R.J. (2005). The Amber biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688. Cetinbas, N., Huang-Hobbs, H., Tognon, C., Leprivier, G., An, J., McKinney, S., Bowden, M., Chow, C., Gleave, M., McIntosh, L.P., et al. (2013). Mutation of the salt bridge-forming residues   146 in the ETV6-SAM domain interface blocks ETV6-NTRK3-induced cellular transformation. J. Biol. Chem. 288, 27940–27950. Chakrabarti, S.R., Sood, R., Nandi, S., and Nucifora, G. (2000). Posttranslational modification of TEL and TEL/AML1 by SUMO-1 and cell-cycle-dependent assembly into nuclear bodies. Proc. Natl. Acad. Sci. U. S. A. 97, 13281–13285. Chambers, A.F., Groom, A.C., and MacDonald, I.C. (2002). Dissemination and growth of cancer cells in metastatic sites. Nat. Rev. Cancer 2, 563–572. Clackson, T., and Wells, J.A. (1995). A Hot Spot of Binding Energy in a Hormone-Receptor Interface. Science (80-. ). 267, 383–386. Connelly, G.P., Bai, Y., Jeng, M. -F, and Englander, S.W. (1993). Isotope effects in peptide group hydrogen exchange. Proteins Struct. Funct. Bioinforma. 17, 87–92. Coyne, H.J., De, S., Okon, M., Green, S.M., Bhachech, N., Graves, B.J., and McIntosh, L.P. (2012). Autoinhibition of ETV6 (TEL) DNA binding: Appended helices sterically block the ETS domain. J. Mol. Biol. 421, 67–84. Croce, C.M. (2008). Oncogenes and cancer. N. Engl. J. Med. 358, 502–511. De, S., Okon, M., Graves, B.J., and McIntosh, L.P. (2016). Autoinhibition of ETV6 DNA binding is established by the stability of its inhibitory helix. J. Mol. Biol. 428, 1515–1530. DeBerardinis, R.J., Lum, J.J., Hatzivassiliou, G., and Thompson, C.B. (2008). The Biology of Cancer: Metabolic Reprogramming Fuels Cell Growth and Proliferation. Cell Metab. 7, 11–20. Delaglio, F., Grzesiek, S., Vuister, G.W., Zhu, G., Pfeifer, J., and Bax, A. (1995). NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293. Druker, B.J., and Lydon, N.B. (2000). Lessons learned from the development of an Abl tyrosine kinase inhibitor. J. Clin. Invest. 105, 3–7.   147 Dyson, H.J., Kostic, M., Liu, J., and Martinez-Yamout, M.A. (2008). Hydrogen-deuterium exchange strategy for delineation of contact sites in protein complexes. FEBS Lett. 582, 1495–1500. Emsley, P., and Cowtan, K. (2004). Coot: Model-building tools for molecular graphics. Acta Crystallogr. Sect. D Biol. Crystallogr. 60, 2126–2132. Eyrisch, S., and Helms, V. (2007). Transient pockets on protein surfaces involved in protein-protein interaction. J. Med. Chem. 50, 3457–3464. Ferguson, L.R., Chen, H., Collins, A.R., Connell, M., Damia, G., Dasgupta, S., Malhotra, M., Meeker, A.K., Amedei, A., Amin, A., et al. (2015). Genomic instability in human cancer: Molecular insights and opportunities for therapeutic attack and prevention through diet and nutrition. Semin. Cancer Biol. 35, S5–S24. Fodje, M., Grochulski, P., Janzen, K., Labiuk, S., Gorin, J., and Berg, R. (2014). 08B1-1: An automated beamline for macromolecular crystallography experiments at the Canadian Light Source. J. Synchrotron Radiat. 21, 633–637. Foulds, C.E., Nelson, M.L., Blaszczak, A.G., and Graves, B.J. (2004). Ras/Mitogen-Activated Protein Kinase Signaling Activates Ets-1 and Ets-2 by CBP/p300 Recruitment. Mol. Cell. Biol. 24, 10954–10964. Friedberg, E.C. (2003). DNA damage and repair. Nature 421, 436–440. Futreal, P.A., Coin, L., Marshall, M., Down, T., Hubbard, T., Wooster, R., Rahman, N., and Stratton, M.R. (2004). A census of human cancer genes. Nat. Rev. Cancer 4, 177–183. Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R.D., and Bairoch, A. (2003). ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31, 3784–3788.   148 Goddard, T.D., and Kneeler, D.G. (1999). Sparky 3rd Edition. Golub, T.R., Barker, G.F., Love, M., and Gilliland, D.G. (1994). Fusion of PDGF Receptor b to a Novel ets-like Gene, tel, in Chronic Myelomonocytic Leukemia with t(5;12) chromosomal translocation. Cell 77, 307–316. Golub, T.R., Goga, A., Barker, G.F., Afar, D.E., McLaughlin, J., Bohlander, S.K., Rowley, J.D., Witte, O.N., and Gilliland, D.G. (1996). Oligomerization of the ABL tyrosine kinase by the Ets protein TEL in human leukemia. Mol. Cell. Biol. 16, 4107–4116. Green, S.M., Coyne, H.J., McIntosh, L.P., and Graves, B.J. (2010). DNA binding by the ETS protein TEL (ETV6) is regulated by autoinhibition and self-association. J. Biol. Chem. 285, 18496–18504. Hanahan, D., and Weinberg, R.A. (2000). The Hallmarks of Cancer. Cell 100, 57–70. Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of cancer: The next generation. Cell 144, 646–674. Hassanpour, S.H., and Dehghani, M. (2017). Review of cancer from perspective of molecular. J. Cancer Res. Pract. 4, 127–129. Hassler, M., and Richmond, T.J. (2001). The B-box dominates SAP-1-SRF interactions in the structure of the ternary complex. EMBO J. 20, 3018–3028. Herr, W., and Cleary, M.A. (1995). The POU domain: Versatility in transcriptional regulation by a flexible two-in-one DNA-binding domain. Genes Dev. 9, 1679–1693. Hock, H., Meade, E., Medeiros, S., Schindler, J.W., Valk, P.J.M., Fujiwara, Y., and Orkin, S.H. (2004). Tel / Etv6 is an essential and selective regulator of adult hematopoietic stem cell survival Tel / Etv6 is an essential and selective regulator of adult hematopoietic stem cell survival. Genes Dev 18, 2336–2341.   149 Hollenhorst, P.C., McIntosh, L.P., and Graves, B.J. (2011). Genomic and biochemical insights into the specificity of ETS transcription factors. Annu. Rev. Biochem. 80, 437–471. Holm, L., and Sander, C. (1995). DALI: a network tool for protein structure comparison. Trends Biochem. Sci. 20, 478–480. Huang-Hobbs, H. (2013). Dissecting the Mechanism of ETV6 Polymerization. University of British Columbia. Humphrey, W., Dalke, A., and Schulten, K. (1996). VMD: visual molecular dynamics. J. Mol. Graph. 14, 33–38. Hunter, T., and Blume-Jensen, P. (2001). Oncogenic kinase signalling. Nature 411, 355. Janz, S., Potter, M., and Rabkin, C.S. (2003). Lymphoma- and leukemia-associated chromosomal translocations in healthy individuals. Genes Chromosom. Cancer 36, 211–223. Johnson, D.K., and Karanicolas, J. (2013). Druggable Protein Interaction Sites Are More Predisposed to Surface Pocket Formation than the Rest of the Protein Surface. PLoS Comput. Biol. 9. Jones, S., and Thornton, J.M. (1996). Principles of protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 93, 13–20. Joo, S.H. (2012). Cyclic peptides as therapeutic agents and biochemical tools. Biomol. Ther. 20, 19–26. Joung, J.K., Ramm, E.I., and Pabo, C.O. (2000). A bacterial two-hybrid selection system for studying protein-DNA and protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 97, 7382–7387. Kabsch, W. (2010). XDS. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 125–132. Kaelin, W.G. (2017). Common pitfalls in preclinical cancer target validation. Nat. Rev. Cancer   150 17, 441–450. Kang, H.S., Nelson, M.L., Mackereth, C.D., Schärpf, M., Graves, B.J., and McIntosh, L.P. (2008). Identification and Structural Characterization of a CBP/p300-Binding Domain from the ETS Family Transcription Factor GABPα. J. Mol. Biol. 377, 636–646. Kazakov, D. V, Hantschke, M., Vanecek, T., Kacerovska, D., and Michal, M. (2010). Mammary-type secretory carcinoma of the skin. Am. J. Surg. Pathol. 34, 1226–1227. Keskin, O., Ma, B., and Nussinov, R. (2005). Hot regions in protein-protein interactions: The organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345, 1281–1294. Kim, C.A., and Bowie, J.U. (2003). SAM domains : uniform structure , diversity of function. Trends Biochem. Sci. 28, 625–628. Kim, C.A., Phillips, M.L., Kim, W., Gingery, M., Tran, H.H., Robinson, M.A., Faham, S., and Bowie, J.U. (2001). Polymerization of the SAM domain of TEL in leukemogenesis and transcriptional repression. EMBO J. 20, 4173–4182. Kleckner, I.R., and Foster, M.P. (2011). An introduction to NMR-based approaches for measuring protein dynamics. Biochim. Biophys. Acta - Proteins Proteomics 1814, 942–968. Klock, H.E., and Lesley, S.A. (2009). The Polymerase Incomplete Primer Extension (PIPE) Method Applied to High-Throughput Cloning and Site-Directed Mutagenesis. In High Throughput Protein Expression and Purification: Methods and Protocols, S.A. Doyle, ed. (Totowa, NJ: Humana Press), pp. 91–103. Knezevich, S.R., McFadden, D.E., Tao, W., Lim, J.F., and Sorensen, P.H.B. (1998). A novel ETV6-NTRK3 gene fusion in congenital fibrosarcoma. Nat. Genet. 18, 184–187. Knudson, A.G. (2001). Two genetic hits (more or less) to cancer. Nat. Rev. Cancer 1, 157–162.   151 Kramer, M.F., and Coen, D.M. (2001). Enzymatic Amplification of DNA by PCR: Standard Procedures and Optimization. Curr. Protoc. Cell Biol. 10, A.3F.1-A.3F.14. Krishna, M.M.G., Hoang, L., Lin, Y., and Englander, S.W. (2004). Hydrogen exchange methods to study protein folding. Methods 34, 51–64. Lai, Z.C., and Rubin, G.M. (1992). Negative control of photoreceptor development in Drosophila by the product of the yan gene, an ETS domain protein. Cell 70, 609–620. Lamballe, F., Klein, R., and Barbacid, M. (1991). trkC, a new member of the trk family of tyrosine protein kinases, is a receptor for neurotrophin-3. Cell 66, 967–979. Lambert, S.A., Jolma, A., Campitelli, L.F., Das, P.K., Yin, Y., Albu, M., Chen, X., Taipale, J., Hughes, T.R., and Weirauch, M.T. (2018). The Human Transcription Factors. Cell 172, 650–665. Lannon, C.L., and Sorensen, P.H.B. (2005). ETV6-NTRK3: a chimeric protein tyrosine kinase with transformation activity in multiple cell lineages. Semin. Cancer Biol. 15, 215–223. Laskowski, R.A., Jabłońska, J., Pravda, L., Vařeková, R.S., and Thornton, J.M. (2018). PDBsum: Structural summaries of PDB entries. Protein Sci. 27, 129–134. Latchman, D.S. (1990). Eukaryotic Transcription Factors. Biochem. J. 270, 281–289. Lee, W., Tonelli, M., and Markley, J.L. (2015). NMRFAM-SPARKY: Enhanced software for biomolecular NMR spectroscopy. Bioinformatics 31, 1325–1327. Leeman-Neill, R.J., Kelly, L.M., Liu, P., Brenner, A. V, Little, M.P., Bogdanova, T.I., Evdokimova, V.N., Hatch, M., Zurnadzy, L.Y., Nikiforova, M.N., et al. (2014). ETV6-NTRK3 is a common chromosomal rearrangement in radiation-associated thyroid cancer. Cancer 120, 799–807. Lemmon, M.A., and Schlessinger, J. (1994). Regulation of signal transduction and signal   152 diversity by receptor oligomerization. Trends Biochem. Sci. 19, 459–463. Lemmon, M.A., and Schlessinger, J. (2010). Cell signaling by receptor tyrosine kinases. Cell 141, 1117–1134. Lindorff-Larsen, K., Piana, S., Palmo, K., Maragakis, P., Klepeis, J.L., Dror, R.O., and Shaw, D.E. (2010). Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins Struct. Funct. Bioinforma. 78, 1950–1958. Lipinski, C.A., Lombardo, F., Dominy, B.W., and Feeney, P.J. (1997). Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25. Lopez, R.G., Carron, C., Oury, C., Gardellin, P., Bernard, O., and Ghysdael, J. (1999). TEL is a sequence-specific transcriptional repressor. J. Biol. Chem. 274, 30132–30138. Mackereth, C.D., Schärpf, M., Gentile, L.N., MacIntosh, S.E., Slupsky, C.M., and McIntosh, L.P. (2004). Diversity in structure and function of the Ets family PNT domains. J. Mol. Biol. 342, 1249–1264. Mandell, J., Falick,  a, and Komives, E. (1998). Identification of protein – protein interfaces by decreased amide. Pnas 95, 14705–14710. Mauchauffe, M., Coniat, M.B., Salomon-nguyen, F., Ghysdael, J., Berger, R., and Bernard, O.A. (2000). The t(1;12)(q21;p13) translocation of human myeloblastic leukemia results in a TEL-ARNT fusion. Proc. Natl. Acad. Sci. U. S. A. 97, 6757–6762. Mayr, L.M., and Bojanic, D. (2009). Novel trends in high-throughput screening. Curr. Opin. Pharmacol. 9, 580–588. McCubrey, J.A., Steelman, L.S., Chappell, W.H., Abrams, S.L., Wong, E.W.T., Chang, F., Lehmann, B., Terrian, D.M., Milella, M., Tafuri, A., et al. (2007). Roles of the Raf/MEK/ERK   153 pathway in cell growth, malignant transformation and drug resistance. Biochim. Biophys. Acta - Mol. Cell Res. 1773, 1263–1284. McIntosh-Smith, S., Wilson, T., Ibarra, A.Á., Crisp, J., and Sessions, R.B. (2012). Benchmarking energy efficiency, power costs and carbon emissions on heterogeneous systems. Comput. J. 55, 192–205. McIntosh-Smith, S., Price, J., Sessions, R.B., and Ibarra, A.A. (2015). High performance in silico virtual drug screening on many-core processors. Int. J. High Perform. Comput. Appl. 29, 119–134. Mercurio, F.A., Di Natale, C., Pirone, L., Iannitti, R., Marasco, D., Pedone, E.M., Palumbo, R., and Leone, M. (2017). The Sam-Sam interaction between Ship2 and the EphA2 receptor: Design and analysis of peptide inhibitors. Sci. Rep. 7, 1–11. Mercurio, F.A., Di Natale, C., Pirone, L., Vincenzi, M., Marasco, D., De Luca, S., Pedone, E.M., and Leone, M. (2019). Exploring the Ability of Cyclic Peptides to Target SAM Domains: A Computational and Experimental Study. ChemBioChem 1–11. Michnick, S.W., Ear, P.H., Manderson, E.N., Remy, I., and Stefan, E. (2007). Universal strategies in research and drug discovery based on protein-fragment complementation assays. Nat. Rev. Drug Discov. 6, 569–582. Mintseris, J., and Weng, Z. (2005). Structure, function, and evolution of transient and obligate protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 102, 10930–10935. Morelli, X., Bourgeas, R., and Roche, P. (2011). Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I). Curr. Opin. Chem. Biol. 15, 475–481. Nagasubramanian, R., Wei, J., Gordon, P., Rastatter, J., Cox, M., and Pappo, A. (2016). Infantile fibrosarcoma with NTRK3-ETV6 fusion successfully treated with the tropomyosin-related   154 kinase inhibitor LOXO-101. Pediatr. Blood Cancer 63, 1468–1470. Negrini, S., Gorgoulis, V.G., and Halazonetis, T.D. (2010). Genomic instability an evolving hallmark of cancer. Nat. Rev. Mol. Cell Biol. 11, 220–228. Nelson, M.L., Kang, H.S., Lee, G.M., Blaszczak, A.G., Lau, D.K.W., McIntosh, L.P., and Graves, B.J. (2010). Ras signaling requires dynamic properties of Ets1 for phosphorylation- enhanced binding to coactivator CBP. Proc. Natl. Acad. Sci. U. S. A. 107, 10026–10031. Nowell, P.C., and Hungerford, D.A. (1960). Chromosome Studies on Normal and Leukemic Human Leukocytes. J. Natl. Cancer Inst. 25, 85–109. Nunn, M.F., Seeburg, P.H., Moscovici, C., and Duesberg, P.H. (1983). Tripartite structure of the avian erythroblastosis virus E26 transforming gene. Nature 306, 391–395. Oikawa, T. (2004). ETS transcription factors: Possible targets for cancer therapy. Cancer Sci. 95, 626–633. Otsubo, K., Kanegane, H., Eguchi, M., Eguchi-Ishimae, M., Tamura, K., Nomura, K., Abe, A., Ishii, E., and Miyawaki, T. (2010). ETV6–ARNT fusion in a patient with childhood T lymphoblastic leukemia. Cancer Genet. Cytogenet. 202, 22–26. Parkin, D.M. (2006). The global health burden of infection-associated cancers in the year 2002. Int. J. Cancer 118, 3030–3044. Paterson, Y., Englander, S.W., and Roder, H. (1990). An antibody binding site on cytochrome c defined by hydrogen exchange and two-dimensional NMR. Science (80-. ). 249, 755–759. Pedone, E., Postiglione, L., Aulicino, F., Rocca, D.L., Montes-Olivas, S., Khazim, M., di Bernardo, D., Pia Cosma, M., and Marucci, L. (2019). A tunable dual-input system for on-demand dynamic gene expression regulation. Nat. Commun. 10, 1–13. Pelay-Gimeno, M., Glas, A., Koch, O., and Grossmann, T.N. (2015). Structure-Based Design of   155 Inhibitors of Protein-Protein Interactions: Mimicking Peptide Binding Epitopes. Angew. Chemie - Int. Ed. 54, 8896–8927. Petsalaki, E., and Russell, R.B. (2008). Peptide-mediated interactions in biological systems: new discoveries and applications. Curr. Opin. Biotechnol. 19, 344–350. Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C., and Ferrin, T.E. (2004). UCSF Chimera - A visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612. Porollo, A., and Meller, J. (2007). Prediction-based fingerprints of protein-protein interactions. Proteins Struct. Funct. Bioinforma. 66, 630–645. Potter, M.D., Buijs, A., Kreider, B., van Rompaey, L., and Grosveld, G.C. (2000). Identification and characterization of a new human ETS-family transcription factor, TEL2, that is expressed in hematopoietic tissues and can associate with TEL1/ETV6. Blood 95, 3341–3348. Rabbitts, T.H. (1994). Chromosomal translocations in human cancer. Nature 372, 143–149. Rasighaemi, P., and Ward, A.C. (2017). ETV6 and ETV7: Siblings in hematopoiesis and its disruption in disease. Crit. Rev. Oncol. Hematol. 116, 106–115. Remy, I., and Michnick, S.W. (2006). A highly sensitive protein-protein interaction assay based on Gaussia luciferase. Nat. Methods 3, 977–979. Roe, D.R., and Cheatham, T.E. (2013). PTRAJ and CPPTRAJ: Software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 9, 3084–3095. Rotman, G., and Shiloh, Y. (1998). ATM: From gene to function. Hum. Mol. Genet. 7, 1555–1563. Rowley, J.D. (1998). the Critical Role of Chromosome Translocations in Human Leukemias. Annu. Rev. Genet. 32, 495–519.   156 Ryan, D.P., and Matthews, J.M. (2005). Protein-protein interactions in human disease. Curr. Opin. Struct. Biol. 15, 441–446. Salk, J.J., Fox, E.J., and Loeb, L.A. (2010). Mutational Heterogeneity in Human Cancers: Origin and Consequences. Annu. Rev. Pathol. Mech. Dis. 5, 51–75. Sattler, M., Schleucher, J., and Griesinger, C. (1999). Heteronuclear multidimensional NMR experiments for the structure determination of proteins in solution employing pulsed field gradients. Prog. Nucl. Magn. Reson. Spectrosc. 34, 93–158. Schneider, G. (2010). Virtual screening: An endless staircase? Nat. Rev. Drug Discov. 9, 273–276. Schrödinger, L. (2015). The PyMOL Molecular Graphics System, Version 1.8. Scott, D.E., Bayly, A.R., Abell, C., and Skidmore, J. (2016). Small molecules, big targets: Drug discovery faces the protein-protein interaction challenge. Nat. Rev. Drug Discov. 15, 533–550. Shaw, D.E., Dror, R.O., Salmon, J.K., Grossman, J.P., MacKenzie, K.M., Bank, J.A., Young, C., Deneroff, M.M., Batson, B., Bowers, K.J., et al. (2009). Millisecond-scale molecular dynamics simulations on Anton. Proc. Conf. High Perform. Comput. Networking, Storage Anal. 1–11. Shen, Y., and Bax, A. (2012). Identification of helix capping and β-turn motifs from NMR chemical shifts. J. Biomol. NMR 52, 211–232. Shoichet, B.K. (2004). Virtual screening of chemical libraries. Nature 432, 862–865. Skálová, A., Vanecek, T., Sima, R., Laco, J., Weinreb, I., Perez-Ordonez, B., Starek, I., Geierova, M., Simpson, R.H.W., Passador-Santos, F., et al. (2010). Mammary analogue secretory carcinoma of salivary glands, containing the ETV6-NTRK3 fusion gene: a hitherto undescribed salivary gland tumor entity. Am. J. Surg. Pathol. 34, 599–608. Skinner, J.J., Lim, W.K., Bédard, S., Black, B.E., and Englander, S.W. (2012). Protein hydrogen   157 exchange: Testing current models. Protein Sci. 21, 987–995. Smith, M.C., and Gestwicki, J.E. (2012). Features of protein–protein interactions that translate into potent inhibitors: topology, surface area and affinity. Expert Rev. Mol. Med. 14, 1–20. Smith, S.A., Sessions, R.B., Shoemark, D.K., Williams, C., Ebrahimighaei, R., McNeill, M.C., Crump, M.P., McKay, T.R., Harris, G., Newby, A.C., et al. (2019). Antiproliferative and Antimigratory Effects of a Novel YAP-TEAD Interaction Inhibitor Identified Using in Silico Molecular Docking. J. Med. Chem. 62, 1291–1305. Sterling, T., and Irwin, J.J. (2015). ZINC 15 - Ligand Discovery for Everyone. J. Chem. Inf. Model. 55, 2324–2337. Stites, W.E. (1997). Protein-protein interactions: Interface structure, binding thermodynamics, and mutational analysis. Chem. Rev. 97, 1233–1250. Sturtevant, J.M. (1977). Heat capacity and entropy changes in processes involving proteins. Proc. Natl. Acad. Sci. U. S. A. 74, 2236–2240. Tannous, B.A., Kim, D.E., Fernandez, J.L., Weissleder, R., and Breakefield, X.O. (2005). Codon-optimized gaussia luciferase cDNA for mammalian gene expression in culture and in vivo. Mol. Ther. 11, 435–443. Thangudu, R.R., Bryant, S.H., Panchenko, A.R., and Madej, T. (2012). Modulating protein-protein interactions with small molecules: The importance of binding hotspots. J. Mol. Biol. 415, 443–453. Tognon, C., Knezevich, S.R., Huntsman, D., Roskelley, C.D., Melnyk, N., Mathers, J.A., Becker, L., Carneiro, F., Macpherson, N., Horsman, D., et al. (2002). Expression of the ETV6-NTRK3 gene fusion as a primary event in human secretory breast carcinoma. 2, 367–376. Tognon, C.E., Mackereth, C.D., Somasiri, A.M., McIntosh, L.P., and Sorensen, P.H.B. (2004).   158 Mutations in the SAM domain of the ETV6-NTRK3 chimeric tyrosine kinase block polymerization and transformation activity. Mol. Cell. Biol. 24, 4636–4650. Tognon, C.E., Martin, M.J., Moradian, A., Trigo, G., Rotblat, B., Cheng, S.-W.G., Pollard, M., Uy, E., Chow, C., Carboni, J.M., et al. (2012). A tripartite complex composed of ETV6-NTRK3, IRS1 and IGF1R is required for ETV6-NTRK3-mediated membrane localization and transformation. Oncogene 31, 1334–1340. Tran, H.H., Kim, C.A., Faham, S., Siddall, M., and Bowie, J.U. (2002). Native interface of the SAM domain polymer of TEL. 6, 1–6. Tsai, C.-J., Lin, S.L., Wolfson, H.J., and Nussinov, R. (1997). Studies of protein-protein interfaces: A statistical analysis of the hydrophobic effect. Protein Sci. 6, 53–64. Vermeulen, K., Van Bockstaele, D.R., and Berneman, Z.N. (2003). The cell cycle: A review of regulation, deregulation and therapeutic targets in cancer. Cell Prolif. 36, 131–149. Vilenchik, M.M., and Knudson, A.G. (2003). Endogenous DNA double-strand breaks: Production, fidelity of repair, and induction of cancer. Proc. Natl. Acad. Sci. U. S. A. 100, 12871–12876. Vivekanand, P., and Rebay, I. (2012). The sam domain of human TEL2 can abrogate transcriptional output from TEL1 (ETV-6) and ETS1/ETS2. PLoS One 7, 5–10. Vogelstein, B., and Kinzler, K.W. (2004). Cancer genes and the pathways they control. Nat. Med. 10, 789–799. Wai, D., Knezevich, S., Lucas, T., Jansen, B., Kay, R., and Sorensen, P.H.B. (2000). The ETV6-NTRK3 gene fusion encodes a chimeric protein tyrosine kinase that transforms NIH3T3 cells. Oncogene 19, 906–915. Wang, J., Wolf, R.M., Caldwell, J.W., Kollman, P.A., and Case, D.A. (2004). Development and   159 testing of a general amber force field. Jounral Comput. Chem. 25, 1157–1174. Wang, L.C., Kuo, F., Fujiwara, Y., Gilliland, D.G., Golub, T.R., and Orkin, S.H. (1997). Yolk sac angiogenic defect and intra-embryonic apoptosis in mice lacking the Ets-related factor TEL. EMBO J. 16, 4374–4383. Watanabe, N., Kobayashi, H., Hirama, T., Kikuta, A., Koizumi, S., Tsuru, T., and Kaneko, Y. (2002). Cryptic t(12;15)(p13;q26) producing the ETV6-NTRK3 fusion gene and no loss of IGF2 imprinting in congenital mesoblastic nephroma with trisomy 11: Fluorescence in situ hybridization and IGF2 allelic expression analysis. Cancer Genet. Cytogenet. 136, 10–16. Wells, J.A., and McClendon, C.L. (2007). Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature 450, 1001–1009. Willard, L., Ranjan, A., Zhang, H., Monzavi, H., Boyko, R.F., Sykes, B.D., and Wishart, D.S. (2003). VADAR: A web server for quantitative evaluation of protein structure quality. Nucleic Acids Res. 31, 3316–3319. Williamson, M.P. (2013). Using chemical shift perturbation to characterise ligand binding. Prog. Nucl. Magn. Reson. Spectrosc. 73, 1–16. Wilson, A.J. (2009). Inhibition of protein-protein interactions using designed molecules. Chem. Soc. Rev. 38, 3289–3300. Winkler, M.E., and Ramos-Montanez, S. (2009). Biosynthesis of histidine. EcoSal Plus 3. Yan, C., Wu, F., Jernigan, R.L., Dobbs, D., and Honavar, V. (2008). Characterization of protein-protein interfaces. Protein J. 27, 59–70. Young, K.H. (1998). Yeast Two-hybrid: So Many Interactions, (in) So Little Time…. Biol. Reprod. 58, 302–311. Zhang, Y.Z. (1995). Protein and peptide structure and interactions studied by hydrogen exchange   160 and NMR. Zhang, M.Y., Churpek, J.E., Keel, S.B., Walsh, T., Lee, M.K., Loeb, K.R., Gulsuner, S., Pritchard, C.C., Sanchez-Bonilla, M., Delrow, J.J., et al. (2015). Germline ETV6 mutations in familial thrombocytopenia and hematologic malignancy. Nat. Genet. 47, 180–185.    161 Appendices Appendix A   - Protein sequences of described constructs Table A-1 Protein sequences of described constructs Sequence Description Sequence His6-tagged ETV61-125. This His6-tag (green) precedes a thrombin cleavage site (slash) and the wild type starting Met residue for ETV6 (yellow highlight). The second thrombin cleavage site (double slash) and alternative start site (M43) are also indicated. The structured PNT domain spans residues L50-Q123 (magenta). Monomerizing mutations were introduced at A93 (cyan highlight) and V112 (green highlight). MGSSHHHHHHSSGLVPR/GSHMMSETPAQCSIKQERISYTPPESPVPSYASSTPLHVPVPR//ALRMEEDSIRLPAHLRLQPIYWSRDDVAQWLKWAENEFSLRPIDSNTFEMNGKALLLLTKEDFRYRSPHSGDVLYELLQHILKQRK His6-tagged, Avitag-ETV643-125. This His6-tag (green) precedes a thrombin cleavage site (slash) and the Avitag (black highlight) with the alternative start site (M43) as the start site. The structured PNT domain spans residues L50-Q123 (magenta). Monomerizing mutations were introduced at A93 (cyan highlight) and V112 (green highlight). MGSSHHHHHHSSGLVPR/GSHIGLNDIFEAQKIEWHEHMEEDSIRLPAHLRLQPIYWSRDDVAQWLKWAENEFSLRPIDSNTFEMNGKALLLLTKEDFRYRSPHSGDVLYELLQHILKQRK Leucine zipper (blue), joined by a linker (purple) to the N-terminal fragment of humanized Gaussia luciferase (orange). MNTEAARRSRARKLQRMKQLEDKVEELLSKNYHLENEVARLKKLVGERIDGGGGSGGGGSSGKPTENNEDFNIVAVASNFATTDLDADRGKLPGKKLPLEVLKEMEANARKAGCTRGCLICLSHIKCTPKMKKFIPGRCHTYEGDKESAQGGIG Leucine zipper (blue), joined by a linker (purple) to the C-terminal fragment of humanized Gaussia luciferase (red). MNTEAARRSRARKLQRMKQLEDKVEELLSKNYHLENEVARLKKLVGERIDGGGGSGGGGSSGEAIVDIPEIPGFKDLEPMEQFIAQVDLCVDCTTGCLKGLANVQCSDLLKKWLPQRCATFASKIQGQVDKIKGAGGD ETV643-125 with the PNT domain (magenta) and monomerizing A93D mutation (cyan highlight), joined by a linker (purple) to the N-terminal fragment of humanized Gaussia luciferase (orange). MEEDSIRLPAHLRLQPIYWSRDDVAQWLKWAENEFSLRPIDSNTFEMNGKDLLLLTKEDFRYRSPHSGDVLYELLQHILKQRKGGGGSGGGGSSGKPTENNEDFNIVAVASNFATTDLDADRGKLPGKKLPLEVLKEMEANARKAGCTRGCLICLSHIKCTPKMKKFIPGRCHTYEGDKESAQGGIG   162 ETV643-125 with the PNT domain (magenta) and monomerizing V112E mutation (green highlight), joined by a linker (purple) to the C-terminal fragment of humanized Gaussia luciferase (red). MEEDSIRLPAHLRLQPIYWSRDDVAQWLKWAENEFSLRPIDSNTFEMNGKALLLLTKEDFRYRSPHSGDELYELLQHILKQRKGGGGSGGGGSSGEAIVDIPEIPGFKDLEPMEQFIAQVDLCVDCTTGCLKGLANVQCSDLLKKWLPQRCATFASKIQGQVDKIKGAGGD                      163 Appendix B  - Chemical shift assignments Table A-2 Chemical shift assignments (in ppm) of the monomeric V112E PNT domain  (20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 8.0 and 25 °C)  Residue Number Amino Acid 1HN 15N 13Ca 13Cb 45 E 8.54 121.4 57.6 - 46 D 8.40 119.6 54.9 - 47 S 8.06 114.9 58.9 - 48 I 8.09 122.0 61.2 - 49 R 8.75 127.2 55.0 - 50 L 8.25 124.9 53.0 - 51 P - - - - 52 A 8.57 124.1 55.7 - 53 H 8.23 112.4 57.5 - 54 L 7.09 120.1 53.9 42.8 55 R 7.31 116.9 60.5 31.1 56 L 7.93 120.6 54.6 - 57 Q - - - - 58 P - - - - 59 I 7.19 113.5 63.3 37.6 60 Y 7.89 119.8 57.4 38.9 61 W 8.03 121.5 55.6 - 62 S 10.65 125.5 57.3 66.1 63 R 9.24 120.6 59.5 - 64 D - - - - 65 D 7.85 122.0 57.8 39.5 66 V 8.16 120.0 67.7 32.3 67 A 7.78 120.9 55.8 17.9 68 Q 8.28 117.7 58.8 28.4 69 W 8.90 126.0 61.6 26.9 70 L 8.36 118.5 58.2 42.1 71 K 7.46 117.5 54.4 32.0 72 W 8.37 122.0 61.4 27.8 73 A 8.88 122.2 54.7 17.8 74 E 7.95 116.4 59.5 29.9 75 N 7.29 114.6 55.9 39.7 76 E 8.64 121.0 58.2 28.8   164 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 77 F 7.53 112.4 57.0 38.0 78 S 7.20 113.4 58.5 61.1 79 L 7.67 117.2 53.2 42.3 80 R 8.25 122.2 54.6 28.9 81 P - - - - 82 I 8.40 123.1 60.2 - 83 D 8.62 125.5 54.0 - 84 S - - - - 85 N 8.78 119.2 54.8 - 86 T 7.78 111.7 63.8 - 85 N - - - - 86 T - - - - 87 F - - - - 88 E - - - - 89 M - - - - 90 N - - - - 91 G 8.50 105.4 47.4 - 92 K 7.77 119.4 59.9 32.3 93 A 7.45 118.7 54.3 19.5 94 L - - - - 95 L - - - - 96 L 7.39 117.5 54.8 42.5 97 L - - - - 98 T 9.51 114.2 60.4 72.0 99 K 8.63 122.7 61.3 31.8 100 E 8.21 117.1 60.2 28.8 101 D 7.80 121.2 - - 102 F 8.29 120.8 63.5 39.8 103 R 8.19 118.8 59.0 30.6 104 Y 8.09 118.5 60.5 38.5 105 R 7.34 116.8 58.2 31.1 106 S 8.22 110.2 53.9 62.3 107 P - - - - 108 H - - - - 109 S 7.45 112.3 57.9 - 110 G 9.25 111.9 48.8 -   165 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 111 D 8.68 120.9 58.4 40.3 112 V - - - - 113 L - - - - 114 Y - - - - 115 E - - - - 116 L 8.48 122.6 57.9 42.1 117 L 8.41 120.8 57.9 40.6 118 Q 7.91 115.0 57.9 27.4 119 H 8.09 119.3 60.1 30.8 120 I 8.23 120.6 65.4 38.2 121 L 8.22 119.8 57.3 42.2 122 K 7.54 117.0 57.7 32.7 123 Q 7.73 117.9 56.6 29.2 124 R 7.94 120.7 55.9 - 125 K 7.87 128.0 57.7 -      166 Table A-3 Chemical shift assignments (in ppm) for the monomeric A93D PNT domain  (20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 7.0 and 25 °C)  Residue Number Amino Acid 1HN 15N 13Ca 13Cb 13CO 44 E 8.53 121.6 56.6 30.2 176.8 45 E 8.56 121.6 57.7 30.1 176.6 46 D 8.40 119.7 54.9 41.0 176.5 47 S 8.11 115.0 59.0 64.0 174.5 48 I 8.02 121.9 61.3 38.2 175.7 49 R 8.63 126.6 55.2 30.5 175.5 50 L 8.20 124.5 53.0 42.0 174.8 51 P - - 62.3 32.3 177.2 52 A 8.70 124.0 55.8 18.3 179.4 53 H 8.28 112.7 57.6 28.9 175.5 54 L 7.02 119.7 54.0 42.6 175.9 55 R 7.36 116.8 56.9 31.3 176.4 56 L 8.01 120.5 54.5 43.9 177.4 57 Q - - - - - 58 P - - 63.2 30.0 176.4 59 I 7.17 113.7 63.3 37.4 176.5 60 Y 7.84 119.8 57.4 38.7 176.7 61 W 8.03 121.6 55.6 30.1 178.4 62 S 10.63 125.4 57.2 66.1 175.2 63 R 9.16 120.3 59.6 29.8 179.5 64 D - - 57.3 40.8 178.2 65 D 7.82 122.0 57.8 39.4 178.0 66 V 8.19 120.1 67.6 32.2 177.7 67 A 7.82 120.9 55.8 17.8 181.3 68 Q 8.26 117.6 58.8 28.4 179.0 69 W 8.81 126.0 61.6 26.9 176.9 70 L 8.39 118.6 58.1 42.0 179.2 71 K 7.48 117.4 58.2 31.9 178.9 72 W 8.35 122.2 61.5 27.7 177.6 73 A 8.81 122.5 54.8 17.5 179.1 74 E 8.09 116.6 59.4 29.9 178.6 75 N 7.31 114.6 55.9 39.5 177.4 76 E 8.56 121.1 58.2 28.7 177.8 77 F 7.66 112.3 57.1 37.7 174.0   167 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 13CO 78 S 7.22 114.0 58.4 61.0 174.5 79 L 8.03 118.0 53.0 42.3 177.3 80 R 8.22 121.7 54.7 28.7 174.0 81 P - - 63.7 32.1 177.1 82 I 8.47 124.4 60.4 40.0 176.1 83 D 8.61 126.4 54.2 41.0 177.7 84 S - - 61.1 63.0 175.8 85 N 8.76 119.2 54.9 37.8 176.3 86 T 7.75 111.3 63.8 69.3 173.9 87 F 7.94 121.0 57.3 40.1 174.5 88 E 8.05 123.2 55.9 27.8 174.1 89 M 8.20 121.9 54.5 34.5 172.3 90 N 8.52 114.9 51.0 38.8 175.8 91 G 8.43 105.1 47.4 - 174.4 92 K 7.77 120.1 60.0 31.6 178.7 93 D 7.83 117.8 56.9 41.0 179.7 94 L 8.34 125.1 58.0 43.6 179.5 95 L 7.73 115.9 56.6 41.7 177.4 96 L 7.35 117.4 54.8 42.3 178.5 97 L 7.4 119.6 55.5 42.1 178.6 98 T 9.54 114.5 60.4 72.0 175.6 99 K 8.64 122.7 61.2 31.7 178.4 100 E 8.25 117.3 60.2 28.7 178.9 101 D 7.85 121.1 57.7 42.1 179.9 102 F 8.32 120.5 63.6 39.8 178.4 103 R 8.15 118.7 59.0 30.6 177.9 104 Y 8.11 118.5 60.5 38.4 177.8 105 R 7.39 116.6 58.2 31.0 177.0 106 S 8.21 110.5 54.1 62.3 171.5 107 P - - 65.2 31.7 179.3 108 H 8.05 111.7 57.5 31.4 177.3 109 S 7.36 111.9 58.2 65.3 174.5 110 G 9.32 111.9 48.9 - 174.1 111 D 8.62 119.5 58.4 40.2 178.1 112 V 7.47 119.2 65.8 31.6 178.2 113 L 8.14 119.2 58.0 42.5 177.8 114 Y 8.21 118.7 62.3 38.0 177.6   168 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 13CO 115 E 7.81 118.0 58.8 28.9 179.6 116 L 8.69 122.4 57.8 42.4 178.8 117 L 8.35 120.8 57.9 40.7 178.4 118 Q 7.87 115.0 57.9 27.4 178.8 119 H 8.09 119.4 60.1 30.6 178.3 120 I 8.21 120.5 65.4 38.2 178.5 121 L 8.19 119.7 57.3 42.2 178.8 122 K 7.53 116.9 57.6 32.7 177.3 123 Q 7.77 117.9 56.5 29.1 176.3 124 R 7.95 120.6 56.0 30.4 175.0 125 K 7.86 128.0 57.8 33.5 181.3      169 Table A-4 Chemical shift assignments (in ppm) for the V112E PNT domain as a heterodimer with the A93D PNT domain  (20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5% D2O at pH 8.0 and 25 °C)  Residue Number Amino Acid 1HN 15N 13Ca 13Cb 46 D 8.41 119.5 54.9 41.1 47 S 8.06 114.7 58.8 64.0 48 I 8.08 121.9 61.3 38.3 49 R 8.74 127.1 55.1 30.5 50 L 8.21 124.7 53.1 42.0 51 P - - - - 52 A 8.70 123.7 55.8 18.5 53 H 8.24 112.3 57.5 29.0 54 L 7.03 119.9 53.9 42.8 55 R 7.34 116.8 56.9 31.3 56 L 7.98 120.0 54.5 43.8 57 Q - - - - 58 P - - - - 59 I 7.09 112.9 63.5 37.5 60 Y 7.87 119.5 57.5 39.0 61 W 8.02 121.4 55.4 29.9 62 S 10.66 126.0 57.4 65.9 63 R 9.37 121.3 59.5 30.6 64 D 8.33 120.5 55.1 32.8 65 D 7.95 122.1 57.8 39.3 66 V 8.22 120.1 67.7 32.1 67 A 7.68 120.8 56.1 18.0 68 Q 8.30 117.8 58.8 28.4 69 W 8.91 125.9 61.5 26.8 70 L 8.33 118.2 58.2 42.0 71 K 7.41 117.3 58.1 32.0 72 W 8.39 122.1 61.4 27.8 73 A 8.88 122.0 54.7 18.0 74 E 7.92 116.3 59.4 29.8 75 N 7.33 114.8 55.9 39.6 76 E 8.62 120.6 58.1 28.8 77 F 7.53 112.3 57.0 38.1 78 S 7.22 113.4 58.4 61.0   170 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 79 L 7.61 117.3 53.2 42.4 80 R 8.24 122.0 63.1 29.2 81 P - - - - 82 I 8.39 121.1 59.9 41.3 83 D 8.60 124.4 54.5 41.6 84 S - - - - 85 N 8.82 119.0 53.8 37.4 86 T 7.83 113.3 65.5 69.3 87 F 8.08 119.6 56.9 40.0 88 E 8.27 125.6 56.5 28.2 89 M 7.83 118.1 54.5 31.2 90 N 7.60 113.0 50.5 39.0 91 G 8.37 104.1 47.2 - 92 K 7.58 118.1 60.0 32.4 93 A 7.40 118.4 54.0 21.3 94 L 7.80 118.5 57.8 43.7 95 L 7.29 113.8 56.3 41.7 96 L 7.72 118.8 55.6 43.2 97 L 6.91 115.5 55.1 42.7 98 T 9.46 113.3 59.9 71.7 99 K 8.57 122.6 61.2 31.8 100 E 8.15 116.9 59.9 29.2 101 D 7.74 122.2 57.8 41.4 102 F 8.25 120.8 63.3 39.5 103 R 8.50 123.4 58.8 30.6 104 Y 8.23 119.6 61.1 38.4 105 R 7.71 117.0 58.5 30.8 106 S 8.29 109.9 53.5 62.2 107 P - - - - 108 H 8.38 111.6 56.9 31.4 109 S 7.44 112.1 58.0 65.9 110 G 9.36 112.3 48.9 - 111 D 8.69 121.1 58.4 40.4 112 E 8.49 119.4 60.5 29.4 113 L 8.63 119.7 58.1 42.8 114 Y 8.46 119.8 62.3 37.9 115 E 8.02 118.1 58.8 28.9   171 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 116 L 8.49 122.6 57.9 42.1 117 L 8.41 120.9 58.0 40.8 118 Q 7.91 115.0 58.0 27.5 119 H 8.06 119.1 60.0 30.6 120 I 8.24 120.8 65.5 38.3 121 L 8.23 119.7 57.3 42.2 122 K 7.55 117.1 57.7 32.7 123 Q 7.75 117.8 56.5 29.1 124 R 7.93 120.8 56.1 30.6 125 K 7.88 128.2 57.7 33.7     172 Table A-5 Chemical shift assignments (in ppm) for the A93D PNT domain as a heterodimer with the V112E PNT domain (20 mM MOPS, 50 mM NaCl, 0.5 mM EDTA and 5 % D2O at pH 7.0 and 25 °C)  Residue Number Amino Acid 1HN 15N 13Ca 13Cb 44 E 8.64 121.7 56.9 30.0 45 E 8.52 121.6 57.4 30.2 46 D 8.35 120.2 54.6 41.0 47 S 8.09 115.7 59.5 63.9 48 I 8.08 121.8 61.1 38.3 49 R 8.67 126.9 55.0 30.7 50 L 8.32 125.3 53.0 41.7 51 P - - - - 52 A 8.69 123.4 55.8 18.5 53 H 8.22 112.4 57.5 29.0 54 L 7.10 120.2 53.8 42.7 55 R 7.32 117.0 57.0 31.0 56 L 7.92 120.6 54.6 43.6 57 Q - - - - 58 P - - - - 59 I 7.18 113.4 63.3 37.5 60 Y 7.86 119.9 57.4 38.9 61 W 8.04 121.6 55.6 30.1 62 S 10.61 125.4 57.2 66.1 63 R 9.16 120.2 59.6 29.9 64 D 7.82 122.0 57.8 39.5 65 D - - - - 66 V 8.17 120.1 67.6 32.3 67 A 7.84 121.0 56.1 17.9 68 Q 8.29 117.9 58.9 28.4 69 W 8.89 126.2 61.4 26.9 70 L 8.40 118.5 58.1 42.0 71 K 7.53 117.7 58.2 32.0 72 W 8.40 122.0 61.5 27.7 73 A 8.91 122.4 54.6 17.5 74 E 8.15 117.2 59.8 29.9 75 N 7.06 114.4 56.0 39.7 76 E 8.53 122.3 58.1 28.9   173 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 77 F 7.78 112.5 58.0 38.0 78 S 7.19 113.3 57.8 61.9 79 L 7.75 116.7 52.9 43.1 80 R 8.64 121.9 54.7 33.0 81 P - - - - 82 I 8.53 124.3 60.5 40.0 83 D 8.60 126.5 54.3 41.1 84 S - - - - 85 N 8.75 119.1 54.7 37.8 86 T 7.75 111.8 64.0 69.2 87 F 7.91 120.7 57.2 40.1 88 E 8.06 123.4 55.8 27.9 89 M 8.18 121.8 54.5 34.5 90 N 8.51 114.9 51.0 38.8 91 G 8.43 105.0 47.4 - 92 K 7.76 120.0 60.0 31.7 93 D 7.84 117.8 56.9 41.0 94 L 8.31 125.2 58.0 43.8 95 L 7.70 115.9 56.5 41.8 96 L 7.36 117.4 54.8 42.4 97 L 7.33 119.7 55.5 42.3 98 T 9.49 114.4 60.4 71.7 99 K 8.71 122.9 61.5 31.3 100 E 8.15 117.1 59.9 28.9 101 D 7.79 121.0 57.7 42.1 102 F 8.24 120.4 63.6 39.6 103 R 8.30 118.2 59.4 30.4 104 Y 8.00 117.8 60.4 38.4 105 R 7.57 116.8 58.3 31.1 106 S 8.17 109.7 54.0 62.2 107 P - - - - 108 H 7.62 109.9 57.6 31.1 109 S 7.58 113.2 58.1 66.0 110 G 9.46 111.5 48.7 - 111 D 8.58 119.9 57.0 37.5 112 V 8.43 122.7 66.4 32.2 113 L 8.15 116.5 57.9 42.1   174 Residue Number Amino Acid 1HN 15N 13Ca 13Cb 114 Y 8.29 119.4 62.3 38.1 115 E 8.26 118.1 58.3 29.8 116 L 9.20 123.7 58.0 42.0 117 L 8.43 121.1 58.1 40.8 118 Q 7.96 115.0 57.9 27.6 119 H 8.20 119.5 60.5 30.4 120 I 8.38 120.7 65.4 38.2 121 L 8.26 119.9 57.3 42.3 122 K 7.55 116.8 57.6 32.6 123 Q 7.71 117.9 56.5 29.1 124 R - - - - 125 K 7.88 128.1 57.7 33.7                  175 Appendix C  - Amide hydrogen exchange protection factors Table A-6 Amide HX protection factors (log(PF))   Residue A93D PNT Monomer V112E PNT Monomer A93D PNT Heterodimer V112E PNT Heterodimer amide 1HN h-bond acceptor secondary structure L41       R42       M43       E44       E45       D46       S47       I48       R49       L50       P51       A52      310 H53      310 L54     P51 CO 310 R55     A52 CO  L56       Q57       P58      310 I59     Q57 CO 310 Y60 4.82 4.72 5.17 5.35 Q57 CO 310 W61 5.47 5.29 6.16 6.39 P58 CO  S62           R63          a D64 No Assignment No Assignment      a D65 3.78 3.63 3.94 4.02 S62 CO a V66 5.67 5.19 7.17 > 8.5 * S62 CO a A67 5.98 5.90 6.21 6.07 R63 CO a Q68 5.98 5.79 6.16 6.18 D64 CO a W69 > 7.0 * 6.71 > 8.5 * > 8.5 * D65 CO a L70 > 7.0 * > 7.0 * > 8.5 * > 8.5 * V66 CO a K71 6.86 6.44 7.66 7.42 A67 CO a W72 6.51 6.35 6.97 6.98 Q68 CO a   176 Residue A93D PNT Monomer V112E PNT Monomer A93D PNT Heterodimer V112E PNT Heterodimer amide 1HN h-bond acceptor secondary structure A73 6.17 5.90 > 8.5 * 6.06 W69 CO a E74 5.59 5.42 Peak Overlap 5.51 L70 CO a N75         K71 CO a E76 4.00 3.89 4.60 3.94 W72 CO a F77 4.39 4.30 7.12 4.28 A73 CO  S78 4.44  5.70   A75 CO  L79   5.24   Q74 CO  R80           P81 Proline Proline Proline Proline   I82           D83           S84 No Assignment No Assignment No Assignment No Assignment   N85           T86         D83 CO  F87   No Assignment     S84 CO  E88   No Assignment       M89 4.03 No Assignment 4.16 6.99   N90   No Assignment   7.35 F77 CO  G91         W61 CO a K92 4.39 4.13 4.64 7.19  a A93 4.84 4.83 5.29 > 8.5 * N90 CO a L94 5.23 No Assignment 6.68 > 8.5 * N90 CO a L95 5.38 No Assignment 6.36 7.13 G91 CO  L96 3.37 3.70 3.41 > 8.5 * A93 CO  L97 5.57 No Assignment 6.62 > 8.5 * L94 CO  T98 4.14 4.37   6.98   K99          a E100     Peak Overlap    a D101 3.97 3.83 4.57 5.73 T98 CO a F102  4.74 7.45 7.67 T98 CO a R103         K99 CO a Y104     5.24 6.51 E100 CO a R105 4.31     D101 CO a S106  5.12 7.14 8.16 F102 CO  P107 Proline Proline Proline Proline     177 Residue A93D PNT Monomer V112E PNT Monomer A93D PNT Heterodimer V112E PNT Heterodimer amide 1HN h-bond acceptor secondary structure H108   No Assignment       S109           G110         S106 CO a D111     4.12    a V112 3.18 No Assignment 5.29 3.46 S109 CO a L113 4.90 No Assignment > 8.5 * 4.95 S109 CO a Y114 5.96 No Assignment > 8.5 * 7.61 G110 CO a E115 5.86 No Assignment 5.58 6.22 D111 CO a L116 > 7.0 *  > 8.5 * 6.74 R112 CO a L117 > 7.0 * > 7.0 * > 8.5 * > 8.5 * L113 CO a Q118 5.84 6.01 6.95 5.89 Y114 CO a H119    5.65 E115 CO a I120 3.49 4.03 4.37   L116 CO a L121 3.01 3.71 3.35   L117 CO a K122     5.03   Q118 CO a Q123         H119 CO  R124     No Assignment     K125           * Represented lower limit due to incomplete exchange after 3 months in D2O solution. The highest PF calculated for the monomeric species was 6.56 and for the heterodimeric complex was 8.16. The lower limits were arbitrarily set to be higher than the largest calculate PF.  Secondary structure and main chain carbonyl hydrogen bond acceptors for the amide 1HN are from an analysis of chains A and B of 1LKY.pdb using PDBsum and Vadar, respectively (Laskowski et al., 2018; Willard et al., 2003). If multiple acceptors are identified, only the closest is indicated.          178 Appendix D  - BUDE compounds tested by NMR spectroscopy Table A-7 Chemical structures of compounds selected from BUDE BUDE Rank MolPort ID Target Chemical Structure 6 MolPort-010-780-927 A93D Hydrophobic Residues  11 MolPort-005-019-637 A93D Hydrophobic Residues  28 MolPort-007-725-254 A93D Hydrophobic Residues  39 MolPort-007-701-986 A93D Hydrophobic Residues  51 MolPort-005-140-761 A93D Hydrophobic Residues  56 MolPort-002-668-848 A93D Hydrophobic Residues  78 MolPort-005-090-877 A93D Hydrophobic Residues  86 MolPort-005-123-375 A93D Hydrophobic Residues  93 MolPort-005-308-219 A93D Hydrophobic Residues  NNNHNN NOOONONNOHNNNNNNNNNNHNN NOOONNN NNHNOONNNNHNNOONNNNNNNFNNONNNNONNNHOOOFNNCl  179 BUDE Rank MolPort ID Target Chemical Structure 95 MolPort-007-773-309 A93D Hydrophobic Residues  129 MolPort-007-690-631 A93D Hydrophobic Residues  130 MolPort-007-702-123 A93D Hydrophobic Residues  155 MolPort-007-899-016 A93D Hydrophobic Residues  208 MolPort-002-736-787 A93D Hydrophobic Residues  255 MolPort-001-906-708 A93D Hydrophobic Residues  292 MolPort-009-154-411 A93D Hydrophobic Residues  319 MolPort-009-723-175 A93D Hydrophobic Residues  374 MolPort-005-299-926 A93D Hydrophobic Residues  NNNONNNNNHNN OONNNHNN NOFONNNHNN NOOON NNONNHN NH2OONNNNHOOOONONNNOONONNNNONNNHNHNNOHNNHOO  180 BUDE Rank MolPort ID Target Chemical Structure 378 MolPort-005-689-600 A93D Hydrophobic Residues  421 MolPort-009-483-070 A93D Hydrophobic Residues  428 MolPort-009-178-639 A93D Hydrophobic Residues  453 MolPort-009-077-356 A93D Hydrophobic Residues  466 MolPort-005-035-860 A93D Hydrophobic Residues  478 MolPort-009-113-626 A93D Hydrophobic Residues  631 MolPort-001-930-558 A93D Hydrophobic Residues  825 MolPort-001-930-557 A93D Hydrophobic Residues  2585 MolPort-000-512-139 A93D Hydrophobic Residues  NHONNFFFONNNOOONONNFFFONNNNONNONHNSOON NNNONNNNONNNONON NNNOOOOOOOHHOOH OHOOOOOOOHOHHOOHOONNNNHSOOHOOHClO  181 BUDE Rank MolPort ID Target Chemical Structure 3186 MolPort-004-881-711 A93D Hydrophobic Residues  3584 MolPort-004-284-939 A93D Hydrophobic Residues  4073 MolPort-008-311-002 A93D Hydrophobic Residues  7648 MolPort-007-566-792 A93D Hydrophobic Residues  7994 MolPort-004-882-201 A93D Hydrophobic Residues  8088 MolPort-003-155-823 A93D Hydrophobic Residues  N/A Structural Analog MolPort-019-818-459 A93D Hydrophobic Residues  2 MolPort-007-690-631 V112E Hydrophobic Residues  3 MolPort-007-690-630 V112E Hydrophobic Residues  NNNNNHSSOOOOHOClHNOS NNOOHONNNOOOHNNONNHNSOHONNNHOClOOOHHONN NSOOHOClNNNNNHNNNHNNNOONNNNOHNOO  182 BUDE Rank MolPort ID Target Chemical Structure 75 MolPort-005-970-014 V112E Hydrophobic Residues  76 MolPort-002-694-487 V112E Hydrophobic Residues  89 MolPort-001-020-317 V112E Hydrophobic Residues  92 MolPort-009-696-679 V112E Hydrophobic Residues  95 MolPort-000-473-014 V112E Hydrophobic Residues  104 MolPort-009-746-988 V112E Hydrophobic Residues  129 MolPort-007-969-298 V112E Hydrophobic Residues  134 MolPort-004-826-786 V112E Hydrophobic Residues  N NNHNN NOONNNNNONNS SOOOOOOOHOHOHHOO ONNN NOONHONHNHNNN NONNNNNNOOHNHNNNNHOOOOONNNOHNONNO  183 BUDE Rank MolPort ID Target Chemical Structure 136 MolPort-005-004-054 V112E Hydrophobic Residues  196 MolPort-044-259-270 V112E Hydrophobic Residues  253 MolPort-009-724-755 V112E Hydrophobic Residues  280 MolPort-002-736-787 V112E Hydrophobic Residues  356 MolPort-003-175-993 V112E Hydrophobic Residues  376 MolPort-000-839-413 V112E Hydrophobic Residues  385 MolPort-002-571-074 V112E Hydrophobic Residues  483 MolPort-003-109-637 V112E Hydrophobic Residues  N NNONNNO ONH2NNH2NOOClClN NNNNNONOON NNONNHN NH2OOSNNNOOONHOONNNHNOOOHONNNNOOOOHNNHNNNNNHNNOOF  184 BUDE Rank MolPort ID Target Chemical Structure 31 MolPort-002-007-977 A93D Salt Bridge Residues  43 MolPort-005-102-430 A93D Salt Bridge Residues  2860 MolPort-002-668-848 A93D Salt Bridge Residues  3221 MolPort-009-704-070 A93D Salt Bridge Residues  1 MolPort-009-747-005 V112E Salt Bridge Residues  3 MolPort-009-746-988 V112E Salt Bridge Residues  22 MolPort-003-156-017 V112E Salt Bridge Residues  45 MolPort-009-747-002 V112E Salt Bridge Residues  75 MolPort-001-630-482 V112E Salt Bridge Residues  N NNONNNHOONNNNOOONNNNHNNOOSNNNNOONNNNNHOONNHHNNNNNNNOOHNHNNNNNONNNNNNOONHNHNNNNHNSOO  185 BUDE Rank MolPort ID Target Chemical Structure 3580 MolPort-008-295-789 V112E Salt Bridge Residues     SNH NHONOOOH

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0392799/manifest

Comment

Related Items