Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Template-directed protein misfolding in neurodegenerative disease Guest, William Clay 2012

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_spring_2013_guest_william.pdf [ 5.52MB ]
JSON: 24-1.0072694.json
JSON-LD: 24-1.0072694-ld.json
RDF/XML (Pretty): 24-1.0072694-rdf.xml
RDF/JSON: 24-1.0072694-rdf.json
Turtle: 24-1.0072694-turtle.txt
N-Triples: 24-1.0072694-rdf-ntriples.txt
Original Record: 24-1.0072694-source.json
Full Text

Full Text

Template-Directed Protein Misfolding in Neurodegenerative Disease by William Clay Guest B.Sc.(Hons.), University of Manitoba, 2007 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in THE FACULTY OF GRADUATE STUDIES (Experimental Medicine) The University Of British Columbia (Vancouver) April 2012 c©William Clay Guest, 2012 Abstract Protein misfolding diseases represent a large burden to human health for which only symptomatic treatment is generally available. These diseases, such as Creutzfeldt-Jakob disease, amyotrophic lateral sclerosis, and the systemic amyloidoses, are characterized by conversion of globular, natively- folded proteins into pathologic β -sheet rich protein aggregates deposited in affected tissues. Un- derstanding the thermodynamic and kinetic details of protein misfolding on a molecular level de- pends on accurately appraising the free energies of the folded, partially unfolded intermediate, and misfolded protein conformers. There are multiple energetic and entropic contributions to the total free energy, including nonpolar, electrostatic, solvation, and configurational terms. To ac- curately assess the electrostatic contribution, a method to calculate the spatially-varying dielectric constant in a protein/water system was developed using a generalization of Kirkwood Frohlich the- ory along with brief all-atom molecular dynamics simulations. This method was combined with previously validated models for nonpolar solvation and configurational entropy in an algorithm to calculate the free energy change on partial unfolding of contiguous protein subsequences. Results were compared with those from a minimal, topologically-based Gō model and direct calculation of free energies by steered all-atom molecular dynamics simulations. This algorithm was applied to understand the early steps in the misfolding mechanism for β2-microglobulin, prion protein, and superoxide dismutase 1 (SOD1). It was hypothesized that SOD1 misfolding may follow a template-directed mechanism like that discovered previously for prion protein, so misfolding of SOD1 was induced in cell culture by transfection with mutant SOD1 constructs and observed to stably propagate intracellularly and intercellularly much like an infectious prion. A defined min- imal assay with recombinant SOD protein demonstrated the sufficiency of mutant SOD1 alone to trigger wtSOD1 misfolding, reminiscent of the “protein-only” hypothesis of prion spread. Fi- nally, protein misfolding as a feature of disease may extend beyond neurodegeneration and amyloid formation to cancer, in which derangement of protein folding quality control may lead to antibody- recognizable misfolded protein present selectively on cancer cell surfaces. The evidence for this hypothesis and possible therapeutic targets are discussed as a future direction. ii Preface Some of the content in this thesis has been published previously, as described below. The papers for which my supervisors and I are the sole authors represent in their entirety my original work prepared with my supervisors’ direction, advice and editing. For the papers with other co-authors, I have extracted in this thesis the work to which I directly contributed, through design of experi- ments, performance of experiments, and/or data analysis. • The overview of the clinical features and treatment of Creutzfeldt-Jakob disease in Section 1.1 has appeared in “Guest W, Pfeffer G, and Cashman N (2011). ‘Elderly Lady with Right- Sided Hallucinations.’ Chapter 30 in Case Studies in Dementia: Common and Uncommon Presentations. ed. S. Gauthier and P. Rosa-Neto. Cambridge University Press.” • The review of PrPSc structural models in Section 1.2 has appeared in “Guest W, Plotkin S, and Cashman N (2011). ‘Toward a Mechanism of Prion Protein Misfolding and Structural Models of PrPSc: Current Models and Future Directions.’ Journal of Toxicology and Envi- ronmental Health Part A, 74, 154–160.” • The description of the prion-like features of other protein misfolding diseases in Section 1.3 has appeared in “Guest W, Silverman M, Pokrishevsky E, O’Neill M, Grad L, Cashman N (2011). ‘Generalization of the Prion Hypothesis to Other Neurodegenerative Diseases: An Imperfect Fit.’ Journal of Toxicology and Environmental Health Part A, 74, 1433–59.” • The theory of protein dielectric response in Chapter 2 has been published in “Guest W, Cash- man N, and Plotkin S (2011). ‘On the Anisotropic and Inhomogeneous Dielectric Properties of Proteins.’ Physical Chemistry Chemical Physics, 13, 6286–6295.” • The analysis of prion protein electrostatic energies in Section 3.1 has appeared in “Guest W, Cashman N, and Plotkin S (2010). ‘Electrostatics in the Stability and Misfolding of the Prion Protein: Salt Bridges, Self-Energy, and Solvation.’ Biochemistry and Cell Biology, 88, 371–381.” iii • The study of prion protein tail length influence on dynamics of the PrP folded domain ap- peared in “Li L, Guest W, Huang A, Plotkin S, Cashman N (2009). ‘Immunological Mimicry of PrPC-PrPSc Interactions: Antibody-Induced PrP Misfolding.’ Protein Engineering Design and Selection, 22, 523–529.” • Data on template directed misfolding of superoxide dismutase 1 in Chapter 5 has been pub- lished as “Grad L, Guest W, Yanai A, O’Neill MA, Pokrishevsky E, Gibbs E, Semenchenko V, Yousefi M, Wishart D, Plotkin S, and Cashman N (2011). ‘Intermolecular transmission of SOD1 misfolding in living cells.’ Proceedings of the National Academy of Sciences of the United States of America, 108, 16398–403.” Patent applications have been filed on some of the work contained in this thesis: • The dielectric calculator in Chapter 2 is covered by a US patent application, “Guest W, Cash- man N, Plotkin S. ‘Methods and Systems for Determining Localized Dielectric Properties of a Molecule.’ US Application Number 12/952,140 filed 30 November 2010.” • The regional protein unfolding algorithm in Chapter 3 and the misfolding-specific epitope predictions for prion protein in Chapter 4, superoxide dismutase 1 in Chapter 5, and cancer- associated protein in Chapter 6 are covered by an application under the Patent Co-operation Treaty: “Cashman N, Plotkin S, Guest W. Methods and Systems for Predicting Misfolded Protein Epitopes. PCT Application Number PCT/CA2009/001413 filed October 6, 2009.” iv Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Clinical Aspects of the Prion Diseases . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Differential Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.3 Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 The Mechanism of Prion Misfolding and Structural Models of PrPSc . . . . . . . . 4 1.2.1 Environmental Variables Influencing Prion Misfolding . . . . . . . . . . . 5 1.2.2 Known Facts about the Infectious Species of PrPSc . . . . . . . . . . . . . 8 1.2.3 Current Structural Models of PrPSc . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Generalization of the Prion Hypothesis to Other Neurodegenerative Diseases . . . . 13 1.3.1 Defining the Prion Principle . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.2 Amyloid-β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.3 Tau Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3.4 TDP-43 and FUS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 v 1.3.5 Superoxide Dismutase 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.6 Huntingtin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.3.7 α-Synuclein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.3.8 Disrupted in Schizophrenia 1 . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.3.9 Other Amyloid Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.3.10 Thermodynamic Considerations . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.11 Cell-to-Cell Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3.12 The Infection Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.3.13 New Therapeutic Avenues from the Prion Hypothesis . . . . . . . . . . . . 26 1.3.14 Toward a Unified Theory of Misfolding Disease? . . . . . . . . . . . . . . 29 1.4 Aims of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2 The Dielectric Properties of Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.2 Protein Dipoles in an Applied Field . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.2.1 Internal Protein Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.2.2 Collective Dipole Fluctuation Modes . . . . . . . . . . . . . . . . . . . . . 37 2.2.3 Linear Response Relation for Induced Moments . . . . . . . . . . . . . . . 38 2.3 The Dielectric Constant at an Arbitrary Point . . . . . . . . . . . . . . . . . . . . 43 2.4 Implementation in a Protein System . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.5 Dielectric Anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.6 Spatial Variation in Protein Dielectric Response . . . . . . . . . . . . . . . . . . . 46 2.7 Dependence on Sphere Size and Lattice Point Spacing . . . . . . . . . . . . . . . . 49 2.8 Averaged Dielectric Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.9 Applications to Ion Pair (Salt Bridge) Energies . . . . . . . . . . . . . . . . . . . . 51 2.10 Applications to Ion Channel Passage . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3 Energy Landscapes for Partial Protein Unfolding . . . . . . . . . . . . . . . . . . . . 59 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2 A Gō Model of Protein Unfolding . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3 An Ensemble-Based Model of Protein Unfolding . . . . . . . . . . . . . . . . . . 62 3.3.1 Polar Protein-Protein Energy . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3.2 Polar Solvation Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.3 Nonpolar Protein-Protein Energy . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.4 Nonpolar Solvation Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 65 vi 3.3.5 Configurational Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.4 Generation of Folded and Unfolded Protein Ensembles . . . . . . . . . . . . . . . 67 3.5 Convergence of Terms in the Energy Function . . . . . . . . . . . . . . . . . . . . 69 3.6 Behaviour of Individual Terms in the Energy Function . . . . . . . . . . . . . . . . 71 3.7 The Unfolding Energy Landscape of Beta-2-Microglobulin . . . . . . . . . . . . . 74 3.8 Comparison with Experimental NMR Data . . . . . . . . . . . . . . . . . . . . . . 76 3.9 Cooperativity in Regional Unfolding . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.10 Capabilities, Assumptions, and Limitations of the Energy Landscape Model . . . . 78 4 Prion Protein Stability and Misfolding . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.3 Dynamics of Dipoles at Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.4 Salt Bridge Energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.5 Total Residue Electrostatic Energies . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.6 Transfer to Hydrophobic Environment . . . . . . . . . . . . . . . . . . . . . . . . 92 4.7 Discussion of Electrostatic Effects in Prion Misfolding . . . . . . . . . . . . . . . 93 5 SOD1 Misfolding as a Template-Directed Process . . . . . . . . . . . . . . . . . . . . 97 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.2 Development of Antibodies Specific to Misfolded SOD1 . . . . . . . . . . . . . . 99 5.3 Expression of Misfolded SOD1 Induces Misfolding of wtSOD1 in Human Cells . . 103 5.4 G127X SOD1 Induction of wtSOD1 Misfolding is Species-Specific . . . . . . . . 105 5.5 G127X Induces wtSOD1 Misfolding in a Recombinant Cell-Free System . . . . . . 107 5.6 Cells Export Aggregated Misfolded SOD1 to their Culture Medium . . . . . . . . 107 5.7 Induction of wtSOD1 Misfolding can be Transmitted Intercellularly . . . . . . . . 110 5.8 Antibodies Against SOD1 can Block Transmission of Misfolding . . . . . . . . . . 113 5.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.1 Summary of Conclusions from the Present Work . . . . . . . . . . . . . . . . . . . 118 6.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.2.1 Protein pKa Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.2.2 Direct Calculation of Regional Protein Unfolding Energy from Steered MD 120 6.2.3 A Mean Hydrophobicity Model for Misfolded Protein Assembly . . . . . . 122 6.2.4 Small Molecule Development to Inhibit SOD1 Misfolding . . . . . . . . . 122 vii 6.3 Protein Misfolding in Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 6.3.1 Gene Mutations Causing Constitutive Protein Misfolding in Cancer . . . . 126 6.3.2 Gene Copy Number Alterations and Subunit Imbalance . . . . . . . . . . . 128 6.3.3 Impaired Protein Quality Control in the ER . . . . . . . . . . . . . . . . . 130 6.3.4 Protein Relocalization to the Cell Surface . . . . . . . . . . . . . . . . . . 131 6.3.5 Tumour Microenvironment Stressors . . . . . . . . . . . . . . . . . . . . . 132 6.3.6 Candidate Protein Targets for Misfolding in Cancer . . . . . . . . . . . . . 133 6.3.7 Therapeutic Practicalities and Limitations . . . . . . . . . . . . . . . . . . 136 6.4 Parting Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 A Prion Protein Salt Bridge Energy Data . . . . . . . . . . . . . . . . . . . . . . . . . . 184 B Supplemental Methods and Figures for Chapter 5 . . . . . . . . . . . . . . . . . . . 193 B.1 Cell Culture and Transfection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 B.2 Molecular Dynamics Simulation of Wild-Type and G127X SOD1 . . . . . . . . . 193 B.3 Immunofluorescence Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 B.4 SDS-PAGE and Immunoblotting . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 B.5 Spectroscopic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 B.6 Recombinant SOD1 Production and Purification . . . . . . . . . . . . . . . . . . . 195 B.7 In Vitro G127X-Mediated Misfolding Induction of Recombinant wtSOD1 Protein . 196 C Unfolding-Prone Epitopes in Cancer-Associated Proteins . . . . . . . . . . . . . . . 201 viii List of Tables Table 1.1 Prion-like features of other diseases . . . . . . . . . . . . . . . . . . . . . . . . 30 Table 4.1 Salt bridges in the human prion protein . . . . . . . . . . . . . . . . . . . . . . 88 Table 4.2 The most attractive and repulsive prion protein salt bridges . . . . . . . . . . . . 89 Table 4.3 Residues in human PrP with the greatest total electrostatic energy . . . . . . . . 91 Table 6.1 PDB structures for cancer-associated misfolded epitope prediction . . . . . . . . 136 Table A.1 Salt bridge energies in species and mutants of prion protein . . . . . . . . . . . 184 Table A.2 Total residue electrostatic energies in a selection of prion protein structures at pH 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Table C.1 Cancer-associated protein epitope sequences . . . . . . . . . . . . . . . . . . . 201 ix List of Figures Figure 1.1 Schematic of agents relevant to the prion misfolding process . . . . . . . . . . 5 Figure 1.2 Current structural models of PrPSc . . . . . . . . . . . . . . . . . . . . . . . . 10 Figure 1.3 Prion-like misfolding energy landscape . . . . . . . . . . . . . . . . . . . . . 22 Figure 2.1 Dipole fluctuation statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Figure 2.2 Schematic of dielectric theory . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Figure 2.3 Dipole correlations and convergence in simulation . . . . . . . . . . . . . . . . 45 Figure 2.4 The spatially varying dielectric function for adenylate kinase . . . . . . . . . . 47 Figure 2.5 Dielectric function for a polyglutamine helix . . . . . . . . . . . . . . . . . . 48 Figure 2.6 Residue nuclear and electronic polarizabilities . . . . . . . . . . . . . . . . . . 49 Figure 2.7 The effect of cavity sphere radius . . . . . . . . . . . . . . . . . . . . . . . . . 50 Figure 2.8 Mean properties of the protein dielectric function . . . . . . . . . . . . . . . . 51 Figure 2.9 Salt bridge energies calculated with the heterogeneous dielectric theory . . . . 52 Figure 2.10 Geometry of strong salt bridges . . . . . . . . . . . . . . . . . . . . . . . . . 53 Figure 2.11 Energy profile of sodium passage thought an ion channel . . . . . . . . . . . . 54 Figure 2.12 Consequences of anisotropy in the dielectric field . . . . . . . . . . . . . . . . 56 Figure 3.1 Loop entropy reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Figure 3.2 Generation of folded and unfolded protein ensembles . . . . . . . . . . . . . . 68 Figure 3.3 Convergence of terms in the free energy function . . . . . . . . . . . . . . . . 70 Figure 3.4 Comparison of pairwise energies between folded and unfolded states . . . . . . 72 Figure 3.5 Contribution of polar solvation energy to the free energy function . . . . . . . 72 Figure 3.6 Configurational entropy in the folded and unfolded states . . . . . . . . . . . . 73 Figure 3.7 Relative contributions to unfolding free energy . . . . . . . . . . . . . . . . . 74 Figure 3.8 Unfolding free energy landscape for β2-microglobulin . . . . . . . . . . . . . 75 Figure 3.9 Validation of landscape theory from hydrogen exchange data . . . . . . . . . . 76 Figure 3.10 Cooperativity in regional unfolding of β2-microglobulin . . . . . . . . . . . . 77 Figure 4.1 Local heterogeneous dielectric map of human PrP . . . . . . . . . . . . . . . . 82 x Figure 4.2 Method of calculating salt bridge and total electrostatic energies . . . . . . . . 83 Figure 4.3 The largest amplitude dipole correlation modes for human PrP . . . . . . . . . 85 Figure 4.4 Nonlocal salt bridges in PrP . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Figure 4.5 The most attractive and repulsive PrP salt bridges . . . . . . . . . . . . . . . . 87 Figure 4.6 Total salt bridge energies in PrP by species . . . . . . . . . . . . . . . . . . . . 90 Figure 4.7 Hydrophobic transfer energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Figure 4.8 Validity of electrostatic energy approximation methods . . . . . . . . . . . . . 94 Figure 5.1 Overview of SOD1 disease-specific epitope sites . . . . . . . . . . . . . . . . 100 Figure 5.2 Structural characterization of the DSE2 peptide . . . . . . . . . . . . . . . . . 101 Figure 5.3 Equilibrium MD simulations of wild type and G127X SOD1 . . . . . . . . . . 102 Figure 5.4 Expression of G127X mutant misfolded SOD1 induces misfolding in wtSOD1 103 Figure 5.5 Conformational conversion of misfolded mutant SOD1 and misfolded wtSOD1 104 Figure 5.6 Mutant SOD1-mediated wtSOD1 misfolding is sequence/structure-specific . . 106 Figure 5.7 Induction of SOD1 misfolding in a cell-free system . . . . . . . . . . . . . . . 108 Figure 5.8 Detection of misfolded SOD1 in transfected cell media . . . . . . . . . . . . . 109 Figure 5.9 Template-directed transduction experimental design . . . . . . . . . . . . . . . 111 Figure 5.10 Mutant SOD1-mediated wtSOD1 misfolding is transmissible between cells . . 112 Figure 5.11 SOD1 antibodies can block misfolding transmission . . . . . . . . . . . . . . . 112 Figure 5.12 Hypothetical energy landscape of structural transitions during SOD1 misfolding 115 Figure 6.1 Direct calculation of unfolding free energy from harmonic biasing MD . . . . . 121 Figure 6.2 Candidate druggable sites on the SOD1 homodimer . . . . . . . . . . . . . . . 123 Figure 6.3 Factors contributing to protein misfolding in cancer . . . . . . . . . . . . . . . 129 Figure 6.4 Predicted misfolding-specific epitopes for P-glycoprotein . . . . . . . . . . . . 139 Figure 6.5 Predicted misfolding-specific epitopes for EGFR and HER2 . . . . . . . . . . 140 Figure 6.6 Predicted misfolding-specific epitopes for Notch, CD44, N-cadherin, and MUC1141 Figure 6.7 Predicted misfolding-specific epitopes for CD46, CD55, and CD59 . . . . . . . 142 Figure 6.8 Predicted misfolding-specific epitopes for TNF Receptor and Fas . . . . . . . . 143 Figure 6.9 Predicted misfolding-specific epitopes for Kit and Ret . . . . . . . . . . . . . . 144 Figure 6.10 Predicted misfolding-specific epitopes for CD38 and neuropilin . . . . . . . . 145 Figure A.1 Hydrophobic transfer energies for different prion protein structures . . . . . . . 192 Figure B.1 Controls ruling out nonspecific SOD1 misfolding . . . . . . . . . . . . . . . . 198 Figure B.2 Recognition of misfolded mouse SOD1 by DSE mAbs . . . . . . . . . . . . . 198 Figure B.3 Purification of recombinant SO1 . . . . . . . . . . . . . . . . . . . . . . . . . 199 xi Figure B.4 Cells export misfolded SOD1 to their culture medium. . . . . . . . . . . . . . 200 xii List of Abbreviations Aβ Amyloid-β AD Alzheimer’s Disease ALS Amyotrophic Lateral Sclerosis AMBER Assisted Model Building with Energy Refinement APBS Adaptive Poisson Boltzmann Solver β2M β2-Microglobulin BSE Bovine Spongiform Encephalopathy CHARMM Chemistry at Harvard Molecular Mechanics CJD Creutzfeldt-Jakob Disease CSF Cerebro-spinal Fluid CWD Chronic Wasting Disease DISC1 Disrupted in Schizophrenia 1 DSE Disease-Specific Epitope DTT Dithiothreitrol EDTA Ethylene Diamine Tetra-acetic Acid EGFR Epidermal Growth Factor Receptor ELISA Enzyme-Linked Immunosorbant Assay ER Endoplasmic Reticulum FALS Familial Amyotrophic Lateral Sclerosis FFI Fatal Familial Insomnia FTIR Fourier Transform Infrared Spectroscopy FUS/TLS Fused in Sarcoma/Translated in Liposarcoma GB Generalized Born GSS Gerstmann-Sträussler Scheinker Syndrome HSP Heat Shock Protein ICP-MS Inductively Coupled Plasma Mass Spectrometry Continued. . . xiii IF Immunofluorescence IHC Immunohistochemistry IP Immunoprecipitation MD Molecular Dynamics β -ME β -Mercaptoethanol NMR Nuclear Magnetic Resonance PB Poisson-Boltzmann PD Parkinson’s Disease PDB Protein Data Bank PrPC Cellular Prion Protein (native) PrPSc Scrapie Prion Protein (infectiously misfolded) ROS Reactive Oxygen Species SALS Sporadic Amyotrophic Lateral Sclerosis SASA Solvent-Accessible Surface Area SOD1 Superoxide Dismutase 1 SPR Surface Plasmon Resonance TDM Template-Directed Misfolding TDP Tar-DNA Binding Protein UPR Unfolded Protein Response VCJD Variant Creutzfeldt-Jakob Disease VMD Visual Molecular Dynamics XRC X-Ray Crystallography xiv List of Symbols By Order of Appearance ε Scalar dielectric constant  Tensor dielectric constant ε1 Scalar dielectric constant surrounding cavity 2 Tensor dielectric constant within cavity kB Boltzmann’s constant T Temperature µ Dipole moment E Electric field r Position vector G Gibbs free energy |φ〉 Basis of individual dipole fluctuation |ψ〉 Dipole fluctuation eigenvector M Degree of coupling between individual-dipole fluctuation modes α Scalar polarizability  Tensor polarizability p Permanent dipole moment of water g Kirkwood partial dipole alignment factor for water (2.67) a Cavity radius fA Fraction of dipole A inside a cavity Eext Externally applied electric field Ein Total electric field inside cavity G Cavity electric field Ee Local field experienced by dipoles within cavity mp Induced dipole moment of protein dipoles inside cavity R Reaction field Continued. . . xv Φin Electric potential inside cavity Φout Electric potential outside cavity T Set of all residues in protein U Set of unfolded residues in protein N Set of folded residues in protein EGo Gō model unfolding energy SGo Gō model unfolding entropy FGo Gō model unfolding free energy Θ Heaviside function ∆Fsolv Total solvation free energy ∆F polsolv Polar contribution to solvation free energy ∆Fnpsolv Nonpolar contribution to solvation free energy ∆Eprotein Total protein contact energy ∆E polprotein Polar contribution to contact energy ∆Enpprotein Nonpolar contribution to contact energy EBorn Born transfer energy due to change of dielectric environment ∆Scon f Configurational entropy ρw(r) Solvent radial distribution function V att Weeks-Chandler-Anderson decomposition of Lennard-Jones potential φi Protein dihedral angle pi Dihedral angle probability distribution I0(κ) Von Mises distribution ∆Sreturn Loop entropy penalty on unfolding fw Probability of a random walk satisfying boundary conditions xvi Acknowledgments Above all, I would like to express my deepest gratitude to my supervisors Neil Cashman and Steve Plotkin for their unwavering support, good advice, and helpful insights through all stages of this project. Thank you to my supervisory committee members John Schrader, Andre Marziali, and Cheryl Wellington, who have guided the course of my work and helped me focus on the most important aspects of my research problem. Thank you to my colleages in the Cashman and Plotkin labs, Eddie Pokrishevsky, Ali Mohazab, Max Silverman, Shirin Hadizadeh, Les Grad, Atanu Das, Ebrima Gibbs, Eric Abrahamsson, Megan O’Neill, Miguel Garcia, Dwayne Ashman, Jing Wang, Li Li, Alan Huang, and Masoud Yousefi, for many productive discussions and brainstorming sessions. Thank you to our collaborators at UBC and other universities, Nahid Jetha, Valentyna Semenchenko, Trent Bjorndahl, David Wishart, Olivier Julien, Subhrangsu Chatterjee, and Brian Sykes, for sup- plying reagents and experimental data that moved forward the work in this thesis tremendously. Thank you to the MD/PhD program directors, Lynn Raymond and Torsten Nielsen, for helping me smoothly navigate the competing demands of medical school and research. Thank you to Jane Lee, the MD/PhD program coordinator, for managing so well the administrative and official aspects of being enrolled as a graduate student. Finally, I would like to thank UBC for graduate scholarships, the Michael Smith Foundation for Health Research for a Junior Graduate Scholarship, and the Canadian Institutes for Health Re- search for an MD/PhD Studentship and a Vanier Canada Graduate Scholarship. xvii Chapter 1 Introduction Protein misfolding diseases are a cause of much human suffering and mortality. The best known is Alzheimer’s disease, which affects more than 10% of the population over 65. Other mis- folding diseases, although less common, are equally devastating. Misfolded prion protein (PrP) is the cause of several rare human conditions, including Creutzfeldt-Jakob disease, Gerstmann- Sträussler-Scheinker syndrome, and fatal familial insomnia. Familial amyotrophic lateral sclerosis has been linked to mutations of the superoxide dismutase 1 (SOD1) protein, leading to a toxic gain-of-function that increases production of cytotoxic free radical species. These diseases have a common pathogenesis: accumulation of insoluble aggregates arising from cellular protein that was previously natively folded. 1.1 Clinical Aspects of the Prion Diseases The human prion diseases are a comparatively rare cause of dementia, with an estimated incidence of 1 case per million people per year, although some uncorroborated reports would place the true incidence much higher [1]. Sporadic SCJD accounts for 85% of cases, with most of the remainder due to autosomal dominantly-inherited mutations of the PRNP gene on chromosome 20 that codes for human prion protein. Depending on the site of the mutation and polymorphic codon 129 status, the illness may manifest as familial FCJD, Gerstmann-Sträussler-Scheinker syndrome (GSS), or fatal familial insomnia (FFI). Prion-contaminated dura mater grafts, cadaveric pituitary hormones, corneal transplants and depth electrodes have led to iatrogenic CJD in over 400 people [2], while there have been almost 200 cases to date of new variant VCJD from consumption of bovine spongi- form encephalopathy prions. VCJD transmission has also been observed through blood and blood products [3], and there is a hypothetical risk of transmission through urine-derived pharmaceuticals [4]. 1 CHAPTER 1. INTRODUCTION 1.1.1 Differential Diagnosis The differential diagnosis for a patient presenting with signs suggestive of CJD is broad, as it en- compasses most causes of dementia. The conventional clinical picture of possible CJD involves a rapidly progressive dementia and two of the following signs: myoclonus, cerebellar and/or visual symptoms, extrapyramidal and/or pyramidal signs, and akinetic mutism. In fact, this descrip- tion belies considerable heterogeneity in the presentation of the prion diseases. There are many atypical presentations of prion disease that resemble more common dementing conditions. The Oppenheimer-Brownell variant of CJD, for example, accounts for roughly 10% of cases and man- ifests exclusively with ataxia and cerebellar deficits [5]. It is particularly important to differentiate CJD from Alzheimer’s Disease (AD), which is often the presumed etiology for symptoms of cog- nitive decline in the elderly. AD is associated with loss of memory and executive function, but motor deficits are rare. Furthermore, prion disease is generally more rapidly progressive than AD, although patients with GSS may exhibit a more gradual deterioration spanning 2 - 7 years that resembles Alzheimer’s-type dementia [6]. In at least one family with GSS, the disease mimicked an atypical frontotemporal dementia [7]. Neuropsychiatric symptoms are common in the early stages of VCJD and SCJD and may point toward an incorrect diagnosis; patients with cognitive and affective variants of CJD have a longer delay in receiving diagnostic testing for CJD [5]. Diffuse Lewy body disease, like CJD, is characterized by motor signs but follows the classic Parkinsonian pattern of tremors, rigidity and bradykinesia rather than ataxia and myoclonus. In vascular dementia the changes in cognitive function are usually abrupt rather than progressive, but with multi-infarct dementia (including CADASIL, cerebral autosomal-dominant arteriopathy with subcortical infarcts and leukoencephalopathy) damage from each accumulated lesion may be incremental and produce an apparently continuous decline. Structural abnormalities like normal pressure hydrocephalus may present with the classic triad of ataxia followed by dementia and uri- nary incontinence that much resembles CJD. Imaging studies showing enlarged ventricles out of proportion to cortical atrophy suggest this diagnosis, which can be confirmed by large-volume lum- bar puncture. Treatable metabolic possibilities include vitamin B12 deficiency or hypothyroidism and can be identified by routine lab tests. Progressive multifocal leukoencephalopathy, AIDS de- mentia complex, and tertiary neurosyphilis may seem similar to CJD and would be suggested by exposure history and HIV and VDRL testing. New variant VCJD is phenotypically distinct from SCJD, with a younger average age of onset (28 vs. 68 for SCJD) and a longer average disease course (14 months vs. 8 months). Psychiatric disturbances including depression or a schizophrenia-like psychosis often precede dementia by several months [8], and an unusual sensory symptom, a feeling of “stickiness” of the skin, has been reported in half of cases. In younger patients with suspected VCJD the differential diagnosis also 2 CHAPTER 1. INTRODUCTION includes inherited conditions like Huntington’s disease or adult-type neuronal ceroid lipofuscinosis (Kufs’ disease) that can be explored by genetic testing. Heidenhain variant CJD accounts for approximately 20% of SCJD cases and is characterized by presentation with visual symptoms, including visual field defects, hallucinations, disturbed colour perception, optical anosognosia, and cortical blindness [9]. Clinical signs of limbic system in- volvement (like aggressiveness) or basal ganglia deterioration (like rigidity, tremor, athetosis, or limb hypertonicity) are seen less often in the Heidenhain variant than in classic CJD. The clini- cal course is typically more rapid than other CJD variants, with death occurring an average of 6 months after presentation. MRI findings unique to this variant include cortical signal abnormality predominantly affecting the occipital lobes and occipital volume loss, although findings typical to other variants of CJD may also be present. EEG classically demonstrates periodic sharp wave complexes but may reveal intermittent nonspecific changes depending on the stage of disease. Neu- ropathological changes include neuronal loss and cortical spongiosis which predominantly affects the occipital lobes. The majority of patients have the M/M type 1 PrP genotype at codon 129 of PRNP [10]. 1.1.2 Testing Given the phenotypic variability of CJD and the risk of mistaking it for other conditions, diagnostic tests are useful to investigate the possibility of CJD. According to a study of 364 patients with suspected CJD [11], a positive 14-3-3 CSF protein test is 95% sensitive and 93% specific for CJD, although caution should be exercised when interpreting this test as virtually any syndrome accompanied by neuronal injury can occasion “false positives.” Bilateral hyperintensity of the striatum on long repetition time MRI images is 67% sensitive and 93% specific. Periodic sharp wave complexes on EEG are 65% sensitive and 93% specific but are more prominent in the late phases of disease; frontal intermittent rhythmical delta activity may be more suggestive of the initial phases of disease [12]. Positive findings on these tests do not exclude other causes of disease but do make them much less likely and elevate a possible diagnosis of CJD based on clinical findings to a probable diagnosis; definite diagnosis requires neuropathological studies, except in the case of subjects satisfying probable criteria of CJD with pathogenic PRNP mutations. Although there are no curative treatments for CJD available, rapid arrival at a correct diagnosis is important to exclude treatable causes of dementia and aid in end-of-life planning in consultation with family members. 3 CHAPTER 1. INTRODUCTION 1.1.3 Treatment Therapeutic options for all the human prion diseases are currently limited to palliation, as there are no agents capable of reliably causing a sustained improvement in clinical course. A large and diverse selection of drugs have been tried with limited success [13], including antivirals, antifun- gals, antibiotics, antimalarials, antidepressants, antioxidants, and analgesics. A small double blind placebo controlled study of the analgesic flupirtine on 26 SCJD patients demonstrated reduced deterioration in cognitive function, but it did not affect survival. Clinical trials of doxycycline, quinacrine, pentosan polysulfate, and simvastatin are currently ongoing. Other approaches, in- cluding active vaccination with prion protein and specific fragments, or passive antibody infusion directed toward the prion protein, have been validated in animal disease models [14] but have not yet reached human trials. 1.2 The Mechanism of Prion Misfolding and Structural Models of PrPSc Despite extensive investigation, many features of prion protein misfolding remain enigmatic. Phys- ico-chemical variables are known to influence misfolding, so studying these variables helps elu- cidate the mechanism of prionogenesis and identify salient features of PrPSc, the misfolded con- former of the prion protein. Prospective work on refinement of candidate PrPSc models based on thermodynamic considerations will help to complete atomic-scale structural details missing from experimental studies and may explain the basis for the templating activity of PrPSc in disease. For over a decade the protein-only hypothesis of prion replication has been the prevailing theory to explain the propagation of the transmissible spongiform encephalopathies. This hypothesis asserts that a pathologic conformational variant of the prion protein, denoted PrPSc, can induce endogenous natively-folded prion protein, PrPC, to misfold and itself become PrPSc by modification of secondary and tertiary structure in a process of template-directed misfolding. Until recently, only a handful of other proteins mainly from yeast had shown a capability like that of PrPSc to catalyse the misfolding of a protein with the same primary sequence as itself, but a growing body of evidence implicates a prion-like misfolding mechanism for many proteins involved in human neurodegenerative diseases, including tau in Alzheimer’s and other tauopathies [15], α-synuclein in Parkinson’s disease [16], and superoxide dismutase 1 in familial amyotrophic lateral sclerosis [17]. This suggests that template-directed misfolding may be a central event in the pathogenesis of many CNS diseases. Unraveling the mystery of PrPSc formation and activity is therefore potentially of large consequence to human health beyond its direct application to the (relatively rare) prion 4 CHAPTER 1. INTRODUCTION Figure 1.1: Schematic of agents relevant to the prion misfolding process, includ- ing lipid membrane (at bottom), solu- tion counterions (red and blue spheres), the N-terminal unstructured domain (at left), copper ions (brown spheres associ- ated with the octapeptide repeats in the N-terminal domain), glycans, and acidic (red) and basic (blue) sites on the C- terminal structured domain. diseases. The following sections identify physical and chemical variables known to influence formation of PrPSc and therefore must be considered in a comprehensive theory of prionogenesis. 1.2.1 Environmental Variables Influencing Prion Misfolding At least one aspect of the PrPC→ PrPSc conversion reaction is understood in detail: the structure of the natively folded PrPC. This structure represents a starting point for misfolding as it is the “substrate” on which PrPSc acts and has the same primary sequence; indeed, it is possible that some elements of the structure of PrPC may not participate in the reorganization that takes place during misfolding and are conserved in PrPSc. To date, the structure of PrPC has been solved by NMR or X- ray crystallography for more than twenty different species and mutants of PrPC (e.g. [18, 19, 20]), all showing a highly conserved fold: of the full-length sequence comprising residues 23 to 230, the C-terminal residues from 125 to 230 have a globular fold that contains a small antiparallel β - sheet and three α-helices, while the N-terminus does not have a defined secondary structure (see Figure 1.1). As the prion protein departs from the PrPC structure during misfolding, however, its geometry becomes increasingly ambiguous. Studies of the misfolding reaction in a diverse range of conditions have revealed the importance of solution conditions and other biomolecules, each of which provides some insight into the mechanism of PrPSc production: • pH and Acidity: It has been known for some time that acidic conditions facilitate prion misfolding [21]. The effect of an acidic environment is to change the protonation state of ionizable side chains, increasing the net positive charge on PrP and changing the network of salt bridges that help to hold secondary structural elements of the protein together in the 5 CHAPTER 1. INTRODUCTION native fold. This may lower the energetic barrier to internal reconfiguration of the protein necessary as part of misfolding. • Salt concentration: The presence of solution counterions is necessary for efficient conversion to the misfolded form. As the concentration of salts rise, the Debye length for screening of electric fields in solution decreases, so that like charges can be brought into closer proximity while incurring a lesser energetic cost. Since the prion protein carries a large positive charge (+11) at neutral pH, the cost of putting two PrP molecules in spatial contact is high in the absence of screening by solution counterions. At an ionic strength of 150 mM, the Debye length is approximately 8Å, which attenuates the repulsion between distant like charges in the protein while nonetheless projecting an electric field into the nearby space. This field may mediate the initial recognition of PrPC by PrPSc and draw the molecules together so that other shorter-range effects like van der Waals forces may augment the interaction. Anionic substances like pentosan polysulfate have shown in vitro efficacy at reducing the rate of misfolding. Interestingly, when recombinant human PrP is denatured with urea in a salt-free environment, the protein merely unfolds; in the presence of urea and even low concentrations of NaCl (50mM), oligomerization and an increase in β -sheet content are observed [22]. This supports the conclusion that screening of electrostatic interactions between PrP molecules is a prerequisite to conversion. • Lipid association: PrPC is held to the outer leaflet of the plasma membrane by a glycophos- phatidylinositol (GPI) anchor at its C-terminus in the physiologic state, but it has been shown that a novel transmembrane form of PrP can arise in which residues 111 to 134 span the plasma membrane [23], and residues 34 to 94 are able to anchor PrPC to synthetic sphingolipid-cholesterol-rich raft-like liposomes even without the GPI anchor [24]. Interac- tion with anionic lipid membranes has been shown to induce β -sheet structure and promote aggregation [25]. Lipids have also been found associated with PrPSc in samples purified from infected hamster brain [26]. A general feature of protein amyloid is an increased ex- posure of hydrophobic groups that are sequestered in the protein interior in the native state, so the presence of lipids in PrPSc may help to lessen the energetic unfavourability of sur- face hydrophobic groups. It has been hypothesized that the thermostability of PrPSc is due to the strength of protein/lipid associations in the aggregate [27], but lipids (mainly sph- ingomyelin, α-hydroxy-cerebroside, and cholesterol) are believed to comprise only around 1% of the mass of purified prions [28]; this would entail approximately one lipid molecule for every one or two PrP monomers. Nonetheless, lipids may serve a specific physical role in the conversion process, perhaps by shielding exposed hydrophobic regions of PrP. No cur- 6 CHAPTER 1. INTRODUCTION rent PrPSc models or conversion mechanisms explicitly consider the contribution of lipids, but their potential role in stabilizing regions of PrPSc or one of its intermediates is important to consider. • Metal ions: PrPC contains 4 octapeptide repeats from residues 56 to 87 that have an affinity for copper. This lies outside the protease K-resistant core of PrPSc but may nonetheless par- ticipate its its toxicity through free radical production, loss of a physiologic copper-binding function, or oxidation of side chains in PrP [29]. An NMR structure of the octapeptide repeat from residues 61 to 84 bound to pentosan polysulfate has recently been released showing a series of loosely defined loops with hydrophobic exposed tryptophans [30]. Interestingly, copper can also influence the protease resistance of PrPSc molecules, and PrPSc formed in a copper-free environment is 20 times more susceptible to proteinase-K digestion; addition of copper restores normal protease resistance [31]. • Glycans: There are N-linked glycans attached to asparagines 181 and 197 of PrPC, each with a molecular mass of approximately 5 kDa. During conversion glycans may present steric restrictions to the approach of PrPSc, as it is a bulky multimolecular aggregate, and may also reduce or eliminate certain structural transitions; however, unglycosylated, monogly- cosylated, and diglycosylated chains are detected in approximately equal quantity in PrPSc. Other sugar molecules separate from the glycans are found in scrapie aggregates and appear to contribute to the resilience of prion rods [26]. Thermodynamically, the presence of the sugars enhances the stability of protein regions near their attachment site by approximately 1 kcal/mol, primarily by reducing accessible states and thus the configurational entropy of the unfolded chain, but this may or may not be a significant effect of glycosylation. MD simulation of PrPC with chitobiose glycosylation demonstrated a minimal effect on the pro- tein’s overall structure, but the absence of glycosylation did result in an observed increase in β -content [32]. Removal of PrP glycosylation by the mutations D181Q and D197Q in a human astrocytoma cell line resulted in an increased rate of apoptosis [33], but this is be- lieved to be the result of a Bcl2-associated mechanism rather than protein misfolding. The balance of current evidence is that glycans are not a central actor in misfolding but are useful in constraining models of PrPSc, which must accommodate the large volume of the glycans presumably by projecting them into solution. • Protein-protein interactions: It has been hypothesized that another protein different from PrPC or PrPSc, a so-called “Protein X”, may facilitate unfolding of PrPC and its subsequent conversion to PrPSc. Various roles have been ascribed to Protein X, including physiologic endoproteolytic cleavage of PrPC[34] or binding to a discontinuous epitope on the surface 7 CHAPTER 1. INTRODUCTION of PrPC [35]. However, in the absence of successful purification of Protein X and increasing success in generating infectious prions from purified PrPC [36], it seems unlikely that an ac- cessory protein is a necessary contributor to the misfolding process. The sufficiency of PrPC, PrPSc, non-protein biomolecules, and solvation conditions to account for prion misfolding simplifies substantially the considerations in establishing the mechanism of prionogenesis. • Interactions between the N- and C-terminal domains: Since only residues 125 - 230 adopt a folded conformation in PrPC and residues 90 - 230 participate in the protease-resistant core of PrPSc, most attention has focused on structural transitions involving the C-terminal part of the protein. However, the N-terminal unstructured domain is capable of exerting a subtle but detectable influence on the C-terminal domain: deletion of residues 34 to 99 resulted in formation of protease resistant PrP in an in vitro cell-free conversion assay with accessible protease cleavage sites at residues 130 and 157; similarly truncated PrP was also observed in brain homogenates from mice exposed to mouse-passaged hamster scrapie, raising the possibility that the N-terminal domain of PrP is involved in the barrier to prion disease trans- mission between species [37]. Even in the native state antibody binding to an epitope near the N terminus can promote loosening of the C-terminal structured domain [38], supporting the notion of a significant if inconstant association between the N- and C-terminal domains of PrPC that may modulate misfolding. 1.2.2 Known Facts about the Infectious Species of PrPSc Knowledge of the structure of PrPSc is expected to reveal the origin of its remarkable ability to induce temple-directed misfolding. A variety of chemical, physical, and biological techniques detailed below have provided low-resolution structural data, but as demonstrated by the existence of different prion strains with distinct neuropathological phenotypes in animal models of disease, there appears to be significant heterogeneity in the structure of PrPSc that renders it difficult to precisely characterize. • Secondary structure: It has been known for almost twenty years from circular dichroism (CD) and Fourier transform infrared spectroscopy (FTIR) that PrP misfolding is associated with an increase in β -sheet content and a decrease in α-helical content [39]: PrPC contains approximately 3% β -sheet and 43% α-helix, while PrPSc contains 34% β -sheet and 20% α-helix. This is a useful if nonspecific constraint to apply in the determination of the PrPSc structure. More recently unique FTIR spectroscopic properties have been demonstrated for several strains of TSE agents [40, 41], which suggests that the misfolded PrPSc may be better 8 CHAPTER 1. INTRODUCTION represented by an ensemble of varying geometries determined by sequence and conversion conditions rather than a single unifying structure. • Proteinase K sensitivity: Resistance to proteinase K (PK) digestion is the biochemical sine qua non of misfolded prion protein. PrPSc is known to be resistant to PK attack from residues 90 to 230 [26], identifying the subsequence involved in misfolding. Sajnani et al. [42] have identified by mass spectrometry other PK cleavage sites within PrPSc isolated from hamster brains at residues 117, 119, 135, 139, 142, and 154; these sites presumably represent regions of higher flexibility that are sufficiently exposed (at least in a proportion of PrPSc molecules) to enable attack by PK. Similarly, Zou et al [43] have determined other PK cleavage sites in PrPSc at residues 146, 153, 156, 162, and 167. Models of PrPSc could account for these inter- nal PK cleavage sites with by placing these sites in accessible non-α , non-β conformations. • Epitope exposure: Antibodies specific to PrPSc have been prepared that recognise epitopes formed by the two strands of the PrPC β -sheet in unstructured conformations [44, 38]. This identifies β -sheet dissociation as a necessary prerequisite to misfolding and indicates that ex- posure of the two segments comprising the native β -sheet persists in PrPSc. An X-ray crystal structure of another antibody capable of recognising both PrPC and PrPSc was determined in complex with three mutants of ovine PrP [45], and the site of antibody binding was found to be the last two turns of α-helix 2. This is evidence for the retention of α-helical secondary structure at this site in PrPSc. Another antibody to α-helix 1 in the native conformation has shown in vivo efficacy at inhibiting conversion [46]; this is one of the sites of PrPSc- PrPC interaction [47], so it is possible that this antibody functions by inhibiting recruitment of PrPC to PrPSc. • Pathologic mutations: Over 30 different point mutations of PrP are associated with familial prion disease [48]; pathologic sites of mutation are spread from codon 102 to 238, with no apparent sequence clustering to implicate a particular region as a hotspot for conversion. There is variation in the presentation of disease associated with PrP mutants, with different mutations associated with each of the three hereditary human prion diseases; however, corre- lation of this information to mutational influences on molecular mechanisms of conversion is not yet possible. Several molecular dynamics studies of prion mutants have been undertaken [49], generally revealing reduced stability and increased flexibility compared to wild-type PrP. • Infectivity of different aggregate sizes: Silviera et al. [50] established that the most infectious prion species are oligomers comprising 14 - 28 monomers, while large fibrils containing hun- 9 CHAPTER 1. INTRODUCTION Figure 1.2: Three current models of PrPSc: extension of the native β -sheet [51] (one monomer of residues 90-230 shown), a parallel in-register β -sheet [52] (five monomers of residues 160-220 shown), and a β -helix [53] (one monomer of residues 90 - 230 shown). The structures presented here comprise a small part of the total misfolded aggregate. Note the large qualitative differences between these models. dreds or thousands of monomers exhibit lower infectivity and isolated monomers, dimers, and trimers appear to have almost no infectivity. This is significant for the purpose of es- tablishing the quaternary organization of PrPSc, as the assembled structure must apparently contain a “magic number” of monomers with the right combination of a viable templating interface and perhaps sufficient flexibility to facilitate large-scale motions in PrPC during conversion. 1.2.3 Current Structural Models of PrPSc Speculation on the structure of PrPSc has been an active area of interest for over a decade, but the heterogeneity and insolubility of PrPSc precludes study by the highest-resolution techniques like nuclear magnetic resonance or X-ray crystallography. An assortment of lower-resolution data has provided useful if incomplete descriptions of certain structural features of the PrPSc molecule. • Extension of the native beta sheet: The first model of PrPSc proposed by F.E. Cohen [54] envisioned PrPSc forming from PrPC through an expansion of the small native beta sheet to involve adjacent residues. A similar observation was reported by DeMarco [51] based on all-atom molecular dynamics simulations of the prion protein at low pH, in which ad- ditional residues from the unstructured N-terminal domain are recruited to form additional strands in the beta sheet, while other residues in the folded C-terminal domain acquire β - type backbone angles. Despite the eminent simplicity of these models they are not in accord with immunologic data showing exposure of the residues comprising the native beta sheet in PrPSc [44]. 10 CHAPTER 1. INTRODUCTION • A parallel in-register beta sheet: Using electron paramagnetic resonance and spin-labelling techniques to study recombinant PrP amyloid, Cobb et al. [52] have determined that PrP molecules associate into a stacked β -sheet in which residues 160 to 220 in the primary se- quence are in close proximity to the same residues in adjacent chains. This implies the fibril exists as a stacked parallel in-register beta sheet and represents the most detailed structural characterization of a misfolded PrP structure to date. As the fibrils studied were generated solely by guanidinium chloride denaturation without exposure to brain-derived PrPSc, it is not known if this structure in fact corresponds to that of endogenous PrPSc. Hydrogen- deuterium exchange studies of aggregate structure in spontaneously-formed recombinant hamster PrP(90-231) fibrils and those produced in the presence of an inoculum of brain- derived PrPSc show that the solvent-protected region of the spontaneous fibrils extends from residues 160 to 220 (in accord with the EPR structure described above), while a subpopula- tion of the PrPSc-seeded fibrils show solvent protection extending more N-terminally to cover the residues from 117 to 133 and from 145 to 220 [55]. The peptide PrP(106-126), which is capable of inducing neuronal apoptosis and appears to mediate the PrPC→ PrPSc conversion, has also been shown by solid state NMR to form a parallel in register β -sheet from residues 113 to 123 with antiparallel intersheet packing in fibrils [56]. Caution in generalizing these results to PrPSc is warranted as the surrounding sequence in full-length may exert effects not captured in this peptide study. It is apparent that large segments of the PrPC primary sequence have a high intrinsic propensity to form aggregates with a parallel in-register beta sheet structure, but there is not yet evidence that these intriguing in vitro findings recapitulate processes in animal prion disease. • A stacked beta helix: Downing and Lazo [57] originally proposed a β -helix for PrPSc based on molecular modelling, and others [58, 53, 59] have elaborated this hypothesis with elec- tron microscopy and fibre diffraction studies of infectious mammalian prions. These studies suggest that PrPSc adopts a left-handed β -helix, in which α-helices 2 and 3 retain the same conformation as in PrPC while residues 90 to 160 refold to form a four-layered triangular helix with parts of the sequence protruding as loops at turns between helical strands (see Figure 1.2). Electron microscopy PrPSc studies have been based on 2-dimensional (2D) crys- tals observed to be associated with purified scrapie samples. Heavy metal cation staining of the 2D crystals showed strong binding to the central region of individual oligomers, indicat- ing the presence of a strong negative potential at this site. Recent fibre diffraction data from natural and synthetic prion fibrils [60] reveal several features of PrPSc that are useful in con- straining potential models: the presence of a 4.8Å diffraction band in these studies strongly supports the existence of cross-β structure in the aggregate, while a 19.2Å repeat indicates 11 CHAPTER 1. INTRODUCTION the presence of four β -strands composing at least part of the fibril. Significant differences were noted between brain-derived PrPSc and recombinant misfolded PrP. Theoretical simula- tions of the kinetics of prion misfolding have been undertaken based on the proposed β -helix structure [61] that proposed a hexamer as the minimum infectious unit. • Other related structures: The structure of the yeast prion HET-s was solved by hydrogen exchange and solid state NMR to reveal a stacked beta sheet with organization suggestive of a β -helix [62]. Each monomer contributes four β -strands to the fibril. Even if this structure is not directly relevant to PrPSc, it is informative to observe the features of other proteins capable of template-directed refolding. Structures based on work for other amyloidogenic proteins, such as a water filled nanotube for polyglutamine amyloid [63], may provide inspi- ration for PrPSc. The lack of agreement on even basic features of the PrPSc fold may derive in part from an in- herent heterogeneity in misfolding: whereas there is one native structure for PrPC, there may be multiple misfolded structures each arising from a local energetic minimum in the conformation space of PrP. Environmental conditions, including the presence of PrPSc seeds of a particular sub- type, and variation in the PrP primary sequence may steer PrPC molecules into one of a range of misfolded geometries. Since each study above employed a different methodology for generating PrPSc, it is possible that each one accessed a different stable misfolded conformer, some or all of which may be relevant in disease. This is supported by the well-known existence of strains in prion disease with distinct neuropathological features. In this sense the search for the PrPSc structure, while not futile, may yield a plurality of valid possibilities rather than a single definitive answer. To date the experimentally-derived PrPSc models have not been subject to a theoretical analysis of their thermodynamic stability, which aside from confirming their plausibility may help to refine important details like side chain configurations and that are beyond the precision of the experimen- tal evidence. Computational tools to separately calculate the enthalpic and entropic contributions to the free energy of a protein structure are developed in Chapter 3 of this thesis and applied to de- scribe local unfolding of PrPC, to predict features of the PrPSc formation mechanism. A complete picture of PrPScstructure and function remains a distant goal, but a combination of empirical and theoretical techniques offers the best hope of unravelling this central mystery of neurodegenerative disease. 12 CHAPTER 1. INTRODUCTION 1.3 Generalization of the Prion Hypothesis to Other Neurodegenerative Diseases Protein misfolding diseases have been classically understood as diffuse errors in protein folding, with misfolded protein arising autonomously throughout a tissue due to a pathologic stressor. The field of prion science has provided an alternative mechanism whereby a seed of pathologically mis- folded protein, arising exogenously or through a rare endogenous structural fluctuation, provides a template to catalyse misfolding of the native protein. The misfolded protein may then spread intercellularly to communicate the misfold to adjacent areas and ultimately infect a whole tissue. Mounting evidence implicates a prion-like process in the propagation of several neurodegenerative diseases, including Alzheimer’s, Parkinson’s, Huntington’s, amyotrophic lateral sclerosis, and the tauopathies. However, the parallels between the events observed in these conditions and those in prion disease are often incomplete. The discovery of prions 30 years ago [64, 65] as agents capable of transmitting scrapie of sheep and kuru of humans between organisms without the requirement for nucleic acid was a rad- ical development in the field of infectious disease. For the first time, it was shown that exogenous misfolded protein from the environment may recruit and induce misfolding of host endogenous protein. Since that time, the uniqueness of misfolded mammalian prion protein as the sole exam- ple of a transmissible protein conformation has been challenged. The study of yeast [66] and fungal prions [67] in the 1990s revealed the same inter-organismal passage of protein conformation, this time as an adaptive tool to convey epigenetic information within a population. More recently, in- creasingly detailed elucidation of the molecular details of human neurodegeneration has indicated that, at least within a single organism, several other proteins in a multiplicity of diseases may obey a prion-like propagation mechanism: • Amyloid-β (Aβ ) in Alzheimer’s Disease (AD) • Tau protein in AD, frontotemporal dementia (FTD), and other dementias • Superoxide dismutase 1 (SOD1), Tar-DNA Binding Protein 43 (TDP-43) and FUsed in Sar- coma (FUS) in amyotrophic lateral sclerosis (ALS) • α-Synuclein in Parkinson’s Disease (PD) • Huntingtin in Huntington’s Disease (HD) • Disrupted in Schizophrenia 1 (DISC1) in schizophrenia 13 CHAPTER 1. INTRODUCTION The level of evidence supporting the prion hypothesis in these conditions varies considerably in quality and amount. Excellent biological reviews on this topic [68, 69, 70] have appeared recently, but three aspects merit particular attention: a framework for understanding the general propensity of proteins in the CNS to seed their own aggregation, possible mechanisms of intercellular spread of misfolded protein, and therapeutic strategies to arrest the misfolding cascade. As a starting point, the essential features of the prion hypothesis will be discussed and the concordance and discordance between the hypothesis and the data for each protein will be considered. 1.3.1 Defining the Prion Principle With the labels “prion”, “prion-like”, or “prionoid” being used with increasing casualness in de- scribing protein misfolding whenever it is seen to spread from cell to cell, it is important to define what is meant by the term “prion” and specify the attributes that a misfolded protein must have to qualify. A hierarchy of characteristic features has developed in the study of prion disease that is helpful in understanding other disorders: • Organism level: Early epidemiologic studies of kuru, an infectious dementia due to ritual cannibalism among the Fore people of Papua New Guinea [71], scrapie in sheep [72], and bovine spongiform encephalopathy in cattle [73] established the ability of these degenera- tive conditions to spread within a population. Misfolded protein was subsequently revealed as the causative agent. The highest standard that could be demanded of a condition to call it a prion disease is transmission of misfolded protein from host organism to recipient or- ganism through exposures in the natural environment that subsequently causes disease in the recipient. With the exception of AA amyloidosis among cheetahs as discussed below, inter-organismal spread has not yet been observed for non-PrP protein misfolding diseases. • Tissue level: Neuropathological studies [74] show a characteristic spread of misfolded pro- tein through the central nervous system (and in some cases peripheral lymphatic tissue). This is related to the prion strain concept [75], where misfolded infectious PrP from differ- ent sources and preparation methods exhibit different regional patterns of involvement in the CNS. Interestingly, many other neurodegenerative diseases show a contiguous spread from one initial locus outward along known neuroanatomical pathways. Functional imaging stud- ies have shown that degeneration during AD and PD follows regular avenues of neuronal connectivity [76, 77, 78]. Likewise, in ALS, spatiotemporal propagation through the neu- roaxis and the linked degeneration of upper and lower motor neurons is well documented [79, 80]. 14 CHAPTER 1. INTRODUCTION • Intercellular level: Spread of neurodegeneration through contiguous tracts of tissue in these diseases implicates cell-to-cell propagation of misfolding and suggests a pathogenic connec- tion between affected cells and their neighbours. A misfolded protein that cannot escape the local environment in which it formed has no way to effectively communicate the misfold to other regions, so the protein must be trafficked (or simply diffuse) in a way that allows it to spread intercellularly. Connected to the intercellular-level description of prion disease is the deposition of protease-resistant amyloid aggregates that have long been viewed as the sine qua non of prion diseases [81]. However, the dogma that these aggregates are the toxic entities has recently been upset by evidence pointing to soluble oligomers in prion disease [50], AD [82], and PD [83] as more harmful and the large-scale aggregates as the inert or even protective end-stage of protein deposition. • Molecular level: Arguably the greatest controversy exists on the nanometer-scale features of the infectious protein agent and how it facilitates the misfolding of native protein, as discussed in Section 1.2. Related to the molecular level description of prion disease is the “protein-only” hypothesis, which asserts that misfolded protein alone is necessary and suf- ficient to catalyse misfolding of native protein, without the requirement for any co-factors. Although early attempts at generating infectious misfolded PrP in vitro used crude brain ho- mogenates as the substrate, more recent work has successfully generated infectious protein in a reductionist system free with recombinant protein [84]. It is readily possible for proteins in other diseases to satisfy some but not all of the attributes above, thereby falling within the penumbra of the prion concept but not fully equalling it. In the following sections (and summarized in Table 1.1), the weight of evidence for and against a prion-like mechanism of disease in other neurodegenerative conditions is discussed. 1.3.2 Amyloid-β Similarities between the behaviour of Aβ during Alzheimer’s disease and infectious prions have lead to a rich discussion of the prion-like nature of Aβ [74, 85, 86]. As with prions, it is clear that Aβ oligomers are neurotoxic [87, 86]. Furthermore, brain homogenates containing Aβ from hu- man AD, or mouse model AD, are capable of seeding AD pathology in naı̈ve mouse models of AD [88, 89]. Indeed, the data suggest that the templating agents of AD in “infected” mouse models are structured but soluble species, consistent with small Aβ oligomers as opposed to insoluble fibrils or plaques [88, 89, 90]. These experiments are demonstrations of a limited interpretation of the amyloid cascade hypothesis [91] whereby seeded nucleation-polymerization of Aβ monomers into oligomers and (debatably) plaques accounts for AD pathology. It is sensible that a soluble seeding 15 CHAPTER 1. INTRODUCTION species would have the greatest potential to diffuse in the CNS interstitium to propagate misfold- ing in susceptible brain regions. However, in contrast to prions, Aβ has no single “native” ordered configuration; it apparently oscillates between an unstructured state, an α-helical state, and a pri- marily β -sheet conformation [92, 93]. Conversely, β -sheet structure dominates in Aβ oligomers, fibrils, and plaques. The formation of Aβ aggregates may follow a nucleation-polymerization mechanism; alternately a particular, unknown, conformation of Aβ (likely with a high β -sheet content itself) can engage in template-directed folding of less-structured Aβ monomers to catalyze their conformational change into a β -sheet dominated structure leading to oligomer and plaque formation. 1.3.3 Tau Protein Tau is a microtubule-associated protein shown to aggregate in over 20 different dementias, includ- ing AD and frontotemporal dementia, making it the most commonly misfolded protein in human neurodegenerative disease. Hyperphosphorylation and filament formation is observed in patho- logical states; the morphology of filaments exhibits considerable heterogeneity. Six tau isoforms are expressed in the adult brain from alternative mRNA splicing. The aggregation propensity of tau at a molecular level has been studied in detail and was shown to depend on a small protein subsequence, residues 306 - 311 [94]. Extracellular tau aggregates, but not monomers, are taken up by cultured cells; these internalized aggregates are capable of inducing aggregation of intra- cellular endogenous tau – a finding replicated in vitro by seeding of recombinant tau monomers [95]. Significantly, the internalized tau was found to localize with dextran, implying an endocytic method of entry rather than direct penetration of the membrane. Migration of tau misfolding out from a nidus in the transentorhinal cortex is a well-known feature of AD [96], satisfying the tissue level features of the prion phenomenon. At an organismal level, injection of brain homogenates from transgenic mice expressing mutant P301S tau into other transgenic mice expressing wild- type human tau caused misfolding and filamentation of the wild-type tau [15]. Although the study claimed to observe spreading of pathology from the injection site, simple extracellular diffusion of the injected material rather than physiological intercellular transmission cannot be excluded. Interestingly, molecular crowding agents appear to favour the misfolding of tau, especially when it is hyperphosphorylated [97]. 1.3.4 TDP-43 and FUS TAR DNA-binding protein 43 (TDP-43), a 414 amino acid predominantly nuclear protein encoded by the TARDBP gene, is a major constituent of neuronal and glial cytoplasmic inclusions in SALS 16 CHAPTER 1. INTRODUCTION patients [98, 99, 100]. The functions of TDP-43 are not yet fully understood, but it is known to be involved in gene regulation, mRNA splicing and localization [101]. In addition to its cleavage into cytotoxic 25- and 35-kDa C-terminal fragments [102], ALS-associated mutants of TDP-43 are hyperphosphorylated, ubiquitinated and prone to aggregation [103], likely due to the C-terminal protein interaction domain. TARDBP mutations affecting the C-terminus of the protein are asso- ciated with FALS and an ALS phenotype in transgenic mice [104, 105, 106, 107, 102, 108]. In 2009 another protein functionally related to TDP-43, FUsed in Sarcoma/Translated in LipoSar- coma (FUS/TLS), was found to be associated with ALS [109, 110]. FUS/TLS is primarily local- ized in the nucleus but shuttles between nucleus and cytoplasm via its nuclear export signal and a non-classical C-terminal nuclear localization signal [111, 112, 113]. Although the functions of FUS/TLS remain to be characterized, its structural similarity to TDP-43 suggests its involvement in transcription regulation, RNA and microRNA processing, DNA repair, and regulation of neuronal spine morphology [114, 115]. To identify prion-like proteins in yeast, Alberti et al. developed a hidden Markov model- based approach using experimentally known yeast prions [116]. Screening the entire database of known human proteins using this hidden Markov Model algorithm, FUS/TLS and TDP-43 rank 15th and 65th, respectively, as predicted prions [117]. In fact, FUS/TLS and TDP-43 rank higher as potential prions than the yeast prions used to establish the algorithm. Cushman et al. [117] used FoldIndex, a bioinformatics tool that predicts intrinsic unfolding of specific amino acid sequences [118], to determine the regions in FUS/TLS and TDP-43 that are prone to misfolding. The C- terminal region in TDP-43, specifically residues 277-414, was determined to be most susceptible to misfolding. This terminus includes the glycine-rich domain proposed to be involved in protein- protein interactions [119]; the vast majority of mutations identified to date in TDP-43 fall within the glycine-rich domain. These mutations may represent an additional destabilizing stress on a region that is already misfolding-prone: in vitro, TDP-43 spontaneously forms aggregates mediated by its C-terminus that are ultrastructurally similar to TDP-43 deposits in degenerating neurons of ALS [101]. In FUS/TLS, The N-terminal region that contains both glycine- and Q/G/S/Y-rich domains, specifically amino acids 1-239, was identified as the misfolding-prone region. 1.3.5 Superoxide Dismutase 1 Mutations in the ubiquitously-expressed gene encoding the free radical defense enzyme superoxide dismutase 1 (SOD1) have been implicated in a proportion of familial amyotrophic lateral sclero- sis (FALS) cases; over 150 different disease-causing SOD mutations have been identified to date [120, 121, 122]. These mutations induce misfolding of the protein and its subsequent aggregation. Formation of these aggregates occurs through a mechanism by which the normally-stable native 17 CHAPTER 1. INTRODUCTION SOD1 homodimer is disrupted, producing misfolded monomers that are incorporated into higher order oligomers [123, 124]. Evidence is accumulating that all types of ALS, including non-SOD1- linked familial and sporadic cases, are associated with SOD1 misfolding, oxidation, and aggrega- tion [125, 126]. Neural deposits of aggregated misfolded SOD1 have been detected in FALS and SALS [127, 128], and biochemical, genetic, and immunological evidence implicating SOD1 in non-SOD1 linked sALS [129, 130, 131, 132] also support this notion. Natively folded, functional SOD1 scavenges destructive superoxide radicals from the cytosol, converting them into less toxic hydrogen peroxide [133]. However, misfolded SOD1 is capable of reacting non-specifically with a variety of substrates to become a net producer of reactive oxygen and nitrogen species [134]. These toxic agents damage important intracellular structures, including microtubules, metabolic enzymes, and signalling proteins. An important clinical feature of ALS is its spatiotemporal propagation through the neuroaxis; the initial clinical presentation is usually focal, with expansion of affected muscle groups in a fashion suggesting contiguous spread through anatomic regions of the nervous system [79]. The outward spreading of pathology in ALS from an originating focus has been well documented in cross-sectional and longitudinal studies [135, 136]. Several theories have been proposed to ac- count for this phenomenon, including diffusion of paracrine substances like cytokines through the extracellular environment, sequential activation of microglia, and axonal transport of a deleterious agent. Additionally, astrocytes expressing mutant SOD1 have been observed to secrete a toxic factor selectively injurious to motor neurons [137]. An alternative hypothesis to account for contiguous spread of motor neuron deficits in ALS is a prion-like mechanism akin to either template-directed misfolding (TDM) or seeded poly- merization. In TDM, protein that has adopted an aberrant non-native conformation is capable of binding to natively-folded protein molecules and inducing a structural reconfiguration that cause them to adopt the same aberrant conformation. In seeded polymerization, the misfolded isoform rapidly adds monomer components to form an amyloid-like fibril [138, 54]. Further- more, misfolded SOD1 has been previously shown to be efficiently exported and imported by cells [139, 140, 141, 142, 143], supporting the notion that misfolded molecules of SOD1 can traverse between cells in a manner reminiscent of prion infectivity, although the precise molecular mecha- nism of this process has yet to be determined. Intramolecular conversion of wtSOD1 to a pathogenic conformation has been implicated in a number of studies in vivo [144, 145, 146, 147]. Co-expression of human wtSOD1 with mu- tant SOD1 variants often exacerbates disease phenotype, suggesting wtSOD1 participates in dis- ease progression through molecular association with mutant SOD1. More direct evidence of in- tramolecular conversion of wtSOD1 to a misfolded isoform has been observed in human cell lines, 18 CHAPTER 1. INTRODUCTION as discussed in [17] and Chapter 5 of this thesis. Cytosolic expression of misfolded SOD1 mu- tants G127X and G85R can confer a misfolded conformation on wtSOD1, as revealed by exposure of natively inaccessible peptide epitopes, markedly enhanced protease sensitivity consistent with structural loosening, and a novel pro-oxidant gain of function acquired by nascently misfolded wt- SOD1. Conformational conversion of wtSOD1 by copper-deficient G127X and G85R SOD1 mu- tants is accompanied by generation of ROS, as revealed by SOD1 non-native interchain disulfide bonds, C-terminal oxidation at Cys146, and carbonyl modification of other non-SOD1 proteins. Previous studies have recognized SOD1 non-native disulfide bond formation, but have posited that this reaction may take place in the oxidizing environment of mitochondria [145]. A more intriguing parallel between induced SOD1 misfolding and the prion diseases is the presence of sequence/structural requirements for conversion to take place, commonly described in prion disease circles as the species barrier. In transgenic mice expressing human SOD1 mutants, the presence or absence of murine endogenous SOD1 has minimal impact on clinical disease, and murine SOD1 is not incorporated in mutant human SOD1 aggregates [145, 147, 148]. By contrast, human wtSOD1 expression can dramatically accelerate clinical disease in transgenic mice express- ing a range of human SOD1 mutants, and is associated with incorporation of human wtSOD1 in aggregates [144, 145, 146, 147]. Conversely, over-expression of human wtSOD1 in a mouse carry- ing an endogenous SOD1 mutation does not accelerate disease onset [149], supporting the notion of a species barrier for misfolded SOD1 propagation. In a recent study, Münch et al. [150] observed aggregates of mutant SOD1 entering cells by macropinocytosis, then nucleating aggregation of soluble, cytosolic mutant SOD1 protein. This reaction was found to be self-perpetuating, with efficient passage from cell to cell by extracellular diffusion of aggregates. While this study clearly supports a prion-like mechanism in SOD1 mis- folding, it is silent on the question of wtSOD1 involvement in the process and does not attempt to disentangle the molecular details of native SOD1 conformational conversion by SOD1 aggregates. 1.3.6 Huntingtin Huntington’s Disease (HD) is an autosomal dominant genetic degenerative condition belonging to the category of triplet expansion diseases, which are caused by an abnormally long sequence of trinucleotide repeats in the affected gene. In the case of HD, the huntingtin gene on chromosome 4 contains an expanded number of CAG triplets, causing the huntingtin protein to contain an un- usually long stretch of polyglutamine. This length of polyglutamine is aggregation prone past a critical length of 35-39 residues, and the longer the polyglutamine tract present, the earlier the age of disease onset (usually middle age). Extended polyglutamine tracts in recombinant proteins have been observed to aggregate spontaneously and form amyloid fibrils ultrastructurally similar to Aβ 19 CHAPTER 1. INTRODUCTION [151]. As a disease requiring a rare genetic susceptibility in affected individuals, there is essentially no possibility of inter-organismal spread, but there is evidence for an intercellular seeding mech- anism whereby an aggregate arising de novo in one cell may pass to adjacent cells and catalyze aggregation of polyglutamine-containing protein. Fibrillar polyglutamine peptide aggregates can be internalized by mammalian cells in culture, where they gain access to the cytosolic compartment and become co-sequestered in aggresomes together with components of the ubiquitin-proteasome system and cytoplasmic chaperones [152]. At a tissue level, striatal projection neurons are affected at an early stage in HD, and imaging studies demonstrate shrinking of the basal ganglia before symptoms involving involuntary movements emerge. Additionally, recent brain imaging studies show that various cortical regions involved in motor, sensory and visual functions already undergo thinning in asymptomatic CAG-expanded huntingtin gene carriers [153]. 1.3.7 α-Synuclein Parkinson’s disease (PD) is the most common movement disorder and the second most common neurodegenerative disorder behind AD. The median age of onset is 55 years, and the primary symptoms are bradykinesia, resting tremor, rigidity and eventually possible cognitive involvement [154]. Akin to motor neuron specificity in ALS, the dopaminergic neurons of the substantia ni- gra are most susceptible in PD [155]. α-Synuclein is an abundant neuronal protein of unknown function but aggregates in PD to form Lewy bodies [155, 156]. In PD, neuronal α-synuclein ag- gregation initiates in the lower brainstem nuclei and spreads to the midbrain, mesocortical and neocortical regions [157]. This progression of neurodegeneration through the neuroaxis is sugges- tive of the tissue-level attributes of a prion. α-Synuclein is a relatively small intrinsically unstructured protein (140 amino acids) with a conserved amphipathic N-terminus and an acidic C-terminus. The central region of the protein tends to self-associate to form an amyloidogenic intermediate, and subsequently contributes to ag- gregation. Oligomers of α-synuclein are formed by self-assembly of partially folded intermediates (including non-native dimers), which interestingly, are stabilized by dopamine [158]. Intrastriatal grafts of human fetal mesencephalic dopaminergic neurons have been attempted as a possible ther- apy for PD, but post-mortem analysis of transplant recipient brain tissue revealed that the grafted neurons were positive for LBs containing α-synuclein and ubiquitin [159, 160]. It is speculated that α-synuclein aggregation and deposition observed in the transplanted dopaminergic neurons was triggered by misfolded α-synuclein in the host, which was transmitted into the graft cells; nonspecific environmental effects are another possibility. This potentially indicates a prion-like mechanism of cell to cell transmission and propagation of disease-related LBs. 20 CHAPTER 1. INTRODUCTION 1.3.8 Disrupted in Schizophrenia 1 Schizophrenia and other psychiatric illnesses arise from the complex interplay of genetic, envi- ronmental, and social risk factors, but recently a gene coding for a protein called Disrupted in Schizophrenia 1 (DISC1) was identified at the breakpoint of a balanced t(11;1) translocation in a Scottish family with an unusually high burden of psychiatric illness [161]. This protein appears to have multiple roles, including a scaffold for protein-protein interactions, neurodevelopment, cy- toskeleton function and cAMP signalling [162]. One study [163] demonstrated insoluble DISC1 aggregates present in the brains of individuals with chronic mental illness on postmortem exam- ination which were absent from controls without mental illness. They propose that aggregation of DISC1 causes a harmful loss of function phenotype through abolition of its ability to inter- act with downstream signalling proteins, particularly Nuclear Distribution Element 1 (NDEL1). Parallel in vitro studies with recombinant DISC1 confirmed the ability of DISC1 to multimerize and disrupt NDEL1 binding. This striking observation means that some mental illnesses may also belong in the category of protein misfolding diseases, causing neuronal dysfunction and perhaps neurodegeneration, although intercellular propagation of misfolding has yet to be demonstrated. 1.3.9 Other Amyloid Proteins The examples discussed so far are drawn from human neurological diseases, but protein misfold- ing affects many other organ systems in the form of amyloidosis, characterized by pathological accumulation of insoluble β -sheet rich protein in the heart, kidneys, blood vessels, or other sites. Over 25 proteins are known to be capable of forming amyloid, often due to massive overproduction (for example immunoglobulin light chains from neoplastic plasma cells in multiple myeloma) or impaired clearance (β2-microglobulin amyloid deposition in joints for patients on hemodialysis). Interestingly, the best evidence for a disease satisfying all facets of the prion hypothesis is AA amyloidosis of captive cheetahs [164]. Serum amyloid A (SAA) fibrils are present in the feces of affected animals and get transmitted orally to other animals. SAA is a secreted protein, so intercel- lular transmission per se is not a necessary aspect of disease pathogenesis in this case as misfolding takes place in the interstitium. There is, however, clear evidence for molecular seeding of misfold- ing, tissue migration, and inter-organismal passage. There is also evidence that the ultrastructure of amyloid fibrils varies from site to site, with fibrils in feces being more frangible (and therefore possibly more infectious) than fibrils in the liver. There is currently no evidence suggesting a similar mechanism of AA amyloid infectivity in humans (but absence of evidence is not evidence of absence). AA amyloidosis almost always occurs in the setting of protracted chronic inflammation from other conditions like rheumatoid 21 CHAPTER 1. INTRODUCTION Free energy Figure 1.3: Schematic of the role of template assistance in protein misfolding. The presence of the misfolded protein seed (indicated by areas in red) creates an alternative pathway to aggrega- tion of native protein (indicated by areas in blue) with a lower activation energy barrier, thereby accelerating the misfolding process. arthritis [165], although only a very small proportion of individuals with such chronic inflamma- tion develop amyloidosis. The endogenous emergence (or even exogenous acquisition) of a minute seed of AA amyloid could be the precipitating factor. In experimentally induced murine AA amy- loidosis, amyloid deposits can develop in less than 24 hours in mice injected with ex vivo amyloid material and subsequently given an inflammatory stimulus [166]. Moreover, at a clinical level, renal function improvements after many months of anti-inflammatory therapy that slowly reduces the burden of amyloid protein can be reversed in a matter of days by an acute inflammatory flare- up [165]. The diminished quantity of AA amyloid is still sufficient to cause rapid induction of misfolding when there is recruitable substrate available. AA amyloidosis (and perhaps other sys- temic amyloidoses as well) may therefore follow a two-hit mechanism, in which a background of protein overproduction is first required followed by secondary emergence of the misfolded seed that catalyses the precipitation of protein from an effectively supersaturated solution. Islet amyloid polypeptide (IAPP) is known to form amyloid deposits in the pancreatic islets of patients with Type II diabetes and may contribute to islet cell destruction. As with Aβ in AD, there is speculation that IAPP oligomers rather than fibrils are cytotoxic, but this hypothesis is as yet unproven [167]. IAPP amyloid deposition appears to be a local phenomenon within the islet and shows no evidence of intercellular transmission, tissue migration, or spread between individuals, so the case for a prion-like mechanism of spread is at present minimal. 1.3.10 Thermodynamic Considerations It is clear that intercellular spread of misfolding is a prevalent feature of neurodegenerative disease, with the distribution of misfolded protein seeds playing the critical role of expanding the pathology 22 CHAPTER 1. INTRODUCTION beyond the site of original involvement. The field of protein folding has been heavily influenced by Afinsen’s Thermodynamic Hypothesis, which asserts that the native fold of a protein is its unique, kinetically accessible global free energy minimum determined by its primary sequence. Impor- tantly, this hypothesis is traditionally applied to monomeric proteins or small protein assemblies; frequently, the global free energy minimum for a system containing multiple proteins is not a so- lution of natively folded monomeric proteins, but rather an aggregate formed by their hydrophobic packing, desolvation, and precipitation from solution. The activation energy barrier needed to form the aggregate may be very high, but once crossed enables access to a system state so much lower in free energy that the process is effectively irreversible (This is intuitive for anyone who cooks an egg and isn’t surprised that the white doesn’t turn back to liquid when the egg cools). The question of whether or not a misfolding process is prion-like depends on the manner in which the misfolded protein interacts with native protein to expedite its misfolding. As suggested several years ago, there are two potential mechanisms that may account for this phenomenon: seeded polymerization, in which the misfolded aggregate scavenges misfolded monomers from solution, thereby causing a product pull that depletes the reservoir of natively folded protein while adding to the aggregate; and template assistance, in which direct interaction between the native protein and the misfolded aggregate creates an alternative pathway with a lower activation energy barrier (that is, classic catalysis) to enable the previously native protein to join the aggregate (see Figure 1.3). In either case, participation in the aggregate is thermodynamically favoured over the native fold in solution, so the aggregate will grow at the expense of the native protein. If the aggregate is toxic, this is a gain-of-function mechanism; if the native protein is physiologically essential and cannot regenerated fast enough, a loss-of-function defect results. Fortunately, formation of the aggregate seed in the first place is not easy and represents the principal obstacle to the whole process. The kinetic barrier to aggregate formation de novo may be so high that it only happens once a decade or less within the whole CNS. The exceptionally slow rate of neuronal turnover in the CNS, though, means that nascent aggregates are difficult to clear once formed, as this would require proteolysing the aggregate without interfering with the contin- ued health and operation of the neuron (or other cell) harbouring them. Cells have sophisticated and robust ways of handling misfolded protein with chaperones, proteasomal degradation, and the endosome-lysosome system, but aggregates that evade these systems by being protease resistant, too large, or unrecognizable as a misfolded protein by the cellular machinery may therefore grow inexorably. At some point, fragments of the aggregate may be trafficked passively or actively to the cell surface and thence into the extracellular milieu. Adjacent cells may then take them up, thereby allowing the misfold to spread through the tissue. 23 CHAPTER 1. INTRODUCTION 1.3.11 Cell-to-Cell Propagation Intercellular spread of misfolded protein has now been observed for PrP, tau, SOD1, and others, but the mechanism of this process remains to be established (and may vary from protein to protein). A recent review [74] discusses the possibilities in detail, including simple diffusion of naked protein aggregates, passage through nanotubes, and encapsulation in exosomes. Free extracellular diffusion of naked protein aggregates is the least complicated possible mech- anism and clearly applies to proteins like β2-microglobulin and transthyretin that aggregate in the interstitium. However, for intracellular proteins like SOD1 and tau, the plasma membrane presents a natural obstacle to the uptake of protein aggregates. Since protein aggregates are too polar to freely diffuse across the membrane, they must cross it by other means. Direct, mechanical pene- tration of the membrane is possible and supported by the observation that large β2-microglobulin fibrils are internalized by cultured cells [168]. Alternatively, internalization could occur through the endocytic pathway: α-synuclein aggregates have been found to co-localize with endosomal and lysosomal markers, and ablating the function of dynamin, a GTPase essential for endosome- membrane fusion, inhibited its uptake [169]. In the study of prion-like SOD1 activity by Münch et al. [150], aggregates are believed to enter cells by macropinocytosis [170], a type of endocytic process. Many of the concerns with free diffusion of intracellular misfolded protein could in principle be overcome by exosome encapsulation. Exosomes are 30-100 nm organelles now known to be released by numerous mammalian cell types including various hematopoietic cells, adipocytes, and neurons. These vesicles are formed within endosomes, by invagination of the limiting mem- brane, resulting in the formation of multi-vesicular bodies (MVBs). Exosomes are then released into the extracellular environment by fusion of MVBs with the plasma membrane [171]. Since the lumen of an exosome is topologically continuous with the cytosol, exosomes could package intracellular aggregates from one cell, spread through the extracellular space, and fuse with the plasma membrane of neighboring cells to release the aggregates into their cytosol. A growing body of work has implicated secreted exosomes in the cell-to-cell transfer of prion protein in disease [172, 173, 174, 175]. Mammalian neuronal cell lines harbouring misfolded prion protein (PrPSc) were found to release exosomes containing the infectious prions [175]. These prion-containing exosomes were captured by bystander cells, and after delivery of their cargo, transformation of native PrPC into infectious PrPSc was observed [175]. Since these initial studies were conducted, attempts to uncover the normal function of PrPC, as well as the mechanisms of disease, have found that the prion protein is also secreted in exosomes released by macrophages [172], platelets [174, 176], and erythroblasts [177], suggesting that these cell types may also be involved in spreading disease. To date no work with primary neurons has been conducted to 24 CHAPTER 1. INTRODUCTION confirm these findings; however exosomes isolated from ovine cerebral spinal fluid were found to be relatively enriched in PrPC [173]. This study, though it did not examine diseased fluids for the presence of infectious prions, is the first in vivo evidence supporting a potential role of exosomes in propagation of prion misfolding during disease. Aβ , α-synuclein, and SOD1 have been found associated with exosomes. It was recently re- ported that α-synuclein is secreted in association with exosomes from neuronal cell lines (SH- SY5Y), and that these vesicles cause neuronal death when used to treat naı̈ve cells [178]. The ev- idence discussed above pointing to soluble Aβ species as primarily responsible for neurotoxicity and propagation during AD [179, 180, 181, 89, 88] inherently argues against a role for exosomes in AD. SOD1 is a well-known component of mammalian exosomes [182], and a mouse hybrid cell model of ALS was found to secrete exosomes associated with both wild type and mutant SOD1 [141]. No work has yet shown that misfolded Aβ or SOD1-containing exosomes are neurotoxic. In addition, these investigations did not determine the nature of the protein-vesicle association, whether the proteins were located within the lumen of the secreted vesicles or associated with the outer leaflet of the membrane. This information is critical to understanding the potential role of vesicles in neurodegenerative disease propagation. 1.3.12 The Infection Question An important distinction in the prion field is made between conversion, the template-directed mis- folding of PrPC by PrPSc, and infection, the passage of PrPSc between organisms in sufficient quantity and in the right form to trigger disease. Whereas conversion is primarily a question of chemistry, infection is a multidimensional question of biology, epidemiology, even sociology (as demonstrated by kuru). The fact that protein misfolding diseases in the CNS other than those caused by PrP do not seem to be transmissible between individuals may be a commentary on the stability of the aggregate in the extracorporeal environment, rather than a result of intrinsic differences in disease mechanism within the CNS. The CNS offers a carefully regulated environment in which aggregates may spread and recruit additional protein, but the external environment may simply be hostile to preservation of the templating fold: misfolded prion protein is sufficiently resilient to survive ex vivo, while other aggregates may be too fragile to be successfully passed between individuals through an oral or even parenteral route. Inadvertent transmission through neurosurgical procedures is a theoretical possibility, as inad- equately decontaminated surgical equipment could in principle spread a misfolded protein seed directly from an affected brain to a healthy brain while bypassing exposure to environmental con- ditions that would normally inactivate it. Serendipitously, many of the precautions instituted to 25 CHAPTER 1. INTRODUCTION prevent iatrogenic CJD, such as more rigorous cleaning of neurosurgical instruments, may help to prevent transmission of other potentially infectious protein misfolding diseases as well. 1.3.13 New Therapeutic Avenues from the Prion Hypothesis The prion paradigm of protein misfolding diseases offers new opportunities for therapeutic in- tervention beyond those available if misfolding is a diffuse process occurring independently at multiple sites. Some of the most promising strategies in current development are: • Enhanced proteolysis of misfolded protein: In the case of PrPSc, the near indestructibility of the misfolded aggregate may preclude this approach, but the active misfolded proteins in other diseases may be more amenable to degradation. Misfolded SOD1, for example, is highly protease sensitive while the native protein is protease resistant, so proteasome acti- vators could potentially expedite the clearance of misfolded protein. This strategy has been attempted in Huntington’s Disease [183], in which proteasome activators involved in either ubiquitinated or non-ubiquitinated proteolysis were overexpressed in HD patients’ skin fi- broblasts or mutant huntingtin-expressing striatal neurons. In AD, proteasome inhibition has been shown to potentiate presenilin accumulation, and it is hypothesized that protea- some activation may accelerate the degradation of Aβ40 or Aβ42 [184]. Indeed, proteasome activation has been discussed as a potential generic anti-aging remedy [185]. Unfortunately, whereas proteasome inhibition is possible with small molecules like bortezomib [186], in current clinical use for multiple myeloma, proteasome activation requires modulation of protein synthesis by gene therapy or other sophisticated approaches that are several years away from clinical application. Upregulation of protein folding chaperones and heat shock proteins is an alternative approach, but interestingly increased levels of chaperones actually potentiated huntingtin misfolding [187]. • Stabilization of the native protein conformation: If the energy barrier to reconfiguration or unfolding of the native protein can be increased, the kinetics of the misfolding reaction will slow. Compounds binding native PrP in order to stabilize it have been identified [188], and this approach can be applied to other aggregation-susceptible proteins that have a defined native structure, like SOD1. In the case of SOD1, monomerization of the native dimer is known to be an early step in misfolding [123], so agents that stabilize the native dimer may impede misfolding. • Decreased production of substrate protein: This approach has been used most extensively in AD with the development of β - and γ-secretase inhibitors and modulators [189], since pro- duction of the fibrillization-prone Aβ peptide depends on the action of these two enzymes. 26 CHAPTER 1. INTRODUCTION IAPP is also produced through cleavage of a propeptide by proprotein convertase 2 and car- boxypeptidase E, but there has so far been little interest in developing inhibitors of these enzymes because of their wide importance in other neuroendocrine systems. • Interference of aggregate binding to native protein: Template-directed misfolding necessi- tates a physical interaction between the native and misfolded protein. If the site of this inter- action can be ascertained, an antagonist molecule capable of binding the native or misfolded protein at the site of interaction would disrupt the conversion process and slow misfolding. A detailed search of the interaction domains between PrPC and PrPSc has been undertaken [47], revealing three regions of association that could be targeted pharmacologically. Similarly, the importance of tryptophan 32 in mediating SOD1 misfolding as discussed in Chapter 5 suggests that compounds binding native SOD1 at this site may prevent template-assisted conversion. • Neutralization of misfolded aggregates: If misfolded aggregates travel through the extracel- lular environment without encapsulation in exosomes or nanotubes (or are present on the outer surface of these structures), epitopes on the misfolded protein are accessible for an- tibody recognition. Prion infection in vitro propagates from cell to neighbouring cell, and this process can be inhibited or abrogated immunologically with antibodies directed against PrPC, PrPSc, or epitopes exposed by both conformers [44, 190, 191]. Raising antibodies against extracellular aggregates in AD has had mixed results: immunotherapy against Aβ has been shown to reduce amyloid deposits as well as improve cognitive behavior in both transgenic mice and human patients [192, 193], but early AD vaccine studies grossly target- ing Aβ peptide were limited by autoimmune meningoencephalitis [194, 195]. There is abundant evidence that motor neuron death in ALS is non-cell autonomous, and that extra-cellular “secreted” SOD1 plays a key role. Motor neurons expressing wild-type SOD1 degenerate when surrounded by glial cells expressing mutant SOD1 [196]. Consistent with a prion-like activity of misfolded SOD1, wtSOD1 misfolding can be induced in naı̈ve human cells cultured in media conditioned by cells expressing mutant SOD1 over several passages of cell growth, and that this intercellular propagation can be abrogated by incubation with antibodies immunoreactive against wild-type and misfolded SOD1. These results necessitate that the templating molecular species are exported from cells, which efficiently occurs with mutant, oxidized, and monomeric species of SOD1 [140], and that extracellular misfolded SOD1 enters the cytosol of exposed cells to propagate SOD1 misfolding, a process which also applies to the intracellular proteins α-synuclein and Tau [15, 16]. 27 CHAPTER 1. INTRODUCTION In light of the immunogenic abrogation of misfolded wtSOD1 propagation mentioned above, the intra- and extracellular toxicities of misfolded/oxidized SOD1 provide potential targets for reducing motor neuron cell death in ALS by blocking intercellular transmission or im- munogenic clearance of pathogenic SOD1. Indeed, a previous study established that active vaccination of a mild-phenotype transgenic mouse model with whole recombinant G93A SOD1 protein significantly slowed disease onset and progression [197]. Other studies have demonstrated that vaccine-directed immunoglobulin might attenuate disease in transgenic models of ALS [198]. Appel and colleagues have shown that IgG can be detected intracel- lularly in ALS motor neuron perikarya, presumably through retrograde transport from the neuromuscular junction [199]. Immunoglobulin has also been detected in motor neurons of G93A SOD1-immunized transgenic mice expressing human G37R SOD1 [197]. However, perhaps a more attractive possibility is that the toxicity of secreted misfolded/oxidized SOD1 could be neutralized in the CNS interstitium, which is relatively accessible to circulating IgG. However promising misfolded/oxidized SOD1 might be as an immunotherapy target, native SOD1 shares many surface epitopes with the toxic misfolded protein and is also present in an antibody-accessible interstitial compartment in the CNS. In order to avoid autoimmune complications of immunization with holo-SOD1, it is vital to discriminate between native SOD1 and aberrant conformers that presumably cause motor neuron disease. There is a growing literature of misfolding-specific SOD1 antibodies [124, 131, 132, 200, 201, 202] that could serve as the precursors for future therapeutic agents against ALS. • Antagonism of infectious oligomers: Currently approved treatments for AD rely on reduction of symptoms and are minimally effective in altering the course of the disease [203], but the intra- and extracellular toxicity of Aβ oligomers provides a potential target for improving neuronal survival in AD. Using a proven hypothesis-driven approach, the Cashman group has generated disease-specific antibodies raised against an Aβ epitope specifically available to antibody binding when the protein is misfolded in disease. These disease-specific antibodies are excellent alternatives to immunotherapies based on antibodies broadly targeting Aβ be- cause, by design, they specifically neutralize the toxicity of a particular conformation of the protein of interest, while sparing natively or alternatively structured proteins from immune recognition and potential widespread autoimmunity [204, 38, 205]. We are currently using these antibodies to identify the Aβ species responsible for neurotoxicity and propagation of disease. Following a similar logic, Abbott Pharmaceuticals used synthetic Aβ oligomers to generate oligomer-specific antibodies [206]. These “globulomer” antibodies were capable of preventing the pathological effects of Aβ oligomers both in vitro and in vivo. Despite 28 CHAPTER 1. INTRODUCTION these successes, the size and conformation of the Aβ oligomer responsible for Alzheimer’s disease remains unknown. 1.3.14 Toward a Unified Theory of Misfolding Disease? The spread of aggregates from cell to cell now appears to be a prevalent mechanism in protein misfolding diseases, which elevates the importance of the initial misfolded seed in models of pathogenesis. The contiguous spatiotemporal spread of tissue involvement in multiple neurode- generative diseases and the demonstrated intercellular passage of misfolded protein argue that the event precipitating disease is endogenous, spontaneous misfolding arising from a rare conforma- tional fluctuation of a vulnerable protein. However, once the misfolded material is created it is resistant to physiologic clearance mechanisms and may then migrate through the local environ- ment, eventually traversing macroscopic distances over months of disease progression. Although the probability of the initial misfolding event may be very low at any given time, over years and decades of life the cumulative probability may approach unity. The alternative model in which protein misfolding arises separately but simultaneously in a large population of cells now seems less likely. If misfolded protein seeds itself through the CNS, it does indeed satisfy the molecular, cellular, and tissue definitions of a prion. These proteins may then be understood in the same framework as the canonical prions from an individual (if not populational or epidemiological) perspective. The considerable investment of scientific resources in understanding the prion protein, its structure and biology, may therefore have applications far beyond the relatively rare prion diseases: the uniqueness of the prion is diminished, but its importance has never been greater. 29 C H A PT E R 1. IN T R O D U C T IO N Protein Molecular Recruitment and Misfolding Induction Intercellular Transmission Tissue Migration Spread Between Organisms Evidence of Strains Protein Only Sufficient For Infection Protease Resistance of Aggregates Prion Protein Yes Yes Yes Yes Yes Yes Yes Serum Amyloid A Yes: seeded polymerization of native protein by aggregates in vitro No: Amyloid formation takes place in interstitium No: Diffuse simultaneous deposition in many tissues Yes, but not in humans (currently only in cheetahs) Yes: different ultrastructural fibril properties depending on tissue site Amyloid fibrils sufficient for infection, but background inflammation in host necessary Yes Amyloid-β Yes: seeded polymerization of native protein by aggregates in vitro No: Aggregation takes place extracellularly Possibly: characteristic spread of Aβ in AD neuropathological staging Possibly: transmission in mice on intracerebral inoculation No evidence to date Possibly: Aβ oligomers are cytotoxic, unclear if aggregates are cause or consequence of disease Yes Tau Yes: seeded polymerization of native protein by aggregates in vitro Yes: extracellular aggregates can be taken up by cells Possibly: outward spread of tau aggregates from transentorhinal cortex in AD Possibly: pathology in mice intracerebrally inoculated with misfolded tau Yes: fibrillar aggregates in brain have range of ultrastructural features No evidence to date Yes SOD1 Yes, but currently only under non-physiologic conditions Yes: SOD1 misfolding may be propagated serially in cell culture Possibly: Indirect evidence from contiguous spread of neuropathology in ALS No evidence to date No evidence to date Possibly: infection in cell culture blocked by antibodies against misfolded SOD1 No: misfolded protein shows enhanced protease sensitivity FUS/TDP-43 No experimental evidence to date; primary sequence similarity to yeast prions No evidence to date No evidence to date No evidence to date No evidence to date No evidence to date No evidence to date Huntingtin Yes: polyglutamine peptides exceeding critical length aggregate in vitro Yes: aggregates are internalized by mammalian cells in culture Yes: Contiguous spread of pathology through CNS in HD No: Unlikely as genetic susceptibility in host necessary No evidence to date No: Etiology is underlying genetic mutation Yes α-Synuclein Yes: seeded polymerization of native protein by aggregates in vitro Yes: premature α-synuclein misfolding in transplanted stem cell grafts Yes: Contiguous spread of pathology through CNS in PD No evidence to date No evidence to date No evidence to date Yes IAPP Yes: seeded polymerization of native protein by aggregates in vitro No: Aggregation takes place extracellularly No: Confined to pancreatic islets No evidence to date No evidence to date No evidence to date Yes DISC1 Yes: multimerization of recombinant DISC1 No evidence to date No evidence to date No evidence to date No evidence to date No evidence to date No evidence to date Table 1.1: Prion-like features of protein misfolding in other neurodegenerative diseases. 30 CHAPTER 1. INTRODUCTION 1.4 Aims of the Thesis Ultimately, the protein misfolding process that takes place in prion disease, ALS, and the amyloi- doses is driven by the balance of thermodynamic effects that determine the free energies of folded, partially unfolded intermediate, and misfolded protein conformers. The overall free energies of these conformers are the sum of multiple contributions, such as configurational entropy, polar and nonpolar solvation energy, and polar and nonpolar protein self energy. For some of these terms, there are well-validated theoretical models that enable their calculation from atomic-resolution structural data. However, current methods for estimating electrostatic energies in particular suf- fer from a severe approximation of treating proteins as homogeneous, isotropic dielectric media, which belies the complexity of their internal structure and consequent response to electric fields. It is therefore useful to explicitly incorporate this complex response behaviour into the calculation of the protein dielectric function, leading to the first goal of this thesis: 1. Develop a theory to quantitatively describe the protein dielectric environment, accounting for inhomogeneity and anisotropy due to internal protein structural constraints. 2. Apply this electrostatic theory to better understand the role of salt bridges in stabilizing misfolding-associated proteins, particularly the prion protein. Once the electrostatic theory is developed, it can be combined with existing methods of calculating other contributions to protein free energy and applied to the study of early events in the loss of protein structure leading to misfolding: 3. Incorporate the inhomogeneous protein dielectric theory into a comprehensive protein con- formation-dependent free energy function. 4. Investigate structural fluctuations and regional unfolding events in PrP, SOD1, and other amyloidosis-associated proteins as a way of understanding the first steps of the misfolding mechanism. 5. Reconcile these results with available experimental NMR data on partial unfolding. Regions with a high propensity to loss of structure in the native state have a high likelihood of adopting a different, possibly unfolded conformation in the misfolded state. Identifying these unstable regions thus provides a basis for predicting structural features of the misfolded protein that can be used for immunologic recognition. 6. Predict regions of PrP, SOD1, and other proteins that are likely to lose structure early in the misfolding process that may therefore be misfolding-specific epitopes amenable to antibody tagging. 31 CHAPTER 1. INTRODUCTION Extending the theoretical work on protein misfolding into an experimental study, it would be in- formative to investigate the misfolding dynamics of SOD1. SOD1 misfolding as a feature in FALS was discovered almost two decades ago, but compared to other neurodegeneration-associated pro- teins like PrP, relatively less is understood about the details of SOD1 misfolding in disease. An especially pressing question is whether SOD1 misfolding is a nucleation/polymerization process, as with Aβ and other amyloid-forming proteins, or a template-directed process, as with prion protein. Further, the dependence of SOD1 misfolding on solution conditions and other possible protein co-factors remains to be settled. 7. Investigate SOD1 misfolding as a prion-like, template-directed process. Specifically, ex- plore whether misfolded SOD1 can propagate its misfolded conformation intra- and inter- cellularly to previously native SOD1 molecules 8. Examine whether other co-factors are required for the transmission of SOD1 misfolding; that is, the degree to which the “protein-only” hypothesis of prion spread applies to SOD1. Finally, while neurodegeneration and the systemic amyloidoses receive the most attention from the protein misfolding academic community, this does not mean that protein misfolding as a feature of disease is restricted to these illnesses. Emerging evidence suggests a role for misfolding in cancer, which merits further exploration as a possible therapeutic strategy. 9. Consider the possible involvement of protein misfolding in oncogenesis and the routes by which misfolded protein could reach the surface of cancer cells. 10. Predict unstable regions of cancer-associated cell surface proteins that may be selective epi- topes for distinguishing neoplastic cells. In sum, the goal of the thesis is to investigate the physico-chemical forces that drive protein mis- folding, develop computational tools to predict steps in the misfolding process, and apply these tools to proteins relevant in human disease. 32 Chapter 2 The Dielectric Properties of Proteins Using results from the dielectric theory of polar solids and liquids, we calculate the mesoscopic, spatially-varying dielectric constant at points in and around a protein by combining Kirkwood Fröhlich theory with short all-atom molecular dynamics simulations of equilibrium protein fluctu- ations. The resulting dielectric permittivity tensor is found to exhibit significant heterogeneity and anisotropy in the protein interior. Around the surface of the protein it may exceed the dielectric constant of bulk water, especially near the mobile side chains of polar residues, such as K, N, Q, and E. The anisotropic character of the protein dielectric selectively modulates the attractions and repulsions between charged groups in close proximity. 2.1 Introduction A quantitatively accurate theory for the dielectric properties of polar liquids and solids took form only after several decades of research, starting with the early work of Lorentz [207] and reaching predictive power with the theories of Kirkwood and Oster for fluids [208, 209] and Mott and Little- ton [210] for solids. Debye’s original formulation [211] followed Langevin’s theory of paramag- netism and quantifies the earlier observations of Clausius and Mossotti for the dielectric constant ε of gases: ε−1 ε+2 = 4pi 3 ∑i ni ( αi+ µ2i 3kBT ) . (2.1) Here the sum is over species of molecules, αi and µi are the electronic polarizability and permanent electric dipole moment of species i, ni is the number per cm3 of species i, and kBT is Boltzmann’s constant times the temperature. This analysis yields a dielectric constant which increases as tem- perature is decreased due to the more effective alignment of dipoles against thermal randomization. 33 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS However, for a positive dielectric constant the left hand side of Equation (2.1) is bounded by unity, while the right hand side is not. In fact if one substitutes known values for the electronic polarizability, molecular dipole moment, and density at room temperature for a substance such as water, Equation (2.1) can only be satisfied by a negative dielectric constant. This is a consequence of the assumption of the Lorentz field for the local field in the model, i.e. the model predicts ferro-electricity for a substance such as water below a temperature Tc = 4pinµ2/9kBT ≈ 1900K analogous to Weiss ferromagnetism. Onsager’s treatment of the local field resulted in a reaction field which could polarize a molecule but not align it, and a cavity field which could provide torque on a dipolar molecule [212]. This theory removed the dielectric catastrophe, but still predicted dielectric constants about half of the experimental values for substances such as water. Oster and Kirkwood’s more explicit treatment of dipole-dipole correlations [208] predicted dielectric constants within a few percent for water by treating the alignment of a water molecule dipole with that of its neighbors [209]. This treatment results in an increased dielectric constant when molecules align their neighbors in a ferromagnetic fashion, with Kirkwood’s expression for the dielectric constant (for 1 species) (ε−1)(2ε+1) 12pinε = [ α+ µ2 3kBT ( 1+n ∫ vo dΩ cosγ e−W/kBT )] (2.2) reducing to that in the Onsager theory when the local effective dipole moment is treated at the mean field level, determined by the reaction field. In (2.2), W is the potential of mean force acting on a pair of molecules, γ is the angle between the dipole moments of a pair of molecules, and the integration is over all relative orientations and positions of the molecules within a sphere of volume vo. A natural application for the theory of polar dielectric media is the study of electrostatic effects in biomolecules such as proteins, as these effects are key to their stability and function. A complete understanding of these effects requires an accurate description of protein dielectric properties, which determine the strength of interactions between charges in the protein. However, unlike a homogeneous liquid whose dielectric constant does not vary throughout its volume, the dielectric response of a biomolecule varies from site to site depending on the local molecular structure. Furthermore, complex constraint forces within the molecule may cause only partial alignment of local dipoles with an applied external field, introducing anisotropic effects. The importance of biomolecule dielectric behavior in such fields as protein-protein interactions and enzyme reaction catalysis has led to interest in a method for calculating protein dielectrics that accounts for their varying local behavior. Explicit microscopic approaches to the calculation of the dielectric constant by extracting fluctuations appearing in Kirkwood-Fröhlich theory from 34 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS molecular dynamics (MD) simulations have been developed by Wada [213], van Gunsteren [214], and Simonson and Perahia [215, 216, 217]. Masunov and Lazardis [218] have calculated the potentials of mean force between pairs of charged amino acid side chains and found that a uniform dielectric is unsatisfactory for explaining the results of explicit simulation. The results of explicit and implicit solvation for calculating protein pKa’s have been compared [219], with the conclusion that the larger scale, anisotropic structural reorganization that can accompany (de)protonation is difficult to capture using Poisson-Boltzmann (PB) methods, but may be captured using molecular dynamics with generalized Born implicit solvent. Voges and Karshikoff [220] have provided a theory that enables the iterative calculation of a heterogeneous (but isotropic) dielectric constant in a small cavity containing part of a protein and have applied it to the calculation of amino acid dielectric constants. The advent of automated PB equation solvers like APBS [221] and DelPhi [222] has en- abled the rapid calculation of protein electrostatic energies. PB methods are used extensively in biomolecular simulation, including ligand docking studies for high-throughput drug screening [223] and implicit solvent molecular dynamics [224]. However, PB calculations commonly assume a constant internal dielectric environment for proteins which neglects local variation in susceptibil- ity, while measurements such as pKa shifts indicate a much richer profile for the effective dielectric constant in proteins[225]. It is thus desirable to calculate the local dielectric constant at all points in and around a protein to use as input for these programs to enable a more accurate and descriptive calculation of protein electrostatic energies. Moreover, knowledge of the local dielectric function allows for an understanding of the mesoscopic structure of the susceptibility within and around a protein, which has consequences for many aspects of stability, folding, binding, and biological function. For these reasons we have developed a mesoscopic-scale theory to calculate the spatially- varying and anisotropic static dielectric constant in and around a protein. This theory is applied to investigate electrostatic contributions to regional stability of the prion protein in Chapter 4 [226]. 2.2 Protein Dipoles in an Applied Field The first step in deriving the local dielectric constant of the protein is to characterize its response to applied electric fields. The protein can be thought of as an assembly of fluctuating dipoles at locations determined by the native protein fold. Unlike in the liquid case, where these dipoles are relatively free to orient with the prevailing applied electric field (subject to local organization of the liquid), stereochemical intramolecular forces constrain the motion of the dipoles in the protein, so they may only partially align with an applied field. Furthermore, dipoles in the protein do not move independently, as the coupling of fluctuations due to the above-mentioned steric constraints in the protein may cause the dipoles to react in a coordinated fashion. 35 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS In principle, any two atoms participating in a covalent bond in the protein may be viewed as a dipole, but the motions of atoms within the backbone and side chains of residues in a protein are highly correlated by the covalent bond network. We will thus define the effective dipoles as groups of atoms in each residue backbone or side chain. Exceptions are made for glycine, alanine, and proline, since their side chains are structurally incapable of motion substantially independent of their backbones; all the atoms in each one of these residues are considered as a single dipole. In the absence of an applied electric field, these dipoles undergo thermal fluctuations that are not necessarily isotropic: there may be greater average motion in some directions compared to others. For example, fluctuations perpendicular to the time-average dipole orientation are typically greater than those parallel the average dipole. Thus each dipole has its own system of principal axes characterizing the response of its three components to an external field. 2.2.1 Internal Protein Constraints Let the protein be composed of n dipoles, each with dipole moment components in the x, y, and z directions. Construct a vector µ of length 3n that contains the deviations of all the protein dipole moment components from their equilibrium values, such that the x, y, and z components of the ith dipole’s deviations are µ3i−2, µ3i−1, and µ3i respectively. This means that 〈µ〉0 = 0, where the angle brackets refer to the thermal average in zero external field. In the presence of a local electric field E=(Ex,Ey,Ez)which can vary from dipole to dipole, the change in free energy by perturbing the configuration of dipoles from their equilibrium positions is (to 2nd order in µ) ∆G= 1 2 3n ∑ i, j=1 Ki jµiµ j− 3n ∑ i=1 Eiµi (2.3) where Ei are the components of the vector E = (E1,E2, . . .En) representing the local field on all n dipoles, Ki j are the second derivative matrix elements δ 2G/δµiδµ j ∣∣ 0 evaluated at the equilibrium position µ= 0. The probability for the system to occupy such a configuration is proportional to exp(−∆G/kBT ). Thus the averages of the induced dipole moment and cumulant matrix elements can be found by diagonalizing the free energy [227], and are given by 〈 µiµ j 〉 c = kBT (K −1)i j (2.4) 〈µi〉 = ∑ j (K−1)i jE j = 1 kBT ∑ j 〈 µiµ j 〉 cE j. (2.5) 36 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Since the average ofµ is 0 in the absence of an external field, the cumulants above may be replaced by the unperturbed averages 〈. . .〉0. So long as the free energy in the above analysis is linear in the field strength (linear response), the statistics of the dipole fluctuations need not be Gaussian, and consequently the local potential of mean force need not be harmonic. To see this, it is sufficient to consider an isolated dipole µ in the presence of a local field E, with an arbitrary unperturbed probability distribution Po(µ) and 〈µ〉o = 0. In the presence of a weak field the probability distribution becomes Poe−µ·E/kBT ≈ Po(1−µ · E/kBT ). The thermal average of the dipole moment µ = µ1î+ µ2ĵ+ µ3k̂ then has components 〈µi〉 = ∑ j 〈 µiµ j 〉 E j/kBT , which is precisely Equation (2.5). Thus, whatever the statistics of the dipoles, Equation (2.5) gives the average moment in the presence of a weak field. In the all-atom molecular dynamics simulations of proteins described below, it was observed that most dipoles were tightly bound by harmonic potentials, with mean fluctuations much less than the total dipole magnitude. However, some polar amino acids near the protein surface un- derwent significant rearrangement due to the lack of steric constraints. In this sense the protein is more liquid-like on its surface than in its interior. The probability distributions of all dipoles in simulation were compared to normal distributions by the Lilliefors test [228]. Figure 2.1A shows the distribution of Lilliefors test statistics from all of the dipole probability distributions of ubiqui- tin, taken from a 20 ns all-atom classical molecular dynamics (MD) simulation in explicit solvent of the native state of ubiquitin. Simulations were performed using the NAMD simulation pack- age [229], with the CHARMM22 force field [230, 231]. More details of the simulation protocol are described in the section “Implementation in a Protein System” below. The significant ma- jority of dipole probability distributions closely followed the normal distribution, although some dipoles modes exhibited decidedly non-Gaussian potentials, often bi- or tri-modal, indicating mul- tiple energetic minima for these dipoles. These multiple minima correspond to different metastable configurations of the amino acid side chain (Figure 2.1A inset), however as mentioned above linear response does not require the fluctuation distribution to be Gaussian. 2.2.2 Collective Dipole Fluctuation Modes If the motion of each dipole were uncorrelated with other dipoles, then the eigenbasis of fluctu- ations would be the principal axes of the individual dipoles. The actual eigenbasis taken from all-atom molecular dynamics simulations can be projected onto this individual-dipole basis to de- termine the degree of coupling between dipoles. In the individual-dipole basis |φ〉, define a fluctuation eigenvector |ψ〉 = ∑φ aφ |φ〉, where by normalization ∑φ |aφ |2 = 1. Each modulus |aφ |2 can be interpreted as the probabilistic weight pφ of the eigenvector |ψ〉 in the basis vector |φ〉. Borrowing the concept from spin-glass theory of the 37 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS 0.02 0.04 0.06 0.08 0.10 50 100 150 Lilliefors Test Statistic Nu m be r o f D ip ol e M od es A. 2 4 6 8 10 12 14 16 18 200 20 40 60 80 100 120 M, Number of Dipoles Involved in Each Mode Nu m be r o f D ip ol e M od es B. Figure 2.1: Dipole fluctuation statistics. A) The distribution of Lilliefors test statistics for ubiq- uitin dipoles. Values greater than the dotted line indicate non-normal distributions with 95% con- fidence. Representative normal and non-normal dipole distributions for the side chains of Y59 and Q31 respectively, along with several molecular configurations, are shown in the insets. The values of the Lilliefors test statistic for these side chains are also indicated. B) The distribution of the spin-glass parameter M (a measure of the effective number of dipoles involved in each fluctuation mode, see text) for the dipole fluctuation modes in ubiquitin. Inset images show the residues in- volved in localized and collective modes. Residues are color-coded so as to indicate whether their motions are correlated (same color) or anticorrelated (different colors). average cluster size of spin glass states [232], we let M = 1 3 ( 3N ∑ φ=1 p2φ )−1 (2.6) denote the degree of coupling between individual-dipole fluctuation modes. When a mode is un- coupled, it has a weight distribution given by a Kronecker delta and so M = 1/3. If a mode is fully coupled to all individual-dipole modes, pφ = 1/3n and so M = n, the total number of dipoles. Figure 2.1B plots the distribution of M for ubiquitin. The number of dipoles participating in each mode varies from 1 to 20, with 70% of modes containing less than 10 dipoles. Thus dipole motion exhibits coordination between moderately sized groups of neighboring dipoles, and only relatively few dipoles move independently of other dipoles; these independent dipoles tend to be less sterically constrained and reside on the protein surface, as seen in the inset of Figure 2.1B. It is therefore important to consider collective dipole modes in proteins to arrive at an accurate response relation. 2.2.3 Linear Response Relation for Induced Moments We calculate the effective local dielectric constant at a point in a protein by considering the equiva- lence between microscopic and macroscopic descriptions of the electric response of nearby media 38 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Figure 2.2: Schematic representation of the approach used to calculate the dielectric constant 2. In the microscopic view, a cavity of radius a containing media with induced dipole moment m and tensor polarizability  is surrounded by a heterogeneous dielectric of various permittivities (in this particular case εA . . .εH ; the number of different dielectric regions will depend on the sphere size and lattice point spacing). In the macroscopic view, this cavity is instead filled with an anisotropic dielectric tensor of permittivity 2 surrounded by an effective homogeneous isotropic dielectric ε1. In both views, an arbitrary external electric field Eext is applied. as depicted schematically in Figure 2.2. In the microscopic description, we imagine the matter within a region of radius a of this point to have a dipole moment m and tensor polarizability , placed in a cavity of the same radius within an environment consisting of various scalar dielectric constants, accounting for the response of the water and/or protein surrounding the cavity and repre- sented generically as εA,εB,εC . . . in Figure 2.2. The average dielectric of the regions surrounding the cavity is ε1. In the macroscopic description, this cavity is instead filled with a dielectric medium of permittivity 2, again surrounded by the dielectric ε1 (see Figure 2.2). Following the approach taken by Voges and Karshikoff [220], we solve for 2 in terms of ε1, m, , and a. In practice, we discretize the space in and around the protein into a lattice with spacing b, with b typically about 1 Å. For each lattice point at r we consider a spherical cavity centered at r of radius a, with a typically a few Å. The cavity may contain parts of several dipoles, and has inside it a local field E(r|a) due to both the external field and the system’s response. All dipoles in a given cavity are taken to experience the same total field. We take the contribution of each dipole to the induced cavity dipole moment to depend on the volume fraction of the backbone or side chain containing the dipole that is within the cavity. Let fA(r|a) be the volume fraction of residue A inside the cavity centered at position r, given the cavity has radius a. To obtain the static dielectric response, fA(r|a) should be the time-averaged fraction of A in the cavity. The ith component of the field-induced moment inside the cavity is given by a sum over both residues and components. It is clearest to write the sums separately, rewriting (2.5) as 〈µAi 〉= (kBT )−1∑nB=1∑3j=1 〈 µAi µBj 〉 EBj for the ith component of the dipole of residue A. The induced moment of the protein dipoles in 39 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS the cavity, mp(r|a), is given by the sum of the induced moments of all residues weighted by the fraction of those residues inside the cavity: mp(r|a) = N ∑ A=1 〈 mA(r|a) 〉 ≡ n ∑ A=1 fA(r|a)〈µA〉. (2.7) The cavity that determines the local dielectric may also be near the surface of the protein, where it will contain some number of solvent molecules (usually water). In this case the sum on residues in the cavity includes a contribution due to the water molecules inside it. The number of water molecules nw(r|a) in a cavity is determined by taking the available volume in the sphere but outside the protein, and dividing by the average volume of a water molecule at STP. The electronic polarizability of the media depends on the proportions of protein backbones, side chains, and water in the cavity. Analogously with the permanent dipole response, the total electronic polarizability in the cavity is weighted by volume fractions: α(r|a) = N ∑ A=1 fA(r|a)αA+nw(r|a)αw. (2.8) Scalar values of the electronic polarizability α for each residue are taken from the literature [233]; they cannot be measured directly at least from traditional classical MD simulation because atomic partial charges are fixed by the CHARMM22 parameter set. We take Kirkwood’s analysis as a starting point to determine the contribution of water to the cavity’s dipole moment, in which the induced moment due to permanent dipole reorientation is given by mw (r|a) = nw (r|a) gp 2 3kT Ee. (2.9) In this equation, p is the permanent dipole moment of water and Ee is the local effective field orienting the molecules. The constant g arises from the Kirkwood-Oster nearest-neighbor approx- imation of the term in parentheses on the right-hand side of Equation 2.2. It has been calculated previously for water and found to be 2.67 [209]. Constrained motion of water molecules, relative to that in bulk [234], has been observed at the surfaces of proteins [235, 236]. The heterogene- ity of hydrogen bonding between protein and water has been studied by Bagchi and co-workers [235] with particularly long lifetimes observed near positively charged residues, as well as reduced hydration layer rigidity near functionally-relevant sites on a villin headpiece subdomain. Such constrained and correlated motions may effectively increase the local dielectric of water near the protein surface if the local moments are positively correlated, and several examples of this effect are discussed below. A detailed investigation of water structure at the protein surface and its con- 40 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS sequences on the dielectric is a topic of future work. Here we make the simplifying assumption that water dipole correlations are the essentially same as those occurring in bulk. In what follows, a relationship analogous to Equation (2.9) is derived with the refinement that the local electric field be reinterpreted as a field proportional to the cavity field. In a cavity con- taining protein and water, the local field Ee experienced by the water and protein dipoles consists of a cavity field G due to the externally applied perturbing field Eext and a reaction field R due to the response of the medium outside the cavity (with an assumed average scalar dielectric ε1) to the induced dipole within the cavity: Ee = G+R = 3ε1 2ε1+1 Eext+F (αEe+m) (2.10) Here, F = 2(ε1 − 1)/((2ε1 + 1)a3), α is the total electronic polarizability of the cavity from Equation (2.8), and m = mp+mw is the total dipole moment due to the positions of atomic nuclei inside the cavity. Solving for Ee, Ee = 1 1−αF (G+Fm)≡ γ (G+Fm) , (2.11) where for convenience, γ ≡ 1/(1−αF ). The total potential energy of the protein dipole component in the cavity is the sum of the electric potential energy−mp ·Ee, plus the steric potential energyUs(µp) = 12KABi j µAi µBj , where the implied sums on A and B run from 1 to N (the number of BB + SC moieties) and the sums on i and j run from 1 to 3. The steric potential constants KABi j taken from simulation implicitly include the protein dipole self-interaction term γF fA fBµAi µBi due to the effect of the protein dipole reaction field on the moment mp itself. Using Equations (2.3), (2.7), and (2.11), U(mp) = 1 2 KABi j µ A i µ B j − γF fAµAi mwi − γ fAµAi Gi, (2.12) where the cavity field is applied only to dipoles in the cavity, resulting in the prefactor fA in the third term of Equation 2.12. The total steric potential energy must be included to properly account for the statistics of dipoles inside the cavity. Thus Equation (2.12) can be thought of as a hybrid potential energy. The potential energy of the water, on the other hand, is determined only by the total effective field, as there are assumed no internal steric constraints on its motion (except as embodied in the Kirkwood g-factor in Equation 2.9). This approximation can be refined by explicitly considering the simulated statistics of water molecules at the protein surface. The water 41 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS dipole self-interaction term γFmwi m w i is zero in the case of isotropic polarizability α since the reaction field produced by the water dipole is parallel the dipole itself and therefore cannot apply a torque to it. We neglect effects of the distensibility in magnitude of the nuclear part of the water dipole moment. The water dipole potential energy is U(mw) =−γF fAµAi mwi − γmwi Gi. (2.13) Thus the potential energy of water and protein dipoles is a sum of terms bilinear in the water and protein dipoles and linear in the effective cavity field γG. This has the form of the problem solved above in Equation (2.5), so ith component of the field-induced water and protein moments in the cavity are 〈 mAi 〉 = 3 ∑ j=1 ( N ∑ B=1 〈 mAi m B j 〉 0 + 〈 mAi m w j 〉 0 ) γG j kBT (2.14) 〈mwi 〉= 3 ∑ j=1 ( N ∑ B=1 〈 mwi m B j 〉 0+ 〈 mwi m w j 〉 0 ) γG j kBT , (2.15) where the time average 〈· · · 〉0 is taken in the absence of an external perturbing field. We have used the generality of the potential elaborated in the comments below Equation (2.5). The dipole polarizability to the cavity field is a property only of correlations within the system itself. Only mutual reaction fields influence the motion of the dipoles in this case. The correlation functions involving water in Equations (2.14) and (2.15) can be evaluated by direct integration. The integra- tion for protein dipoles is over all space, while it is confined to a sphere of radius p for the water dipoles, which can reorient but are fixed in magnitude. The potential energies in Equations (2.12) and (2.13) appear in a Boltzmann factor with the cavity field G = 0; those Boltzmann factors not containing Ki j are small compared to kBT and may be linearized to give 〈 mAi 〉 = N ∑ B=1 3 ∑ j=1 fA fB ( 1+F nwgp2 3kBT ) 〈µAi µBj 〉0 kBT γG j (2.16) 〈mwi 〉= 3 ∑ j=1 ( nwgp2 3kBT )δi j+ N∑ A,B=1 F fA fB 〈 µAi µBj 〉 0 kBT γG j (2.17) 42 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Equations (2.16) and (2.17) can be combined and written generally as a matrix equation, with a nuclear polarizability tensor  relating the effective field γG and induced moment m = mw+mp: m(r|a) = (r|a) · γ(r|a)G(r|a), (2.18) with components  AB i j (r|a) = fA fB ( 1+2F nwgp2 3kBT ) 〈µAi µBj 〉0 kBT +δi j nwgp2 3kBT (2.19) The quantities 〈 µAi µBj 〉 0 may be obtained directly from MD simulations of the protein in the absence of an external field, as described below. 2.3 The Dielectric Constant at an Arbitrary Point With the response of permanent dipoles and polarizable media in a cavity now established, we calculate the dielectric permittivity tensor at a location r by following the recipe outlined in Fig- ure 2.2. In a microscopic description, a set of polarizable constituents with induced and permanent dipole moments exist in a cavity of a heterogeneous dielectric medium with various scalar dielec- tric constants. The present theory approximates the medium external to the cavity by a single scalar dielectric ε1 equal to the the average of the neighboring effective scalar dielectrics over the surface of the cavity (εA,εB,εC, . . . in Figure 2.2). The medium inside the cavity is assigned a single ten- sor dielectric 2 because the polarizibility in the cavity is a tensor. As described below, we take the effective scalar value of the cavity’s dielectric to be the geometrical average of its principal components. The total field Ein inside the cavity at a position r is the superposition of the cavity field, the reaction field, and the permanent and induced dipole fields from the water and protein,[220] so Ein = G+F (αEe+ 〈m〉)−∇ ( αEe · r r3 + 〈m〉 · r r3 ) . (2.20) On substituting for 〈m〉 from (2.18) and Ee from (2.11), the potential inside the cavity is Φin = [(1+ γFα)(I+ γF)G] · r+[(γ+Fγαγ+αγ)G] · rr3 . (2.21) The potential outside the cavity is formed by the superposition of the potentials from the external field (taken to be uniform), the field due the cavity in the dielectric, and the field due to the dipole 43 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS in the cavity: Φout =−Eext · r+a3 rr3 ·WEext , (2.22) where W is defined by W≡ 9ε1 (2ε1+1)2 a3 (γ+Fγαγ+αγ)− ε1−1 2ε1+1 . (2.23) Shifting to the equivalent macroscopic description of the system as a dielectric with permittivity tensor 2 surrounded by a dielectric with scalar permittivity ε1, the potential in the surrounding dielectric is found to be [237] Φout =−Eext · r+a3 ( 2I+ ε−11 2 )−1 (ε−11 2−1)E · rr3 . (2.24) Equating the microscopic and macroscopic expressions for the potential outside the cavity and solving for 2, 2 = ε1 (2W+ I)(I−W)−1 . (2.25) In the case where W is a scalar and the dipole response in Equation (2.18) is approximated by freely rotating Langevin dipoles (a severe approximation due to steric constraints in the protein as mentioned above), Equation (2.25) reduces to that in the theory of Voges and Karshikoff [220]. 2.4 Implementation in a Protein System To calculate the local dielectric constant according to the approach described here, the first step is to obtain the matrix of correlations 〈 µAi µBj 〉 0 appearing in Equation (2.18) that describe coupled fluctuations between the dipoles in the native state. This is done with an all-atom classical MD simulation of the various proteins at 298K using the CHARMM22 force field [229, 230, 231], with particle-mesh Ewald electrostatics and a Lennard-Jones cut-off distance of 13.5 Angstroms. Proteins are solvated in a box of explicit water molecules that exceeds the dimensions of the native protein by 10 Angstroms on all sides and has periodic boundary conditions. Basic residues (Lys and Arg) are protonated, acidic residues (Asp and Glu) are deprotonated, and histidines are neu- trally charged to reflect ionization conditions at pH 7. Na+ and Cl− ions are added to the solvent to achieve overall system charge neutrality and an ionic strength of 150 mM. The simulation time step is 2fs, and snapshots are taken every 1ps for ensemble averaging. As seen in Figure 2.3B, a 1ps interval is sufficient for dipole positions to decorrelate from those in the previous snapshot. The total simulation time required for convergence to reliable values was typically 1ns; longer sim- ulations did not appreciably change the distribution of correlations (Figure 2.3C). All simulations 44 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Dipole Component Di po le Co m po ne n t A 1 2 3 40 0.2 0.4 0.6 0.8 1 C or re la tio n Simulation Time (ps) B Simulation Time (ns) 50 10 1st 2nd 3rd 4th C D is t’n  m om en t ( re la tiv e sc al e) 0 2 4 6 8 Figure 2.3: Dipole correlations and convergence in simulation. A) Correlation map for dipoles in ubiquitin. The area of the circles at each coordinate (i, j) indicates the magnitude of the corre- lation function 〈 µiµ j 〉 averaged over snapshots of the system, for all pairs of dipole components {µi,µ j} in ubiquitin. The numbering of the components runs from the N- to the C-terminus and accounts for the x-, y-, and z-components of each backbone and side chain dipole. Blue indicates〈 µiµ j 〉 > 0; red indicates 〈 µiµ j 〉 < 0 (anticorrelation). B) The correlation coefficient between dipole products, corr ( µi(0)µ j(0),µi(t)µ j(t) ) as a function of the time t between frames from the simulation. The dipole motion decorrelates with a time constant of < 0.5ps. C) The 1st, 2nd, 3rd, and 4th moments of the distribution of dipole correlation values for a 10 ns simulation of ubiquitin. The dipole correlations converge to a stable distribution after a total simulation length of 1 ns. for proteins used in ion pair energy calculations were run for 2ns. The dipole moment of each side chain and backbone is calculated from the partial charges assigned to atoms in the CHARMM force field, with distances to atoms measured from the center of mass of the set of atoms. As one would expect, components of the same dipole exhibit significant autocorrelation (Fig- ure 2.3A diagonal), but there are also significant cross-correlations between dipoles distant in se- quence but spatially close in the native protein structure. A band of large fluctuations is typically observed at the N- and C-terminus of the protein due to its high flexibility. After MD simulation of dipole fluctuations is complete, the protein is centered in a rectangular box with dimensions that exceed the minimum and maximum x, y, and z coordinates of the protein on all sides by at least the cavity radius a. The dielectric constant is calculated at each lattice point within this box (generally with a 1Å spacing). The effective surrounding dielectric constant ε1 used for each point is determined by averaging the scalar dielectric constant of all points in a shell within 0.5Å of the cavity boundary. An iterative solution is necessary, since the dielectric constant 45 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS at one point depends on the dielectric constant at surrounding points. An approximate function of ε = 10 inside the protein and ε = 78 outside was used as an initial dielectric function for the first iterations, and the system was then iteratively relaxed until the spatially-varying dielectric function converged, typically after less than 20 iterations. We found that the final values of the dielectric function were independent of the choice of initial conditions. The dielectric calculation program is implemented in Tcl and MATLAB (The MathWorks, Nat- ick, MA). For a protein of length 100 residues, the initial simulation to obtain the dipole correlation matrix takes roughly 12 hours on an 8-core 2.5 GHz Intel Xeon workstation, while the calculation of the dielectric constant afterwards takes less than 30 minutes running on one core with a lattice point spacing of 1Å. The time needed to calculate the dielectric function varies linearly with the number of lattice points. 2.5 Dielectric Anisotropy As demonstrated above, to properly capture the behavior of a protein the local dielectric constant must be a tensor. However, for many practical applications, it is desirable to have an equivalent scalar dielectric constant that replicates the behavior of the tensor as well as possible. The best choice of method for converting the tensor 2 to the scalar ε may depend on the situation, but as shown by Mele [238], the transmission of free charge fields into an anisotropic dielectric depends on the geometric mean of the dielectric constants in each direction. That is, if λ1, λ2, and λ3 are the eigenvalues of 2, we define the equivalent scalar ε to be ε = (λ1λ2λ3)1/3. (It is worth noting that if all three eigenvalues are equal, their harmonic, geometric, and arithmetric means are the same. Even if one eigenvalue exceeds the other two by 50%, among the most pronounced anisotropy we have seen in our dielectric calculations for 25 proteins, the difference between the harmonic, geometric, and arithmetic means is still less than 5%. Thus the choice of approach for averaging the principal axes of the dielectric scalar to produce a scalar does not significantly affect the quantitative result.) 2.6 Spatial Variation in Protein Dielectric Response An example of a dielectric map for adenylate kinase (PDB 1AKY) is shown in Figure 2.4. Panel A depicts the scalar dielectic function as a surface plot for a slice through the protein, while panel B shows the scalar dielectric function as a 3D isosurface plot. Panel C depicts the regions of anisotropy in the dielectric function, which tend to be localized around the surface of the protein. A notable feature of the heterogeneous protein dielectric theory is the presence of regions with relative permittivity comparable to or exceeding that of water on the surface of the protein, as can 46 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 20 40 60 80 100 Y Coordinate (Angstroms) X  Co ord ina te ( Ang stro ms ) ε ε = 5 ε = 25 ε = 70 ε = 80 A B C Figure 2.4: The spatially-varying dielectric function for adenylate kinase (1AKY). A) The effective scalar dielectric constant on a horizontal plane through the geometric center of the protein. B) Dielectric contours around the 1AKY structure, showing surfaces of ε = 5, 25, 70 and 80. Regions inside the blue globules have dielectric constants larger than water. C) Representation of the anisotropic dielectric constant . The orientation of each ellipsoid is given by the eigenbasis of the dielectric tensor at that point; the length of the semimajor axes are directly proportional to the eigenvalues of the tensor. Only ellipsoids with a difference between eigenvalues of >25% are shown. be seen in Figure 2.4B. The solvation energy of a charged group varies inversely with the solvent dielectric constant, so the presence of these regions lowers the potential energy of protein surface charges and enhances protein stability. They arise from the presence of charged or polar groups with large dipole moments on the protein surface that can fluctuate extensively, as there are fewer native steric constraints restricting their motion. This effect is particularly pronounced for proteins and peptides having significant numbers of polarizable residues such as the poly-Q helix shown in Figure 2.5, of interest in CAG triplet expansion diseases such as Huntington’s Disease [239]. This peptide shows large regions of dielectric higher than that of water distributed over its surface. Large values of the dielectric constant approaching that of the solvent have been observed on the surface of proteins [216, 214], and values of the effective dielectric constant approaching 150 have been seen for salt bridges on the surface of barnase [241, 242]. Dielectric values much greater than water have also been observed just outside the charged head groups of lipid bilayers [243, 244]. Having the surface of the protein surrounded by this region of high dielectric con- stant would attenuate the projection of electric fields from solution into the protein and vice versa, potentially reducing electrostatic attractions or repulsions between nearby proteins. This reduces the effects of generic electrostatically-driven interactions which may be involved in protein ag- gregation, potentially allowing for higher intracellular protein concentration. A variable dielectric profile on the protein surface may also allow for optimization of specific binding strategies for ligands or protein complexes. 47 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS ε1 100 Figure 2.5: Dielectric function for a hypothetical polyglutamine α-helix, as generated by the I-TASSER structure prediction server [240]. Values of the dielectric in the helix are color-coded as indicated in the figure. Because of the large polarizability of the amino acids comprising this peptide, there exist large spatial regions of high dielectric constant at the surface of the helix. Iso- dielectric surfaces for ε = 78 are shown, so that within these regions the dielectric is larger than that of water. Additionally, the dielectric function calculated by the present theory varies from over 100 in places on the protein surface to as low as 2 in the hydrophobic protein interior (see Figures 2.4 and 2.5). Other studies have reported values in this range for the effective dielectric constant of proteins: ε ≈ 2 for PARSE parameter sets [245], ε ≈ 4 from bulk measurements of anhydrous protein on application of Kirkwood-Frohlich theory to an idealized protein [246], ε = 2− 8.9 by site-dependent thermodynamic integration/molecular dynamics studies [247], and ε ≈ 20 for best agreement with the experimental pKas of titrable groups in proteins [248]. The wide range of measured and calculated dielectric constants in previous studies reaffirms the considerable het- erogeneity of protein dielectric response and may arise from differences in the local environment where the processes investigated take place. The timescale for relaxation processes is also important. Nuclear polarization due to protein dipole relaxation, which dominates the dielectric response on the protein surface, happens on a timescale of several picoseconds (see Figure 2.3B); electronic polarization due to electron motion, which occurs throughout the molecule, happens much faster. Processes occurring on a timescale longer than several picoseconds would therefore experience both nuclear and electronic polariza- tion effects, while processes on shorter timescales would experience only electronic polarization effects and a consequently lower value for the effective dielectric constant. Frequency-dependent relaxation plays a role in the electrostatics of enzyme catalysis [249, 250]; similarly, a frequency- dependent friction coefficient has been seen to strongly affect reaction rates or transition states in 48 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Figure 2.6: Nuclear and electronic polarizabilites for all residues in the set of 21 proteins studied. Error bars give standard deviations of the nuclear polarizabilities for each residue, thus for example some instances of a Lys, Asn, or Gln may have polarizabilities larger than that of water. Residue labels refer to side chains alone, except for Gly, Pro, and Ala which refer to the whole residue as discussed in the text. diverse systems ranging from gas-phase and condensed phase reactions [251, 252, 253] to protein folding [254, 255]. A frequency-dependent relaxation response could be obtained from the spec- trum of normal mode relaxation in Figure 2.1; investigation of these topics is reserved for future studies. The total nuclear polarizability of all dipoles in the set of proteins used for salt bridge energy calculations was calculated according to Equation 2.19, and is shown in Figure 2.6. For most residues the nuclear polarizability due to dipole motion exceeds (often substantially) the electronic polarizability taken from literature data [233], especially for polar side chains like Asn, Gln, Lys, Arg, Asp, and Glu. Interestingly, there is considerable variability in nuclear polarizability for a given dipole depending on its local environment within a protein, as represented by the error bars in the figure. This means that a dielectric constant for a residue cannot be reliably assigned based only on its identity; it is necessary to explicitly compute the dipole-dipole correlations to understand each dipole’s unique response to electric fields. 2.7 Dependence on Sphere Size and Lattice Point Spacing The cavity radius a is an adjustable parameter in this approach. A smaller value of a gives a more local description of the dielectric response of the protein but suffers from the application of a 49 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS 0 5 10 15 20 25 30 35 40 45 50 20 40 60 80 100   2 A 4 A 6 A 8 ADi el ec tr ic  c on st an t ε X Coordinate (Angstroms) Figure 2.7: The effect of cavity sphere radius a on the calculated dielectric function. Plots are taken on a line through the geometric center of ubiquitin (1UBQ), for a = 2, 4, 6, and 8 Å. macroscopic description to the atomic-scale behavior within a smaller cavity. Conversely, a larger value of a may properly capture the effective macroscopic response of a protein region but conceal important shorter-length phenomena. In Figure 2.7, the calculated dielectric function on a line through the middle of ubiquitin is shown for various cavity radii a. Based on these observations, we use a cavity radius of 3-4 Å as an optimal length scale to capture both locally average behavior and mesoscopic dielectric structure. The choice of cavity radius determines the spacing of lattice points in the calculation, since it is necessary to have an adequate density of them near the surface of the cavity to accurately reflect the nature of the surrounding dielectric. We have found that once the lattice point spacing is 1/4 of the cavity radius, the dielectric map thus produced has converged in that it no longer changes with an increasing density of lattice points. We thus choose a lattice spacing of ∼ 1Å. 2.8 Averaged Dielectric Properties Protein N- and C-termini tend to have high flexibility, and for 1AKY the ends also have high net charge, so the dielectric function tends to be larger in these regions as well. To see whether this is a general trend, we plot in Figure 2.8A the dielectric constant averaged over proteins, as a function of sequence index. So that different length proteins may be compared, the index is chosen to start at zero, and is normalized by N−1 where N is the number of residues. One can see from the plot that the dielectric constant is, on average, larger at the ends of the protein. To investigate how the dielectric function varies as one moves from a protein’s interior to its surface, we plotted the scalar dielectric constant at a distance r from the geometric center of the protein, averaged over the surface of the sphere of radius r, i.e. 〈ε(r)〉 ≡∑′ ε(r)/4pir2∆r where all points in a spherical shell between r and r+∆r are summed. A plot of this is shown in Figure 2.8B 50 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 15 10 15 20 25 30 35 40 45 50 Fraction of Distance Along Protein Backbone M ea n  D ie le ct ric  C on st an t a t C α A 0 0.2 0.4 0.6 0.8 1 1.2 10 20 30 40 50 60 70 80 Distance from Geometric Center of Protein (as Fraction of Eective Radius) M ea n D ie le ct ric  C on st an t 0 B 1CEX 1AMM 1AKZ 1ARB 1MLA 1PTX 1RIE 2END Avg of 21 proteins Figure 2.8: Mean properties of the protein dielectric function. A) Average dielectric constant as a function of fractional distance along the protein backbone from a set of 21 proteins 2. Note the increased permittivity at the N-and C-termini due to their large flexibility. B) Average dielectric constant as a function of the fractional distance from the protein geometric centre. for several proteins indicated. To investigate the general trend across proteins, this quantity was then averaged again over a dataset of 21 proteins to obtain a protein-averaged dielectric constant as a function of radius. To compare differently sized proteins, the radius r was normalized by the effective protein radius rp, defined as the radius of a sphere that would have the same radius of gyration rG as the protein, i.e. rp = √ 5/2rG. The resulting quantity 〈 ε(r/rp) 〉 prot is also plotted in Figure 2.8B. It is worth noting that at a given radius r within a given protein, there is significant scatter in the data. 2.9 Applications to Ion Pair (Salt Bridge) Energies As shown in Figure 2.9A, the correspondence between the ion pair energies calculated with the heterogeneous scalar dielectric theory and a homogeneous protein dielectric of 8 is best for weak salt bridges with energies around zero. There is, however, significant divergence in the calculated energies of stronger salt bridges which are likely to play a larger role in the stabilization of native structure. Using a lower value for a homogeneous protein dielectric would enable a better fit to the stronger salt bridge energies as calculated from the heterogeneous dielectric, but at the expense of consequently overestimating the strength of weaker salt bridges. Best fit between the two data sets is obtained for a protein dielectric of 6.5, but at this value neither the weak nor strong salt bridge strengths in the biphasic dielectric approximation agree well with those obtained from the full heterogeneous dielectric theory. In principle, for each ion pair in a protein there is a choice of homogeneous protein dielectric that would give the same energy as the heterogeneous theory, but this choice of dielectric is not obvious a priori. Thus an advantage of the heterogeneous theory described here is the ability to appropriately describe the local environment of each ion pair and thereby determine its strength without additional assumptions. 51 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Figure 2.9: Salt bridge energies calculated with the heterogeneous dielectric theory. A) Com- parison of 370 ion pair (salt bridge) energies from a set of 21 proteins calculated using a homo- geneous protein dielectric of 8 (and water dielectric of 78), and the heterogeneous dielectric ac- cording to the theory in the text. The superimposed line has equation Ehomogeneoussb = E heterogeneous sb . Inset shows the recipe for calculating the ion pair energy by subtracting the left two individual Born transfer energies EA− E0 and EB− E0 from the total (solvation + ion pair) charging en- ergy EAB−E0, shown by the rightmost double arrow. B) Comparison of the approximate ion pair energies calculated with an anisotropic dielectric to those calculated with the equivalent scalar di- electric. The error bars give the range of possible ion pair energies depending on the orientation of the dielectric tensor at the midpoint of the ion pair. Inset shows a magnification of the plot for weak ion pairs. From Figure 2.9A it is apparent that there are two classes of salt bridges, “weak” and “strong”, with strong salt bridges of energy < −30kJ/mol falling off the line where the the salt bridge energies of the biphasic and heterogeneous predictions agree. We investigated whether the class of weak salt bridges tended to be on the surface of the protein, and the strong salt-bridges buried. The average dielectric at the center of the salt bridge for the strong SB is 7.5 (standard deviation s = 6.0), and that for the weak is 36.31 (s = 21.3). By the two-tailed t-test, the average dielectric constants at the centers of the weak and strong salt bridges differ significantly with p < 0.0001. The average fractional distance from the center of mass of the protein 〈 r/rp 〉 for the strong SBs is 0.40 (s = 0.33) and that for the weak is 0.55 (s = 0.35); the average distances differ significantly with p= 0.0272. In contrast, the average SASA of side chains of the residues involved in the strong SBs was 58Å2 (s = 44Å2), while that for the weak SBs was 59Å2 (s = 52Å2) – a nonsignificant difference (p= 0.89). The average distances between charged atoms were 4.0Å (s= 0.92Å) and 3.8Å (s = 0.82Å) respectively, also a nonsignificant difference (p = 0.0877). These numbers indicate that the most reliable indicator of salt bridge strength is the local dielectric environment 52 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS Figure 2.10: The four strongest ion pairs (salt bridges) among the proteins in Figure 2.9. The protein PDB and participating amino acids are shown in each panel, along with the energy of each salt-bridge. Dashed lines are shown between charged atomic moities participating in each salt bridge, which are labeled by the separation of the charged groups in Angstroms. The local dielectric function is also color coded, where one can see that the local dielectric for these SBs is low. around the salt bridge, while other parameters to estimate salt bridge energy like degree of solvent exposure and charged group separation distance are inadequate on their own to differentiate strong and weak salt bridges. The 4 strongest SBs are shown in Figure 2.10. Their strong attraction is attributable to their location in regions of low permittivity within their respective proteins, as well as favourable side chain geometries. Protein dielectric anisotropy, which is most pronounced near the protein surface in the vicinity of partially constrained polar residues, enables the modulation of attractions and repulsions be- tween closely spaced charged groups in a way that may maximize the stability or utility of the protein. Figure 2.9B depicts the correspondence between salt bridge energies calculated with the heterogeneous scalar dielectric as in Figure 2.9A, and the anisotropic dielectric salt bridge energy calculated as described in the methods. Accounting for anisotropy increases the strength of both attractive and repulsive ion pairs, indicated by the fact that the slope of the best fit line in Figure 2.9B is greater than unity. Moreover, the range of ion pair energies depending on the orientation of the dielectric tensor is considerable, demonstrating the importance of accounting for dielectric anisotropy. For attractive ion pairs, the 53 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS -30 -20 -10 0 10 20 30 -30 -20 -10 0 10 20 30 10 20 30 40 50 60 70 80 90 100 N a+  io n el ec tr os ta tic  tr an sf er  e ne rg y (k J/ m ol ) ε Distance from pore center (Angstroms) Homogeneous protein ε = 2 Homogeneous protein ε = 5 Homogeneous protein ε = 10 Heterogeneous ε with a = 3 Angstroms Figure 2.11: Energy profile of sodium ion passage through the acetylcholine receptor pore (1OED) for various choices of homogenous and heterogeneous dielectric. The background shows the di- electric map through the pore calculated by the heterogeneous theory with a cavity radius of 3Å. orientation of an anisotropic dielectric interposed between charged groups affects the local field geometry so as to amplify field components parallel the lesser axis of the dielectric tensor and attenuate those parallel the greater axis. For repulsive charges in proximity (repulsive ion pairs) the converse is true. Thus the best estimates of the ion pair energies tend to lie near the lower bound of the range for each ion pair, i.e. the data lie towards the bottom of the vertical bars in Figure 2.9B. This suggests a possible evolutionary selection of dielectric tensor orientation in the vicinity of ion pairs to maximize their stability. In regions of proteins containing many charged residues of like and unlike charge, such as an active site or binding pocket, adjustment of the orientation of the dielectric tensor may increase attraction energies and decrease repulsion energies (or vice versa as needed) to improve protein stability or binding stability, while maintaining necessary charges in a relatively fixed geometry. Construction of a PB solver capable of directly handling anisotropic dielectrics will be a topic for future work. 54 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS 2.10 Applications to Ion Channel Passage Another frequent use of continuum electrostatics calculations is to model the transit of an atom through a membrane channel. The energy of the ion at different sites in the channel describes the barrier to be overcome during the kinetics of ion travel. To test the heterogeneous protein dielectric theory in an ion channel system, the dielectric map was computed for the acetylcholine receptor pore based on the electron microscopy structure (1OED) [256]. The linear PB equation was solved with APBS under the same conditions of water dielectric and ionic concentration as for the salt bridge energy calculations above. The transfer energy of a sodium ion at a position r along the central axis of the pore was determined by taking the energy of the ion and the protein together and subtracting from it the energy of the protein alone and the Born energy of the ion in free water: Etrans f er(r) = E(r)−E(∞) . This calculation was performed for several choices of homogeneous protein dielectric (ε = 2, 5, and 10) and for the heterogeneous dielectric theory with a cavity radii of 3 Å (the radius of the ion channel [256]). The energy profiles of the ion as a function of transit through the pore are shown in 2.11. As the homogeneous protein ε increases, the energy barrier to ion transit decreases. Overall there is a much higher energy cost to ion transit for the homogeneous theory, with a peak barrier height of 6 kT for a homogeneous ε of 10, compared with a peak barrier height of 3.5 kT from the heterogeneous theory. It is believed based on MD simulation of the 1OED structure that it represents the pore in the “closed” state [257], so it is expected that a potential energy barrier to ion passage through the pore is consistently observed. In this channel the barrier seems to derive from the presence of hydrophobic residues at the constriction point in the channel (coinciding with the lowest dielectric in the heterogeneous theory), forcing the ion to pass through a region of low dielectric as can be seen in Figure 2.11 at the maximum of the transfer energy function. 2.11 Conclusions Electrostatic interactions between charges are critical in determining the stability of a protein. The strength of these interactions is modulated by the local environment around the charges, which can relax or polarize in response to the electric fields. This “dielectric screening” weakens forces between charges. We found that in the interior of a protein, the dielectric is not constant but instead is spatially heterogeneous, with many local minima and maxima. Moreover, our studies show the polarizability of an amino acid is context-specific and large on the surface of the protein, where the local dielectric constant can be even larger than that of water. These regions can thus act as “stability shells” for charges, because charges tend to migrate towards higher dielectrics. We found that the dielectric response inside a protein tended to be direction-dependent. 55 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS A.  εy > εx B.  εy < εx ++ Figure 2.12: Consequences of anisotropy in the dielectric field. The energy of interaction between two unlike changes depends on the orientation of the anisotropic dielectric in which they are placed. The red lines show the electric field between the charges. The energy of attraction is greater when εx < εy. As shown in Figure 2.12, the orientation of an anisotropic dielectric interposed between charged groups affects the local field geometry so as to amplify field components parallel the lesser axis of the dielectric tensor and attenuate those parallel the greater axis. In regions of proteins containing many charged residues of like and unlike charge, such as an active site or binding pocket, adjust- ment of the orientation of the dielectric tensor may increase attraction energies and decrease repul- sion energies (or vice versa as needed) to improve stability while maintaining necessary charges in a fixed geometry. Indeed, residues near charged groups may be evolutionarily selected to favor those with anisotropic fluctuations in a preferred direction. This theory fits in the middle of the microscopic-to-macroscopic continuum of techniques to describe biomolecule electrostatic properties. It is not fully microscopic in that individual atoms are collected into backbone and side chain dipoles to improve computational efficiency, and the applied fields are assumed to be approximately uniform over distances of a few Å; conversely, by allowing the dielectric characteristics of a protein to vary throughout its volume it captures subtleties in electric effects that a purely macroscopic model would efface. It is useful to have a robust and versatile tool for capturing much of the microscopic electrostatic behavior in a simple parameter like a locally-varying dielectric constant, which may then be refined by MD simula- tion or density functional methods to explore interesting or noteworthy effects identified by the mesoscale method. An alternative approach of extracting local electrostatic properties directly from all-atom MD simulation often requires the subtraction of large quantities of comparable mag- nitude, introducing large errors that require long simulations to satisfactorily average. Moreover, the most popular MD force fields are non-polarizable, so they do not account for the effects of electronic polarizability which are integrated into this method. Polarizable MD force fields tend 56 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS to be computationally demanding due to the need to frequently recalculate charge distributions, so that correlations between electronic polarization and relatively long time-scale nuclear motions are difficult to characterize at present. On a practical level, this theory requires only a brief (1-2ns) equilibrium simulation of the pro- tein of interest. Calculation of the protein dielectric map can therefore be accomplished on a single workstation in 1 day, enabling a rapid analysis of several targets when needed. The low compu- tational cost of this method is particularly important when studying large proteins or oligomeric protein complexes, for which longer-length MD simulations as a means of obtaining electrostatic energies may be impractically slow and a continuum electrostatics approach is therefore more appropriate. To increase speed for large multiprotein simulations, the correlation matrices for indi- vidual proteins, or subsets of the system including protein interfaces, may be obtained in isolation and then appropriately combined to produce an overall dielectric map. Salt bridge formation and disruption is known to play an important role in the misfolding of amyloid-β [258, 259, 260, 261]; it is also instructive to investigate the significance of electrostatics in other misfolding-prone proteins. The theory developed here has been applied to the prion protein to elucidate the role that salt bridges and hydrophobic transfer energies may play in its misfolding [226]. Salt bridges known to be absent in disease-causing human mutants of the prion protein were found to be among the strongest present in the protein, so that the human mutants were electrostatically the least stable of those proteins studied. Conversely, the prion protein with the most stable salt bridges belonged to a species known to be resistant to prion disease (frog). It was also demonstrated that a Coulomb law with a single local effective dielectric constant was insufficient to fully capture salt bridge energetics, which necessitated the calculation of the full dielectric map to accurately predict the strength of salt bridges. The utility of the dielectric calculator extends to any protein system in which electrostatics may play a role. Prominent examples include protein interactions with polyanions like DNA or RNA, protein-protein recognition and binding, oligomerization and aggregation, and membrane protein transport and selectivity. Furthermore, this approach is not limited to using water as a solvent, as solvent conditions in simulations may be tuned to reflect different environments where needed. We have at present only calculated dielectric profiles for natively folded ensembles, but the same technique could be applied to partially folded or misfolded structural ensembles. In principle, screened interaction energies may be obtained directly from all-atom MD; how- ever, the present theoretical framework may provide a clearer intuitive picture of why certain in- teractions may be strong or weak within the protein. While the present theory improves the quan- titative description of protein dielectric response, we are currently limited in the application of our theory by the absence of a Poisson-Boltzmann equation solver capable of handling an anisotropic 57 CHAPTER 2. THE DIELECTRIC PROPERTIES OF PROTEINS dielectric function; only an approximate isotropic (though heterogeneous) dielectric function can be used at present. We are currently developing a tool capable of accounting for such anistropy to enable the accurate calculation of electrostatic energies and plan to apply it to the study of protein pKa prediction, salt bridge energies, and protein thermodynamic stability. Calculation of electrostatic energies by continuum electrostatics methods requires a description of the spatially-varying dielectric constant for the system under study. We have presented a robust tool to calculate this dielectric function in a protein-water system that accounts explicitly for the complex dynamic properties of protein and solvent dipoles. The method may be straightforwardly generalized to any biomolecule-solvent system. Heterogeneity and anisotropy are important char- acteristics of the protein dielectric, and strongly affect the electrostatic interactions that govern protein stability. Modulation of dielectric heterogeneity and anisotropy, through the evolution of residue fluctuations tailored to specific tasks, may provide a mechanism to simultaneously satisfy requirements for protein stability and function. 58 Chapter 3 Energy Landscapes for Partial Protein Unfolding Amyloid-forming proteins such as β2-microglobulin (β2M) can convert from a natively folded globular form to a misfolded multimolecular aggregate, a reaction that involves transient forma- tion of at least partially unfolded intermediate structures. Understanding the free energy barriers to partial unfolding of these proteins informs the early steps of the misfolding mechanism. An algorithm has been developed to calculate the free energy of unfolding for all regions of a protein and we have applied it to the study of early events in the β2M misfolding mechanism. 3.1 Introduction The partial loss of tertiary structure by previously folded proteins is often the first step to misfolding and aggregation, with pathologic consequences. There are over 20 human proteins capable of forming amyloid, a phase of protein structure typically comprised of multimolecular packing of extended β -sheets [262]. Three of the best studied proteins capable of forming amyloid outside the central nervous system are transthyretin (TTR), a thyroid-hormone binding protein involved in familial amyloid polyneuropathy [263], β2-microglobulin (β2M), a subunit of the immunoglobulin light chain that accumulates in dialysis-associated amyloidosis [264], and lysozyme (LYZ), an antibacterial protein ordinarily present in mucous secretions [265]. These proteins are all globular in the native state, but destabilization due to inherited mutations or changes in concentration due to overproduction or impaired clearance cause amyloid accumulation at the expense of the globular phase. As discussed in Chapter 1, the mechanism of misfolding may include features of template assistance that extend beyond a simple nucleation-polymerization process. 59 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING Experimental and theoretical studies of amyloid protein formation have provided an increas- ingly detailed picture of the molecular events in amyloidogenesis. Radford and colleagues [266] have used NMR to study an intermediate of β2M at low pH, identifying loss of structure in two of the protein’s seven native beta strands. Partially unfolded ensembles of lysozyme have been gen- erated by high temperature molecular dynamics (MD) and compared to HSQC NMR data [267]. Rennella et al [268] have used hydrogen exchange NMR to monitor the equilibrium unfolding of β2M, in which they observed both global unfolding in the core region and partial openings of the native state. Similarly, a biophysical study of lysozyme amyloid formation demonstrated a locally cooperative loss of native tertiary structure, followed by progressive unfolding of a com- pact, molten globule-like denatured state ensemble as the temperature is increased [269]. Molecu- lar dynamics simulations of transthyretin [270, 271] identified differences between wild-type and amyloid-forming mutant unfolding pathways, particularly in the α-helix, and the formation of potentially amyloidogenic unfolding intermediates. Experimental studies of transthyretin in urea [272] have described its complex unfolding pathway that proceeds through loss of quaternary ho- motetrameric structure to disruption of tertiary structure. The detailed molecular mechanism of amyloid formation remains unknown, but the studies mentioned above support the transient formation of partially unfolded intermediate structures as a necessary intermediate to enable the transition between the native fold and the aggregated misfold. It is well known that folded proteins in solution are dynamic, sampling an ensemble of thermally accessible conformations around the free energy minimum native structure. Approaches to mod- elling the native state ensemble have been developed previously by Hilser and co-workers [273] and used to understand the energetic contributions to protein stability. We are interested in extend- ing this approach to describe larger-scale unfolding events that move a protein away from its native conformation toward partially unstructured intermediates on the misfolding pathway. Furthermore, we seek to understand the cooperativity between unfolding events in different protein secondary structures as a way of building a descriptive mechanistic model of amyloid formation. In this chapter we present two methods for calculating the energy of unfolding subregions of a protein: a minimal model based on the topology of the native fold and a detailed molecular model that accounts categorically for enthalpic and entropic changes on unfolding. We apply these meth- ods to the partial unfolding of β2M, to identify regions involved in the early steps of misfolding and the cooperativity between loss of structure in different regions of the protein. Our motiva- tion for studying this protein is twofold: first, the previous experimental studies by biophysical methods like NMR provide valuable benchmarking data to validate the accuracy of our theoretical approach; second, with our method we can probe in detail the rare fluctuations that contribute to amyloid formation in vivo and disentangle unfolding events in different regions without relying on 60 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING non-physiologic conditions like addition of denaturant or extremes of pH. The methods developed here are fully general and can be applied to any protein structure for which partial unfolding may contribute to its physiologic or pathologic function. The free energy of a given protein conformation arises from the fine balance of several compet- ing contributions. The unfolded state is favoured by its large configurational entropy and solvation of polar functional groups but enthalpically disfavoured for the absence of electrostatic and van der Waals intra-protein interactions stabilizing the native state. We have concentrated on unfold- ing events that originate at a given site in the protein and then spread contiguously outward in the primary structure from this initial focus, while the rest of the protein remains folded. The site of unfolding may be varied along the primary sequence. This is the inverse of the single sequence approximation employed in the study of protein folding kinetics [274], in which native structure grows from localized regions that then fuse to form the complete native molecule. This approach enables the identification of regions of low stability in the protein of interest that may lose struc- ture early in the unfolding process. A generalization of this method, the double sequence approx- imation, involves simultaneous unfolding of two contiguous sequences in the protein to identify cooperative or correlated unfolding events that advance the protein further along the unfolding pathway. 3.2 A Gō Model of Protein Unfolding A benchmark model to account for the free energetic changes that take place on unfolding is to assign a fixed energy to all contacts in the native state, where a contact is defined as a pair of nonhydrogen atoms within a fixed cutoff distance. Such Gō models [275, 276, 277, 278, 279, 280] have been successfully implemented in studies of protein folding [281]. The total free energy cost of unfolding depends on the number of interactions disrupted, to which an entropy term accounting for the greater conformational flexibility of the unfolded region is added. In the following equations, lower case variables refer to atoms, while upper case variables refer to residues. Let T be the set of all residues in the protein,U be the set of residues unfolded in the protein, andF be the subset of residues left folded in the protein (thus T =U ∪F ). A model for the set of unfolded residues is required. The unfolding mechanism at high degrees of nativeness generally consists of multiple contiguous strands of disordered residues [282]. Here we adopt the approximation of a single contiguous unfolded strand as described above. The total free energy change ∆FGo(U ) for unfolding the set of residues U is ∆FGo(U ) = ∆EGo(U )−T∆SGo(U ) (3.1) 61 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING The unfolding enthalpy function ∆EGo is given by the number of interactions disrupted by unfolding of the set of U residues: ∆EGo(U ) = a · i> j ∑ Atoms i∈T , j∈U Θ ( rcuto f f − ∣∣ri− r j∣∣) (3.2) In Equation 3.2, the sum on i, j is over all unique pairs of nonhydrogen atoms that have either one or both atoms in the unfolded region, ri and r j are the coordinates of atoms i and j, rcuto f f (taken to be 4.8Å) is the interaction distance cutoff, Θ is the Heaviside function defined by Θ(x) = 1 if x is positive and 0 otherwise. A less abrupt cutoff function can also be used. The energy per interaction a is chosen to recapitulate the overall experimental stability ∆Fexp(U )|U =T on complete unfolding of the protein at room temperature: a= ∆Fexp(U )|U =T +T∆SGo(U )|U =T ∑i> ji, j∈N Θ ( rcuto f f − ∣∣ri− r j∣∣) . (3.3) The unfolding entropy term ∆SGo(U ) is given by the sum over all residues K in the unfolded region ∆SGo(U ) = ∑ K∈U bK, (3.4) where the bk’s are the residue side chain and backbone unfolding configurational entropy changes calculated previously [283, 284], which range from 3.49 cal/(mol·K) for valine to 10.28 cal/(mol·K) for lysine. The Gō model isolates the effects arising from the topology of native protein interac- tions, and in practice the unfolding free energy landscape can be readily calculated from a single native state structure. 3.3 An Ensemble-Based Model of Protein Unfolding The Gō model approach to protein unfolding has the advantage of computational efficiency but does not attempt to account in detail for the underlying physical contributions to the free energy function. The main contributions to the free energy function must include: • The solvation free energy ∆Fsolv due to protein-solvent interactions that comprises polar/- electrostatic (∆F polsolv) and nonpolar (∆F np solv) effects. • The protein contact energy ∆Eprotein due to disruption of protein-protein interactions in the native state, which also comprises polar (∆E polprotein) and nonpolar (∆E np protein) effects. 62 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING • The configurational entropy ∆Scon f due to the greater number of microstates accessible to an unfolded region The overall free energy change is the sum of these contributions: ∆Fun f olding = ∆F pol solv+∆F np solv+∆E pol protein+∆E np protein−T∆Scon f (3.5) All of these contributions are sequence-dependent and cannot be fully determined from knowledge of the native protein structure alone. Detailed physico-chemical studies of the above free energy contributions have been undertaken by others as described below, which we have adapted to the task of calculating local unfolding free energies based on an ensemble of states for the fully folded and fully unfolded protein. 3.3.1 Polar Protein-Protein Energy A subset of attractions and repulsions between charged groups in the native protein are lost on partial unfolding. Electronic bond polarization gives rise to partial charges at all atomic locations. We use the CHARMM parameterization for all partial charges in electrostatics calculations. The interaction energy between all residue pairs {A,B}was calculated according to [226] (see Figure 4.2), where EP−A−B contains only the self energy of the part of the protein not including A and B, EP contains the self energies of the whole protein, and EP−A (and analogously EP−B) contains the self energies of A and P and their interaction energies (EB is analogous): EAB = EP+EP−A−B−EP−A−EP−B. (3.6) Combining the terms as shown causes all the energies to cancel except the interaction energy of A and B. The total polar protein-protein interaction energy change on unfolding is due to the disrup- tion of interactions in the native state, which may be partially compensated by other interactions in the unfolded state: ∆E polprotein = ∑ A∈U ∑ B∈N 〈EAB〉un f olded−〈EAB〉 f olded . (3.7) The angle brackets denote an average for each ensemble, folded and unfolded. For the folded en- semble, the heterogeneous dielectric function in and around the protein was calculated as described previously [285] (see Chapter 2) by applying a generalization of Kirkwood-Frölich theory to the spectrum of dipole-dipole fluctuation correlations from a molecular dynamics (MD) simulation of the folded state. For the unfolded ensemble residues were taken to be fully solvated in a bulk water dielectric with relative permittivity 78. In both cases, energies were obtained by solution of the linear Poisson-Boltzmann (PB) equation with APBS [221]. 63 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 3.3.2 Polar Solvation Energy The dielectric environment changes around a protein region during unfolding; the change is gener- ally from a region of low dielectric in the native protein to a region of high dielectric in the unfolded region, thereby favouring the unfolded structure. The polar solvation energy is the so-called Born energy due to the effect of charged groups in storing energy in the surrounding dielectric alone; interactions between different charged groups (which are mediated by the dielectric) are accounted for in the polar protein-protein energy term above. The energy of charge transfer is determined by calculating the Born energy EBorn of each residue in the folded protein dielectric environment ε f olded , determined by the methods in Chapter 2, and the unfolded dielectric environment εun f olded , in which each residue is placed in the middle of a sequence-appropriate tripeptide (to account for end effects) and then fully solvated with a surrounding bulk water dielectric (ε = 78). ∆F polsolv = ∑ A∈U 〈 EBornA (ε f olded) 〉−〈EBornA (εun f olded)〉 . (3.8) In the folded ensemble, EBornA (ε f olded) is the energy of the partial charges comprising a residue in the heterogeneous dielectric function from solution of the PB equation. EBornA (εun f olded) is the energy calculated by the same method but solvated to represent the unfolded ensemble. 3.3.3 Nonpolar Protein-Protein Energy The nonpolar contribution arises from van der Waals attractions between atoms in the protein, calculated according to the 6-12 Lennard-Jones potential in the Amber force field [286], from which the constants εi j, Ai j, and Bi j are derived. The average van der Waals energy over an ensemble of states taken from an all-atom molecular dynamics simulation of the protein of interest is computed according to Enpprotein = 〈 Atoms ∑ i∈U i< j ∑ j∈N εi j ( Ai j r6i j − Bi j r12i j )〉 . (3.9) The change on regional unfolding is therefore ∆Enpprotein = 〈 Enpprotein 〉 un f olded − 〈 Enpprotein 〉 f olded . (3.10) 64 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 3.3.4 Nonpolar Solvation Energy It is only in the last decade that reliable approaches to calculating this parameter have been devel- oped. Earlier methods parameterized nonpolar solvation energy changes solely in terms of surface area or volume (with mixed results), but Levy and co-workers [287] have more successfully divided the energy into attractive and repulsive terms, ∆Fsolnp = ∆F sol np,rep+∆F sol np,att , (3.11) where ∆Frep is the energy penalty due to cavity creation in the solvent and ∆Fatt is the free energy from forming solvent-solute interactions (and also includes solvent-solvent reorganization). For the attractive term, Chandler and co-workers [288] approximate the free energy change by the mean attractive interaction potential energy between protein and solvent, Fsolnp,att ≈ 〈 ∆U solnp,att 〉 = ∑ i∈U ∫ ρw(r)Vatt(r)dr. (3.12) In Equation 3.12, ρw(r) is the distribution function of solvent around protein atom i. To calculate this term, we follow the method of Tan, Tan, and Luo [289], using a Weeks-Chandler-Andersen [290] decomposition scheme in which V attiw =  −εiw r ≤ rmin εi j ( Ai j r6i j − Bi j r12i j ) r > rmin. (3.13) In this equation, rmin is the radius at which the Lennard-Jones potential is minimized. The repulsive term is proportional to changes in solvent accessible volume (SAV): Fsolnp,rep = p ·∆SAV + c, (3.14) where the constants p and c are taken from [289]. ∆Frep may also be treated as proportional to solvent-accessible surface area, or solvent excluded volume, but the overall fit to SAV involves the use of the most realistic solvent probe radius and does not require the constant offset (c= 0). 3.3.5 Configurational Entropy The number of microstates accessible to the protein in the unfolded state is much greater than the number accessible in the native state, so there is a favourable gain of configurational entropy on unfolding. Recent work by Li and Brüschweiler [291, 292] has provided a direct way of efficiently 65 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING S S Figure 3.1: Schematic of approach used to ac- count for loop entropy loss in an unfolded re- gion. The ends of the unfolded region are fixed to their sites in the folded structure, while the rest of the unfolded region is free to diffuse sub- ject to a planar absorbing boundary accounting for the steric barrier from the still-folded part of the protein. Disulfide bonds, when present in the unfolded region, are treated as an additional constraint on the loop’s motion. calculating the total configurational entropy from the dihedral angle space of a protein ensemble. For a dihedral angle φi represented by a set of values {φi1 . . .φiN} taken from N frames of an MD simulation, the dihedral angle probability distribution pi(φi) may be estimated by pi (φi) = 1 N N ∑ j=1 eκ·cos(φi−φi j) 2piIo(κ) (3.15) The function on the RHS of Equation 3.15 is the von Mises distribution, the circular analogue of a Gaussian, and κ is a constant inversely related to the width of the distribution and chosen so that κ−1/2 = pi/180 rad = 1◦. I0(κ) is the modified Bessel function. If there are M dihedral angles in residue A of the protein, the total configurational entropy of this residue is the sum of the Gibbs entropy for each angle: SA =−kB M ∑ i=1 ∫ 2pi 0 pi (φi) ln [pi (φi)]dφi. (3.16) The configurational entropy for residue backbones and side chains may be extracted from simu- lations of the folded and unfolded states. However, a correction needs to be applied to the unfolded state configurational entropies, since in the single sequence approximation the end points of the partially unfolded region are fixed to be equivalent to their positions in the folded protein. This means that there is a loop entropy penalty to be paid for constraining the ends in the partially unfolded structure which is not present in the fully unfolded state: ∆Sreturn =−kB ln( fw(r|n)∆τ) (3.17) where f (r|n)∆τ is the probability a random walk will return to a box of volume ∆τ centered at position r after n steps, without penetrating back into the protein during the walk. The typical size of a melted region is ∼ bn1/2 with b ≈ 3.8Å the Cα −Cα distance. The characteristic dimension of the protein is ∼ bN1/3 with N ≈ 100 structured residues. Thus for strand lengths shorter than 66 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING about n ≈ 20 residues, the size of the melted strand is smaller than the protein diameter and the steric excluded volume of the protein is well-treated as an impenetrable plane. The number of polymeric states of the melted strand must be multiplied by the fraction of random walks that travel from an origin on the surface of the protein to a location where the melted polymer re- enters the protein [293]. This position is taken to be anywhere within a box of side b located at (xo,yo,b/2) where √ x2o+ y2o is the distance between the exit and entrance locations. The above fraction of states is given by fw = b 2n ( 3 2 pin )3/2 exp ( −3 2 x2o+ y 2 o b2n ) (3.18) The total configurational entropy change on unfolding is thus ∆Scon f ig = ∑ A∈U ( Sun f oldedA −S f oldedA ) +∆Sreturn (3.19) Disulfide bonds require additional consideration in the loop entropy term since they further restrict the motion of the unfolded segment. When present, the disulfide is treated as an additional node through which the loop must pass, in effect dividing the full loop into two smaller loops both subject to the boundary conditions described above (see Figure 3.1). 3.4 Generation of Folded and Unfolded Protein Ensembles The free energy change to unfold a protein region is determined by combining energy terms de- scribed above for the folded and unfolded ensembles. For the set of residuesU to be unfolded, the difference between their ensemble average free energies in the folded state and the unfolded state is calculated. Generation of the folded ensemble is straightforward, as it involves a standard all-atom equilibrium MD simulation. The structure of human wild-type β2-microglobulin was obtained from the Protein Data Bank (ID 1JNJ [294]), and a patch was applied to the protein structure file to join cysteines 25 and 80 in a disulfide bond. Protonation states of ionizable groups were assigned to reflect conditions at pH 7. To generate the folded ensemble, the protein was solvated in a box of explicit water with 150mM NaCl that exceeded the maximum dimensions of the protein in the x, y, and z directions by 10Å. The simulation was performed using NAMD [229] with the CHARMM22 force field [230]. A time step of 2 fs was used, frames were extracted from the simulation every 2 ps, and the total length of the simulation was 25ns. The first 1ns of simulation was used for equilibration and not analysed. To generate the unfolded ensemble, the simulation is steered away from the native state by applying a biasing potential related to the fraction of native contacts present in the simulation. Two 67 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 0 1 2 3 4 5 60 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 V al ue  o f C on ta ct  C |Contact Distance − Native Distance| (A) c) Folded ensemble d) Unfolded ensemble 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fr ac tio n of  N at iv e Co nt ac ts  q 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0 Simulations continue Folded simulation Unfolded simulation Simulation time (ns) a) b) Figure 3.2: Generation of folded and unfolded ensembles for energy landscape calculation. a) The value of each contact C as a function of the difference in separation distance di j between two atoms in simulation and in the native structure (see Equation 3.20). b) Fraction of native contacts q (see Equation 3.21) as a function of simulation time for the folded (blue) and unfolded (red) ensembles. The folded ensemble is derived from an unconstrained equilibrium MD simulation, which retains the native structure (q≈ 0.9). The unfolded ensemble is generated by steering a second simulation from the PDB structure (q = 1) through a series of partially unfolded intermediates (states with 0.05 < q< 1) to reach the unfolded state with q≈ 0.05. The first 4 ns of simulation are shown, but each simulation continues to 25 ns. c) Superimposed representative conformations of the folded ensemble. Conformations are shaded from red to blue according to the time they were taken in the simulation. d) Representative conformations from the unfolded ensemble. The image at the bottom left shows several superimposed unfolded conformations. nonhydrogen atoms i and j in different, nonadjacent residues (separated by three or more residues in the primary sequence) in the protein are taken to form a native contact if their separation distance di j in the PDB structure is less than a cutoff distance of 4.8Å. A numerical value Ci j as a function of simulation time t and the atomic coordinates ri(t) and r j(t) is assigned to each contact according to Ci j(ri(t),r j(t)) = 1− (∣∣ri− rj∣∣/(di j+2.5))6 1− (∣∣ri− r j∣∣/(di j+2.5))12 . (3.20) The functionCi j amounts to a soft cutoff for the definition of individual native contacts (see Figure 3.2). If C is the set of all native contacts, the fraction q of native contacts present at a given time in the simulation is q(t) = ∑(i, j)∈C Ci j(ri(t),r j(t)) ∑(i, j)∈C Ci j(ri(0),r j(0)) , (3.21) where time t = 0 corresponds to the PDB structure used as a starting point for the simulation. A harmonic biasing potential P that depends on q and the target native contact fraction qc is calculated according to P(t) = 1 2 k (q(t)−qc(t))2 (3.22) 68 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING The value of the prefactor k is set such that P = 10kcal/mol when |q−qc| = 0.1. The potential P is added to the CHARMM force field Hamiltonian H0 to give the overall system Hamiltonian H = H0+P. The steering parameter in the simulation is qc, which is chosen to be qc(t) = { 1−0.95 t1ns 0 < t < 1ns 0.05 t > 1ns (3.23) Thus in the first nanosecond the simulation is steered from q = 1 to q = 0.05, to account for the possibility of some residual native structure in the unfolded ensemble. The simulation then con- tinues until adequate sampling is obtained for the energy function to converge, as described in the following section. Running the unfolding simulation using explicit water was not computation- ally tractable since the volume of the water box required to enclose the unfolded chain is much larger than that required for the folded protein. A Generalized Born solvation model recently im- plemented in NAMD was used instead, with a solvent dielectric of 78.5, an ionic concentration of 150mM, and a cutoff distance of 16Å for electrostatic interactions. The two nanoseconds of simulation, in which the protein is steered to q = 0.05 and equilibrates in the unfolded state, is omitted from analysis. 3.5 Convergence of Terms in the Energy Function It is important to verify that simulations of the folded and unfolded proteins have run long enough for the energy terms to have arrived at stable values. Figure 3.3 shows the fluctuation and conver- gence properties of the energy terms in the folded and unfolded ensembles, as a function of the length of simulation used. Due to the low computational cost of calculating the configurational entropy and nonpolar protein-protein energy, all simulation frames (taken every 2 ps) were used to calculate these terms. The other terms are more computationally demanding and were calculated for every 100th frame (an interval of 200ps). Interestingly, there is considerable variation in the rate of convergence between the energy terms. The solvation energy terms (polar and nonpolar) achieve the most rapid stabilization in less than 3 ns, while the protein-protein energies take a longer time to reach constant values, up to 15 ns. Unsurprisingly, convergence times for the unfolded ensemble tend to be longer than for the folded ensemble, reflecting its greater heterogeneity and thus need for more extensive conformational sampling. The slowest term to converge, and thus the limiting factor in determining the length of the simulation needed, is the configurational entropy. Both the folded and unfolded simulations take 25 ns to plateau. Unlike the other terms that tend to oscillate around their ensemble values, the 69 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING Folded NP protein-protein energy Unfolded NP protein-protein energy Unfolded NP solvation energyFolded NP solvation energy Folded P solvation energy Unfolded P solvation energyFolded P protein-protein energy Unfolded P protein-protein energy Folded congurational entropy Unfolded congurational entropy 0 5 10 15 0 5 10 15 20 25 0 5 10 15 Simulation Time (ns) Simulation Time (ns) 0 5 10 15 20 25 0 5 10 15 0 5 10 15 20 25 Simulation Time (ns) Simulation Time (ns) Simulation Time (ns) Simulation Time (ns) 0 5 10 15 0 5 10 15 20 25 Simulation Time (ns) Simulation Time (ns) 0 5 10 15 20 25 Simulation Time (ns) Simulation Time (ns) 0 5 10 15 20 25 Distribution mean −0.1 0  0.1 0  0.1 0 -0.4 0  0.1 −0.1 0  0.1 Distribution standard deviation Distribution 3rd moment Distribution 4th moment Mean change in accumulated individual values Relative linear scale A bsolute energy scale (kcal.m ol) Figure 3.3: Convergence of terms in the energy function for the folded and unfolded ensembles. The upper plot in each pair shows the change in the value of each energy term as frames are accumulated through the simulation. The lower plot gives the 1st, 2nd, 3rd, and 4th moments of the overall distribution of energy terms as a function of simulation time. For compactness, all four moments are shown on the same graph on a relative linear scale. 70 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING configurational entropy terms approach them monotonically, because more states are sampled with more simulation until convergence is achieved. The convergence behaviour of the energy function establishes the computational cost of the simulations needed to generate the folded and unfolded ensembles. For a small monomeric protein like β2-microglobulin, the 25 ns folded and unfolded simulations can each be completed in about 1 week using 8 cores on one node of a modern cluster. This is not particularly onerous and makes the method described here suitable for medium throughput analysis of protein structures. The Gō model, on the other hand, does not require simulation so calculations even for large proteins take less than 1 minute on a standard workstation. 3.6 Behaviour of Individual Terms in the Energy Function Each term in the energy function has been examined to determine how it changes between the folded and unfolded ensembles. In Figure 3.4, the difference in contact energies between the folded and unfolded states are shown; the presence of residual interactions in the unfolded state is noteworthy. Figure 3.5 gives the polar solvation energy, which is driven both by the polarity of each residue and its degree of burial in the protein. The configurational entropies of the folded and unfolded states are insightful to compare (see Figure 3.6A). The residues with the greatest entropy increase on unfolding are (intuitively) those with the greatest steric restriction in the folded state, such as Phe 62, which therefore gain the most freedom on unfolding. Surprisingly a few residues were found to have slightly greater entropy in the folded state, such Arg 97. In the native state, the long side chain points directly out into solvent, giving it essentially complete rotational freedom, and the location of the residue near the C-terminus of the protein gives its backbone considerable flexibility too. Meanwhile, in the unfolded state the residue side chain encounters other unfolded parts of the protein that somewhat curtail its motion, leading to the lower entropy. The probability distribution of improper angles (defined for planar structures such as amide groups and aromatic rings) was also analysed as an additional possible contributor to configura- tional entropy changes, but the effect of these angles was found to be negligible and thus omitted from the entropy calculation (see Figure 3.6B). This is reasonable because of the strong potentials constraining improper angles in MD force fields, which does not leave much scope for changes in the probability distribution. As well, the degrees of freedom reflected in improper angles are accounted for in surrounding dihedral angles. Adding improper contributions to the dihedral en- tropies would therefore risk a (slight) double counting. 71 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 20 40 60 80 100 20 40 60 80 100 −4 −2 0 2 4 6 8Folded U nfolded −10 −8 −6 −4 −2 0 2.5 2.0 1.5 1.0 0.5 0 -0.5 2 0 20 40 60 80 1000 20 40 60 80 100 20 40 60 80 1000 20 40 60 80 100Folded Folded U nfolded U nfolded Residue Index Residue Index Residue Index R es id ue  In de x a) Nonpolar protein-protein energy b) Polar protein-protein energy c) Nonpolar solvation energy E ne rg y (k ca l/m ol ) Figure 3.4: Comparison of energies between folded and unfolded states for the three terms (polar protein-protein energy, nonpolar protein-protein energy, and nonpolar solvation) that depend on pairwise interactions between protein residues. Polar solvation energy is treated in Figure 3.5. The upper diagonal shows energy due to residues i and j > i in the folded state at coordinates (i, j); the lower diagonal shows the energy due to these residues in the unfolded state at coordinates ( j, i). Note the presence of residual structure in the unfolded state. 0 10 20 30 40 50 60 70 80 90 0 50 100 150 Residue Index D ie le ct ric  T ra ns fe r E ne rg y (k ca l/m ol ) Folded ensemble Unfolded ensemble Dierencea) b) 0 25kcal/mol 100 Figure 3.5: Contribution of polar solvation energy to the free energy function, as given by the transfer energy from vacuum to a medium with a dielectric profile corresponding to either the folded or unfolded ensemble. Values for each residue in β2-microglobulin are shown for the folded and unfolded ensembles. The change in polar solvation energy on unfolding is depicted on the structure at right; the transparent contour is a contour of constant relative permittivity (ε = 50) around the protein. 72 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 0 20 40 60 80 100 −20 −15 −10 −5 0 5 10 Folded ensemble Unfolded ensemble Dierence - T  *  S  (k ca l/m ol ) Residue Index 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 0 0.01 0.02 0.03 0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 0 0.01 0.02 0.03 0.04 Dihedral Angle (rad) Dihedral Angle (rad) D ih ed ra l A ng le  In de x D ih ed ra l A ng le  In de x Probability Probability −1.0 −0.5 0 0.5 0 20 40 60 80 100 Residue Index - T  *  S  (k ca l/m ol ) a) Dihedral contribution b) Improper contribution 1 2 34 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 c) ARG97 dihedrals d) PHE62 dihedrals Figure 3.6: Configurational entropy contribution to the unfolding free energy of β2-microglobulin. a) The change in entropy due to dihedral angle rotation between the folded and unfolded states. b) The change in entropy due to improper angle flexion between the folded and unfolded states. The contribution to configurational entropy from improper flexibility is very small relative to that from dihedral rotation. The spikes correspond to Pro residues (which have an anomalous backbone amide improper angle) or residues with more than one improper angle, such as Phe, Tyr, Trp, or His. c-d) The change in individual dihedral bond angle probabilities for the residues with the smallest (Arg 97) and largest (Phe 62) changes in configurational entropies on unfolding. 73 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 10 20 30 40 50 60 70 80 90 100 −2 −1 0 1 2 3 4 5 x 104 P protein-protein energy NP protein-protein energy P solvation energy NP solvation energy Congurational entropy Overall free energy En er gy  (c al /m ol ) Residue Index Figure 3.7: Relative magnitude of the five terms in the unfolding free energy function. A cross section through the β2-microglobulin landscape for an unfolded length of 5 residues is shown. 3.7 The Unfolding Energy Landscape of Beta-2-Microglobulin The free energy of unfolding for all contiguous subsequences in β2M were calculated by the meth- ods above and used to construct the proteins’s unfolding energy landscape (see Figure 3.8) as a visual tool for understanding the early events in loss of structure leading to misfolding. The high stability of the two faces of the β -sandwich that form the structural core of the protein is clearly seen, but interestingly the region between the two halves of the sandwich, including β strand 4, is conspicuously less stable. This suggests that unfolding of this region occurs with relatively high probability. Moreover, unfolding at this site would expose a free edge of the β sheet, which is pro-amyloidogenic. Examining the individual contributions to the overall energy landscape (Figure 3.7) gives in- sight into the balance of effects that control overall protein stability. All five terms meaningfully influence the free energy, although nonpolar solvation energy is consistently smallest and nonpolar protein-protein energy is consistently largest. Nonpolar (van der Waals) protein-protein energy correlates best with the overall free energy, which explains why the landscape generated by the detailed ensemble-based energy function and the much simpler Gō model show good qualitative agreement: the contact-based Gō model essentially uses a simplified version of the van der Waals energy term. 74 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING a) Ensemble model b) Go model Residue Index Le ng th  U nf ol de d Le ng th  U nf ol de d Residue Index Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) β1 β2 β3 β4 β5 β6 β7 β1 β2 β3 β4 β5 β6 β7 Figure 3.8: The unfolding free energy landscape for β2-microglobulin, calculated with (a) the detailed ensemble based energy function and (b) simple Gō energy function. The horizontal axis gives the residue at the centre of an unfolded region, the depth axis (into the page) gives the length of the region unfolded, and the vertical axis gives the free energy change for partial unfolding of a given region. Small unfolding events are at the front of the landscape, and larger unfolding events are are at the back of the landscape. 75 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 0 10 20 30 40 50 60 70 80 90 100 −20 −10 0 10 20 30 40 50 Fr ee  e ne rg y of  u nf ol di ng  (k ca l/m ol ) Residue at Middle of Unfolded Segment Ensemble model Go model Fully protected from H/D exchange Partially protected from H/D exchange Figure 3.9: Comparison of the β2-microglobulin free energy landscape with experimental data from NMR hydrogen/deuterium exchange [295]. A cross section through the landscape for an unfolded region length of 5 residues is shown, superimposed on residues with a full (> 24 hr) or partial (> 30 min) protection from exchange [295]. 3.8 Comparison with Experimental NMR Data As a test of the concordance between this theory and experimental data, we have compared a cross section through the energy landscape with hydrogen/deuterium exchange data from NMR studies of native β2M [295]. H/D exchange is well suited to comparison with this theory, because it comments directly on local unfolding and refolding; this is the process by which exchange takes place. Less stable regions have a higher probability of unfolding and thereby exchange protons more readily and vice versa. As seen in Figure 3.9, there is generally very good correspondence between the parts of the protein found to be protected from exchange and the regions predicted to have high stability in the native state. The only exceptions are residues 42 and 92, which are predicted to be relatively unstable in the native fold yet are protected from exchange. In the β2M structure these residues are located in the middle of a loop and point out into solvent, so it is not clear why Hoshino et al. found them to be protected. 76 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING −20 0 20 40 60 80 N terminus Beta 1 Loop 1 Beta 2 Loop2 Beta3 Loop 4 Loop5 C-terminus Loop 3 Beta 4 Loop 6 Beta 5 Beta 6 Beta 7 En er gy  (k ca l/m ol ) N  te rm in us Be ta  1 Lo op  1 Be ta  2 Lo op 2 Be ta 3 Lo op  4 Lo op 5 C- te rm in us Lo op  3 Be ta  4 Lo op  6 Be ta  5 Be ta  6 Be ta  7 NT β1 β2 β3 β4 β5 β6β7 L1 L2 L3 L4 L5 L6 CT Figure 3.10: Cooperativity in unfolding of secondary structural elements in β2-microglobulin. The main diagonal gives the total free energy change for unfolding each of the secondary structures listed (see the structure at right for reference.) The upper triangular part of the figure gives the total free energy of unfolding the two secondary structural elements indicated by the labels at the left and bottom. The brighter the red, the greater the overall barrier to unfolding the two regions simultaneously. The lower triangular part gives the difference in free energy between unfolding the two indicated secondary structural elements simultaneously, compared to unfolding them separately (a measure of cooperativity in unfolding). The darker the blue, the more unfolding of one region facilitates unfolding of the other. 3.9 Cooperativity in Regional Unfolding In Figure 3.8, only single regions of the protein are unfolded at a time, but generalization to si- multaneous unfolding of two regions is straightforward as a way of investigating cooperativity in the unfolding mechanism. In Figure 3.10, all pairs of secondary structural elements are unfolded, and the degree to which unfolding of one reduces the barrier to unfolding of the other is shown. The high degree of cooperativity between β -strand 4 and β -strand 5 is noteworthy. Since β4 has the lower unfolding free energy, it is likely to unfold as a first step, which would then facilitate unfolding of β5 as a second step. Furthermore, the loop between β4 and β5 is also marginally stable, so unfolding of β4 and β5 would entail unfolding of this loop as well and thereby free a 77 CHAPTER 3. ENERGY LANDSCAPES FOR PARTIAL PROTEIN UNFOLDING 20 residue segment to interact with other molecules in solution, including other β2M proteins or amyloid aggregates. 3.10 Capabilities, Assumptions, and Limitations of the Energy Landscape Model As with any modelling approach, there are limits in the descriptiveness of the energy landscape theory. In forming the landscape by combining energies from two separate simulations of the folded and unfolded protein, it is implicitly assumed that the interaction of an unfolded region with the rest of the folded protein is similar, at least in sum, to its interaction with the rest of the protein when unfolded. This approximation is difficult to overcome without performing a separate simulation for each region of interest. The effects of polar solvation energy are also not sepcifically accounted for in the energy function. A separate modelling effort, perhaps based on thermodynamic integration to determine the free energy of inserting each amino acid into solvent, could be undertaken for this purpose. A further limitation is that the parts of the protein that remain folded are not able to relax in response to unfolding of the region of interest. The free energies predicted therefore represent upper limits to the true values, because the folded part of the protein will in reality deform to find a new free energy minimum. However, the utility of the energy landscape theory is less in calculation of the absolute magnitude of free energy changes, but rather in the relative variations in stability between parts of the structure. The validity of the theory for this purpose should not be affected. There is a range of maximum applicability for unfolded region length in the energy landscape theory. For sequence lengths less than 5 residues, unfolding is in some sense only a notional process as the sequence is so tightly constrained by its endpoints that it cannot acquire much additional freedom. Conversely, past a length of 20-25 residues, the extensive structural disruption entailed by unfolding such a large segment means that the rest of the protein would likely unfold spontaneously. The theory is thus most useful for understanding unfolding events in the 5-20 residue range, corresponding to early steps in the unfolding pathway. Notwithstanding these complications, this theory provides a unifed and efficient way to ex- tensively characterize and predict events in protein unfolding. As described in Chapters 5 and 6, the energy landscape theory has been used to confirm misfolding-specific epitopes in SOD1 and predict sites of local unfolding in cancer-associated proteins as a potential new avenue in tumour- specific immunotherapy. In general this method may be adapted to compare any two ensembles of protein states, not just folded and unfolded. 78 Chapter 4 Prion Protein Stability and Misfolding Using the mesoscopic theory of protein dielectrics developed in Chapter 2, we have calculated the salt bridge energies, total residue electrostatic potential energies, and transfer energies into a low dielectric amyloid-like phase for 12 species and mutants of the prion protein. Salt bridges and self energies play key roles in stabilizing secondary and tertiary structural elements of the prion protein. The total electrostatic potential energy of each residue was found to be invariably stabilizing. Residues frequently found to be mutated in familial prion disease were among those with the largest electrostatic energies. The large barrier to charged group desolvation imposes regional constraints on involvement of the prion protein in an amyloid aggregate, resulting in an electrostatic amyloid recruitment profile that favours regions of sequence between alpha helix 1 and beta strand 2, the middles of helices 2 and 3, and the region N-terminal to alpha helix 1. Stabilization due to salt bridges is minimal among the proteins studied for disease-susceptible human mutants of prion protein. The average electrostatic binding potential correlates with a species’ resistance to prion infection. 4.1 Introduction Misfolded prion protein is the causative agent for a unique category of human and animal neurode- generative diseases characterized by progressive dementia, ataxia, and death within months of on- set [296]. These include Creutzfeldt-Jakob disease (CJD), fatal familial insomnia, and Gerstmann- Sträussler-Scheinker syndrome in humans, bovine spongiform encephalopathy in cattle, scrapie in sheep, and chronic wasting disease in cervids. Unlike other infectious conditions that are transmit- ted by conventional microbes, the material responsible for propagation of prion diseases consists of an abnormally folded conformer of an endogenous protein, possibly in complex with host nucleic acids or sulfated glycans [297]. Soluble, natively-folded monomers of the prion protein (known 79 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING as PrPC) may adopt an aggregated protease-resistant conformation known as PrPSc that is capable of recruiting additional monomers of PrPC and inducing them to misfold in a process of template- directed conversion. This results in ordered multimers of prion protein that, when fractured, act as additional seeds to propagate the misfold through the reservoir of PrPC present in brain. Al- though the conversion process may be initiated by an infectious inoculum of PrPSc, it may also arise spontaneously or due to mutations in the gene coding for PrP that predispose to misfolding. Structurally, PrPC is a glycophosphatidylinositol-anchored glycoprotein of 232 amino acids comprising an N-terminal unstructured domain and a C-terminal structured domain of 3 α-helices (hereafter referred to as α1, α2, and α3 in order) and a short two-stranded antiparallel β -sheet (made of strands β1 and β2), while PrPSc has substantially enriched β content speculated to form a stacked β -helix [53] or extended β -sheet [52] conformation in the amyloid fibril. At a molecular level PrP misfolding is a physico-chemical process, with the propensity to misfold determined by the free energy difference between folded and misfolded states and the magnitude of the energy barrier separating them. As in any protein system, electrostatic effects make significant contributions to the energies of the various states and take two forms: salt bridge energy due to spatial proximity of charged groups within the native protein, and solvation/self energy due to field energy storage in the ambient protein and water dielectric media. A priori, it is expected that electrostatic effects generally favour the well-solvated monomeric PrPC over the more hydrophobic amyloid PrPSc, since formation of PrPSc necessitates disruption of salt bridges in the native structure (although this may be compensated for by the formation of alternative salt bridges in PrPSc) and transfer of some charged groups into an environment of lower permittivity, both of which are energetically costly. However, these penalties on formation of PrPSc are counterbalanced by hydrogen bonding, hydrophobic, and possibly entropic contributions that favour the amyloid form [298]. Regional variation in the electrostatic transfer energy to water and amyloid may be useful in predicting participation in the amyloid core of PrPSc. Furthermore, several of the causative mutations for familial prion disease involve substitution of charged residues for uncharged residues (such as the D178N mutation responsible for fatal familial insomnia or familial CJD, depending on mutant allele polymorphism status at codon 129) or charge reversal of a residue (such as E200K, the most common mutation in classical familial CJD) [48], offering an indication of the importance of electrostatic effects in the misfolding process. More broadly, it has been found that changes in the charge state of a mutant protein compared to wild-type relate to its tendency to form aggregates [299], and the aggregation propensity of a polypeptide chain is inversely correlated with its net charge [300]; similarly, aggregation propensity is maximal at the protein iso-electric point where the net charge is zero [301]. Intrinsically unstructured proteins tend to have a high net charge [302], which increases the electrostatic cost for the system to condense into the folded structure. 80 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING Sequence correlations between charged groups may affect the kinetics of amyloid formation as well [303]. The role of salt bridges in prion disease has been investigated previously by molecular dy- namics simulation (MDS) and experimental studies of mutant protein. MDS of human PrPC has identified salt bridges that play a role in stabilization of the native structure [304]. Other MDS stud- ies of the R208H mutation, which disrupts a salt bridge with residues D144 and E146 of α-helix 1, have shown that it results in global changes to the backbone structure [305]. Experimentally, the E200K mutant of PrPC has been shown through calorimetry to be 4 kJ/mol less stable than wild type [306]. Mutation of two aspartates participating in α1 intra-helix salt bridges to neutral residues increases misfolding fourfold in cell-free conversion reactions under conditions favouring salt bridge formation [307]. Interestingly, complete reversal of charges in α1 appears to inhibit conversion, possibly by preventing docking of PrPC and PrPSc [307]. The pH dependence of charge interactions in PrPC has also been investigated to identify those most sensitive to pH changes [308]; this is an important aspect of the problem because of the observed increased PrPC misfolding rate at low pH. A unifying analysis of all PrPC salt bridges would be useful in understanding their role in structural stability. As well, to our knowledge solvation energy contributions to the misfolding process have not yet been investigated; they would offer a helpful perspective for probing the propensity of different regions of the prion protein to participate in the PrPSc amyloid core. Direct extraction of salt bridge and solvation energies from molecular dynamics is complicated by the need to run long-length simulations that sample the equilibrium between states of interest, which can be prohibitively slow for states that differ significantly in energy. An alternative ap- proach is to use a continuum electrostatics description of the protein-water system, in which the response of surrounding material is modelled through solution of the Poisson-Boltzmann equation as a macroscopic dielectric that varies from a low value (usually 4) within the volume of the protein to 78 (the dielectric constant of bulk water) outside the protein. The downside of this method is that it ignores subtleties of the protein response to perturbing fields, such as cooperative internal reorganization. Using results from Kirkwood-Frölich theory [209, 309, 220], we have recently developed a procedure to compute a spatially-varying dielectric function for a protein based on fluctuation statistics obtained from brief equilibrium MD simulations that capture much of the microscopic response of the protein at moderate computational cost [285] (see Chapter 2). This provides a convenient tool to calculate solvation and salt bridge energies for all residues in a pro- tein from a single simulation. In what follows we apply this method to determine the energies for all salt bridges in 12 molecular species of prion protein and the transfer energy for all residues in these proteins into a hypothetical protein amyloid core. 81 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING D ie le ct ric  c on st an t X Coo rdina te (An gstro ms)Y Coordinate (Angstroms) 0 10 20 30 40 50 60 70 80 10 2030 40 50 60 0 10 20 30 40 50 60 Figure 4.1: Local heterogeneous dielectric map of human PrP as calculated by mesoscopic dielec- tric theory. This is a slice taken through the geometric centre of the molecule in the X-Y plane. 4.2 Methods Twelve structures of various species and mutants of PrPC were selected from the Protein Data Bank (PDB), including the species human 1QLZ and 1QLX [310], cow 1DX0 [311], turtle 1U5L, frog 1XU0, chicken 1U3M [312], mouse 1AG2 and 1XYX [313, 19], dog 1XYK, pig 1XYQ, cat 1XYJ [18], wallaby 2KFL [314] and the human mutants D178N 2K1D [315] and E200K 1FKC [316]. They were taken as starting points for 5ns all-atom molecular dynamics simulations using the CHARMM22 force field [230] with explicit pure solvent water, periodic boundary conditions, particle mesh Ewald electrostatics, a timestep of 2fs, and a Lennard-Jones potential cutoff distance of 13.5Å. The basic residues (ARG and LYS) were protonated, while the acidic residues (HIS, ASP, and GLU) were deprotonated to reflect ionization conditions at pH 7. The system was first minimized for 200 time steps before starting the simulation. Snapshots of the simulations were taken every 2ps to build up an ensemble of equilibrium conformations for each protein. The dipole moments µ of all residue side chains and backbones were calculated at each snapshot and used to obtain the correlation coefficients Ri j = 〈 µiµ j 〉√〈 µ2i 〉〈 µ2j 〉 (4.1) for all pairs of Cartesian dipole components µi and µ j, where the angle brackets denote an average over all snapshots (the thermal average). The matrix R of correlation coefficients was diagonalized to isolate the normal modes of dipole fluctuations, which describe the response of charged groups to perturbations around equilibrium. The R matrix for each protein was used to calculate the local 82 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING Esb EAB EA EBE0 P A B P A B P A B P A B= + Sa lt br id ge en er gy SCPSCP P SC A B = To ta l e le ct ro - st at ic  e ne rg y Eelec Ewhole Ewhole - sc Esc Figure 4.2: Schematic of the approach for calculating salt bridge and total electrostatic energies. Circular arrows denote solvation/self energies; straight arrows denote interaction energies. Labels refer to quantities in Equations 4.2 and 4.3. dielectric map. See Figure 4.1 for the dielectric map of human PrP. These dielectric maps were then taken as input for the Poisson-Boltzmann solver APBS [221] to solve the linearized Poisson- Boltzmann equation on an 973 mesh in 150 mM NaCl, again with periodic boundary conditions, to obtain the electrostatic energies required below. Atomic radii were assigned according to the CHARMM force field by the program PDB2PQR [317]. The often-used simplifying approxima- tion of a constant internal protein dielectric constant of 4 and water dielectric constant of 78 was employed for comparison [318]. Salt bridges in the set of proteins were identified by searching all pairs of charged atoms for those with charged groups within 12Å of each other, whether the charges were alike or different. The energy of each salt bridge was determined by a mutation cycle designed to isolate the charge interaction energy from the energy in the surrounding dielectric milieu as shown in Figure 4.2. For charged groups A and B, their salt bridge energy Esb was taken to be a function of the energy of the protein system with both charges in place, EAB, with one or the other charge removed, EA and EB, and with both charges removed, E0, as follows: Esb = E0+EAB− (EA+EB) . (4.2) Here, E0 contains only the self energy of the part of the protein not including A and B (labelled P in Figure 4.2), while EAB contains the self energies of A, B, and P as well as the pairwise interaction energies between A and B, A and P, and B and P. EA contains the self energies of A and P and 83 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING their interaction energies (EB is analogous). Combining the terms as shown causes all the energies except the interaction energy of A and B to cancel. Another cycle, also shown in Figure 4.2, was used to determine the total contribution of each residue to the electrostatic energy of the protein, Eelec. For each side chain in the protein, the electrostatic energies of the side chain Esc and the protein lacking the side chain Ewhole−sc were calculated in isolation in the protein dielectric environment and subtracted from the electrostatic energy of the intact protein Ewhole: Eelec = Ewhole−Esc−Ewhole−sc. (4.3) The terms Esc and Ewhole−sc contain the self energies of the side chain and rest of the protein respectively, and Ewhole contains these self energies as well as the interaction energy of the side chain with the rest of the protein. Subtracting the terms as shown causes the self energies to cancel, leaving only the interaction energy between the side chain and the protein. This energy can be thought of as the electrostatic potential energy of a residue in the protein. To approximate the electrostatic energy of residue transfer into a hydrophobic, low dielectric environment like the core of a PrPSc amyloid, the energy E(εPrPC) of a residue in the dielectric environment of PrPC was compared to the energy E(εPrPSc) of the residue in a homogeneous di- electric of ε = 4, which describes the dielectric response in the interior of a bulk amyloid protein phase. Since the nature of monopole fields in the PrPSc structure is unknown, interactions between charged residues are omitted from the calculation, so the transfer energy reflects only changes in the dielectric environment. For a given residue, the dielectric contribution to the transfer energy Etrans is: Etrans = E(εPrPSc)−E(εPrPC). (4.4) 4.3 Dynamics of Dipoles at Equilibrium The modes obtained by diagonalizing the correlation matrix R in Equation 4.1 generally involved several parts of the molecule; correlations were not limited to residues close in space or sequence. This is consistent with phonon transmission of perturbations at one site throughout the molecule by strong steric coupling effects through solid-like elastic moduli. The four largest-amplitude dipole modes for human PrP are shown in Figure 4.3. Dipole fluctuations were not qualitatively different between species, but different regions of the molecule exhibited characteristic motions. The two long alpha helices 2 and 3 exhibited primarily synchronous motion, with the helices rocking back and forth together as a unit. Nonetheless, some dipoles in the helices exhibited contrary motion. 84 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING 1. 2. 3. 4. Figure 4.3: The 4 largest amplitude dipole correlation modes for human PrP. Regions of the same colour move in synchrony, while regions of different colours move in opposition. Alpha helix 1 did show some autonomy from the rest of the structure and tended to fluctuate as a group. Motion of the beta sheet is prominent in several of the modes. Two patterns stand out: a see- saw motion in which one strand tilts up as the other tilts down with both strands pivoting about the middle of the strands, and an in-out motion in which the outer beta strand (β1) and the N-terminal part of α2 move synchronously away from the inner beta strand (β2). The first motion is seen in modes 2 and 3 in Figure 4.3 above, while the second motion is seen in other lesser-amplitude modes. This is compatible with NMR observations of the beta sheet, which show slow exchange between a range of conformations [319, 320]. In the NMR experiments, motion of the beta strands was observed on a time scale of microseconds, while these simulations only spanned nanoseconds, but both are indicative of some degree of conformational flexibility in the beta sheet. 4.4 Salt Bridge Energies The PrP structures analysed contained a diverse set of salt bridges, ranging from moderately at- tractive to weakly repulsive. A complete list of salt bridges in all structures is presented in Table A.1; salt bridges in the human structure are shown in Table 4.1 for both the single NMR structure 1QLX and the ensemble of 20 NMR structures 1QLZ. Structurally, the salt bridges can be divided into local and nonlocal by the proximity in sequence of the participating residues. Local salt bridges, like Asp148—Glu152 in α1, Asp208—Glu211 in α3, and Arg164—Asp167 between β2 and the following loop serve to stabilize secondary structural elements of the protein, while nonlocal salt bridges like Arg156—Glu196, Arg164— Asp178, and Glu146—Lys204 help to hold these elements together in the overall tertiary fold. Figure 4.4 shows the position of these nonlocal salt bridges in bovine PrP. 85 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING α helix 1 α helix 2 α helix 3 β sheet Asp 178 Glu 146 Glu 196 Arg 156 Lys 204 Arg 164 α helix 1 α helix 2 α helix 3 β sheet Asp 178 Glu 146 Glu 196 Arg 156 Lys 204 Arg 164 Figure 4.4: Stereo view of the three well-conserved nonlocal salt bridges as they are arranged in bovine PrP. The transparent surfaces show contours of equal dielectric (5 for blue; 70 for white) as determined from heterogeneous mesoscopic dielectric theory.The volume near the surface of the protein shows the greatest difference in dielectric on comparison of the homogeneous and heterogeneous dielectric fields. See also Figure 4.1 for a surface plot of the dielectric permittivity as a function of position. Many of the salt bridges identified were near the protein surface, where the high degree of sol- vation attenuates their strength; the strongest salt bridges were those best sequestered from solvent, for this places them in a dielectric environment that increases electric field strength. The strongest salt bridge of all, between residues 206 and 210 of frog PrP, features a special “two-pronged” ge- ometry that enables the amino group of Lys 210 to associate with both carboxyl oxygens on Asp 206. Interestingly, two strong but intermittant salt bridges are present in human 1QLZ between the C-terminal arginine and residue 167 in the β2−α2 loop and residue 221 in α3. The substantial variation between members of the NMR ensemble at the C-terminus results in large motion of the arginine side chain, so that these salt bridges are only formed in a subset of conformers. Similarly, the ARG 164 - ASP 178 salt bridge that helps to anchor the beta sheet to α2 and α3 is not present in all members of the 1QLZ NMR ensemble, although it is quite strong in the single 1QLX struc- ture. Although attractive salt bridges predominate, there were a number of repulsive salt bridges identified as seen in Table 4.2, especially in α1 and α3, which are crowded with several charged residues. As demonstrated in the following section, despite the presence of these destabilizing interactions, no residue experiences a net repulsive potential as these unfavourable salt bridges are counterbalanced by the presence of other, stronger, favourable ones. 86 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING Cat 147 - 151 (-11.6) Chicken 159 - 215 (+5.2)Cat 156 - 202 (-11.8)Wallaby 156 - 202 (-13.3)Frog 206 - 210 (-29.7) Figure 4.5: The 4 most attractive salt bridges (in black lettering) and the 1 most repulsive (in red lettering) identified in the set of PrP structures. Numbers in parentheses are the salt bridge energies in kJ/mol for the structures indicated. Note that interactions with all surrounding residues were considered when assessing the total effect of each residue on overall stability, as given in Table 4.3 and Table A.2. The total energies due to all salt bridges in each molecular species studied are shown in Figure 4.6B. Of note is the much reduced total salt bridge energy in the two human mutants, E200K and D178N, compared to any other structure. The categorization of species as susceptible or resistant to prion disease is somewhat approximate, but comparison of total salt bridge energy and disease susceptibility by Kendall’s tau gives a value of τ = 0.45, implying that the order of species by salt bridge energy and disease susceptibility are significantly concordant (p = 0.046). Overall, the effect of a heterogeneous dielectric was to moderate putatively strong salt bridges under the biphasic protein-water approximation for the dielectric function. The salt bridges listed in Table A.1 are those present at pH 7, but for human PrP an additional search was performed to identify salt bridges that would emerge at lower pH, since acidic condi- tions are known to drive PrPSc formation. Lower pH results in protonation of histidine residues to produce a positively charged species, which in human PrP enables the formation of three ad- ditional weakly attractive salt bridges (indicated by daggers in Table 4.1). While the dominant effect of lowering pH is to reprotonate acidic side chains, thus reducing electrostatic stability, this is partially compensated for by the formation of salt bridges involving histidine. 87 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING 1QLX 1QLZ 1QLX 1QLZ Residues Involved r Esb(1) Esb(1) δEsb(1) Esb(1) Esb(2) Esb(2) δEsb(2) Esb(2) (Å) (kJ/mol) (kJ/mol) (kJ/mol) (kJ/mol) †HIS 140 ASP 147 6.6 -4.0 -3.7 0.30 -4.6 -4.2 0.40 ASP 144 ASP 147 7.6 2.7 4.4 0.19 2.9 5.4 0.32 GLU 146 LYS 204 7.4 -2.5 -3.0 0.29 -2.5 -3.4 0.43 ARG 148 ARG 151 7.9 1.8 2.2 1.23 1.4 2.3 0.51 ARG 148 GLU 152 4.8 -7.4 -5.0 0.25 -35.4 -14.6 0.51 ARG 156 GLU 196 4.9 -5.5 -5.1 0.32 -18.2 -11.8 0.96 ARG 156 ASP 202 7.1 -3.8 -3.0 0.27 -4.8 -3.8 0.39 ARG 164 ASP 167 6.9 -3.0 -2.4 0.54 -3.7 -2.7 0.95 ARG 164 ASP 178 6.0 -20.4 -4.8 1.23 -48.2 -8.2 1.44 †HIS 177 GLU 207 6.6 -2.8 -2.8 0.43 -2.7 -3.1 0.58 †HIS 187 ASP 202 8.0 -3.0 -3.2 0.25 -5.1 -4.6 0.30 GLU 196 ASP 202 7.9 2.9 2.4 0.22 2.9 2.7 0.25 GLU 200 LYS 204 4.1 -4.9 -3.7 0.39 -7.5 -6.5 0.85 ARG 208 GLU 211 2.6 -9.3 -5.8 0.30 -37.9 -14.9 0.66 Table 4.1: Salt bridges in the human prion protein from 1QLX (a single structure) and 1QLZ (an ensemble of 20 structures). †: Only present at low pH. The separation between charged groups is given by r. Esb(1) is the salt bridge energy (an average for 1QLZ) calculated with the heteroge- neous dielectric theory, and Esb(2) is the same energy calculated with a constant protein dielectric of 4. The standard deviation of salt bridge energy over the structures in the NMR ensemble con- taining the salt bridge is given as a fraction of the total salt bridge energy by δEsbEsb . The correlation coefficient between salt bridge energies from 1QLX and 1QLZ is 0.82. 88 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING PDB Species Residues Involved n Esb(1) δEsb(1) Esb(1) Esb(2) δEsb(2) Esb(2) (/20) (kJ/mol) (kJ/mol) 1XU0 Frog ASP 206 LYS 210 20 -21.4 0.31 -42.2 0.37 1QLZ Human WT GLU 221 ARG 228 12 -13.7 0.07 -76.6 0.07 2KFL Wallaby ARG 156 ASP 202 20 -13.3 0.88 -18.0 0.90 1QLZ Human WT ASP 167 ARG 228 11 -13.1 0.06 -75.5 0.08 1XYJ Cat ARG 156 ASP 202 20 -11.8 0.45 -21.5 0.69 1XYJ Cat ASP 147 ARG 151 20 -11.6 0.27 -40.9 0.33 2KFL Wallaby ARG 156 GLU 196 20 -10.9 0.32 -29.6 0.51 1XYX Mouse ARG 156 ASP 202 20 -10.4 0.37 -18.9 0.46 1QLZ Human WT ASP 147 ARG 151 20 -9.9 0.38 -30.4 0.53 1XYX Mouse GLU 146 LYS 204 20 -9.7 0.26 -27.2 0.44 ... ... ... ... ... ... ... ... ... ... ... 2K1D Human D178N ARG 156 LYS 194 20 3.4 0.42 4.9 0.84 1XYQ Pig GLU 207 GLU 211 20 3.4 0.24 4.0 0.30 1FKC Human E200K LYS 204 ARG 208 20 3.4 0.15 3.7 0.24 1XYQ Pig ARG 148 ARG 151 20 3.5 0.09 4.1 0.24 1XYX Mouse GLU 207 GLU 211 10 3.5 0.14 4.0 0.20 1XYK Dog GLU 207 GLU 211 20 4.0 0.18 4.8 0.25 1QLZ Human WT ASP 144 ASP 147 19 4.4 0.18 5.4 0.31 1XU0 Frog LYS 197 LYS 210 19 4.7 0.49 7.7 0.84 2KFL Wallaby GLU 196 ASP 202 20 4.9 0.14 7.4 0.22 1U3M Chicken GLU 159 GLU 215 20 5.2 0.15 6.2 0.43 Table 4.2: The most attractive and repulsive salt bridges in the set of prion protein structures studied. The number of NMR conformers for each species in which the salt bridge is present is n. Esb(1) is the salt bridge energy (an average for 1QLZ) calculated with the heterogeneous dielectric theory, and Esb(2) is the same energy calculated with a constant protein dielectric of 4. The standard deviation of salt bridge energy over the structures in the NMR ensemble containing the salt bridge is given as a fraction of the total salt bridge energy by δEsbEsb . 89 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING -100 -75 -50 -25 0 Hu ma n D 17 8NDo g Pig Ch ick en Tu rtl e Bo vin e Ca t Hu ma n W T Hu ma n E 20 0K Mo us e Wa lla by  Fro g To ta l S al t B rid ge  E ne rg y (k J/ m ol ) Resistant to prion disease Susceptible to infection but does not develop prion disease spontaneously Susceptible to infection and develops prion disease spontaneously Familial prion disease mutations B. Salt Bridge Energy (kJ/mol) N um be r o f S al t B rid ge s A. 10 5 -5 -10 -15 -20 -25 50 100 150 200 250 300 0 0 Figure 4.6: Total salt bridge energies by species A. Histogram of average energies for all salt bridges identified. B. Total salt bridge energies in the molecular species studied, calculated ac- cording to Equation 4.2. Error bars give the 95% confidence interval for the mean salt bridge energy from each ensemble of NMR structures. 90 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING Res Site Eelec (kJ/mol) Eelec Avg(Eelec) Res Site Eelec (kJ/mol) Eelec Avg(Eelec) 1QLX 1QLZ THR 183 α2 -197 6.0 THR 183 α2 -171 5.9 ASP 147 α1 -196 5.9 TYR 150 α1 -171 5.9 TYR 150 α1 -171 5.2 ASP 202 α3 -135 4.6 ARG 136 β1-α1 -151 4.6 TYR 157 α1-β2 -97 3.3 ASP 202 α3 -137 4.2 ARG 164 β2 -81 2.8 ARG 164 β2 -99 3.0 THR 192 α2-α2 -67 2.3 ASP 178 α2 -96 2.9 ASP 178 α2 -67 2.3 GLU 221 α3 -92 2.8 VAL 210 α3 -65 2.2 TYR 157 α1-β2 -75 2.3 ARG 136 β1-α1 -63 2.1 VAL 210 α3 -73 2.2 GLU 221 α2 -61 2.1 Table 4.3: Residues in human PrP (1QLX and 1QLZ) with the greatest total electrostatic energy from Equation 4.4. The last column gives the factor by which each residue’s electrostatic energy exceeds the average for all residues in the protein (-33 kJ/mol). THR 183 has the lowest electro- static energy in human PrP and is also a minimum on the hydrophobic transfer profile (see Figure 4.7), due presumably to its low dielectric local environment. VAL 210 appears in the list despite being a putatively nonpolar residue because it is located in a region of particularly low dielectric in the protein core, which increases the energy of its small side chain methyl group dipoles. β1-α1: Between α1 and β1. α1-β2: Between α1 and β2. Bold: Wild-type residues at the locations of known mutation sites in familial prion diseases. The correlation between transfer energies from 1QLX and 1QLZ is 0.87. With 90% confidence, the list is significantly enriched in pathologic mutations compared to random chance (p = 0.096). 4.5 Total Residue Electrostatic Energies The salt bridge energies describe pairwise effects, but for mutational analysis it is more important to know the total contribution of each side chain to the stability of the protein. These energies approximate the electrostatic contribution to the energy change on mutation to a residue with a small nonpolar side chain like alanine. In practice, the side chain of each residue is removed from the protein. The total electrostatic energy of each residue in all prion proteins studied was less than or equal to 0, indicating a strong degree of evolutionary selection toward residues that benefit stability in the folded conformation. Through electrically neutral, or nearly so, proteins have their internal dipoles oriented so as to lower the potential energy of every residue. In human PrP, it is instructive to correlate the energies to known pathogenic mutations: the residue with the greatest overall stabilizing energy, Thr183, is implicated in familial CJD by a T183A mutation [48]; this mutation has also been shown to radically reduce measured stability by urea denaturation [321]. It is interesting to note that this residue, although not charged, is polar and more deeply buried in the hydrophobic core of the protein than any other charged residue, thereby enhancing the effect of 91 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING 130 140 150 160 170 180 190 200 210 220 200 400 600 800 1000 1200 1400 1600 Residue Index No np ol ar  Tr an sfe r E ne rg y ( kJ/ mo l)   Human Bovine Dog Turtle β strand 1  strand 2helix 1 helix 3helix 2 Average (kJ/mol) Human  820 Bovine   900 Dog        910 Turtle     970 βα α α Figure 4.7: Hydrophobic transfer energy for strands of 7 residues centred on a given residue index as calculated from Equation 4.4 for 4 species of PrPC. The numbers in the upper left hand corner give the average transfer energy over the whole protein. Shown below are the locations of secondary structural elements in the PrPC sequence. dipolar attractions with its neighbors. Other residues that on mutation cause familial prion disease have especially high total electrostatic stabilizing energies, including D178 and D202. Table 4.3 gives the 10 human side chains with the greatest total electrostatic energies. We might anticipate that mutation of other residues in Table 4.3 may enhance the probability of developing misfolding- related disease. 4.6 Transfer to Hydrophobic Environment In forming the amyloid core of PrPSc, some residues must undergo the migration to a region of low dielectric constant. For highly charged residues, this transfer energy is prohibitively high and may thereby exclude their participation in the amyloid core, while for nonpolar residues the small elec- trostatic transfer energy cost is overcome by favourable solvation entropy changes. By mapping the transfer energy of each residue into a region of low dielectric approximating PrPSc amyloid, it is possible to predict the likelihood of recruitment for various PrP regions into the amyloid core, without the aid of specific dipole-dipole correlations as might be present in the amyloid. Figure 4.7 shows the transfer energy from the PrPC dielectric to a homogeneous dielectric of 4 for various species of PrPC. This value for the dielectric was chosen to account for electronic polarizabilily contributions to the dielectric response in the amyloid phase that will raise the dielectric constant above that of vacuum (see Figure 2.6). The transfer energy to an aqueous environment would show an inverse pattern. A 7 amino acid summing window is applied because sequence heterogeneity 92 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING causes large variation between adjacent residues, and individual residues cannot enter the amyloid core without placing their neighbors in it as well. The transfer energies in Figure 4.7 are quite large, but including other terms in addition to the electrostatic energies considered here will reduce the magnitude of the total transfer free energy. There is considerable variation in the transfer energy along the sequence, with the lowest bar- rier to dielectric transfer for the region between α1 and β2, the middles of α2 and α3, and β1. Conversely, α1, the loop between β2 and α2, and the loop between α2 and α3 show a formidable barrier to transfer. This overall pattern is well preserved among all PrP structures studied (see Figure A.1). Immunological studies have defined β2 as a PrPSc-specific epitope [44], which pre- sumably necessitates its surface exposure. In the human structure, β2 is located at the border between regions of low and high transfer energies, so it is possible that it is close in proximity to the amyloid core but protrudes sufficiently to be recognised by antibodies. The overall contour of the transfer energy functions is similar for all PrP structures studied, but there is some variation that correlates with known infectivity data. As seen in Figure 4.7, human and bovine share highly similar transfer energy profiles and are both susceptible to prion disease and interspecies transmission of disease, while non-mammalian turtle PrP that does not form PrPSc has a different profile, with a higher transfer energy barrier than cow or human over 4/5 of the sequence. PrPC from dog, a mammalian species known to be resistant to prion infection [322], is intermediate between the human and turtle profiles. The average transfer energies correlate with a species’ resistance to disease (Figure 4.7 legend). Some other species, including chicken, turtle, and wallaby, have a qualitatively different transfer energy function (see Appendix A). 4.7 Discussion of Electrostatic Effects in Prion Misfolding Although electrostatic effects are only one contribution to the enigmatic PrPC→ PrPSc conversion process, they offer clues to many of the central questions in prion biochemistry. The spatial vari- ation of the dielectric is important, as neither charge separation distance, dielectric constant at the midpoint of the salt bridge, nor burial can predict salt bridge stability or total electrostatic energy (see Figure 4.8). As shown above, salt bridges play an important role in stabilizing both secondary and tertiary aspects of the PrPC structure. In fact, the total energy of all salt bridges in human PrPC is -60 kJ/mol, almost twice the total stability of the protein as determined by calorimetry [306, 321]. Thus disruption of even a proportion of salt bridges in PrPC is sufficient to substantially destabilize the folded conformation, possibly accelerating or enabling the transition to PrPSc in the right conditions. However, as has been observed elsewhere [323], the free energy change of salt bridge disruption is not equal to the Coulombic energy of the salt bridge itself, due to the compet- ing favourable reduction in desolvation energy. This will substantially offset the change in stability 93 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0 5 10 15 20 25 30 | Es b| (kJ /m ol) R = 0.76 1 /ε r (Angstroms)-1 0.15 0.2 0.25 0.3 0.35 0.4 0 5 10 15 20 25 30 |Es b| (kJ /m ol) 1 / r (Angstroms) -1 R = 0.63 B. C. 0 50 100 150 200 250-200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 SASA (Angstroms) E tra n s (kJ /m ol) R = 0.27 x 10-2 A. 2 Figure 4.8: A. Total electrostatic energies of side chains in human PrP calculated from Equation 3 plotted as a function of solvent-accessible surface area (SASA), a statistic related to burial in the core of the protein. There is a weak but significant (p = 0.01) correlation between the energy and SASA, but SASA alone is a poor proxy for the electrostatic energy. B. Absolute salt bridge energies as a function of the reciprocal of the separation distance between the participating charged groups. C. Absolute salt bridge energies as a function of the reciprocal of the separation distance multiplied by the dielectric constant ε at the midpoint of the line joining the charged groups. In a homogeneous dielectric environment, the energy of a salt bridge connecting charges qA and qB would be given by Esb = 14piε0 qAqB ε r . Including 1/ε in the fit in addition to 1/r improves the correlation from 0.63 to 0.76, but even taken together the quantities ε and r do not reliably predict salt bridge energies. Thus a direct calculation of energies from the full heterogeneous dielectric map is necessary to obtain accurate results. from salt bridge disruption. Charge interactions may also participate in the poorly-understood as- sociation between the unstructured N-terminal domain and structured C-terminal domain of PrPC. As shown previously, the C-terminal domain organization depends on the length of N-terminal tail present [38], possibly through a collection of transient interactions below the detection threshold of NMR, resulting in an “avidity-enhanced” C-terminal structure. Specifically, a variable set of intermittant weak contacts may stabilize a loose association between regions of a protein, even in the absence of stronger contacts revealed by NMR. Charge complementarity between the N- and C-terminal domains provides one explanation for this phenomenon. For example, the very N- terminus of PrPC contains the highly positively charged region KKRPK from codons 25–29, while α1 contains the highly negatively charged region DYED from codons 144–147. If the N-terminal tail is free to explore a random walk around the C-terminal domain, electrostatic attractions are likely to bias this part of the tail toward residence near α1, a region that is especially influenced by the length of tail present. The net attraction between the N-terminal tail and C-terminal struc- ture may be insufficient to structure the tail but sufficient to collapse or condense the tail onto the surface of the structured domain, resulting in a kind of “molten shell.” Further exploration of this phenomenon, by molecular dynamics or other tools, may prove insightful. 94 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING The importance of acidic conditions in the PrP conversion has been known for some time [324], and acidity exerts a large effect through modification of the protonation states of charged residues. At slightly acidic pH below the pKa of histidine (6.5), protonation of the histidine imidazole ring creates mildly stabilizing salt bridges with nearby residues. In the prion literature, histidines are a subject of considerable attention for their ability to coordinate copper ions in the octapeptide repeat region of the N-terminal domain [325, 326], but it seems that they also help to protect PrPC from the stress of mildly acidic conditions. At much lower pH, however, protonation of glutamate and aspartate side chains ablates some of the stabilizing salt bridges shown in Table 4.1, which substantially reduces the energy barrier to rearrangement of PrPC components. For example, at pH 4.5, the pKa predictor program PROPKA [327] identifies glutamate residues 168, 200, 219, and 221 as being significantly protonated, which will affect the stability of the protein and whose systematic investigation is a topic for future work. The influence of acidity on the monomeric PrPC structure has been extensively studied by molecular dynamics [328, 329, 51], but perhaps the most noteworthy effect of acidity may not be in the resulting structural transition of the isolated PrPC monomer but rather in lowering the barrier to induced reorganization in the presence of the templating species. Another natural question is the role that electrostatics play in the formation of the PrPSc amy- loid. It may be argued that since the transfer energy profile in Figure 4.7 neglects the possibility of forming strong salt bridges in the low dielectric amyloid core of PrPSc it misrepresents the ability of these charges to stably occupy the amyloid. A counterexample may be constructed in a case of homogeneous dielectrics. The total energy change ∆Etotal on bringing two opposite charges A and B both of charge q and radius rion from a large distance apart in a medium with high dielectric εsolv like water into close proximity rAB in a region of low dielectric εprot to form a salt bridge is equal to the sum of the solvation energy changes ∆EAsolv and ∆E B solv and their pairwise Coulomb energy ∆EAB. Treating the charges as Born ions, in the limit where εprot/εsolv  1 (a valid assumption since generally εsolv = 78 and εprot = 4, so εprot/εsolv ≈ 0.05), the total energy change to form the desolvated salt bridge is: ∆Etotal =∆EAsolv+∆E B solv+∆EAB= 2q 2 ( 1 2rion )( 1 εprot − 1 εsolv ) − q 2 εprotrAB ≈ q 2 εprot ( 1 rion − 1 rAB ) . (4.5) This is always positive since rAB is greater than rion to satisfy the stereochemistry of the atoms. Thus although salt bridges may partially mitigate the burial of charged residues, they cannot alter the fundamental unfavourability of the electrostatic component of this process. It is also possible that solvent-exposed salt bridges may form in the misfolded state outside or on the surface of the amyloid or oligomeric core, which could occur without the desolvation penalty described above. 95 CHAPTER 4. PRION PROTEIN STABILITY AND MISFOLDING This would provide a mechanism to stabilize charged and polar parts of the protein in the misfolded form. Such salt bridges are likely to be relatively low in energy due to the high ambient dielectric environment, and it has been observed for amyloid-beta 16-22 peptide that hydrophobic forces are more important than specific salt bridges in driving amyloid formation [330]. However as mentioned above, it has been shown that the net charge of a polypeptide chain incurs resistance to aggregation [300, 301], consistent with the notion of a higher overall energetic cost of transfer into a low dielectric medium for more highly charged polypeptides. In light of this, we believe the transfer energy profiles accurately convey this part of the obstacle to amyloid formation. Continuum electrostatics as a tool to examine protein behaviour has limitations, namely that it ignores the microscopic response of the system and thereby risks omitting subtle but important effects. However, by deriving the dielectric map from all-atom molecular dynamics simulations of the proteins of interest, we are able to substantially incorporate the microscopic response in our model and thereby improve the reliability of the energy estimates obtained. Previous theories could not reliably predict the effective dielectric constant inside a protein, so values typically between 4 and 10 have been used as initial guesses. Stronger salt bridges in the interior tend to be better predicted by an interior dielectric of 4, which would then overestimate the strength of the more abundant salt bridges on the protein surface. An interior dielectric of 10 best predicts the strength of the abundant surface salt bridges, but would then underestimate the strength of the buried interior salt bridges. The heterogeneous dielectric theory in Chapter 2 makes it unnecessary to guess at the value of the dielectric inside a protein and also indicates that no single value in the interior is satisfactory. Quantum effects due to electronic polarizability may be added to this approach as further refinement. The conformational variability in the ensemble of NMR structures for each PrP molecule also introduces an inherent uncertainty in the calculation of electrostatic energies, which we treated by averaging salt bridge energies over all NMR ensemble members. The molecular dynamics relaxation methods, often done in the absence of counterions, may introduce uncertainty as well. Electrostatic considerations are relevant to many aspects of the prion question, from PrPC dynamics and stability to PrPSc amyloid organization and templating. We have presented an analysis of salt bridge, electrostatic, and hydrophobic transfer energies that provides a useful perspective for understanding the structural vulnerabilities of PrPC. 96 Chapter 5 SOD1 Misfolding as a Template-Directed Process Human wild-type superoxide dismutase 1 (wtSOD1) is known to co-aggregate with mutant SOD1 in familial amyotrophic lateral sclerosis (FALS), in double transgenic mouse models of FALS, and in cell culture systems, but the structural determinants of this process and its functional conse- quences are unclear. In this chapter, the effects of intracellular obligately misfolded SOD1 mutant proteins on natively structured wild-type SOD1 are dissected at a molecular level. Expression of the enzymatically inactive, natural familial ALS SOD1 mutations G127X and G85R in human mesenchymal and neural cell lines induces misfolding of wild-type natively-structured SOD1, as indicated by acquisition of immunoreactivity with SOD1 misfolding-specific monoclonal antibod- ies. Conversely, expression of G127X and G85R in mouse cell lines did not induce misfolding of murine wtSOD1, and a species restriction element for human wtSOD1 conversion was mapped to a region of sequence divergence in loop II and β -strand 3 of the SOD1 β -barrel (residues 24- 36), then further refined surprisingly to a single tryptophan residue at codon 32 in human SOD1. Finally, we show that aggregated recombinant G127X is capable of inducing misfolding of recom- binant human wtSOD1 in a cell-free system in buffered saline containing reducing and chelating agents. These observations demonstrate that misfolded SOD1 can induce misfolding of natively structured wtSOD1 in a physiological intracellular milieu, in a manner consistent with a direct protein-protein interaction. As described below, culture medium from cells transiently transfected with wild-type or mutant SOD1 can induce misfolding of endogenous SOD1 when added to naive cell cultures, and this process can be stably propagated through serial passages. The agent responsible for induction of misfolding was determined to be a misfolded SOD1 aggregate. Transmission of SOD1 misfolding in vitro is abrogated by extracellular pan- and misfolding-specific SOD1 antibodies. This evidence 97 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS collectively suggests that SOD1 misfolding and toxicity can propagate between cells in a prion-like process that may prompt novel targeted therapies for ALS and other neurodegenerative diseases. 5.1 Introduction Amyotrophic lateral sclerosis (ALS) is caused by the degeneration of motor neurons in the brain, brainstem, and spinal cord [331], resulting in progressive paralysis of the limbs, and the muscles of speech, swallowing, and respiration. ALS is responsible for about 1 in 1000 adult deaths, with 80% of individuals dying within 2-5 years of diagnosis [332]. About 10% of ALS cases display autoso- mal dominant inheritance [333], with ∼20% of these cases due to mutations in the gene encoding superoxide dismutase 1 (SOD1) [334], a ubiquitously-expressed free radical defence enzyme abun- dantly expressed in motor neurons. Over 151 familial ALS (FALS) SOD1 missense, nonsense, and intron splice-disrupting mutations have been catalogued to date [120] (, with no benign amino acid polymorphisms as yet identified. The collective evidence suggests that a cytotoxic gain of function is conferred by SOD1 mutations [331, 335], which has been var- iously attributed to generation of reactive oxygen and nitrogen species, cytoskeletal disruption, caspase activation, mitochondrial dysfunction, proteosome disruption, microglial activation, and other mechanisms [331, 336]. A well-studied consequence of SOD1 mutation and/or oxidation is a propensity of the protein to misfold and aggregate [337]. SOD1-containing neural deposits can be detected by immunohistochemistry (IHC) in motor neurons from familial ALS patients [338], and in transgenic [339] and tissue culture [340] models of the disease. Emerging evidence with misfolding-specific antibodies also identifies misfolded SOD1 in sporadic ALS (SALS) [132, 341], although some antibodies recognizing misfolded SOD1 in FALS do not show immunoreactivity in SALS, e.g. [202]. Natively folded, functional SOD1 scavenges destructive superoxide radicals from the cytosol, converting them into less toxic hydrogen peroxide [133]. However, misfolded SOD1 is capable of reacting nonspecifically with a variety of substrates to produce reactive oxygen and nitrogen species [134]. These toxic agents damage important intracellular structures, includ- ing microtubules, metabolic enzymes, and signalling proteins. Positive feedback loops have been identified between protein misfolding and excitotoxicity [342]; moreover, microglial activation can also be triggered by aggregated SOD1 [343], which is toxic for neurons via a glutaminergic mechanism [344]. Protein misfolding diseases have been classically understood as errors in proteostasis, in which the burden of misfolded species eventually overwhelms the compensatory mechanisms that nor- mally keep their concentration low [337]. Alternatively, a pathologically disordered protein may recruit and induce misfolding of a natively folded isoform, by seeded polymerization or template assistance [138]. These molecular mechanisms may participate in the pathogenesis of several neu- 98 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS rodegenerative conditions, including prion disease, Alzheimer’s disease, and Parkinson’s disease [69]. However, detailed molecular mechanisms are lacking for the propagation of protein misfold- ing in these diseases. An important clinical feature of ALS is its spatiotemporal propagation through the neuroaxis [145]. The initial clinical presentation is usually focal, with expansion of affected muscle groups in a fashion suggesting contiguous spread through anatomic regions of the nervous system [79]. The outward spreading of pathology in ALS from an originating focus has been well documented in cross-sectional and longitudinal studies [135, 136]. Several theories have been proposed to account for this phenomenon, including diffusion of paracrine substances like cytokines through the extracellular environment, sequential activation of microglia, and axonal transport of a dele- terious agent. Additionally, astrocytes expressing mutant SOD1 have been observed to secrete a toxic factor selectively injurious to motor neurons [137]. An alternative hypothesis to account for contiguous spread of deficits in ALS is template-directed misfolding (TDM, the mechanism underlying the prion diseases [54]) of SOD1, in which protein that has adopted an aberrant non- native conformation is capable of binding to natively folded protein molecules and inducing a structural reconfiguration that cause them to adopt the same aberrant conformation. Misfolded SOD1 molecules may diffuse through the neuroaxis, propagating the misfold and its associated toxic effects as described above. In an intriguing recent study [150], aggregates of misfolded mutant SOD1 were shown to be taken up from the extracellular environment by macropinocytosis and cause misfolding of endoge- nously expressed mutant SOD1. To validate the SOD1 TDM hypothesis as a contributor to the pathogenesis of ALS, it is necessary to show at a molecular level that misfolded SOD1 is capa- ble of inducing misfolding of native SOD1, and that at a cellular level the misfolded species may transit between cells and induce misfolding in cells expressing wild-type native SOD1. In the following sections, we provide evidence to • demonstrate intracellular recruitment and TDM of SOD1, • characterize the structural determinants of this process, • show that cells export misfolded SOD1 to their extracellular environment, and • this exported species may induce misfolding in naive cells, even after serial passage. 5.2 Development of Antibodies Specific to Misfolded SOD1 We developed a constrained system in vitro in which cause and effect of participating SOD1 molec- ular species in misfolding could be effectively disentangled. We exploited two natural FALS SOD1 99 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 127 133 G85 DSE2 DSE1a 3H1 10C12 GX-CT 128 153 WT-CT A) wtSOD1 B) G127X C) 0 50 100 150 5 15 25 0 10 20 30 40 Le ng th  U nf ol de d Un fo ld in g Fr ee  En er gy  (k cal /m ol) Centre of Unfolded Region (Residue Index) DSE2 DSE1a D) DSE2 DSE1a Figure 5.1: Sequence overview of (A) wild type (wt) and (B) G127X SOD1. Residue G85, which is mutated in FALS, and the relative locations of the DSE, GX-CT, and WT-CT epitopes are noted, with the corresponding antibodies used in this study. The pan-SOD1 antibody used in this study was a rabbit polyconal antibody prepared by immunization with the whole wtSOD1 polypeptide. C) Unfolding energy landscape for human wild-type SOD1. Minima in this landscapre represent candidate DSEs. D) Location of two DSEs in the native SOD1 structure. mutants: G127X, comprising a TGGG frameshift insertion in exon 5, and the full-length missense mutation G85R (Figure 5.1). Both G127X and G85R translation products migrate faster than wt- SOD1 on gel electrophoresis, which can be visualized by direct immunoblotting as probed with pan-specific SOD1 affinity purified rabbit IgG. G127X possesses the added convenience of being distinguishable from wtSOD1 by mutually exclusive polyclonal antibodies (pAbs): one directed against the five non-native amino acids following Gly127 in the G127X SOD1 variant (GX-CT), the other directed against the C-terminal 25 amino acids of wtSOD1 deleted in the G127X mu- tant (WT-CT) (Figure 5.1). G127X expression also provides an opportunity to unambiguously identify induced misfolding of wtSOD1 by recognition of SOD1 misfolding-exposed epitopes in the deleted region, particularly by non-denaturing methods such as immunoprecipitation (IP) and immunofluorescence (IF) of “native” misfolded wtSOD1. Using molecular dynamics and molecular modelling, the free energy of unfolding was cal- culated for all sequence-contiguous regions of the SOD1 protein. Electrostatic, van der Waals, solvation, pKa, and configurational entropy contributions were included in the calculation (see Chapter 4). Minima in the the resulting unfolding energy landscape (Figure 5.1) represent regions 100 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 200 220 240 260 280 300 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 Wavelength (nm) M o la r El lip ti ci ty  P er  R es id u e  5 C 10 C 15 C 20 C 25 C 30 C 35 C 37 C 50 C 5 10 15 20 25 30 35 40 45 50 0 20 40 60 80 100 Temperature (C) %  S e co n da ry  S tr uc tu re   Alpha Helix Beta Sheet Random Coil A) B) Figure 5.2: The isolated DSE2 peptide adopts a random conformation at physiological tempera- tures. A) The DSE2 peptide circular dichroism spectrum at a range of temperatures (5◦C - 50◦C); the spectrum deconvolution is shown in (B) and indicates a predominantly random coil conforma- tion at all temperatures studied. of the protein with an increased propensity to loss of native structure that are therefore likely to be disrupted in the misfolding process. Sequences corresponding to three of the landscape minima were chosen as disease specific epitopes (DSEs). The monoclonal antibodies against the DSEs were prepared by immunizing mice with free peptides of the same sequence as the regions of SOD1 prone to loss of structure on misfolding, but the specific conformation of these SOD1 subsequences recognised by the antibodies as DSEs is unknown. Possible recognised conformations include: 1) a disordered structure representing a subset of the conformational ensemble accessible to the unfolded region; 2) an “avidity structure” stabilized by multiple, potentially transient but redundant, non-native interactions [38] that form on partial disruption of the native fold; or 3) extrusion or greater accessiblity of intact secondary structural elements such as the short α-helix present in the SOD1 electrostatic loop within DSE2. Planned peptide-antibody co-crystallization experiments will resolve this question by elucidating the structural features that enable DSE recognition. We selected two disease specific epitopes (DSEs) of wtSOD1 (DSE1a and DSE2) (Figure 5.1), specifically recognizing regions inaccessible to antibody binding in natively-structured wtSOD1, but exposed on the molecular surface of the misfolded isoforms [200, 345]. DSE1a comprises residues 145-151 (the SOD1 epitope of the dimer interface, SEDI, previously reported [124]) with cysteic acid replacing Cys146. Substitution of this sulfonic acid derivative in SEDI was based on the reasoning that sulfhydryl Cys146, exposed by Cys57-Cys146 intrachain disulfide bond re- duction accompanying dimer dissociation [346], is a ready substrate for oxidative modifications. 101 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 0 1 2 3 4 5 20 40 60 80 100 120 140 RM SD  (A ng str om s) Residue 20 30 40 50 60 80 20 40 60 80 100 120 140 Residue 70 B Fa ct or  (A ng str om s2 ) G127X wtSOD1 wtSOD1 G85R A 1 2 3 40 RMSD (Angstroms) 2 3 1 N-terminus Zn-binding loop C-terminus (WT) C-terminus (G127X) Electrostatic loop (incl. DSE2) 4 5 B WT (dimer) G127X (monomer) 2 3 1 4 5 1 1 5 3 2 2C Figure 5.3: (A) Comparison of residue-by-residue root mean square deviations between wtSOD1 and G127X from 20 ns all-atom molecular dynamics. B) Residue-by-residue comparison of B- factors, a measure of structural uncertainty, from the crystal structures of wt (PDB 1PU0) and G85R (3CQP). (C) 100 superimposed protein conformers of wt and G127X SOD1 taken from MD simulation and shaded according to the root-mean-square deviation of backbone Cα atoms. Inter- estingly, the greatest regional fluctuations in MDS of wtSOD1 are seen in the vicinity of the elec- trostatic loop containing DSE2 and near DSE1a (Figure 5.1), which suggests that these epitopes are present in relatively unstable parts of the protein that are most susceptible to immunologically detectable conformational changes. 10C12, the DSE1a mAb utilized in these studies, specifically binds to in vitro oxidized SOD1 and disease-associated misfolded SOD1 as determined by ELISA, IP, and IHC [345]. DSE2 comprises residues 125-142 that form a segment of the SOD1 electrostatic loop (ESL), a structural element which is extruded and interacting with β -barrel elements in crystal structures of aggregated SOD1 [347]. 3H1, the DSE2 mAb utilized in these studies, specifically binds to SOD1 that has been misfolded in vitro by denaturants and mild oxidation, and to disease-associated misfolded SOD1 by IP, IF, and IHC [200, 345]. G127X and G85R SOD1 proteins are partially misfolded, enzymatically inactive SOD1 molec- ular species, which do not stoichiometrically bind structure-promoting copper or zinc ions [348, 349]. Moreover, the Cys146 necessary for formation of the monomer-stabilizing intrachain disul- fide bond is deleted in G127X. Relative to wtSOD1, the instability of G85R has been demonstrated by its considerably reduced unfolding temperature in calorimetric studies [350]. Equivalent data 102 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS G127X with 3H1G127X with GX-C T G127X Merge EV Control Merge 10 um10 um10 um10 um A) B) C) D) Figure 5.4: Expression of G127X mutant misfolded SOD1 induces misfolding in wtSOD1. IF images of G127X-transfected HEK 293FT (HEK) cells probed with (A) GX-CT polyclonal IgG, specific for the non-native C-terminus of G127X SOD1, and (B) the DSE2-specific monoclonal antibody 3H1, which recognizes only misfolded full-length SOD1. Image (C) shows the result of merging (A) and (B) with the nuclear-specific stain 4’, 6-diamidino-2-phenylindole (DAPI); the inset shows untransfected cells stained with GX-CT, 3H1, and DAPI. Cells positive for 3H1 immunoreactivity also co-express G127X. Image (D) shows an empty vector (EV) control stained with GX-CT, 3H1, and DAPI; inset shows EV control stained with DAPI and pan-SOD1 antibody. is unavailable for G127X, but equilibrium molecular dynamics simulation (MDS) showed greater regional structural fluctuation compared with wild-type protein (Figure 5.3). 5.3 Expression of Misfolded SOD1 Induces Misfolding of wtSOD1 in Human Cells For these studies, we employed an experimentally tractable cell culture system that did not overex- press human wtSOD1 (associated with spontaneous misfolding in vitro and in vivo [144, 351]), and which tolerated misfolded or mutant SOD1 better than primary neurons. Transient transfection- mediated expression of G127X or G85R in HEK-293FT (HEK) cells induced misfolding of en- dogenously expressed human wtSOD1, as observed by IF microscopy with the DSE mAb 3H1 (Figure 5.4), by IP and with the DSE mAbs 3H1 and 10C12 (Figure 5.5A). To ensure that ob- served results are not merely due to stressors associated with the transfection process, all experi- ments include an empty vector (EV) control in which cells are transfected with non-coding DNA. Inspection of IF images show that within doubly positive cells immunoreactivity for G127X and misfolded wtSOD1 were not entirely congruent and immunoreactivity for 3H1 had a punctate ap- pearance, both of which may reflect differences in trafficking, consolidation, or degradation of the mutant and wt misfolded proteins. Because G85R SOD1 possesses the DSE epitopes, distinc- tion of mutant and wtSOD1 misfolded species was not possible by IF. Molecular association of misfolded wtSOD1 with G127X was demonstrated by co-IP of mutant and misfolded wtSOD1 in 103 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS G1 27 X G8 5R EV mIgG2a GX-CT 3H1 10C12rIgG EV G127X G85R 1%  p re -IP m Ig G 2a rIg G SO D 1 G X- C T 3H 1 10 C1 2 IP antibody Pr e- IP m IgG 2a rIg G SO D1 GX -C T 3H 1 10 C1 2 [P ro te as e K] Untreated 0.5 µg/ml %  IP ed  w tS O D 1 * G85R Empty vector G127X transfected A) B) C) [Protease K] : IP 60 50 40 30 20 10 0 wt GX wt GR wt GR wt GX wt GR wt wt GX wt GX Pr e- IP m IgG 2a rIg G SO D1 3H 1 10 C1 2 0 0.5 1 5 10 100 * * * * µg/ml [P ro te as e K] Untreated 0.5 µg/ml G127X Transfected G85R Transfected Figure 5.5: Association with and conformational conversion of misfolded mutant SOD1 and mis- folded wtSOD1. (A) IP of lysates from transiently-transfected HEK cells. DSE mAbs 3H1 and 10C12 precipitate wtSOD1 in lysates from cells transfected with G127X or G85R, but not with empty vector (EV) control. The quantitation summary is shown below; error bars represent stan- dard error. Values are the average of a minimum of five independent IP experiments. * denotes statistically significant difference compared to EV control. (B) IP of protease K-treated lysates from G127X- and G85R-transfected HEK cells demonstrating marked protease sensitivity exhib- ited by the mutant SOD1 variants and misfolded wtSOD1 immunoprecipitated by the DSE mAbs. A minimum of five replicates were performed for all immunoprecipitation experiments. (C) Pro- tease sensitivity comparison of wtSOD1 and G127X mutant SOD1. Note that G127X is only detectable in the absence of Protease K. All immunoblots were probed with pan-SOD1 pAb. non-denaturated HEK lysates from mutant-transfected HEK cells probed with GX-CT-, 10C12- or 3H1-coupled magnetic beads and detected on subsequent immunoblotting with pan-SOD1 pAbs. IP of wtSOD1 by DSE mAbs in G85R-transfected cells is consistent with similar misfolding in- duction, or co-IP of wtSOD1 with natively misfolded G85R protein (which possesses the DSE epitopes). Misfolded wtSOD1 was not detected in HEK cells transfected with the empty vector pFUW (Figure 5.4D, 5.5A), so wtSOD1 misfolding was specifically induced by expression of the mutant SOD1 species and not due to cell stressors inherent in transfection. 104 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 5.4 G127X SOD1 Induction of wtSOD1 Misfolding is Species-Specific Motor neuron disease in mutant human SOD1 transgenic murine models of ALS can be acceler- ated by cross-breeding of transgenic mice expressing human wtSOD1, in which co-aggregation of mutant and wtSOD1 is observed [145, 147, 148]. Endogenous murine SOD1 in these trans- genic models appears to be essentially inert to aggregation [148, 348], although minor protective effects of mouse SOD1 have been observed [148]. The epitopes for DSE1a and DSE2 are identi- cal between mouse and human and can be exposed by oxidation-induced misfolding (Figure B.2). However, DSE immunoreactivity was not observed in transfected murine N2a neuroblastoma cells expressing abundant G127X protein (Figure 5.6B). Likewise, DSE mAb-immunoprecipitation was also not observed in lysates of N2a cells (Figure 5.6D), compared to that of lysates from human HEK cells. Human restriction of G127X-induced wtSOD1 misfolding was confirmed in three human cell lines (HEK, HeLa, and SH-SY5Y) and three mouse cell lines (N2a, Min6, and B16; Figure 5.6D). Notably, HEK and HeLa cells are epithelial, whereas SH-SY5Y cells are derived from a human neuroblastoma, suggesting that any cell type expressing human SOD1 is susceptible to mutant-induced misfolding. Human and mouse SOD1 are highly conserved, sharing 83% sequence identity (BLAST) and 89% sequence similarity (T-COFFEE sequence alignment). However, visual inspection of the se- quence in Figure 5.6B reveals a divergent region comprising loop II and β -strand 3 of the SOD1 β -barrel (codons 24-36) in which 7 of the 13 residues are non-homologous. We therefore tested whether the determinants for wtSOD1 conversion might map to this non-conserved region by swapping mouse amino acid sequence 24-36 into human G127X and transfecting this construct (G127Xm24-36) into HEK cells. Although G127Xm24-36 was expressed at similar levels to the parent G127X construct, conversion of human wtSOD1 was ablated (Figure 5.6E). These data indicate that the hSOD1 24-36 residue domain is required for efficient recruitment of wtSOD1 by G127X, consistent with structural or sequence dependence for conformational conversion of wtSOD1. Further inspection of the sequence revealed a strikingly non-conservative substitution of tryp- tophan (W) at position 32 in human SOD1 by serine in mouse SOD1. W32 is the only tryptophan in the human protein and has previously been identified as a site of oxidative modification and a potentiator of aggregation [352]. Moreover, W32 is highly solvent exposed, ranking in the 89th percentile for solvent exposure among non-redundant tryptophans in the PDB. G127X and G85R constructs with a W32S substitution had markedly reduced ability to convert wtSOD1 compared with these mutant proteins (Figure 5.6F-G), suggesting that W32 directly participates in the wt- SOD1 conformational conversion process. 105 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS D) 1%  p re -IP m Ig G 2a rIg G SO D 1 G X- C T 3H 1 10 C 12 G X G X IP 1%  p re -IP m Ig G 2a rIg G SO D 1 G X- C T 3H 1 10 C 12 m 24 -3 6 IP mIgG2a GX-CT 3H1 10C12rIgG GX GX m24-36 * ** IP antibody 0 10 20 30 40 50 IP antibody mIgG2a GX-CT 3H1 10C12rIgG Human cells Mouse cells %  im m un op re ci pi ta bl e w tS O D 1 * * * 0 10 20 30 40 50 60 70 80 E) *hSOD1 MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFG mSOD1 MAMKAVCVLKGDGPVQGTIHFEQKASGEPVVLSGQITGLTEGQHGFHVHQYG 51 hSOD1 DNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSV 103 hSOD1 ISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ 153 β3 1 mSOD1 DNTQGCTSAGPHFNPHSKKHGGPADEERHVGDLGNVTAGKDGVANVSIEDRV mSOD1 ISLSGEHSIIGRTMVVHEKQDDLGKGGNEESTKTGNAGSRLACGVIGIAQ C) %  im m un op re ci pi ta bl e w tS O D 1 N 2a /G X H EK /G X H EK 0 10 20 30 40 mIgG2a rIgG 3H1 10C12 1%  P re -IP m Ig G 2a rI gG SO D 1 3H 1 10 C1 2 G 12 7X G 12 7X (W 32 S) G 85 R G 85 R (W 32 S) 0 10 20 30 40 50 60 70 mIgG2a rIgG 3H1 10C12 %  im m unoprecipitable w tSO D 1 %  im m unoprecipitable w tSO D 1 WB: α-SOD1 G85R G85R(W32S) * * * * IP 1%  P re -IP m Ig G 2a rI gG SO D 1 3H 1 10 C1 2 IP WB: α-SOD1 G127X G127X(W32S) F) G) B) N2AA) HEKGX-CT GX-CT Merged Merged 3H1 3H1 10um 10um IP antibody IP antibody Figure 5.6: Mutant SOD1-mediated wtSOD1 misfolding is sequence/structure-specific. IF im- ages of G127X transiently-transfected HEK (A) and mouse N2a (B) cells probed with the GX-CT, and the DSE2-specific monoclonal antibody, 3H1. HEK cells expressing G127X display 3H1 immunoreactvity for misfolded wtSOD1, but no 3H1 immunoreactivity is observed in N2a cells. (C) Sequence alignment of human SOD1 (hSOD1) and mouse SOD1 (mSOD1) proteins. Orange blocks indicate non-conserved residues. Box indicates region of chimeric substitution enriched in human-mouse differences. Location of the SOD1 divergence motif (loop II and β -strand 3) is indi- cated by blue line. The initial methionine is cleaved during processing. (D) IP of lysates from HEK or N2a cells transfected with G127X . DSE mAbs 3H1 and 10C12 precipitate wtSOD1 in lysates from human HEK cells but not from mouse N2a cells. The quantitation summary below includes data from three human (HEK, SH-SY5Y, HeLa) and three mouse (N2a B16, MIN6) cell lines. (E) IP of HEK cells transfected with human G127X or the human-mouse G127Xm24-36 chimera. DSE mAbs precipitate misfolded wtSOD1 in G127X-transfected cells but not in G127Xm24-36- transfected cells. For all immunoprecipitation experiments, a minimum of four independent repli- cates were performed for each cell line and construct. Pre-IP levels of the G127X construct were 47.2±4.5 signal volume (arbitrary units), while pre-IP levels of G127Xm24−36 were 52.8±4.5 (a nonsignificant difference). (F-G) IP of lysates from HEK cells transfected with W32S/G127X (F) or W32S/G85R (G) double mutants. SOD1-DSE mAbs immunoprecipitate SOD1 from lysates en- dogenously expressing original FALS-linked SOD1 single mutants, but not from those expressing the W32S mutation. Immunoblots probed with pan-SOD1 pAb. Values in quantitation summaries represent the average of five independent experiments. Error bars represent standard error; * de- notes statistically significant difference between single and double mutant cell lysates. 106 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 5.5 G127X Induces wtSOD1 Misfolding in a Recombinant Cell-Free System Induced misfolding of wtSOD1 prompted us to investigate the role of the intracellular milieu in the process, including non-SOD1 macromolecules. Purified recombinant wtSOD1 (Figure B.3) was incubated with and without pre-aggregated recombinant G127X for 24 hrs with agitation at 37◦C in HEPES-buffered saline in the presence or absence of DTT (50 mM) and EDTA (5 mM) to simulate the reducing and extensively metal cation-buffered intracellular environment. Mis- folded wtSOD1 was measured in a Biacore surface plasmon resonance assay by capture with 3H1 mAb. G127X significantly potentiated wtSOD1 misfolding in the solution containing DTT and EDTA (Figure 5.7A), although a limited background of spontaneous misfolding of wtSOD1 was observed under these conditions. Conversely, in the solution without DTT or EDTA, there was no induction of wtSOD1 misfolding either in the presence or absence of G127X (Figure 5.7B). The misfolded SOD1 generated in the cell free system was sensitive to degradation by protease K (Fig 5.7D), as also observed for misfolded wtSOD1 in cell culture 5.5C. Co-incubation with purified G127X/W32S instead of G127X induced considerably less wtSOD1 misfolding (Fig. 5.7E), con- firming the protective effect of the W32S substitution. These results suggest that wtSOD1 as a metal replete dimer can still intermittently expose its intrachain disulfide bond for reduction and “trapping” in a partially misfolded state recognized by the 3H1 electrostatic loop mAb, and also indicates that misfolded G127X protein can facilitate this process. Loss of the monomer disulfide bond has been shown to precipitate further structural disorganization in wtSOD1 and increase its propensity for aggregation [346]. Moreover, no other macromolecule appears to be essential for the SOD1 conformational conversion reaction, consistent with direct physical interaction between isoforms. 5.6 Cells Export Aggregated Misfolded SOD1 to their Culture Medium Evidence above that misfolded mutant SOD1 can misfold wtSOD1 intracellularly prompts the question of whether this process can transduce misfolding between cells, as has been observed for tau and α-synuclein [15, 16]. Both wild-type and mutant forms of SOD1 have previously been identified to be exported from cultured cells expressing SOD1 mutants [353, 139]. Exploit- ing our mutant- and wtSOD1-specific antibodies in direct immunoblotting, we detected obligate misfolded molecular species in the culture medium of G127X-transfected HEK cells, including G127X/wtSOD1 heterodimers, non-native disulfide-linked G127X oligomers and multimers (Fig- ure B.4A), and a wtSOD1 fragment of 12-kDa (Figure B.4B). 107 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 0s 100s 200s 300s 400s −50 0 50 100 150 0s 100s 200s 300s 400s −60 −40 −20 0 20 40 60 3H 1 Bi nd in g (R U ) −20 0 20 40 60 80 3H 1 Bi nd in g (R U ) Association phase Dissociation phase HBSN aloneHBSN + DTT + EDTA HBSN + DTT + EDTA HBSN alone 3H1 Binding Dierence + and - GX: n = 5, p = 0.58 n = 5, p = 0.002 A) C) B) 0s 100s 200s 300s 400s −10 10 30 50 70 90 110 3H 1 Bi nd in g (R U ) D) Before PK After PK + G127X - G127X + G127X - G127X 200s 400s 600s−10 0 10 20 30 40 50 60 E) WTSOD1 + G127X WTSOD1 + G127X/W32S 3H 1 Bi nd in g (R U ) RU Figure 5.7: Recombinant G127X can induce wtSOD1 misfolding in a cell free system under reducing, metal cation buffered conditions. A) Representative Biacore sensorgram showing mis- folded SOD1 binding to 3H1 mAb after 24 hr incubation at 37◦C in the presence and absence of recombinant misfolded G127X protein in HEPES-buffered saline (HBSN) containing 50 mM DTT and 5 mM EDTA. Analyte is applied to the sensor surface during the association phase and allowed to elute during the dissociation phase. B) Same experiment as in (A) perfomed in HEPES buffered saline without DTT or EDTA. C) Box plot showing difference in 3H1 binding between 24 hr incubated samples in the presence and absence of G127X for both buffer conditions. Binding levels are normalized to account for variation in immobilization levels of 3H1 between replicates. In the buffer containing DTT and EDTA there is a significant increase in 3H1 binding when G127X is added (99% confidence interval 22–103 Response Units, RU), while in the buffer without DTT or EDTA there is no significant difference due to the addition of G127X (95% confidence interval -26–17 RU). P-values reported are from a paired t test of independent replicates. D) Treatment of misfolded wtSOD1 generated by co-incubation with G127X with protease K (1 µg/ml for 30 min) causes its complete degradation. E) Co-incubation with G127X/W32S causes significantly less induction of wtSOD1 misfolding compared to co-incubation with G127X (p = 0.02). Error bars show mean and standard deviation of binding at the end of the association phase for three independent experimental replicates. 108 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS S fraction B SA m Ig G 2a SO D 1 3H 1 10 C 12  IP P fraction G 85 R h W T 1 %  in p u t S P 3H1 IP 10C12 IP %  im m un op re ci pi ta bl e SO D 1 * * * * * * M ed iu m  c o n d - it io n ed  w it h G 12 7X EV B SA m Ig G 2a SO D 1 3H 1 10 C 12  1 %  in p u t IP 0 5 10 15 20 25 EV G1 27 X G8 5R hW T %  im m un op re ci pi ta bl e SO D 1 0 5 10 15 20 25 Conditioned medium EV G1 27 X G8 5R hW T S P 0 50 100 150 200 250 300 350 400 450 −10 −5 0 5 10 15 20 Time (s) 3H 1 A b B in di ng  (R es po ns e U ni ts ) Untreated Empty Vector Human WT G85R G127X EV HW T G8 5R G1 27 X 5 10 15 20 25 0 p = 0.007 p = 0.01 p = 0.01 n = 6 n = 6 n = 6 n = 63H 1 A b B in di ng  (R es po ns e U ni ts ) A) B) Figure 5.8: Misfolded SOD1 is detectable in the culture media of cells tranfected with SOD1 constructs. Cell media 2 days post-transfection was ultracentrifuged at 100,000g for 1 hour to separate it into pellet and supernatent fractions. A) Misfolded SOD1 was detected by IP in both the pellet and supernatent fractions, but the amount detected was much higher in the pellet fraction. B) The presence of misfolded SOD1 in the ultracentrifuged pellet was confirmed by SPR. Misfolded SOD1 was not detected in the supernatent. 109 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS To better understand the aggregation state of the exported SOD1, media from transfected cells was ultracentifuged for 1 hr at 100,000g to separate high molecular weight multimers and ag- gregates from soluble oligomers. Interestingly, the substantial majority of misfolded SOD1 was detected in the pellet fraction (Figure 5.8), indicating that large insoluble multimers represents the predominant species in the culture media. It is not resolved from this experiment, however, whether the SOD1 is exported from cells as large aggregates, or whether it spontaneously forms these aggregates post-export. Furthermore, the infectivity of the supernatent and pellets remains to be determined: although the abundance of aggregates may be higher, they may less infectious per mass of protein. 5.7 Induction of wtSOD1 Misfolding can be Transmitted Intercellularly Having detected misfolded SOD1 in cell culture media, we thought it natural to ask if this media containing misfolded SOD1 can induce misfolding in naive cells. This question was answered using an experimental design depicted schematically in Figure 5.9, in which conditioned media from mutant SOD1-transfected HEK cells was collected and incubated with naive non-transfected HEK cells. DSE-immunoprecipitable misfolded wtSOD1 was detected in these cell lysates, but not from lysates in which cells were incubated with media from an empty vector transfection (Figure 5.10). As a control to demonstrate that the misfolded SOD1 detected is not merely due to cellular uptake of the inital misfolded seed added to the fresh culture, experiments were repeated in mouse N2A cells, which as shown in Figure 5.6 are inert to induced misfolding. The levels of misfolded SOD1 detected by mAbs 3H1 and 10C12 are significantly less than those detected on transfection of human cells, indicating that the accumulated misfolded SOD1 in human cells is genuinely due to induction of misfolding in the naive cells. To confirm that the effect observed is not generically caused by other proteins structurally similar to SOD1, this experiment was repeated using culture media from HEK cells tranfected with GFP, a β -barrel protein like SOD1: media from GFP-transfected cells was unable to induce SOD1 misfolding (Figure B.1). As well, the amount of detectable misfolded SOD1 was unaffected by addition of DNAse to the conditioned media, ruling out the possibility of misfolding caused by residual transfection plasmid. Since transfected cell culture media contains misfolded SOD1, we wondered if it is possible to propagate SOD1 misfolding through a series of cell cultures in the spirit of serial passage experi- ments common in prion research, as shown schematically in blue in Figure 5.9. For propagation 110 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS collect conditioned media; centrifuge to remove cell debrisHEK293FT cells in growth medium add designated antibody add to naive HEK293FT cells 48 hr growth at 37oC Incubate 30 min at 37oC 20 hr growth at 37oC Transfect with mutant SOD1 multiple passages of transduction harvest cells; lyse for IP 1.  TDM transduction assay 2. TDM blocking assay add to naive HEK293FT cells Figure 5.9: Template directed transduction experimental design. To determine whether template directed misfolding (TDM) of wtSOD1 can be transmitted from cell to cell, in a fashion rem- iniscent of prion disease, we utilized extracellular mutant and misfolded wtSOD1 known to be present in the growth medium following transfection of cultured cells with mutant SOD1. This “conditioned” growth medium is added to fresh medium and overlaid onto naive cells (1) where the presence of misfolded wtSOD1 is thought to initiate TDM of native wtSOD1. Induced cells can either be harvested for analysis or washed and incubated in the presence of fresh medium to allow endogenous misfolded wtSOD1 to be exported into the growth medium for another round, or passage, of SOD1-associated TDM in a new culture of naive cells. To confirm the presence of a TDM-inducing SOD1 species in the extracellular environment, we modified our transduction as- say to include pre-incubation of “conditioned” medium with pan-SOD1 or SOD1-DSE antibodies (2) that would clarify the medium of native and/or misfolded SOD1, prior to its exposure to naive cells. past the first passage, the original misfolded mutant protein seed is extensively diluted, and the species transducing further SOD1 misfolding is endogenously misfolded wtSOD1 generated from cells in each fresh culture. Levels of misfolded SOD1 remain stable through each passage (Figure 5.10), so the phenomenon is not dimishing in time and appears to be self-sustaining. Interestingly, overexpression of human wtSOD1 is also able to trigger SOD1 misfolding that propagates stably thereafter. 111 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 0 5 10 15 20 25 EV G127X G85R hWT %  t o ta l i m m u n o p re ci p it ab le  w tS O D 1 3H1 IP EV G127X G85R hWT 10C12 IP E V G 1 2 7 X G 8 5 R h W T 1 3 5 Passage 01 3 5 Passage 0 0 5 10 15 20 25 %  t o ta l i m m u n o p re ci p it ab le  w tS O D 1 1st passage 3rd passage 5th passage 10 C 12 3H 1 SO D 1 m Ig G 2a B SA 10 %  in p u t IP 10 C 12 3H 1 SO D 1 m Ig G 2a B SA 10 %  in p u t IP 10 C 12 3H 1 SO D 1 m Ig G 2a B SA 10 %  in p u t IP Figure 5.10: Mutant SOD1-mediated wtSOD1 misfolding is transmissible between cells. Im- munoprecipitation of lysates from naive HEK or N2a cells cultured in the presence of culture me- dia from G127X- or empty vector-transfected HEK cells (see Figure 5.9 for experimental design). Quantitation summary is included; error bars represent standard error. *, statistically significant induction of wtSOD1 misfolding between species, and between control vs. G127X conditioned media on naive HEK cells. SOD1 misfolding can be transmitted in serial passage. Immunoprecip- itation of lysates from naive HEK cells cultured in the presence of conditioned medium from one, three and five passages of transduction assays initiated by different SOD1 variants. The percentage of total immunoprecipitable SOD1 as a function of the number of transduction assay passages is shown below. Values are the average of at least three independent experiments. Error bars represent standard deviation. All immunoblots probed with pan-SOD1 polyclonal antibody. 1 %  p re -I P m Ig G 2 a rI g G SO D 1 G X -C T 3 H 1 1 0 C 1 2 rI g G SO D 1  rIgG block α-SOD1 block * * 0 5 10 15 20 25 %  im m u n o pr e ci pi ta bl e  w tS OD 1 B lo ck in g IP antibody mIgG2a GX-CT 3H1 10C12BSA IP rIg G SO D 1 3H 1 10 C 12  1%  p re -IP m Ig G 2a rIg G SO D 1 G X- C T 3H 1 10 C 12 B lo ck in g mIgG2a GX-CT 3H1 10C12rIgG%  im m u n o pr e cip ita bl e w tS OD 1 IP antibody * * * * * rIgG SOD1 3H1 10C12 0 10 20 30 40 50 60 70 80 90A) B) IP Figure 5.11: SOD1 antibodies can block misfolding transmission. (A) Immunoprecipitation of lysates from naive HEK cells cultured in the presence of culture media from G127X-transfected HEK cells, immunologically pre-treated with either rabbit IgG (rIgG) or pan-SOD1 polyclonal antibody (SOD1). Quantitation summary is included; error bars represent standard error. * denotes statistically significant difference compared to rIgG blocking control. (B) Immunoprecipitation experiment with lysates from naive HEK cells cultured in the presence of culture media from G85R-transfected HEK cells pre-treated with rabbit IgG (rIgG), pan-SOD1 polyclonal antibody, or DSE mAbs 3H1 or 10C12; corresponding accompanying quantitation is included; error bars represent standard error. 112 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS 5.8 Antibodies Against SOD1 can Block Transmission of Misfolding Since SOD1 misfolding can serially propagate between cell cultures, SOD1 antibodies should be able to bind the transducing species and neutralize it by abrogating its interaction with natively folded SOD1. Following the blocking protocol shown in red in Figure 5.9, The wtSOD1 convert- ing activity of G127X-transfected HEK cell supernatants was abolished by pre-incubation with SOD1-specific polyclonal rabbit IgG, but not control rabbit IgG (Figure 5.11A). Furthermore, wt- SOD1 conversion in HEK cells exposed to G85R conditioned supernatants was attenuated by anti- SOD1 rabbit IgG, as well as the DSE mAbs 3H1 and 10C12 expected to react with this full-length misfolded mutant (Figure 5.11B). 5.9 Discussion We have developed a tractable reductionist system in vitro utilizing SOD1 isoform-specific anti- bodies to dissect the molecular mechanisms of mutant-induced wtSOD1 misfolding. We demon- strate unambiguously that cytosolic expression of misfolded SOD1 mutants G127X and G85R can confer a misfolded conformation on wtSOD1, as revealed by exposure of natively inaccessible peptide epitopes. Our system has also enabled us to determine for the first time that conformational conversion of wtSOD1 is sequence-restricted, dominated surprisingly by a single residue: W32 (see Figure 5.6). This residue is substantially solvent-exposed on the external convexity of the protein, distant from the native dimer interface. The implication of this “alternate site” for SOD1-SOD1 interaction may resolve some seeming conflicts related to the participation of wtSOD1 in transgenic mouse models of ALS. In transgenic mice expressing human SOD1 mutants, the presence or absence of murine endogenous SOD1 has minimal impact on clinical disease, and murine SOD1 is not incorporated in mutant human SOD1 aggregates [148, 145, 147]. Mouse SOD1 possesses a Ser residue at position 32, which we report is unable to participate in misfolding reactions with human wtSOD1 through this site, although dimer interface interactions are not precluded [354, 355]. By contrast, human wtSOD1 expression can dramatically accelerate clinical disease in trans- genic mice expressing a range of human SOD1 mutants, and is associated with incorporation of human wtSOD1 in aggregates [145, 354, 148, 147]. Human SOD1 possesses the W32 residue, which we report here is essential for misfolding induction of wtSOD1 in HEK cells. However, some studies have shown that human wtSOD1 can actually stabilize mutant SOD1, thought to be due to the formation of native heterodimers mediated by the dimer interface [356, 354, 357]. It has also been noted [354, 357] that co-aggregation of mutant and wtSOD1 is not a simple stoichiomet- 113 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS ric process, confirmed by our study: e.g., massive W32-dependent disulfide-stabilized multimers are observed for G127X and G85R, but incorporation of wtSOD1 in these multimers is minimal, retaining solubility in non-denaturing detergents consistent with our detected monomer and/or non- native disulfide-bonded heterodimers. The role of W32 in the mutant misfolded inducing species is also supported by a recent study showing wild-type human SOD1 does not accelerate motor neuron disease in mice expressing murine SOD1 with the G86R mutation [149], which lacks a W32 residue. The conformational conversion of natively structured SOD1 is analogous to the conversion of the natively folded prion protein (PrPC) to a misfolded conformer of the same protein (PrPSc) in prion disease. Two mechanisms have been proposed to account for the PrPC→ PrPScconversion process: nucleation-polymerization, in which the misfolded monomeric PrP is intrinsically less sta- ble as a monomer but becomes more stable than PrPCwhen recruited to a multimolecular PrPSc ag- gregate; and template mediated assistance, in which the PrPScconformer is more stable than PrPCbut kinetically inaccessible without catalysis by interaction with PrPSc[138]. A critical distinction between these two potential mechanisms is that seeded polymerization progresses from soluble recruitable substrate by incorporation into polymeric forms, whereas template-assisted conver- sion requires direct physical contact between the native and misfolded isoforms to induce the re- cruitable substrate. In our study, aggregation of human SOD1 mutants G127X, G85R, and A4V (as detected by detergent insolubility and/or formation of higher-order disulfide-linked multimers by immunoblotting) is consistent with seeded polymerization of natively misfolded molecular species. However, in our experimentally tractable G127X system, direct contact with misfolded wtSOD1 can be inferred by co-precipitation of both species by GX-CT or DSE mAbs. Physical contact or at least close proximity of G127X and misfolded wtSOD1 is also supported by the formation of G127X-wtSOD1 heterodimers that can be stabilized by non-native disulfide bonds [17], with the source of the oxidation likely nascently misfolded wtSOD1 as detailed above. These data suggest that conversion of wtSOD1 may proceed by direct physical contact with misfolded SOD1 through a template directed process, and not through seeded polymerization. In support of this notion is the recent finding that wtSOD1 can only participate in seeded polymer- ization if it is partially denatured by exposure to low pH and the chaotrope guanidine in vitro [358], consistent with the generation of equilibrium recruitable substrate. As seen in Figure 5.7, pre-aggregated G127X is sufficient to induce wtSOD1 misfolding at physiological pH without chemical denaturants like guanidine, supporting the idea that misfolded SOD1 species catalyse the conversion of wtSOD1 into recruitable substrate as required for a template-mediated mechanism. Different species and mutants of SOD1 fall into a continuum of susceptibility to misfolding, from spontaneously aggregating mutants like G85R on one end to mouse SOD1, which appears to 114 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS D im er N at iv el y  fo ld ed  m on om er M is fo ld ed  m on om er M is fo ld ed  ag gr eg at e M ou se Hu m an M ut an t Energy SO D 1 Sp ec ie s * + + * Human/mouse heterodimer Misfolded human/mutant heterodimer Figure 5.12: Schematic energy landscape of structural transitions taking place during SOD1 misfolding. Colors indicate areas of the landscape accessible to wt human SOD1, misfolding- prone mutant SOD1 , misfolding resistant mouse SOD1, mixture of mutant and human wt SOD1, and mixture of human and mouse wt SOD1. It is known that the aggregation pathway involves monomerization as an intermediate [123], which then proceeds to aggregate formation as an end point. The heterodimers containing one subunit of misfolding resistant SOD1 deplete the pool of “recruitable substrate” available and thereby slow misfolding; conversely, heterodimers containing one subunit of misfolded mutant SOD1 create an alternate pathway to wt misfolding with a lower activation energy and thereby accelarate misfolding. The lines below show the relative differences in stability between SOD1 isoforms. be inert to misfolding induction, on the other end. Wild-type human SOD1 is intermediate between these extremes. The propensity of SOD1 to aggregate is determined by the free energies of the native state relative to the misfolded aggregate and any intermediates, accounting for the kinetic barriers between the states, which is depicted schematically in Figure 5.12. There is evidence for heterodimer formation [17] between wild-type and pathogenic mutant SOD1, which may be the pathway through which co-expression of mutant SOD1 catalyses wild-type protein misfolding. Conversely, one may speculate that heterodimer formation between human wild-type SOD1 and a more stable isoform of SOD1 may be protective by preventing the protein from advancing along the misfolding pathway. Over-expression of wtSOD1 in the current study (see Figure 5.5), and in other laboratories [144, 351] is associated with a proportion of misfolded and ROS-generating conformers. Recent studies have shown misfolded wtSOD1 in sporadic as well as SOD1-FALS [341, 132, 359], sug- gesting that the stochastic generation of misfolded SOD1 template might trigger sporadic ALS, as has been theorized for sporadic prion disease. However, a recent study [360, 202] using different 115 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS antibodies than the above [341, 132], identified SOD1 aggregates from SOD1 FALS, but not spo- radic ALS patients. These data are consistent with different SOD1 misfolding epitope exposure in SOD1-FALS and typical SALS, and perhaps even different mechanisms for SOD1 aggregation. Unlike metal-replete wtSOD1, mutant SOD1 proteins support a population of partially unfolded intermediates even under physiological conditions [361, 362, 123], and it is tempting to spec- ulate that aggregation of mutant SOD1 may progress through seeded polymerization of mutant species, whereas the recruitment of wtSOD1 to SOD1 aggregates in SOD1-FALS, and the wholly wtSOD1-mediated progression of SALS, might occur through templated assistance. Demonstra- tion of prion-like intercellular mutant SOD1 aggregate propagation [150] therefore contributes to understanding of the FALS disease process, but its relevance to SALS is unclear. In contrast, the present study shows a clear role for intracellular induction of wtSOD1 misfolding either by wt- SOD1 overexpression or mutant SOD1 transfection, moving a step closer to a unified model of FALS and SALS pathogenesis. Prion infection in vitro propagates from cell to neighbouring cell, and this process can be in- hibited or abrogated immunologically with antibodies directed against PrPC, PrPSc, or epitopes exposed by both isoforms [190, 191]. Consistent with a prion-like activity of misfolded SOD1, we show that wtSOD1 misfolding can be induced in human cells cultured in media conditioned by cells expressing mutant SOD1, and that this intercellular propagation can be abrogated by incu- bation with pAbs and mAbs reacting with the misfolded species, but not by irrelevant antibodies. These results necessitate that the templating molecular species are exported from cells, which efficiently occurs with mutant, oxidized, and monomeric species of SOD1 [140], and that extracel- lular misfolded SOD1 enters the cytosol of exposed cells to propagate SOD1 misfolding, a process which also pertains the intracellular proteins α-synuclein and tau [15, 16]. The characteristic features that typify a prion remain controversial, despite extensive study. At a molecular level, there is general agreement that a prion is defined by an abnormally folded protease-resistant protein conformer capable of recruiting natively folded protein with the same primary sequence as itself and transmogrifying the recruited protein to adopt the same conforma- tion as the original prion [296]. This behaviour (protease resistance excepted, in fact the opposite) has been observed for SOD1. At an organismal level, transmission of protein misfolding from cell to cell by a seed of misfolded protein is the sine qua non of prion infection, and the serial trans- missibility of SOD1 misfolding between cultures shown above demonstrate that SOD1 possesses this property. It therefore seems that misfolded aggregated SOD1 has the requisite features to be considered a prion. Although motor neuron degeneration in ALS may initially manifest at any site in the neuroaxis, it spreads contiguously outward from the initial focus [136]. This disease process is compatible 116 CHAPTER 5. SOD1 MISFOLDING AS A TEMPLATE-DIRECTED PROCESS with initial spontaneous SOD1 misfolding at a particular site, followed by outward spread of mis- folded SOD1 with TDM activity. The impetus for the initial misfolding event remains to be eluci- dated; in FALS, this can be understood as triggered by misfolding of a disease-prone SOD1 mutant, whereas in sALS it may be due to spontaneous fluctutations or an environmental stressor leading to wtSOD1 misfolding (analogous to sporadic Creutzfeldt-Jakob disease). Significantly, there is a tendency for ALS to spread in a rostral to caudal direction within the spinal cord [79], which is compatible with biased diffusion of misfolded SOD1 in the extracelluar environment, either in the parenchyma or with the prevailing direction of CSF flow [363]. It is important to emphasize that ALS, unlike the canonical prion diseases, is not known to be transmissable between individuals. One explanation for this observation lies in the stability and resilience of the hypothesized prionic species involved in ALS. Whereas other misfolded proteins become protease-resistant when aggregated, SOD1 acquires increased protease sensitivity [364] and therefore does not form the large protein aggregates that are the neuropathological hallmarks of other protein misfolding diseases like Alzheimer’s disease and Creutzfeldt-Jakob disease [365]. The increased protease sensitivity of SOD1 on misfolding may indicate a loosening of structure, and it is possible that the misfolded SOD1 seed is susceptible to denaturation and loss of templating ability, unlike the highly resilant PrP amyloid. The seed may be able to propagate misfoding within the stable environment of a given individual’s nervous system but not tolerate conditions outside it. Direct inoculation of misfolded SOD1 into the nervous system may successfully cause the induction of SOD1 misfolding in vivo and will be tested in the future with transgenic mice. Current therapeutic options for ALS are extremely limited, with only one compound, rilu- zole, shown to modestly prolong survival [366]; many other candidate drugs have failed in clinical trials [367, 368]. Vulnerability of the intercellular transmission process to extracellular antibod- ies indicates that the templating species must contain “naked” SOD1 which is not sequestered in antibody-inaccessible exosomes, gap junctions or other antibody-impermeable structures. Anti- bodies directed against misfolding specific epitopes in SOD1 have the potential to neutralize mis- folded SOD1 as it transits the extracellular space between cells, thereby arresting its spread and its ability to seed further misfolding, while sparing the natively folded SOD1 from autoimmune recognition. 117 Chapter 6 Conclusions and Future Directions This thesis has sought to elucidate the structural transformations that take place on protein misfold- ing through detailed exploration of the thermodynamics of protein conformation. Particular focus has been paid to the prion protein and SOD1, but as discussed below many of the insights gained from studying these proteins may be applicable to a range of proteins involved in other diseases, notably cancer. 6.1 Summary of Conclusions from the Present Work • Proteins exhibit a complex, partially coordinated response to electric fields, which leads to an inhomogeneous, anisotropic dielectric function in and around the protein. • The protein dielectric response stabilizes protein electrostatic pairwise interactions while minimizing the destabilizing effects of repulsive charge pair effects, possibly reflecting evo- lutionary selection in charge placement within proteins. • The protein surface exhibits a high degree of polarizability, in some cases exceeding that of bulk water. This contributes to stabilization of ionizable side chains on the periphery of the protein. • Protein conformation-dependent free energy has multiple contributions, including polar and nonpolar solvation, polar and nonpolar protein-protein interactions, and configurational en- tropy. Values for each of these terms may be extracted from MD simulations of a folded and unfolded protein ensemble. • A simple topology-based free energy function is able to approximately reproduce the un- folding free energy landscape for a protein, but subtler features of the landscape are only apparent from more detailed analysis of MD trajectories. 118 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS • Cooperativity in native protein unfolding may be understood by investigating the free energy savings in simultaneous unfolding of two secondary structural regions over their unfolding in isolation. • Ion pair networks are important in stabilization of the native PrP structure and impose ener- getic constraints on the organization of the misfolded protein. • Total protein electrostatic energies for PrP correlate with susceptibility to misfolding. • Certain secondary structural elements of PrP have a relatively low barrier to unfolding, in- cluding the first β -strand, the loop between β -strand 2 and α helix 2, and the loop between α-helix 2 and α-helix 3. These regions may be sites for the initial structural loosening of PrPCpreceding its conversion into PrPSc. • Human cells transfected with mutant SOD1 constructs in culture export misfolded SOD1 into their medium, and application of medium from these cells is competent to induce mis- folding in naive cells in serial passage. • Misfolding of wild-type SOD1 can be induced by co-incubation with mutant protein both in cell culture and in a reductionist recombinant system. 6.2 Future Directions There is scope to extend much of the theoretical and experimental work above. Some of the directions that may be fruitful to pursue based on the evidence in this thesis are listed below. 6.2.1 Protein pKa Prediction A classic application of Poisson-Boltzmann theory is the prediction of dissociation constants for acidic and basic side chains in a protein of interest [369]. The local electrostatic environment around a charged group perturbs the free energies of the protonated and unprotonated states from their values in bulk solution, which in turn affects the dissociation constant according to −RT lnKa = GHA−GH+/A− ≡ ∆Ga→ pKa = ∆Ga ln(10) ·RT , (6.1) where GHA and GH+/A− are respectively the free energies of the protonated and unprotonated functional group in the protein, which may be calculated by PB solvers using the heterogeneous dielectric function described in Chapter 2. This method gives the so-called intrinsic pKa for a 119 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS charged group, that is its pKa in the absence of effects from the competing dissociation of other ionizable sites in the protein. Monte Carlo methods to simulate titration curves may be used to account for the simultaneous dissociation of all charged groups, to convert the intrinsic pKas de- termined by PB theory to effective pKas (for example the PDB2PKA module implemented within PDB2PQR [317]). Current empiric methodologies like PROPKA [370] have succeeded in predict- ing protein pKas with a mean error of slightly less than 1 pH unit, but there many cases where the predictions of these methods are incorrect. Recent NMR experiments demonstrate a complex profile for the pKas of buried glutamic acids and lysines in staphylococcal nuclease [371, 372], so there is an experimental dataset available against which pKa predictions may be benchmarked. 6.2.2 Direct Calculation of Regional Protein Unfolding Energy from Steered MD Instead of separately estimating all the contributions to the unfolding free energy function as de- scribed in Chapter 3, the free energy change on partial unfolding may be obtained directly from MD simulations that steer the protein from the folded state to the partially unfolded state. This model-free approach uses the Jarzynski equality [373] to relate the work W done by the steering force over all possible nonequilibrium paths (denoted by angle brackets) to the free energy change F : e∆F/kT = 〈 e−W/kT 〉 (6.2) Although the Jarzynski equality holds arbitrarily far from equilibrium, the closer the simulations are to equilibrium, the faster the average converges and the fewer simulations are required [374]. For this method, a collective variable q is added to the system Hamiltonian that depends on the “degree of foldedness” of the region of interest. A suitable choice for q is the fraction of native contacts formed between the region to be unfolded and the whole protein. At the start of the sim- ulation q is set to its value in the native protein and made to decrease linearly to a value (generally near 0) representative of the partially unfolded state by application of a harmonic biasing potential. The simulation may be performed while the rest of the protein is either free to relax during the steering or held fixed, allowing motion only in the segment to be unfolded. The results of a representative simulation to calculate the free energy change on unfolding of the human PrP β - sheet are shown in Figure 6.1. The major limitation to this approach is the high computational cost of performing the large number of replicate simulations (generally around 20) required to achieve convergence of the average in the Jarzynski equality. Preliminary simulations with Generalized Born implicit solvent have been successful provided the folded region of the protein is fixed. If the folded part is left unrestrained, it tends to lose structure in the course of simulations performed to 120 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 120 100 80 60 40 20 0 Co nt ac t C ol va r V al ue Time (ns) 2 4 6 8 10 Actual colvar trajectory Centre of colvar harmonic potential 2 ns 4 ns 6 ns 8 ns 10 ns 12 ns 12010080604020 Contact Colvar Value W or k (k ca l/m ol ) -5 0 5 10 15 20 Harmonic restraint function Centre 20 colvar units 10  k ca l/m ol Individual steered trajectory Free energy calculated from JE a) b) c) Figure 6.1: Proof-of-concept direct calculation of unfolding free energy from harmonic biasing MD simulations. A) Frames extracted from a steered unfolding simulation of the PrP β -sheet in Generalized Born implicit solvent, in which β strand 1 is removed from the rest of the protein, which is held fixed. B) Value of the collective variable (colvar) parameter representing the number of native β -sheet contacts during the simulation. The simulation is linearly steered by adding a harmonic constraint potential whose centre moves along the straight line shown. C) Integrated work done in unfolding the β -sheet for 15 simulations. The thick line is the result of using the Jarzynski equality to extract the free energy change from the work paths. Inset shows to scale the shape of the biasing potential used as to steer the simulation. date, whether or not a biasing potential is applied (this is perhaps a commentary on the incomplete descriptiveness of current implicit solvent force fields). The computational resources to perform the simulation replicates in explicit solvent exceed what is currently available, but this may be less of an obstacle in the future. A further disadvantage of this model-free approach is that the individual variables contributing to the free energy change are not isolated; that is, it cannot be discerned whether the change is driven primarily by entropic or enthalpic, nonpolar or polar, solvent or protein effects. Conversely, the model-free approach is also not susceptible to potential flaws in the energy function. Whereas the model-based method is useful as a screening tool to study whole proteins or even classes of proteins, the steered MD method is useful as a way to investigate specific phenomena of interest (like the stability of a particular secondary structure), so that computational effort can be efficiently targetted. 121 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 6.2.3 A Mean Hydrophobicity Model for Misfolded Protein Assembly The energy landscape theory described in Chapter 3 has been used to identify regions of instability in proteins and measure the cooperativity in unfolding of different regions, by comparing the free energy of the native state in water to that of the unfolded state in water. However, it can be generalized to compare the free energy changes involved in making the transition between any two ensembles. For example, to gain insight into later stages of the misfolding process, the native state may be compared to an approximation of the protein in a homogenous amyloid phase. Inevitably, since specific protein-protein interactions in the misfolded amyloid phase are unknown, they must be modelled based on a continuum description of the known features of such a phase: a low dielectric of 4 - 8, close van der Waals packing of the backbone and nonpolar side chains, and restricted configurational entropy equal to or less than that in the native state. Such a “mean-field hydrophobicity” model would provide an indication of the barrier to participation in the amyloid phase. A version of this approach that accounts for just the effects of dielectric transfer is shown in Figure 4.7, but this could be extended to other terms in the energy function as well. 6.2.4 Small Molecule Development to Inhibit SOD1 Misfolding Understanding SOD1 misfolding as a template-directed process provides a new direction to possi- ble therapeutic intervention. In addition to direct use of misfolding-specific antibodies as disease- modifying treatment, small molecules that stabilize native SOD1 to increase the kinetic barrier to misfolding, block interaction between native and misfolded protein, or reduce ROS production could attenuate the pathologic consequences of SOD1 misfolding. The advantages of develop- ing a small molecule inhibitor of SOD1 template directed misfolding are potentially considerable. Antibodies, by virtue of their large size, have pharmacokinetic limitations in penetrating the blood- brain barrier and reaching motor neurons in the cortex and spinal cord. Small molecules, especially those that are lipophilic, would not suffer this problem. Similarly, antibody administration must be done parenterally to avoid destruction in the digestive tract, but if an acid-stable small molecule can be devised, oral dosing is possible. The results of Chapter 5 suggest three possible binding targets for a SOD1 TDM inhibitor: • Active site: The SOD1 unfolding energy landscape in Section 5.2 identifies the electrostatic loop partially protecting the SOD1 active site as susceptible to unfolding, and ROS pro- duction due to greater exposure of the SOD1 active site copper is an established sequela to misfolding. A target binding pocket on the SOD1 native structure in close spatial proximity to the active site formed from residues found to be stable and unaffected by electrostatic loop unfolding would therefore be an appropriate candidate to block exposure of active site 122 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 1 2 3 Figure 6.2: Druggable sites to be stud- ied on the SOD1 homodimer. 1) The ac- tive site 2) The dimer interface domain 3) W32 (native/misfolded interaction site). copper, its redox cycling, and ROS production. Using this approach to define the binding pocket maximizes use of available structural data to favour compounds with an affinity for the misfolded active site. • Dimer interface: Unlike the misfolded SOD1 active site, the native dimer interface is well- defined in existing SOD1 crystal structures, which substantially simplifies the process of binding pocket selection. In a preliminary investigation, the van der Waals surface of the native SOD1 dimer has been inspected for druggable clefts at the dimer interface, revealing a candidate site with the desirable features of a ligand binding pocket (see Figure 6.2). The site is formed by residues 105-111 of both monomers in the dimer and contains a mixture of hydrophobic, polar, and charged side chains. Of note is the sulfhydryl group of residue 111 from a reduced cysteine. Previous studies have demonstrated the formation of non- native disulfide bonds in misfolded SOD1 aggregates [375, 376], and by pharmacologically blocking access to this sulfhydryl group formation of these aggregate-stabilizing bonds may be inhibited. Notwithstanding the specific consequences of preventing non-native disulfide bond formation, a molecule that binds to the dimer interface domain and increases its en- thalpy of dissociation will raise the activation energy barrier to misfolding. For every 1 kcal/mol increase in the enthalpy barrier of dimer dissociation, from the Arrhenius rate law the velocity of the dissociation reaction will decrease by a approximately a factor of 5. If the drug binds with an enthalpy of 5 kcal/mol, for example, the rate of dimer dissociation will be reduced by a factor of 4×103. This would significantly slow the propagation of misfolding and thereby decrease the rate of disease progression. • Interaction domains between folded and misfolded protein: If SOD1 misfolding follows a template-directed mechanism, physical contact between the native and misfolded species is a requirement for intermolecular transmission of the misfold. Interactions between the folded and misfolded proteins are likely to occur at defined sites, and if a ligand could block binding 123 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS of misfolded SOD1 at these sites on the native protein, TDM could be prevented. W32 has been identified in this work as a key determinant of SOD1 recruitment to the misfolding template. The molecular surface of SOD1 in the vicinity of W32 is relatively flat and does not present any obvious binding clefts for small molecules. However, there are a variety of functional groups from other residues in the vicinity that could be exploited to achieve acceptable affinity and specificity. Previous work by Antonyuk et al. [377] has identified molecules able to bind SOD1 near W32, but their approach employed only small scale library screening. Nonetheless, this study can be used to help refine molecular poses near W32. A comprehensive program to develop drugs against these sites would involve: • Systematically screening compound libraries for high-affinity ligands to druggable sites on the SOD1 molecule, using known compound libraries and a molecular docking algorithm like GLIDE [378]. Especially for the highly exposed W32 site, alternate methods that more rigorously account for solvation effects like the 3D-Reference Interaction Site Model (3D- RISM) [379, 380] may be used as part of a complementary docking search strategy. • Determining lead compound binding affinity for misfolded SOD1 by surface plasmon reso- nance. • Testing for inhibition of template-directed misfolding. The cell-free conversion assay de- veloped in Section 5.6 provides a convenient technique for studying the effectiveness of candidate drugs at attenuating SOD1 TDM. • Measurement of ROS/RNS production by misfolded SOD1 in the presence and absence of candidate drugs. Reduction in reactive oxygen or nitrogen species (ROS/RNS) production by SOD1 is expected to correlate with improved disease prognosis, so drugs that can achieve this effect are promising therapeutics. HEK cells transfected with disease-causing mutant SOD1 constructs could be incubated in the presence and absence of candidate drugs and then exposed to the cell-permeable dye dichlorodihydrofluorescein diacetate, which reacts to form a fluorescent molecule in the presence of ROS [381]. Cell fluorescence could be monitored by flow cytometry to determine intracellular levels of ROS; desirable drug candidates would cause reduced fluorescence signifying inhibited ROS production. • Co-crystallization or NMR studies of SOD1 with candidate drugs. Confirmation that a can- didate drug is binding at the expected docking site on the SOD1 molecule can be provided by co-crystallization and XRC structure determination of SOD1 with bound ligand. Alter- natively, solution-phase protein could be saturated with ligand and studied by conventional 124 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 15N-HSQC NMR experiments. Previously published assignments [382, 383] in conjunction with analysis of the resulting HSQC spectra could be used to identify and localize structural changes induced upon ligand binding. Chemical shift changes, peak broadening and peak disappearance can all be used to help localize ligand binding. If the binding is weak alternate approaches such as heteronuclear or transferred NOE (trNOE) studies may b used to identify the residues exhibiting perturbations [384]. • Testing lead compounds in mouse models of ALS. After completion of in vitro drug char- acterization, the most promising leads could be tested for efficacy in BL6 mice transgenic for G93A human mutant SOD1, a previously validated aggressive mouse model of ALS [385]. Protocol guidelines for this kind of efficacy study have been established by the ALS Therapeutics Discovery Initiative and require 24 mice per study arm, with untreated con- trols and sham treated animals studied for comparison. Treated mice would be monitored for weight, hind limb reflex [386], RotaRod performance [387], gait, righting reflex, and survival. Postmortem neuropathological studies for SOD1 aggregate inclusions, microglial activation, and motor neuron counts could be performed and correlated with behavioural and disease progression data. If a compound with efficacy in these studies were found, it would represent a highly promising translation of basic disease mechanisms into something of medical utility. 6.3 Protein Misfolding in Cancer The field of protein misfolding has been driven by the study of neurodegeneration and the sys- temic amyloidoses. However, the centrality of protein conformation in so many aspects of phys- iology and the inherent susceptibility to disruption of native structure according to environmental conditions suggest that misfolding may be a broader feature of human disease than commonly appreciated. Misfolding in other illnesses may be subtler, not leaving evidence in the form of mi- croscopically visible aggregates, but may nonetheless provide new insights into pathogenesis and avenues for therapeutic intervention. Cancer, in particular, is a set of diseases often ignored by the protein folding research community for which a better understanding of protein misfolding in the context of tumour-specific environmental and genetic stressors offers heretofore unexploited opportunities. The life of a cancer cell is not easy. Replicating in an acidic environment deprived of oxy- gen and nutrients, evading surveillance by the immune system, and recovering from cytotoxic chemotherapy or radiation damage, neoplastic cells succeed in maintaining their viability despite these challenges, but often at considerable cost: DNA damage, corrupted signaling pathways, and 125 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS disorganization of cellular machinery. An additional, heretofore under-recognized casualty of the cancer cell’s struggle to survive in a hostile world may be a decline in the fidelity of protein folding and the quality control that normally keep levels of misfolded protein to a minimum. Eukaryotic cells have evolved a sophisticated set of tools to control the production, folding, and localization of proteins, which may be compromised in cancer cells as they dedicate more cellular resources to division. There are a variety of avenues by which cancer cells could generate misfolded protein at a rate exceeding that of healthy cells: • At a pre-translational level, genetic damage contributing to the malignant transformation causing production of mutated protein that fails to fold correctly; • At a co-translational and maturational level, endoplasmic reticulum stress causing aberrant protein trafficking, glycosylation; imbalance between the amount of protein synthesized and the availability of chaperones, and overexpression of particular proteins; • At a post-maturational level, induced misfolding of cell surface proteins due to the unusual tumour tissue microenvironment that deviates from physiologic norms. The implications of protein misfolding in cancer are potentially considerable, particularly cell surface expression of incorrectly folded protein that represent tumour specific antigens to be rec- ognized by antibodies or vaccination. The current evidence directly supporting protein misfolding in cancer is first reviewed, followed by a discussion of indirect evidence from known aspects of cancer cell pathophysiology suggestive of a larger role for misfolding (See Figure 6.3 for an overview of the pathways involved). The practical consequences of protein misfolding for drug and antibody development are considered, with a list of currently targetable proteins and their misfolding-specific epitopes predicted using the tools described in the body chapters of this thesis. 6.3.1 Gene Mutations Causing Constitutive Protein Misfolding in Cancer In the conventional understanding of cancer development, a cell acquires a series of genetic muta- tions that cause it to start replicating independent of the signals that usually regulate its cell cycle. The mechanisms by which these mutations contribute to oncogenesis are varied and complex, but fall broadly into two categories: activation of an oncogene or inactivation of a tumour suppressor gene, either of which can be caused by genetic mutations affecting protein primary sequence or larger-scale disruptions affecting gene copy number. A role for protein misfolding in cancer was first established by studying mutants of p53, a particularly crucial tumour suppressor protein in- volved in regulating DNA repair, cell cycle arrest, and apoptosis. The p53 protein is composed of independently folding DNA-binding domains, tetramerization domains, and large unstructured 126 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS regions encompassing roughly 40% of its primary sequence [388] that participate in transcription factor activation, nuclear localization, and apoptotic signaling. Inactivation of p53 is a feature of almost all human cancers: in roughly half of cancers it is inactivated directly by mutation, while in the other half the inactivation is indirect through derangements of associated signalling path- ways [389]. Interestingly, the vast majority of oncogenic mutations are in the core DNA-binding domain, even though it accounts for only around half its total length [390]. These mutations have been divided into two groups according to their effect on the protein [388]: contact mutations (such as R284Q, R248W, R273H, and R273C) disrupt residues involved in DNA binding, while struc- tural mutations (such as R175H, G245S, R249S) affect residues that contribute to the stability and overall organization of the DNA binding surface. Contact mutants belong in the large category of point mutations causing functional changes due to local active site modifications and do not entail protein misfolding per se; structural mutants are more interesting from the standpoint of protein misfolding as they achieve their effect by more comprehensive disorganization of the protein. They are therefore analogous to pathologic mutations in protein misfolding diseases in that they deplete the supply of available natively folded protein, albeit by unfolding rather than aggregation. The susceptibility of p53 to large-scale structural disruption by point mutations arises because it is only marginally stable, with a melting point of 44-45◦C. This is commonly observed for proteins with multiple binding partners, since the ability to participate in binding interactions generally requires flexibility that comes at the expense of high stability. Approximately 30% of the mutations that inactivate p53 simply lower the melting temperature of the core DNA binding domain of p53 so that it denatures rapidly or fails to fold in cells [391]. Several crystal structures of wild-type and mutant p53 are now available [392, 393, 394], which permit a detailed understanding of the basis for destabilization. Small molecules designed to stabilize p53 are now in development [395] as a way of rescuing its tumour suppressor activity. p53 is an example of the role of protein misfolding in cancer, but not the only one. Epidermal growth factor receptor (EGFR) is well-studied surface protein overexpressed on non small-cell lung cancer, pancreatic cancer, colorectal cancer, squamous cell carcinoma of the head and neck and glioblastoma multiforme. The frequency of EGFR overexpression and its importance in po- tentiating tumour growth led to considerable interest in developing small molecule and antibody inhibitors. Gefitinib and erlotinib, two small molecules in clinical use, inhibit the intracellular tyrosine kinase domains of EGFR and thereby prevent downstream signalling [396]. The first EGFR antibody approved for clinical use was cetuximab, which interacts exclusively with domain III of surface EGFR, partially occluding the ligand binding region on this domain and sterically preventing the receptor from adopting the extended conformation required for dimerization [397]. Another anti-EGFR antibody, matuzumab, blocks ligand-induced receptor activation indirectly by 127 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS sterically preventing the local conformational changes necessary for high-affinity ligand binding and receptor dimerization [398]. Although these EGFR targetted therapies have proven their clinical utility, they cause side effects related to antagonism of EGF signalling in healthy tissue: the majority (45-100%) of pa- tients receiving EGFR inhibitors develop a papulopustular rash [399], a smaller fraction develop paronychia and mucositis, and a small number develop severe reactions with life-threatening super- infection of skin lesions [400]. The ideal EGFR-based antineoplastic would avoid these adverse reactions by selectively antagonizing EGFR signalling in tumour tissue while sparing EGFR in normal tissue. Achieving such tumour-specific EGFR antagonism is a difficult challenge: for mAb thera- peutics, it requires identifying protein epitopes only present at tumour sites yet absent elsewhere. Recognizing mutated regions is a relatively straightforward means for the sizable fraction of tu- mours expressing mutant protein isoforms. EGFR has provided an instructive example: of the 40% of glioblastoma cases overexpressing EGFR, roughly 50% have a highly oncogenic mutant, ∆2−7EGFR, generated from a deletion of exons 2 to 7 of the EGFR gene [401]. This deletion results in an in-frame loss of 267 amino acids from the extracellular domain of the receptor, preventing it from binding ligand but causing it to signal constitutively. Attempts to raise antibodies against this ∆2−7EGFR mutant yielded two antibodies, mAb175 and mAb806, with unexpected properties: they bound the overexpressed wtEGFR in tumours but did not cross-react with wtEGFR on othr tissues in the body [402]. The epitope for these antibodies was mapped to residues 287-302 in the full-length protein and confirmed by co-crystallization studies, which show that thse antibodies recognise a cryptic epitope sterically blocked in native EGFR but apparently exposed on tumour EGFR [403]. It appears that mAb175 and mAb806 are recognizing a locally misfolded region of EGFR, expanding their utility beyond cases with ∆2−7EGFR mutation. This approach may be broadly applicable to a range of other cancer-associated targets. If partially misfolded protein is more abundantly present on cancer cells and antibodies reactive to misfolding-specific epitopes can be developed, a new approach to targetted anitneoplastic immunotherapy is possible. 6.3.2 Gene Copy Number Alterations and Subunit Imbalance A well-known feature of neoplastic cells is genomic instability, manifesting mildly as changes in gene copy number and more severely with changes in chromosome number (aneuploidy) [404]. This imbalance in genetic material naturally leads to protein under- or overexpression, and in the case of multi-subunit proteins often creates an imbalance in subunit number. The more abundantly expressed subunits are unable to find their needed counterparts to form the correct quaternary structure, burdening the cell with incompletely folded protein. Experiments in yeast show that a 128 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS G en et ic  d am ag e Tr an sl at io na l a nd  m at ur at io na l a no m al ie s En vi ro nm en ta l s tr es so rs Reactive oxygen species Acidic micro- environment Membrane protein  crowding O2- O2- O2- O2- O2- O2- O2- O2- O2- 6 3 4 5 2 7 8 9 10 1 1112 Inducers of Protein Misfolding in Cancer ER protein crowding Intracellular oxidative stress Dierent tracking patterns Aberrant glycosylation Oncogene and tumour suppressor mutation Chromosomal aneuploidy Eects of rad- iation and chemotherapy Chaperone imbalance Impaired/inadequate protein clearance Mitochondria Protein aggregates Proteasomes Chaperones Glycans H+ Figure 6.3: Schematic of variables contributing to protein misfolding in cancer single extra chromosome is sufficient to activate the unfolded protein response [405]. Furthermore, proteasome activity is necessary to clear the burden of improperly folded protein, so this extra chromosome increases the sensitivity to proteasome inhibitors [405]. This evidence supports the conclusion that aneuploid cells constitutively generate a large amount of unfolded or misfolded protein. Even in cells without gross chromosome-level abnormalities, promoter mutations leading to increased transcription or microsatellite expansion would produce the same effect. 129 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 6.3.3 Impaired Protein Quality Control in the ER Aneuploidy, gene copy number changes, and other genetic anomalies result in the production of improperly folded protein, which must be cleared away by the cellular machinery. Eukaryotic cells have evolved a variety of tools for this purpose: the ubiquitin-proteasome system, autophagy, and the unfolded protein response. The cell’s normal process for dealing with incorrectly folded proteins is ubiquitination and degradation in the proteasome. It is estimated that up to 30% of the polypeptides synthesized in a healthy cell fold incorrectly and must be recycled by this route; for some transmembrane proteins like CFTR or the δ -opiod receptor, the rate of incorrect folding necessitating degradation is 40% [406]. For neoplastic cells constitutively engaged in protein production, such as multiple myeloma secreting antibody chains, misfolded protein is constantly produced as a byproduct that must be cleared away if the cell is to survive. Such cells are exquisitely senstive to proteasome inhibition, since loss of proteasome activity essentially causes the cell to choke on its own debris. The unfolded protein response (UPR) is the emergency stop button on the ER assembly line. When the number of newly translated polypeptides entering the ER exceeds its capability to fold them, the UPR is activated to attenuate further protein synthesis and upregulate the expression of ER chaperones [407]. The proteostasis boundary [408] describes the set of conditions that the cell can tolerate defined by the combined effects of the kinetics and thermodynamics of folding and the kinetics of misfolding, which are linked to the variable and adjustable PN capacity found different cell types. When this boundary is exceeded, the result is accumulation of improperly or incompletely folded protein necessitating clearance through the UPR. There is evidence for activation of the unfolded protein response in breast cancer [409, 410]. A further risk associated with protein buildup in the ER is the effect of macromolecular crowd- ing on protein folding efficiency. In secretory cells, molecular crowding has been shown to impair protein folding and lead to aggregate formation in the ER [411]. A computational study investi- gating the effects of spatial confinement on folding of a minimal β -barrel protein found that such confinement increases the folding temperature and decreases the folding time [412]. However, this study considered only folding of an isolated monomer. In another study of confined crambin folding with coarse-grained MD, the presence of multiple protein copies with a weak inter-protein attractive potential (a more realistic scenario) hindered correct folding and predisposes to aggrega- tion and misfolding [413]. Interestingly, it has been estimated that increasing the total intracellular protein concentration by 10% can potentially increase the rate of protein misfolding reactions fol- lowing a nucleation-polymerization mechanism by a factor or 10 [414]. Applying this reasoning to the ER, the initial accumulation of misfolded protein in the ER could potentiate further misfolding. 130 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS Glycosylation of proteins destined for the cell surface is an essential function of the ER and Golgi complex [415], and aberrant glycosylation has long been recognized as a hallmark of cancer [416]. For example, the blood group precursors T and Tn carbohydrate epitopes are not detected in healthy and benign-diseased tissues but are present in roughly 90% of carcinomas [416]. In pancreatic cancer, both the expression and glycosylation of membrane-bound mucin glycoproteins is dysregulated [417]: MUC4, which is minimally expressed in the normal pancreas, becomes highly expressed; this is accompanied by cell surface expression of the novel carbohydrate Tn antigen. Similarly, MUC1 has been identified as a promising target in breast cancer [418], as it is overexpressed in more than 90% of breast cancers in an underglycosylated form. This results in exposure of core regions of the extracellular domain. A study of 100 tissue samples from lung cancer patients revealed significant alterations in N-linked glycosylation, with increases in Sialyl-Lewis-X, mono-antennary, and highly sialated glycans but decreases in core-fucosylated biantennary glycans [419]. Aside from contributing to protein function, glycans modulate protein stability. Changes in glycosylation can therefore expose novel epitopes normally buried by glycans and destabilize protein regions around glycosylation sites. 6.3.4 Protein Relocalization to the Cell Surface Prolonged ER stress has been shown to cause protein mislocalization, including migration of newly synthesized immunoglobulin from the secretory pathway to the cytoplasm [420]. It was discovered recently that calreticulin, an ER chaperone protein involved in proper disulfide formation, may migrates to the cell surface where it serves as a marker for immunogenic cell death and enables phagocytosis by dendritic cells [421]. Global profiling of the cell surface proteome of multiple cancer cell lines by mass spectrometry has revealed a surprising abundance of chaperone proteins, including GRP78, GRP75, HSP70, HSP60, HSP54, HSP27, and protein disulfide isomerase [422]. Derlin-1, an ER integral membrane protein reported to participate in misfolded protein dislocation from the ER to the cytosol, has been identified at the cell surface of colon, breast, ovarian, and other cancers [423]. ER stress actively promotes GRP78 localization on the cell surface [424]. Collectively, the evidence strongly supports migration of normally ER-resident proteins to the cancer cell surface, an event likely secondary to ER stress. This is somewhat contradictory to the conventional understanding of ER stress, which holds that export of proteins from the ER to the Golgi is reduced [425]. It is possible that although the total amount of protein exported from the ER decreases under conditions of stress, the specificity of export is also reduced. If chaperone proteins like calreticulin, derlin, and protein disulfide isomerase make it to the cell surface, it is likely that at least a proportion of them carry incompletely folded proteins. If the mechanism of 131 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS chaperone expression on the cell surface is indeed nonspecific, other improperly folded proteins may also make it to the surface. 6.3.5 Tumour Microenvironment Stressors Rapid growth without adequate blood supply leads to an accumulation of metabolic end products within the tumour and creates an extracellular environment hostile to protein stability. The depen- dence of tumour cells on glycolysis, with lactate as a metabolic endpoint, acidifies the interstitium, dropping it from a physiologic pH of 7.35-7.45 to as low as 6.2. This pH is a tissue average, so it is possible that within it there are pockets of higher acidity. Although typical proteins retain their overall stability to a pH of 4 or lower, protein regions containing ionizable side chains with pKa’s in the 6-7 range (such as histidine and buried aspartate or glutamate) will change their charge state in the tumour microenvironment. This may eliminate a stabilizing salt bridge, in the case of Asp or Glu, or introduce an unshielded charge, in the case of His, with potentially destabilizing consequences. Beyond pH, tumours produce high levels of ROS that can derivatize protein functional groups and modify their chemical properties. Intracellular ROS derive primarily from mitochondria, and mutations in mitochondrial DNA seen in cancer cells have been linked to increased generation of superoxide [426]. Other intracellular sources of oxidative stress include non-mitochondrial electron transport chains in the ER, phase I reactions though p450 metabolism, β -oxidation in peroxisomes, and inflammatory cytokines. High levels of intracellular oxidative stress have been abundantly demonstrated though increased H2O2 production in ovarian, prostate, colon, pancre- atic, and breast cancers [427, 428]. Extracellarly, ROS production is amplified by endogenous factors like the Nox family of NADPH oxidase membrane proteins [426] and altered levels of extracellular superoxide dismutase. Exogenous factors like radiation, hyperthermia, chlorinated compounds, and metal ions further increase oxidative stress. The balance between reduced and oxidized glutathione is perturbed in prostate and lung cancer [429, 430], indicating a chronic im- balance in redox homeostasis. The effects of oxidative stress include DNA damage and increased metabolic requirements, but importantly also include non-physiologic protein modifications. It is estimated that roughly 10% of the cysteines in the 200,000 unique cysteines in the human pro- teome are oxidizable [431], introducing sulfonic, sulfenic, or sulfinic acid groups that are generally destabilizing. In the ER, oxidative stress is associated with increased in protein conjugation to glu- tathione and the accumulation of misfolded misfolding leading to activation of the UPR [432]. For our goal of identifying cancer-specific cell surface epitopes, oxidation is helpful in two ways. First, oxidative protein modification will affect the local conformation in the vicinity of the modification, generally disfavouring the native fold because of the need to form non-native protein- 132 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS protein and protein-solvent contacts to lower the free energy of the oxidized group. In general, local unfolding is an effective way to solvate the charge on the oxidized group and accommodate its larger steric volume, so the sequences around oxidative modifications have a higher probability of being selectively unfolded on the oxidized protein. Second, non-native oxidative modifications are often highly immunogenic due to their large size and net charge, making them a relatively easy antibody target. 6.3.6 Candidate Protein Targets for Misfolding in Cancer A wide range of protein targets are suitable for antibody-mediated recognition of misfolded sub- sequences, but generally the proteins of interest must be present on the extracellular side of the plasma membrane in order to be antibody accessible. It is therefore not possible to use this method against intracellular proteins, barring more exotic technologies like intrabodies [433]. The ideal misfolding-specific antibody target in cancer would be highly expressed on cancer cell surfaces and have a high susceptibility to partial misfolding due to environmental stressors or transforming mutations. It is additionally beneficial if the protein has a function important to tumour growth or survival that could be modulated by antibody binding. For application of the tools in this thesis a structure of the protein of interest is a necessary starting point. Listed below are some poten- tial proteins with available structures amenable to development of misfolding specific antibodies. Many of the proteins listed are not perfect targets, but they illustrate conceptually how this strategy may be employed. • Fas: Fas is the prototypic representative of the death receptor subgroup of the tumor necro- sis factor receptor family. Since its discovery in 1989, it has been an appealing target for anticancer therapeutic development due to its direct role in initiating the extrinsic apoptosis pathway [434]: in principle, specific agonism of Fas on cancer cells could induce their apop- tosis while leaving surrounding tissue unharmed. Attempts at developing Fas antibodies, however, have met disappointing results associated with nonspecific Fas agonism at other sites in the body, leading to massive hepatocyte apoptosis and lethal liver damage in animal models [435]. • Notch: Notch signalling is an important regulator of cell differentiation in normal growth and development, but recent evidence identifies deregulated expression of wild-type Notch receptors, ligands, and targets in many solid tumors and hematological malignancies [436]. Notch is an especially interesting target for its role in the maintenance and proliferation of cancer stem cells; for this reason several Notch antibodies have been developed and are currently in preclinical trials [437]. 133 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS • CD44: The cell adhesion molecule CD44 is another known cell surface cancer stem cell marker with a role in invasion, adhesion, and metastasis [438]. Several splicing variants of the gene coding for CD44 are possible, resulting in production of different protein isoforms. The most studied of these isoforms, CD44v6, is a negative prognostic indicator in acute myeloid leukemia and high-grade non-Hodgkins lymphoma. For the purpose of developing unfolded-specific antibodies, either the full-length or variant isoforms would be appropriate targets. • CD38: CD38 is expressed during differentiation on B-cells [439]. As a surface marker in B- cell chronic lymphocytic leukemia, it is a negative prognostic indicator because it correlates with increased proliferative potential. • Prion protein: Expression of the prion protein is not limited to neural tissues: it is widely distributed on many body tissues and has been identified on glioblastoma, breast cancer, prostate and gastric cancer [440]. Preliminary evidence from the Cashman lab indicates that misfolding-specific PrP antibodies differentially stain cancer cell lines, and administration of antibodies against β -strand 1 slow tumour growth of melanoma tumour xenografts in mice. • EGFR: As mentioned above, EGFR overexpression is a feature of non-small cell lung cancer and glioblastoma. The best current evidence for cell surface misfolded epitopes on cancer cells comes from studies of EGFR [441, 403]. • P-glycoprotein: Chemotherapy resistance is often due to overexpression of P-glycoprotein, a nonspecific exporter of drug-like molecules [442]. It’s large size and complex membrane topology lead to a high risk of improper folding, especially under adverse conditions. An an- tibody recognizing the misfolded extracellular region of P-glycoprotein may also antagonize its drug-exporting function by steric occlusion of the membrane channel. • Kit kinase: c-Kit is a proto-oncogene receptor kinase normally found on hematopoietic stem cells that binds to stem cell factor. It can be overexpressed or mutated in gastrointestinal stromal tumors, testicular seminoma, mast cell disease, melanoma, acute myeloid leukemia [443]. • Ret: Like C-Kit, Ret is another kinase proto-oncogene that encodes a single pass transmem- brane receptor expressed in cells derived from the neural crest and the urogenital tract [444]. The extracellular domains consist of four cadherin-like domains and a cysteine-rich domain. Germline mutations of Ret are found in the multiple endocrine neoplasia (MEN) syndromes, while sporadic mutations are associated with medullary carcinoma of the thyroid. 134 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS • Mucin: As described above, MUC1 is overexpressed in more than 90% of breast cancers in an underglycosylated form, leading to exposure of core regions of the extracellular domain. • Her2: Her2 is a transmembrane growth factor receptor overexpressed in 30% of breast can- cers, where it is is associated with increased disease recurrence and worse prognosis. Over- expression also occurs in other cancer such as ovarian cancer, stomach cancer, and biologi- cally aggressive forms of uterine cancer, such as uterine serous endometrial carcinoma. • CD46, CD55, and CD59: These three membrane-bound complement regulatory proteins attenuate complement activation and thereby limit the efficacy of anti-tumour mAbs [445]. CD46 is expressed on medulloblastomas, a brain tumour that primarily occurs in childhood; CD59 is highly expressed in B-cell non-Hodgkin’s lymphoma, where it reduces the effec- tiveness of rituximab [446]. • Neuropilin: Neuropilin receptors were first discovered as nervous system development reg- ulators but have now been shown to act as receptors for vascular endothelial growth factor, in addition to several other related functions [447]. The neuropilins are expressed in a wide variety of human tumor cell lines including those derived from carcinomas of the prostate, kidney, bladder, stomach, colon, pancreas, breast, ovary, and lung. • N-cadherin: Cadherins are a family of transmembrane cell adhesion molecules. Levels of E-cadherin are often decreased on the surface of cancer cells, facilitating their sloughing and migration to other sites. Recent evidence [448] indicates that this loss of cadherin func- tion may be due to misfolding in the ER and not transcriptional downregulation, at least in the case of gastric-cancer-associated E-cadherin mutants. N-cadherin, conversely, tends to be upregulated on cancer cells and facilitates their transendothelial migration (thereby contributing to metastasis). Using the Gō model unfolding energy landscape algorithm described in Chapter 3, epitope predictions for these proteins have been generated and are given in Figures 6.4 to 6.10. Sequences corresponding to the predicted epitopes are given in Appendix B. Epitopes have been selected at local minima in the free energy landscapes, representing regions with a low barrier to loss of structure. One aspect of epitope selection that merits further study is the role of disulfide isomerization in producing novel epitopes. Especially for disulfide-rich proteins like EGFR and TNFR, incorrect folding post-translation raises the possibility that non-native cysteine-cysteine contacts could form and lead to non-native disulfide bonds. In the case of the misfolding-specific EGFR antibody mAb 806 [403], the antibody binding site identified is close to a disulfide bond that may be disrupted 135 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS PDB Reference Description 2EC8 [449] Extracellular domain of Kit receptor kinase 2X2U [450] Cadherin-like domains of human RET 1TOZ [451] NOTCH-1 ligand binding region 1POZ [452] CD44 hyaluronan binding domain 1NCG [453] N-cadherin N-terminal domain 1YH3 [454] CD38 extracellular domain 1IVO [455] EGFR extracellular domain 3MZW [456] HER2 extracellular domain 3G5U [442] Murine P-glycoprotein 2J8B [457] Nonglycosylated recombinant CD59 1H03 [458] CD55 domains 3 and 4 1CKL [459] CD46 two N-terminal domains 2QQO [460] Neuropilin A2B1B2 domains 2ACM [461] MUC1 SEA domain 1TNR [462] Tumour necrosis factor receptor extracellular domain Fas receptor modelled from 1TNR Table 6.1: PDB structures for cancer-associated misfolded epitope prediction as part of the misfolding process. Detailed consideration of possible alternative disulfide pairings could suggest other misfolding-specific epitopes in addition to those shown in Figures 6.4 to 6.10. 6.3.7 Therapeutic Practicalities and Limitations Passive immunization with antibodies recognizing misfolded regions on the cancer cell surface pre- supposes that the immune system has not already seriously attempted to mount a response to these targets; otherwise, exogenous antibody administration would be largely redundant. There are sev- eral reasons to believe that the immune system would not be effective at recognizing misfolded epi- topes. First, cancer cells are notoriously effective at evading immune surveillance [463, 404]: the acidic microenvironment suppresses immune effector cells, the surface expression of HLA class 1 is reduced, membrane complement regulatory proteins inhibit complement fixation, and overex- pression of the enzyme indoleamine 2,3-dioxygenase that catabolizes tryptophan to the kynurenine pathway increases kynurenine that stimulates apoptosis of cytotoxic T-cells. Immune recognition of misfolded regions in endogenous proteins is further complicated by earlier presentation of the primary sequences comprising the unfolded epitopes identified here in a tolerogenic context as part of immune system development. Thus even if tumours do expose novel epitopes formed by par- tially denatured protein, it seems likely that the immune system would not be able to capitalize on it. Administration of preformed antibodies against these epitopes readily overcomes this problem. 136 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS Alternatively, immunization with peptides of the same sequence as the epitopes together with a vaccine adjuvant to overcome tolerance may enable endogenous production of misfolding-specific antibodies. An exciting, but speculative, possibility is that targetting misfolded cell surface protein could be complementary to existing treatment. The proteasome inhibitor bortezomib impairs the clearance of misfolded protein and may therefore increase the number of misfolded cell surface epitopes available for recognition. Heat shock protein inhibitors like 17AAG may similarly impair the correct folding of synthesized proteins by attenuating chaperone activity, thereby increasing the concentration of misfolded proteins. The histone deacetylase inhibitor CG0006 induces breast cancer cell death in part by disrupting chaperone pathways through hyper-acetylation of HSP90 [464], impairing the ability to rescue misfolded protein. Even physical treatments like radiation and hyperthermia, or free radical generating chemotherapy drugs like bleomycin, increase the environmental stress. Such agents are potentially useful as adjuvants to increase the efficacy of misfolding-specific antibodies. Conversely, there are potential limitations to this approach. The most important reservation concerns the total number of misfolding epitopes present on each cell, which depends on two factors: the number of protein molecules on the cell surface, and the number of these proteins that are partially misfolded. The number of protein molecules will depend on the protein and cell target in question. Tumour cells that overexpress signalling proteins, mucins, or cell adhesion proteins often do so at levels orders of magnitude above those found on healthy cells. The probability that a given cell surface protein will expose the epitope depends on its free energy of unfolding: the lower the energy change, the higher the probability of epitope exposure. Nonetheless, even a low abundance of epitopes may to some degree be compensated for by high antibody affinity, provided the epitopes are genuinely specific to the tumour site. In general, it seems likely that the more disorganized the cellular machinery, the larger the proportion of incorrectly folded protein evading degradation, and the greater its abundance. Aggressively disorganized, rapidly replicating, or protein secreting tumours may have a higher density of misfolded protein on the cell surface and thereby be the most promising candidates for this approach. Another possible drawback is cross-reactivity of antibodies with other proteins present on by- stander cells. Since the proposed epitopes are unfolded peptide sequences, other proteins in the human proteome with the same primary sequence are potentially reactive. However, there are en- couraging reasons to believe that this will not be a deterrant. The sequences comprising exposed epitopes on misfolded proteins are generally 6 - 12 residues in length. The number of possible pri- mary sequences corresponding to such an epitope is 206−2012 = 3.2 ·106−4.1 ·1015. The number of proteins in the human proteome is 33,869, and the average protein length is 375 [465], for an 137 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS approximate number of (overlapping) subsequences of fixed length of 1.2 ·107. Thus sequences of length 7 or more are likely to be unique in the human proteome; even if they are not unique, they are not necessarily accessible to antibody binding in other proteins containing the same sequence. More detailed screening of candidate epitopes may straightforwardly be performed as necessary, for example with blastp searching, to definitively address the question of cross-reactivity before undertaking an experimental program. A further question is how easily cancer cells can adapt to administration of misfolding-specific antibodies. An effective antibody would apply a strong selective pressure to promote the develop- ment of cells not exposing the target epitope, as presently occurs with other targetted biological agents in cancer treatment. However, the processes contributing to protein misfolding are for the most part intrinsic to general features of cancer cell metabolism: although production of a particu- lar protein may decrease as an adaptive measure, there are most likely going to be other misfolded cell surface epitopes still to be exploited. This argues for a multifaceted approach involving si- multaneous or sequential use of different misfolding-specific antibodies to achieve the greatest effect. Finally, considering the increasing prominence of cancer stem cells (CSCs) in our understand- ing of tumour biology, it is important to ascertain if CSCs are also likely to display partially mis- folded cell surface proteins. Although the CSC phenotype remain somewhat enigmatic, it does ap- pear that they are generally more slowly dividing, better organized, and have less genetic damage than much of the other tumour tissue mass. Internal derangements leading to protein misfolding may therefore be less severe. On the other hand, since CSCs tend to be localized in the core of the tumour and thereby subject to low oxygen tension, acidity, and high levels of reactive oxygen species [466], the external environment may still induce partial misfolding of cell surface proteins. Some of the protein targets described above like CD44 and Notch are CSC markers that may be helpful in eradicating the CSC population, but separate investigation is necessary to establish the merit of CSC-directed treatments with this approach. 138 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS P-glycoprotein N-terminal half (3G5U) P-glycoprotein C-terminal half (3G5U) 160-170 365 - 380 488 - 496 550 - 563 685 - 705 804 - 812 1010 - 1020 1065 - 1080 1130 - 1142 1165 - 1180 1200 - 1220 1225 - 1240 1255 - 1265 Le ng th  u nf ol de d Le ng th  u nf ol de d Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Centre of region Centre of region P-glycoprotein N-terminal half (3G5U) 84 - 96 205 - 215 318 - 330 733 - 748 957 - 969 Intracellular Extracellular Figure 6.4: Predicted misfolding-specific epitopes for P-glycoprotein. Both intracellular (left) and extracellular (right) predicted epitopes are shown. There are practical limitations to antibody accessibility for the intracellular epitopes. 139 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Centre of region Centre of region Le ng th  u nf ol de d Le ng th  u nf ol de d Epidermal Growth Factor Receptor (1IVO) HER2 (3MZW) 10 - 24 159 - 174 246 - 260 290 - 302 319 - 333 355 - 367 471 - 485 10 - 16 166 - 179 253 - 265 296 - 308 328 - 340 364 - 376 480 - 492 HER2EGFR Figure 6.5: Predicted misfolding-specific epitopes for EGFR and HER2. Gaps in the landscapes arise from regions of the protein not resolved in the crystal structure. 140 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS NOTCH1 (1TOZ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Centre of region Le ng th  u nf ol de d 445 - 455 475 - 485 495 - 505 516 - 526 92 - 103 107 - 115 165 - 178 Le ng th  u nf ol de d Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Centre of region 11 - 19 26 - 36 67 - 75 82 - 92 1042 - 1055 1083 - 1091 Centre of region Centre of region Centre of region Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Le ng th  u nf ol de d Le ng th  u nf ol de d MUC1 (2ACM) N-Cadherin (1NCG) CD44 (1POZ) Figure 6.6: Predicted misfolding-specific epitopes for Notch, CD44, N-cadherin, and MUC1 141 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS CD46 (1CKL) CD55 (1H03) CD59 (2J8B) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Le ng th  u nf ol de d Le ng th  u nf ol de d Le ng th  u nf ol de d Centre of region Centre of region Centre of region 160-170 365 - 380 488 - 496 550 - 563 10 - 17 32 - 38 50 - 60 64 - 69 160-170 365 - 380 488 - 496 550 - 563 685 - 705 Figure 6.7: Predicted misfolding-specific epitopes for CD46, CD55, and CD59 142 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS TNF Receptor (1TNR) Fas 35 - 45 48 - 58 86 - 96 104 - 114 Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Centre of region Centre of region Le ng th  u nf ol de d Le ng th  u nf ol de d 53 - 59 73 - 80 104 - 112 146 - 157 188 - 194 Figure 6.8: Predicted misfolding-specific epitopes for TNF receptor and Fas 143 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS Kit kinase (2EC8) RET (2X2U) 71 - 79 151 - 163 246 - 260 352 - 362 363 - 373 416 - 426 454 - 463 56 - 66 103- 117 154 - 166 209 - 223 Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Le ng th  u nf ol de d Le ng th  u nf ol de d Centre of region Centre of region Figure 6.9: Predicted misfolding-specific epitopes for Kit and Ret. Gaps in the landscapes arise from regions of the protein not resolved in the crystal structure. 144 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS CD38 (1YH3) Neuropilin (2QQO) 71 - 82 110-  120 212 - 222 246 - 257 284 - 296 156 - 164 230 - 239 296 - 306 348 - 358 376 - 388 389 - 401 452 - 466 537 - 550 Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Fr ee  e ne rg y of  u nf ol di ng  (c al /m ol ) Centre of region Centre of region Le ng th  u nf ol de d Le ng th  u nf ol de d Figure 6.10: Predicted misfolding-specific epitopes for CD38 and neuropilin. Gaps in the land- scapes arise from regions of the protein not resolved in the crystal structure. 145 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 6.4 Parting Thoughts Protein misfolding diseases represent perhaps the largest category of human illness for which no disease modifying therapy is available. Why is this? What aspects of these conditions have made them so resistant to understanding, even as we have made great strides in improving other domains of health, from infection to cancer to heart disease? First, the problem sits at the interface of two disciplines, physics and biology, that have tra- ditionally approached new problems in very different ways. Whereas biology has achieved its success through careful systematization and classification of phenomena, physics has sought to reduce these phenomena to generalizable principles. Each method has its strengths and weak- nesses: the biological approach more readily accomodates new observations but is often not able to explain them in terms of broader truths; meanwhile physicists may struggle to come up with an explanatory model for a new observation but once such a model is developed it provides testable predictions to cement understanding. Despite the complexity of protein misfolding it is driven by a small number of physical forces. Understanding these forces, however, is compounded by the multiplicity of biological factors, from post-translation modifications to cellular metabolism, that affect the environment in which the process takes place. Understanding protein misfolding in neurodegeration necessitates unification of the mostly disjoint approaches of physics and biology, which does not happen without some friction. Second, the length scale of the problem (measured in 10−10 m) is so far outside our realm of familiarity that intuition is limited. Structural biology has advanced hugely in providing detailed information about the native conformation of proteins, but in doing so it depends on near-absolute uniformity in the protein sample studied: every unit cell of the protein crystal and every protein molecule in the NMR tube must be the same (although progress is being made in this area). When dealing with misfolded protein, the heterogeneity of the sample due to its amorphous packing confounds these most comprehensive techniques. Instead we must rely on techniques with lower resolution like electron microscopy, which only gives us a general sense of what is happening, or techniques with incomplete coverage like epitope mapping, which risks misinterpretation based on partial information. Computational tools like molecular dynamics provide an alternative means to probe the structural transitions that take place during misfolding, but they must be carefully vali- dated by comparison to known experimntal information to ensure their conclusions are accurate. Third, many areas of success in medicine have amounted to harnessing the existing regener- ative power of the body to address a problem that it is not quite able to deal with on its own. Immunization is a classic example: once presented with an antigen against a pathogenic organism, the immune system is capable curtailing infection without much further help. Similarly, the ability of the body to heal after surgery is something we can take no credit for, yet it enables us to con- 146 CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS fidently cut through tissue to repair anatomical malformations. Aggregated protein, on the other hand, appears to be something the body has comparatively limited tools to deal with. This problem is magnified by the sensitivity and isolation of the nervous system from the rest of the body, which magnifies the difficulties in delivering therapy where it is needed. What will it take to overcome the obstacles that have held back the scientific community in understanding and treating the protein misfolding diseases? Better and better technologies, ex- perimental and theoretical, for understanding the organization of non-uniform phases of matter at atomic resolution are in development. Physicists and biologists are effectively collaborating on the problem to combine their respective insights. New techniques for modulating physiologic protein clearance mechanisms and improving the accessibility of the nervous system to therapeutic agents are emerging. All in all, we’re headed in the right direction. Unravelling the mysteries of protein misfolding is one of the greatest challenges to have faced the medical and scientific communities, but we are up to it. “This world is but a canvas to our imagination.” — Henry David Thoreau 147 Bibliography [1] E. E. Manuelidis and L. Manuelidis, “Suggested links between different types of dementias: Creutzfeldt-Jakob disease, Alzheimer disease, and retroviral CNS infections,” Alzheimer Dis Assoc Disord, vol. 3, pp. 100–109, 1989. → pages 1 [2] T. Hamaguchi, M. Noguchi-Shinohara, I. Nozaki, Y. Nakamura, T. Sato, T. Kitamoto, H. Mizusawa, and M. Yamada, “The risk of iatrogenic Creutzfeldt-Jakob disease through medical and surgical procedures,” Neuropathology, vol. 29, pp. 625–631, 2009. → pages 1 [3] A. H. Peden, M. W. Head, D. L. Ritchie, J. E. Bell, and J. W. Ironside, “Preclinical vCJD after blood transfusion in a PRNP codon 129 heterozygous patient,” Lancet, vol. 364, pp. 527–529, 2004. → pages 1 [4] A. Van Dorsselaer, C. Carapito, F. Delalande, C. Schaeffer-Reiss, D. Thierse, H. Diemer, D. S. McNair, D. Krewski, and N. R. Cashman, “Detection of prion protein in urine-derived injectable fertility products by a targeted proteomic approach,” PLoS ONE, vol. 6, p. e17815, 2011. → pages 1 [5] B. S. Appleby, K. K. Appleby, B. J. Crain, C. U. Onyike, M. T. Wallin, and P. V. Rabins, “Characteristics of established and proposed sporadic Creutzfeldt-Jakob disease variants,” Arch Neurol, vol. 66, pp. 208–215, 2009. → pages 2 [6] K. E. Novakovic, V. L. Villemagne, C. C. Rowe, and C. L. Masters, “Rare genetically defined causes of dementia,” Int Psychogeriatr, vol. 17, no. Suppl 1, pp. S149–194, 2005. → pages 2 [7] A. R. Giovagnoli, G. Di Fede, A. Aresi, F. Reati, G. Rossi, and F. Tagliavini, “Atypical frontotemporal dementia as a new clinical phenotype of Gerstmann-Straussler-Scheinker disease with the PrP-P102L mutation. Description of a previously unreported Italian family,” Neurol Sci, vol. 29, pp. 405–410, 2008. → pages 2 [8] M. D. Spencer, R. S. Knight, and R. G. Will, “First hundred cases of variant Creutzfeldt-Jakob disease: retrospective case note review of early psychiatric and neurological features,” Br Med J, vol. 324, pp. 1479–1482, 2002. → pages 2 [9] S. Kropp, W. J. Schulz-Schaeffer, M. Finkenstaedt, C. Riedemann, O. Windl, B. J. Steinhoff, I. Zerr, H. A. Kretzschmar, and S. Poser, “The Heidenhain variant of Creutzfeldt-Jakob disease,” Arch Neurol, vol. 56, pp. 55–61, 1999. → pages 3 [10] P. Parchi, A. Giese, S. Capellari, P. Brown, W. Schulz-Schaeffer, O. Windl, I. Zerr, H. Budka, N. Kopp, P. Piccardo, S. Poser, A. Rojiani, N. Streichemberger, J. Julien, C. Vital, B. Ghetti, P. Gambetti, and H. Kretzschmar, “Classification of sporadic Creutzfeldt-Jakob disease based on 148 BIBLIOGRAPHY molecular and phenotypic analysis of 300 subjects,” Ann Neurol, vol. 46, pp. 224–233, 1999. → pages 3 [11] S. Poser, B. Mollenhauer, A. Kraubeta, I. Zerr, B. J. Steinhoff, A. Schroeter, M. Finkenstaedt, W. J. Schulz-Schaeffer, H. A. Kretzschmar, and K. Felgenhauer, “How to improve the clinical diagnosis of Creutzfeldt-Jakob disease,” Brain, vol. 122, no. 12, pp. 2345–2351, 1999. → pages 3 [12] H. G. Wieser, U. Schwarz, T. Blattler, C. Bernoulli, M. Sitzler, K. Stoeck, and M. Glatzel, “Serial EEG findings in sporadic and iatrogenic Creutzfeldt-Jakob disease,” Clin Neurophysiol, vol. 115, pp. 2467–2478, 2004. → pages 3 [13] L. A. Stewart, L. H. Rydzewska, G. F. Keogh, and R. S. Knight, “Systematic review of therapeutic interventions in human prion disease,” Neurology, vol. 70, pp. 1272–1281, 2008. → pages 4 [14] L. Li, S. Napper, and N. R. Cashman, “Immunotherapy for prion diseases: opportunities and obstacles,” Immunotherapy, vol. 2, pp. 269–282, 2010. → pages 4 [15] F. Clavaguera, T. Bolmont, R. A. Crowther, D. Abramowski, S. Frank, A. Probst, G. Fraser, A. K. Stalder, M. Beibel, M. Staufenbiel, M. Jucker, M. Goedert, and M. Tolnay, “Transmission and spreading of tauopathy in transgenic mouse brain,” Nat Cell Biol, vol. 11, pp. 909–913, 2009. → pages 4, 16, 27, 107, 116 [16] P. Desplats, H. J. Lee, E. J. Bae, C. Patrick, E. Rockenstein, L. Crews, B. Spencer, E. Masliah, and S. J. Lee, “Inclusion formation and neuronal cell death through neuron-to-neuron transmission of alpha-synuclein,” Proc Natl Acad Sci U S A, vol. 106, no. 31, pp. 13010–5, 2009. → pages 4, 27, 107, 116 [17] L. Grad, W. Guest, A. Yanai, M. O’Neill, E. G. E. Pokrishevsky, S. V., W. D., S. Plotkin, and N. Cashman, “Prion-like propagation of misfolded superoxide dismutase 1,” Submitted to Proc Natl Acad Sci U S A, 2011. → pages 4, 19, 114, 115 [18] D. A. Lysek, C. Schorn, L. G. Nivon, V. Esteve-Moya, B. Christen, L. Calzolai, C. von Schroetter, F. Fiorito, T. Herrmann, P. Gntert, and K. Wuthrich, “Prion protein NMR structures of cats, dogs, pigs, and sheep,” Proc Natl Acad Sci U S A, vol. 102, pp. 640–645, 2005. → pages 5, 82 [19] A. D. Gossert, S. Bonjour, D. A. Lysek, F. Fiorito, and K. Wuthrich, “Prion protein NMR structures of elk and of mouse/elk hybrids,” Proc Natl Acad Sci U S A, vol. 102, pp. 646–650, 2005. → pages 5, 82 [20] K. J. Knaus, M. Morillas, W. Swietnicki, M. Malone, W. K. Surewicz, and V. C. Yee, “Crystal structure of the human prion protein reveals a mechanism for oligomerization,” Nat Struct Biol, vol. 8, pp. 770–774, 2001. → pages 5 [21] M. L. DeMarco and V. Daggett, “Local environmental effects on the structure of the prion protein,” C R Biol, vol. 328, pp. 847–862, 2005. → pages 5 [22] M. Morillas, D. L. Vanik, and W. K. Surewicz, “On the mechanism of alpha-helix to beta-sheet transition in the recombinant prion protein,” Biochemistry, vol. 40, pp. 6982–6987, 2001. → pages 6 149 BIBLIOGRAPHY [23] R. S. Stewart and D. A. Harris, “Mutational analysis of topological determinants in prion protein (PrP) and measurement of transmembrane and cytosolic PrP during prion infection,” J Biol Chem, vol. 278, pp. 45960–45968, 2003. → pages 6 [24] G. S. Baron and B. Caughey, “Effect of glycosylphosphatidylinositol anchor-dependent and -independent prion protein association with model raft membranes on conversion to the protease-resistant isoform,” J Biol Chem, vol. 278, pp. 14883–14892, 2003. → pages 6 [25] N. Sanghera, M. J. Swann, G. Ronan, and T. J. Pinheiro, “Insight into early events in the aggregation of the prion protein on lipid membranes,” Biochim Biophys Acta, vol. 1788, pp. 2245–2251, 2009. → pages 6 [26] D. Riesner, “Biochemistry and structure of PrP(C) and PrP(Sc),” Br Med Bull, vol. 66, pp. 21–33, 2003. → pages 6, 7, 9 [27] P. Gale, “The prion/lipid hypothesis–further evidence to support the molecular basis for transmissible spongiform encephalopathy risk assessment,” J Appl Microbiol, vol. 103, pp. 2033–2045, 2007. → pages 6 [28] T. R. Klein, D. Kirsch, R. Kaufmann, and D. Riesner, “Prion rods contain small amounts of two host sphingolipids as revealed by thin-layer chromatography and mass spectrometry,” Biol Chem, vol. 379, pp. 655–666, 1998. → pages 6 [29] G. Colombo, M. Meli, G. Morra, R. Gabizon, and M. Gasset, “Methionine sulfoxides on prion protein Helix-3 switch on the alpha-fold destabilization required for conversion,” PLoS ONE, vol. 4, p. e4296, 2009. → pages 7 [30] L. M. Taubner, E. A. Bienkiewicz, V. Copi, and B. Caughey, “Structure of the Flexible Amino-Terminal Domain of Prion Protein Bound to a Sulfated Glycan,” J Mol Biol, vol. 395, pp. 475–90, 2009. → pages 7 [31] K. Nishina, S. Jenks, and S. Supattapone, “Ionic strength and transition metals control PrPSc protease resistance and conversion-inducing activity,” J Biol Chem, vol. 279, pp. 40788–40794, 2004. → pages 7 [32] L. Zhong and J. Xie, “Investigation of the effect of glycosylation on human prion protein by molecular dynamics,” J Biomol Struct Dyn, vol. 26, pp. 525–533, 2009. → pages 7 [33] L. Chen, Y. Yang, J. Han, B. Y. Zhang, L. Zhao, K. Nie, X. F. Wang, F. Li, C. Gao, X. P. Dong, and C. M. Xu, “Removal of the glycosylation of prion protein provokes apoptosis in SF126,” J Biochem Mol Biol, vol. 40, pp. 662–669, 2007. → pages 7 [34] N. S. Hachiya, M. Imagawa, and K. Kaneko, “The possible role of protein X, a putative auxiliary factor in pathological prion replication, in regulating a physiological endoproteolytic cleavage of cellular prion protein,” Med Hypotheses, vol. 68, pp. 670–673, 2007. → pages 7 [35] K. Kaneko, L. Zulianello, M. Scott, C. M. Cooper, A. C. Wallace, T. L. James, F. E. Cohen, and S. B. Prusiner, “Evidence for protein X binding to a discontinuous epitope on the cellular prion protein during scrapie prion propagation,” Proc Natl Acad Sci U S A, vol. 94, pp. 10069–10074, 1997. → pages 8 150 BIBLIOGRAPHY [36] N. R. Deleault, J. C. Geoghegan, K. Nishina, R. Kascsak, R. A. Williamson, and S. Supattapone, “Protease-resistant prion protein amplification reconstituted with partially purified substrates and synthetic polyanions,” J Biol Chem, vol. 280, pp. 26873–26879, 2005. → pages 8 [37] V. A. Lawson, S. A. Priola, K. Meade-White, M. Lawson, and B. Chesebro, “Flexible N-terminal region of prion protein influences conformation of protease-resistant prion protein isoforms associated with cross-species scrapie infection in vivo and in vitro,” J Biol Chem, vol. 279, pp. 13689–13695, 2004. → pages 8 [38] L. Li, W. Guest, A. Huang, S. S. Plotkin, and N. R. Cashman, “Immunological mimicry of PrPC-PrPSc interactions: antibody-induced PrP misfolding,” Protein Eng Des Sel, vol. 22, pp. 523–529, 2009. → pages 8, 9, 28, 94, 101 [39] K. M. Pan, M. Baldwin, J. Nguyen, M. Gasset, A. Serban, D. Groth, I. Mehlhorn, Z. Huang, R. J. Fletterick, and F. E. Cohen, “Conversion of alpha-helices into beta-sheets features in the formation of the scrapie prion proteins,” Proc Natl Acad Sci U S A, vol. 90, pp. 10962–10966, 1993. → pages 8 [40] A. Thomzig, S. Spassov, M. Friedrich, D. Naumann, and M. Beekes, “Discriminating scrapie and bovine spongiform encephalopathy isolates by infrared spectroscopy of pathological prion protein,” J Biol Chem, vol. 279, pp. 33847–33854, 2004. → pages 8 [41] S. Spassov, M. Beekes, and D. Naumann, “Structural differences between TSEs strains investigated by FT-IR spectroscopy,” Biochim Biophys Acta, vol. 1760, pp. 1138–1149, 2006. → pages 8 [42] G. Sajnani, M. A. Pastrana, I. Dynin, B. Onisko, and J. R. Requena, “Scrapie prion protein structural constraints obtained by limited proteolysis and mass spectrometry,” J Mol Biol, vol. 382, pp. 88–98, 2008. → pages 9 [43] W. Guest, L. Li, O. Julien, S. Chatterjee, B. Sykes, S. Plotkin, W. Zou, and N. Cashman, “Partial unfolding of the prion protein: early steps on the path to misfolding,” Proceedings of Prion 2009, 2009. → pages 9 [44] E. Paramithiotis, M. Pinard, T. Lawton, S. LaBoissiere, V. L. Leathers, W. Q. Zou, L. A. Estey, J. Lamontagne, M. T. Lehto, L. H. Kondejewski, G. P. Francoeur, M. Papadopoulos, A. Haghighat, S. J. Spatz, M. Head, R. Will, J. Ironside, K. O’Rourke, Q. Tonelli, H. C. Ledebur, A. Chakrabartty, and N. R. Cashman, “A prion protein epitope selective for the pathologically misfolded conformation,” Nat Med, vol. 9, no. 7, pp. 893–9, 2003. → pages 9, 10, 27, 93 [45] F. Eghiaian, J. Grosclaude, S. Lesceu, P. Debey, B. Doublet, E. Trguer, H. Rezaei, and M. Knossow, “PrPSc conversion from the structures of antibody-bound ovine prion scrapie-susceptibility variants,” Proc Natl Acad Sci U S A, vol. 101, pp. 10254–10259, 2004. → pages 9 [46] A. R. White, P. Enever, M. Tayebi, R. Mushens, J. Linehan, S. Brandner, D. Anstee, J. Collinge, and S. Hawke, “Monoclonal antibodies inhibit prion replication and delay the development of prion disease,” Nature, vol. 422, pp. 80–83, 2003. → pages 9 [47] L. Solforosi, A. Bellon, M. Schaller, J. T. Cruite, G. C. Abalos, and R. A. Williamson, “Toward molecular dissection of PrPC-PrPSc interactions,” J Biol Chem, vol. 282, no. 10, pp. 7465–71, 2007. → pages 9, 27 151 BIBLIOGRAPHY [48] G. G. Kovacs, G. Trabattoni, J. A. Hainfellner, J. W. Ironside, R. S. Knight, and H. Budka, “Mutations of the prion protein gene phenotypic spectrum,” J Neurol, vol. 249, pp. 1567–1582, 2002. → pages 9, 80, 91 [49] Y. Chebaro and P. Derreumaux, “The conversion of helix H2 to beta-sheet is accelerated in the monomer and dimer of the prion protein upon T183A mutation,” J Phys Chem B, vol. 113, pp. 6942–6948, 2009. → pages 9 [50] J. R. Silveira, G. J. Raymond, A. G. Hughson, R. E. Race, V. L. Sim, S. F. Hayes, and B. Caughey, “The most infectious prion protein particles,” Nature, vol. 437, no. 7056, pp. 257–61, 2005. → pages 9, 15 [51] M. L. DeMarco and V. Daggett, “Molecular mechanism for low pH triggered misfolding of the human prion protein,” Biochemistry, vol. 46, pp. 3045–3054, 2007. → pages 10, 95 [52] N. J. Cobb, F. D. Sonnichsen, H. McHaourab, and W. K. Surewicz, “Molecular architecture of human prion protein amyloid: a parallel, in-register beta-structure,” Proc Natl Acad Sci U S A, vol. 104, pp. 18946–18951, 2007. → pages 10, 11, 80 [53] C. Govaerts, H. Wille, S. B. Prusiner, and F. E. Cohen, “Evidence for assembly of prions with left-handed beta-helices into trimers,” Proc Natl Acad Sci U S A, vol. 101, pp. 8342–8347, 2004. → pages 10, 11, 80 [54] F. E. Cohen, “Protein misfolding and prion diseases,” J Mol Biol, vol. 293, no. 2, pp. 313–20, 1999. → pages 10, 18, 99 [55] V. Smirnovas, J. I. Kim, X. Lu, R. Atarashi, B. Caughey, and W. K. Surewicz, “Distinct structures of scrapie prion protein (PrPSc)-seeded versus spontaneous recombinant prion protein fibrils revealed by hydrogen/deuterium exchange,” J Biol Chem, vol. 284, pp. 24233–24241, 2009. → pages 11 [56] P. Walsh, K. Simonetti, and S. Sharpe, “Core structure of amyloid fibrils formed by residues 106-126 of the human prion protein,” Structure, vol. 17, pp. 417–426, 2009. → pages 11 [57] D. T. Downing and N. D. Lazo, “Molecular modelling indicates that the pathological conformations of prion proteins might be beta-helical,” Biochem J, vol. 343, no. 2, pp. 453–460, 1999. → pages 11 [58] H. Wille, M. D. Michelitsch, V. Guenebaut, S. Supattapone, A. Serban, F. E. Cohen, D. A. Agard, and S. B. Prusiner, “Structural studies of the scrapie prion protein by electron crystallography,” Proc Natl Acad Sci U S A, vol. 99, pp. 3563–3568, 2002. → pages 11 [59] H. Wille, C. Govaerts, A. Borovinskiy, D. Latawiec, K. H. Downing, F. E. Cohen, and S. B. Prusiner, “Electron crystallography of the scrapie prion protein complexed with heavy metals,” Arch Biochem Biophys, vol. 467, pp. 239–248, 2007. → pages 11 [60] H. Wille, W. Bian, M. McDonald, A. Kendall, D. W. Colby, L. Bloch, J. Ollesch, A. L. Borovinskiy, F. E. Cohen, S. B. Prusiner, and G. Stubbs, “Natural and synthetic prion structure from X-ray fiber diffraction,” Proc Natl Acad Sci U S A, vol. 106, pp. 16990–16995, 2009. → pages 11 [61] H. K. Nakamura, M. Takano, and K. Kuwata, “Modeling of a propagation mechanism of infectious prion protein; a hexamer as the minimum infectious unit,” Biochem Biophys Res Commun, vol. 361, pp. 789–793, 2007. → pages 12 152 BIBLIOGRAPHY [62] C. Ritter, M. L. Maddelein, A. B. Siemer, T. Lhrs, M. Ernst, B. H. Meier, S. J. Saupe, and R. Riek, “Correlation of structural elements and infectivity of the HET-s prion,” Nature, vol. 435, pp. 844–848, 2005. → pages 12 [63] M. F. Perutz, J. T. Finch, J. Berriman, and A. Lesk, “Amyloid fibers are water-filled nanotubes,” Proc Natl Acad Sci U S A, vol. 99, pp. 5591–5595, 2002. → pages 12 [64] T. O. Diener, M. P. McKinley, and S. B. Prusiner, “Viroids and prions,” Proc Natl Acad Sci U S A, vol. 79, no. 17, pp. 5220–5224, 1982. → pages 13 [65] S. B. Prusiner, “Novel proteinaceous infectious particles cause scrapie,” Science, vol. 216, no. 4542, pp. 136–44, 1982. → pages 13 [66] M. M. Patino, J. J. Liu, J. R. Glover, and S. Lindquist, “Support for the prion hypothesis for inheritance of a phenotypic trait in yeast,” Science, vol. 273, no. 5275, pp. 622–626, 1996. → pages 13 [67] S. M. Uptain and S. Lindquist, “Prions as protein-based genetic elements,” Annu Rev Microbiol, vol. 56, pp. 703–741, 2002. → pages 13 [68] M. Goedert, F. Clavaguera, and M. Tolnay, “The propagation of prion-like protein inclusions in neurodegenerative diseases,” Trends Neurosci, vol. 33, no. 7, pp. 317–325, 2010. → pages 14 [69] A. Aguzzi, “Cell Biology: Beyond the prion principle,” Nature, vol. 459, no. 7249, pp. 924–925, 2009. → pages 14, 99 [70] A. Aguzzi and L. Rajendran, “The transcellular spread of cytosolic amyloids, prions, and prionoids,” Neuron, vol. 64, no. 6, pp. 783–790, 2009. → pages 14 [71] D. C. Gajdusek and V. Zigas, “Degenerative disease of the central nervous system in New-Guinea - the endemic occurrence of kuru in the native population,” New Engl J Med, vol. 257, no. 20, pp. 974–978, 1957. → pages 14 [72] H. B. Parry, “Scrapie - transmissible hereditary disease of sheep,” Nature, vol. 185, no. 4711, pp. 441–443, 1960. → pages 14 [73] T. A. Holt and J. Phillips, “Bovine spongiform encephalopathy,” Br Med J, vol. 296, no. 6636, pp. 1581–1582, 1988. → pages 14 [74] P. Brundin, R. Melki, and R. Kopito, “Prion-like transmission of protein aggregates in neurodegenerative diseases,” Nat Rev Mol Cell Biol, vol. 11, no. 4, pp. 301–7, 2010. → pages 14, 15, 24 [75] J. Collinge and A. R. Clarke, “A general model of prion strains and their pathogenicity,” Science, vol. 318, no. 5852, pp. 930–936, 2007. → pages 14 [76] W. W. Seeley, R. K. Crawford, J. Zhou, B. L. Miller, and M. D. Greicius, “Neurodegenerative diseases target large-scale human brain networks,” Neuron, vol. 62, no. 1, pp. 42–52, 2009. → pages 14 153 BIBLIOGRAPHY [77] R. A. Sperling, B. C. Dickerson, M. Pihlajamaki, P. Vannini, P. S. LaViolette, O. V. Vitolo, T. Hedden, J. A. Becker, D. M. Rentz, D. J. Selkoe, and K. A. Johnson, “Functional alterations in memory networks in early Alzheimer’s disease,” Neuromolecular Med, vol. 12, no. 1, pp. 27–43, 2010. → pages 14 [78] K. A. Celone, V. D. Calhoun, B. C. Dickerson, A. Atri, E. F. Chua, S. L. Miller, K. DePeau, D. M. Rentz, D. J. Selkoe, D. Blacker, M. S. Albert, and R. A. Sperling, “Alterations in memory networks in mild cognitive impairment and Alzheimer’s disease: an independent component analysis,” J Neurosci, vol. 26, no. 40, pp. 10222–31, 2006. → pages 14 [79] J. M. Ravits and A. R. La Spada, “ALS motor phenotype heterogeneity, focality, and spread: deconstructing motor neuron degeneration,” Neurology, vol. 73, no. 10, pp. 805–11, 2009. → pages 14, 18, 99, 117 [80] M. Cudkowicz, M. Qureshi, and J. Shefner, “Measures and markers in amyotrophic lateral sclerosis,” NeuroRx, vol. 1, no. 2, pp. 273–83, 2004. → pages 14 [81] M. P. McKinley, D. C. Bolton, and S. B. Prusiner, “A protease-resistant protein is a structural component of the scrapie prion,” Cell, vol. 35, no. 1, pp. 57–62, 1983. → pages 15 [82] A. L. Lublin and S. Gandy, “Amyloid-beta oligomers: Possible roles as key neurotoxins in Alzheimer’s disease,” Mt Sinai J Med, vol. 77, no. 1, pp. 43–49, 2010. → pages 15 [83] M. R. Cookson, “The biochemistry of Parkinson’s disease,” Annu Rev Biochem, vol. 74, pp. 29–52, 2005. → pages 15 [84] F. Wang, X. Wang, C. G. Yuan, and J. Ma, “Generating a prion with bacterially expressed recombinant prion protein,” Science, vol. 327, no. 5969, pp. 1132–5, 2010. → pages 15 [85] B. Frost and M. I. Diamond, “Prion-like mechanisms in neurodegenerative diseases,” Nat Rev Neurosci, vol. 11, no. 3, pp. 155–9, 2010. → pages 15 [86] G. Forloni, “Neurotoxicity of beta-amyloid and prion peptides,” Curr Opin Neurol, vol. 9, no. 6, pp. 492–500, 1996. → pages 15 [87] D. M. Walsh, I. Klyubin, J. V. Fadeeva, M. J. Rowan, and D. J. Selkoe, “Amyloid-beta oligomers: their production, toxicity and therapeutic inhibition,” Biochem Soc Trans, vol. 30, no. 4, pp. 552–7, 2002. → pages 15 [88] M. D. Kane, W. J. Lipinski, M. J. Callahan, F. Bian, R. A. Durham, R. D. Schwarz, A. E. Roher, and L. C. Walker, “Evidence for seeding of beta-amyloid by intracerebral infusion of Alzheimer brain extracts in beta -amyloid precursor protein-transgenic mice,” J Neurosci, vol. 20, no. 10, pp. 3606–11, 2000. → pages 15, 25 [89] M. Meyer-Luehmann, J. Coomaraswamy, T. Bolmont, S. Kaeser, C. Schaefer, E. Kilger, A. Neuenschwander, D. Abramowski, P. Frey, A. L. Jaton, J. M. Vigouret, P. Paganetti, D. M. Walsh, P. M. Mathews, J. Ghiso, M. Staufenbiel, L. C. Walker, and M. Jucker, “Exogenous induction of cerebral beta-amyloidogenesis is governed by agent and host,” Science, vol. 313, no. 5794, pp. 1781–4, 2006. → pages 15, 25 154 BIBLIOGRAPHY [90] Y. S. Eisele, T. Bolmont, M. Heikenwalder, F. Langer, L. H. Jacobson, Z. X. Yan, K. Roth, A. Aguzzi, M. Staufenbiel, L. C. Walker, and M. Jucker, “Induction of cerebral beta-amyloidosis: intracerebral versus systemic A-beta inoculation,” Proc Natl Acad Sci U S A, vol. 106, no. 31, pp. 12926–31, 2009. → pages 15 [91] J. A. Hardy and G. A. Higgins, “Alzheimer’s disease: the amyloid cascade hypothesis,” Science, vol. 256, no. 5054, pp. 184–5, 1992. → pages 15 [92] T. Luhrs, C. Ritter, M. Adrian, D. Riek-Loher, B. Bohrmann, H. Dobeli, D. Schubert, and R. Riek, “3D structure of Alzheimer’s amyloid-beta(1-42) fibrils,” Proc Natl Acad Sci U S A, vol. 102, no. 48, pp. 17342–7, 2005. → pages 16 [93] H. Sticht, P. Bayer, D. Willbold, S. Dames, C. Hilbich, K. Beyreuther, R. W. Frank, and P. Rosch, “Structure of amyloid Abeta-(1-40)-peptide of Alzheimer’s disease,” Eur J Biochem, vol. 233, no. 1, pp. 293–8, 1995. → pages 16 [94] M. von Bergen, P. Friedhoff, J. BieRNAt, J. Heberle, E. M. Mandelkow, and E. Mandelkow, “Assembly of tau protein into Alzheimer paired helical filaments depends on a local sequence motif ((306)VQIVYK(311)) forming beta structure,” Proc Natl Acad Sci U S A, vol. 97, no. 10, pp. 5129–5134, 2000. → pages 16 [95] B. Frost, R. L. Jacks, and M. I. Diamond, “Propagation of tau misfolding from the outside to the inside of a cell,” J Biol Chem, vol. 284, no. 19, pp. 12845–12852, 2009. → pages 16 [96] H. Braak and E. Braak, “Neuropathological staging of Alzheimer-related changes,” Acta Neuropathol, vol. 82, no. 4, pp. 239–259, 1991. → pages 16 [97] Z. Zhou, J. B. Fan, H. L. Zhu, F. Shewmaker, X. Yan, X. Chen, J. Chen, G. F. Xiao, L. Guo, and Y. Liang, “Crowded cell-like environment accelerates the nucleation step of amyloidogenic protein misfolding,” J Biol Chem, vol. 284, no. 44, pp. 30148–58, 2009. → pages 16 [98] Y. Davidson, T. Kelley, I. R. Mackenzie, S. Pickering-Brown, D. Du Plessis, D. Neary, J. S. Snowden, and D. M. Mann, “Ubiquitinated pathological lesions in frontotemporal lobar degeneration contain the TAR DNA-binding protein, TDP-43,” Acta Neuropathol, vol. 113, no. 5, pp. 521–33, 2007. → pages 17 [99] T. Arai, M. Hasegawa, H. Akiyama, K. Ikeda, T. Nonaka, H. Mori, D. Mann, K. Tsuchiya, M. Yoshida, Y. Hashizume, and T. Oda, “Tdp-43 is a component of ubiquitin-positive tau-negative inclusions in frontotemporal lobar degeneration and amyotrophic lateral sclerosis,” Biochem Biophys Res Commun, vol. 351, no. 3, pp. 602–11, 2006. → pages 17 [100] C. Lagier-Tourenne and D. W. Cleveland, “Rethinking ALS: the FUS about TDP-43,” Cell, vol. 136, no. 6, pp. 1001–4, 2009. → pages 17 [101] B. S. Johnson, D. Snead, J. J. Lee, J. M. McCaffery, J. Shorter, and A. D. Gitler, “TDP-43 is intrinsically aggregation-prone, and amyotrophic lateral sclerosis-linked mutations accelerate aggregation and increase toxicity,” J Biol Chem, vol. 284, no. 30, pp. 20329–39, 2009. → pages 17 155 BIBLIOGRAPHY [102] N. J. Rutherford, Y. J. Zhang, M. Baker, J. M. Gass, N. A. Finch, Y. F. Xu, H. Stewart, B. J. Kelley, K. Kuntz, R. J. Crook, J. Sreedharan, C. Vance, E. Sorenson, C. Lippa, E. H. Bigio, D. H. Geschwind, D. S. Knopman, H. Mitsumoto, R. C. Petersen, N. R. Cashman, M. Hutton, C. E. Shaw, K. B. Boylan, B. Boeve, N. R. Graff-Radford, Z. K. Wszolek, R. J. Caselli, D. W. Dickson, I. R. Mackenzie, L. Petrucelli, and R. Rademakers, “Novel mutations in TARDBP (TDP-43) in patients with familial amyotrophic lateral sclerosis,” PLoS Genet, vol. 4, no. 9, p. e1000193, 2008. → pages 17 [103] S. H. Kim, N. P. Shanware, M. J. Bowler, and R. S. Tibbetts, “Amyotrophic lateral sclerosis-associated proteins TDP-43 and FUS/TLS function in a common biochemical complex to co-regulate HDAC6 mRNA,” J Biol Chem, vol. 285, no. 44, pp. 34097–105, 2010. → pages 17 [104] R. Rademakers, H. Stewart, M. Dejesus-Hernandez, C. Krieger, N. Graff-Radford, M. Fabros, H. Briemberg, N. Cashman, A. Eisen, and I. R. Mackenzie, “FUS gene mutations in familial and sporadic amyotrophic lateral sclerosis,” Muscle Nerve, vol. 42, no. 2, pp. 170–6, 2010. → pages 17 [105] Y. F. Xu, T. F. Gendron, Y. J. Zhang, W. L. Lin, S. D’Alton, H. Sheng, M. C. Casey, J. Tong, J. Knight, X. Yu, R. Rademakers, K. Boylan, M. Hutton, E. McGowan, D. W. Dickson, J. Lewis, and L. Petrucelli, “Wild-type human TDP-43 expression causes TDP-43 phosphorylation, mitochondrial aggregation, motor deficits, and early mortality in transgenic mice,” J Neurosci, vol. 30, no. 32, pp. 10851–9, 2010. → pages 17 [106] N. Luquin, B. Yu, R. B. Saunderson, R. J. Trent, and R. Pamphlett, “Genetic variants in the promoter of TARDBP in sporadic amyotrophic lateral sclerosis,” Neuromuscul Disord, vol. 19, no. 10, pp. 696–700, 2009. → pages 17 [107] J. B. Tatom, D. B. Wang, R. D. Dayton, O. Skalli, M. L. Hutton, D. W. Dickson, and R. L. Klein, “Mimicking aspects of frontotemporal lobar degeneration and Lou Gehrig’s disease in rats via TDP-43 overexpression,” Mol Ther, vol. 17, no. 4, pp. 607–13, 2009. → pages 17 [108] J. Sreedharan, I. P. Blair, V. B. Tripathi, X. Hu, C. Vance, B. Rogelj, S. Ackerley, J. C. DuRNAll, K. L. Williams, E. Buratti, F. Baralle, J. de Belleroche, J. D. Mitchell, P. N. Leigh, A. Al-Chalabi, C. C. Miller, G. Nicholson, and C. E. Shaw, “TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis,” Science, vol. 319, no. 5870, pp. 1668–72, 2008. → pages 17 [109] J. Kwiatkowski, T. J., D. A. Bosco, A. L. Leclerc, E. Tamrazian, C. R. Vanderburg, C. Russ, A. Davis, J. Gilchrist, E. J. Kasarskis, T. Munsat, P. Valdmanis, G. A. Rouleau, B. A. Hosler, P. Cortelli, P. J. de Jong, Y. Yoshinaga, J. L. Haines, M. A. Pericak-Vance, J. Yan, N. Ticozzi, T. Siddique, D. McKenna-Yasek, P. C. Sapp, H. R. Horvitz, J. E. Landers, and R. H. Brown, “Mutations in the FUS/TLS gene on chromosome 16 cause familial amyotrophic lateral sclerosis,” Science, vol. 323, no. 5918, pp. 1205–8, 2009. → pages 17 [110] C. Vance, B. Rogelj, T. Hortobagyi, K. J. De Vos, A. L. Nishimura, J. Sreedharan, X. Hu, B. Smith, D. Ruddy, P. Wright, J. Ganesalingam, K. L. Williams, V. Tripathi, S. Al-Saraj, A. Al-Chalabi, P. N. Leigh, I. P. Blair, G. Nicholson, J. de Belleroche, J. M. Gallo, C. C. Miller, and C. E. Shaw, “Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6,” Science, vol. 323, no. 5918, pp. 1208–11, 2009. → pages 17 156 BIBLIOGRAPHY [111] R. P. Zakaryan and H. Gehring, “Identification and characterization of the nuclear localization/retention signal in the EWS proto-oncoprotein,” J Mol Biol, vol. 363, no. 1, pp. 27–38, 2006. → pages 17 [112] D. Dormann, R. Rodde, D. Edbauer, E. Bentmann, I. Fischer, A. Hruscha, M. E. Than, I. R. Mackenzie, A. Capell, B. Schmid, M. Neumann, and C. Haass, “ALS-associated fused in sarcoma (FUS) mutations disrupt transportin-mediated nuclear import,” EMBO J, vol. 29, no. 16, pp. 2841–57, 2010. → pages 17 [113] H. Zinszner, J. Sok, D. Immanuel, Y. Yin, and D. Ron, “Tls (fus) binds RNA in vivo and engages in nucleo-cytoplasmic shuttling,” J Cell Sci, vol. 110, no. 15, pp. 1741–50, 1997. → pages 17 [114] I. R. Mackenzie, R. Rademakers, and M. Neumann, “TDP-43 and FUS in amyotrophic lateral sclerosis and frontotemporal dementia,” Lancet Neurol, vol. 9, no. 10, pp. 995–1007, 2010. → pages 17 [115] R. Fujii, S. Okabe, T. Urushido, K. Inoue, A. Yoshimura, T. Tachibana, T. Nishikawa, G. G. Hicks, and T. Takumi, “The RNA binding protein TLS is translocated to dendritic spines by mglur5 activation and regulates spine morphology,” Curr Biol, vol. 15, no. 6, pp. 587–93, 2005. → pages 17 [116] S. Alberti, R. Halfmann, O. King, A. Kapila, and S. Lindquist, “A systematic survey identifies prions and illuminates sequence features of prionogenic proteins,” Cell, vol. 137, no. 1, pp. 146–58, 2009. → pages 17 [117] M. Cushman, B. S. Johnson, O. D. King, A. D. Gitler, and J. Shorter, “Prion-like disorders: blurring the divide between transmissibility and infectivity,” J Cell Sci, vol. 123, no. 8, pp. 1191–201, 2010. → pages 17 [118] J. Prilusky, C. E. Felder, T. Zeev-Ben-Mordehai, E. H. Rydberg, O. Man, J. S. Beckmann, I. Silman, and J. L. Sussman, “FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded,” Bioinformatics, vol. 21, no. 16, pp. 3435–8, 2005. → pages 17 [119] A. Mousavi and Y. Hotta, “Glycine-rich proteins: a class of novel proteins,” Appl Biochem Biotechnol, vol. 120, no. 3, pp. 169–74, 2005. → pages 17 [120] P. M. Andersen, “Genetic factors in the early diagnosis of ALS,” Amyotroph Lateral Scler Other Motor Neuron Disord, vol. 1 Suppl 1, pp. S31–42, 2000. → pages 17, 98 [121] W. Robberecht, “Genetics of amyotrophic lateral sclerosis,” J Neurol, vol. 247, pp. 2–6, 2000. → pages 17 [122] P. M. Andersen, K. B. Sims, W. W. Xin, R. Kiely, G. O’Neill, J. Ravits, E. Pioro, Y. Harati, R. D. Brower, J. S. Levine, H. U. Heinicke, W. Seltzer, M. Boss, and J. Brown, R. H., “Sixteen novel mutations in the Cu/Zn superoxide dismutase gene in amyotrophic lateral sclerosis: a decade of discoveries, defects and disputes,” Amyotroph Lateral Scler Other Motor Neuron Disord, vol. 4, no. 2, pp. 62–73, 2003. → pages 17 [123] R. Rakhit, J. P. Crow, J. R. Lepock, L. H. Kondejewski, N. R. Cashman, and A. Chakrabartty, “Monomeric Cu,Zn-superoxide dismutase is a common misfolding intermediate in the oxidation models of sporadic and familial amyotrophic lateral sclerosis,” J Biol Chem, vol. 279, no. 15, pp. 15499–504, 2004. → pages 18, 26, 115, 116 157 BIBLIOGRAPHY [124] R. Rakhit, J. Robertson, C. Vande Velde, P. Horne, D. M. Ruth, J. Griffin, D. W. Cleveland, N. R. Cashman, and A. Chakrabartty, “An immunological epitope selective for pathological monomer-misfolded SOD1 in ALS,” Nat Med, vol. 13, pp. 754–759, 2007. → pages 18, 28, 101 [125] J. Matias-Guiu, L. Galan, R. Garcia-Ramos, and J. A. Barcia, “Superoxide dismutase: the cause of all amyotrophic lateral sclerosis?,” Ann Neurol, vol. 64, no. 3, pp. 356–7; author reply 358, 2008. → pages 18 [126] M. Synofzik, R. FeRNAndez-Santiago, W. Maetzler, L. Schols, and P. M. Andersen, “The human G93A SOD1 phenotype closely resembles sporadic amyotrophic lateral sclerosis,” J Neurol Neurosurg Psychiatry, vol. 81, no. 7, pp. 764–7, 2010. → pages 18 [127] S. M. Chou, H. S. Wang, and K. Komai, “Colocalization of NOS and SOD1 in neurofilament accumulation within motor neurons of amyotrophic lateral sclerosis: an immunohistochemical study,” J Chem Neuroanat, vol. 10, no. 3-4, pp. 249–58, 1996. → pages 18 [128] S. M. Chou, H. S. Wang, and A. Taniguchi, “Role of SOD1 and nitric oxide/cyclic GMP cascade on neurofilament aggregation in ALS/MND,” J Neurol Sci, vol. 139 Suppl, pp. 16–26, 1996. → pages 18 [129] A. Gruzman, W. L. Wood, E. Alpert, M. D. Prasad, R. G. Miller, J. D. Rothstein, R. Bowser, R. Hamilton, T. D. Wood, D. W. Cleveland, V. R. Lingappa, and J. Liu, “Common molecular signature in SOD1 for both sporadic and familial amyotrophic lateral sclerosis,” Proc Natl Acad Sci U S A, vol. 104, pp. 12524–12529, 2007. → pages 18 [130] W. J. Broom, M. Greenway, G. Sadri-Vakili, C. Russ, K. E. Auwarter, K. E. Glajch, N. Dupre, R. J. Swingler, S. Purcell, C. Hayward, P. C. Sapp, D. McKenna-Yasek, P. N. Valdmanis, J. P. Bouchard, V. Meininger, B. A. Hosler, J. D. Glass, M. Polack, G. A. Rouleau, J. H. Cha, O. Hardiman, and J. Brown, R. H., “50bp deletion in the promoter for superoxide dismutase 1 (SOD1) reduces SOD1 expression in vitro and may correlate with increased age of onset of sporadic amyotrophic lateral sclerosis,” Amyotroph Lateral Scler, vol. 9, no. 4, pp. 229–37, 2008. → pages 18 [131] K. Forsberg, P. A. Jonsson, P. M. Andersen, D. Bergemalm, K. S. Graffmo, M. Hultdin, J. Jacobsson, R. Rosquist, S. L. Marklund, and T. Brannstrom, “Novel antibodies reveal inclusions containing non-native SOD1 in sporadic ALS patients,” PLoS ONE, vol. 5, no. 7, p. e11552, 2010. → pages 18, 28 [132] D. A. Bosco, G. Morfini, N. M. Karabacak, Y. Song, F. Gros-Louis, P. Pasinelli, H. Goolsby, B. A. Fontaine, N. Lemay, D. McKenna-Yasek, M. P. Frosch, J. N. Agar, J. P. Julien, S. T. Brady, and J. Brown, R. H., “Wild-type and mutant sod1 share an aberrant conformation and a common pathogenic pathway in ALS,” Nat Neurosci, vol. 13, no. 11, pp. 1396–403, 2010. → pages 18, 28, 98, 115, 116 [133] J. S. Beckman and W. H. Koppenol, “Nitric oxide, superoxide, and peroxynitrite: the good, the bad, and ugly,” Am J Physiol, vol. 271, no. 5 Pt 1, pp. C1424–37, 1996. → pages 18, 98 [134] M. Said Ahmed, W. Y. Hung, J. S. Zu, P. Hockberger, and T. Siddique, “Increased reactive oxygen species in familial amyotrophic lateral sclerosis with mutations in sod1,” J Neurol Sci, vol. 176, no. 2, pp. 88–94, 2000. → pages 18, 98 158 BIBLIOGRAPHY [135] T. L. Munsat, P. L. Andres, L. Finison, T. Conlon, and L. Thibodeau, “The natural history of motoneuron loss in amyotrophic lateral sclerosis,” Neurology, vol. 38, no. 3, pp. 409–13, 1988. → pages 18, 99 [136] J. Ravits, P. Paul, and C. Jorg, “Focality of upper and lower motor neuron degeneration at the clinical onset of ALS,” Neurology, vol. 68, no. 19, pp. 1571–5, 2007. → pages 18, 99, 116 [137] M. Nagai, D. B. Re, T. Nagata, A. Chalazonitis, T. M. Jessell, H. Wichterle, and S. Przedborski, “Astrocytes expressing ALS-linked mutated SOD1 release factors selectively toxic to motor neurons,” Nat Neurosci, vol. 10, no. 5, pp. 615–22, 2007. → pages 18, 99 [138] A. L. Horwich and J. S. Weissman, “Deadly conformations–protein misfolding in prion disease,” Cell, vol. 89, no. 4, pp. 499–510, 1997. → pages 18, 98, 114 [139] M. Urushitani, A. Sik, T. Sakurai, N. Nukina, R. Takah