"Science, Faculty of"@en . "Chemistry, Department of"@en . "DSpace"@en . "UBCV"@en . "Fegan, Sarah Katherine"@en . "2014-10-21T14:12:33Z"@en . "2014"@en . "Doctor of Philosophy - PhD"@en . "University of British Columbia"@en . "With the invention of gentle ionization methods such as electrospray ionization, mass spectrometry has been used as a tool for studying large protein complexes. Simulations of these gas-phase protein complexes will allow for better understanding of the dissociation mechanism and lead to methods for controlling the dissociation. Controlling the dissociation will help obtain structural information from the mass spectrometry experiments.In this work, the suitability of the MARTINI coarse-grained force field for gas-phase simulations is studied and a charge hopping algorithm is developed.\nUsing a coarse-grained force field makes the simulations faster so that longer simulation times can be accessed.\nThis is important because protein motions can take place on time scales of nanoseconds to milliseconds and these long times are not practical with all-atom simulations.\nMost molecular dynamics simulations use fixed charges, but including charge motion allows for better simulation of mobile protons.Two protein complexes are studied here, one dimer and one tetramer. Hopping rates, energies, radii of gyration, and distances within the complexes are calculated. Simulations with the cytochrome c$^{\prime}$ dimer (no charge hopping) are compared to published all-atom results. The MARTINI force field is found to be good for qualitative results, but slightly more attractive than the OPLS all-atom (for the isolated protein complex).\nThe transthyretin tetramer is used to study the hopping algorithm. Modifications of the protein (blocking N-termini from accepting charges and adding basic sites with a tether) are also explored. The dissociation behavior of the protein complexes is controlled by the Coulomb repulsion model. Protein modifications near the N-termini show potential for controlling the dissociation."@en . "https://circle.library.ubc.ca/rest/handle/2429/50811?expand=metadata"@en . "Charge Hopping in Coarse-GrainedSimulations of Gas-Phase ProteinComplexesbySarah Katherine FeganB.Sc., The University of Waterloo, Waterloo, ON, Canada, 2008A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Chemistry)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)October 2014c\u00C2\u00A9 Sarah Katherine Fegan 2014AbstractWith the invention of gentle ionization methods such as electrospray ionization, mass spec-trometry has been used as a tool for studying large protein complexes. Simulations of thesegas-phase protein complexes will allow for better understanding of the dissociation mecha-nism and lead to methods for controlling the dissociation. Controlling the dissociation willhelp obtain structural information from the mass spectrometry experiments.In this work, the suitability of the MARTINI coarse-grained force field for gas-phasesimulations is studied and a charge hopping algorithm is developed. Using a coarse-grainedforce field makes the simulations faster so that longer simulation times can be accessed.This is important because protein motions can take place on time scales of nanosecondsto milliseconds and these long times are not practical with all-atom simulations. Mostmolecular dynamics simulations use fixed charges, but including charge motion allows forbetter simulation of mobile protons.Two protein complexes are studied here, one dimer and one tetramer. Hopping rates,energies, radii of gyration, and distances within the complexes are calculated. Simulationswith the cytochrome c\u00E2\u0080\u00B2 dimer (no charge hopping) are compared to published all-atom re-sults. The MARTINI force field is found to be good for qualitative results, but slightly moreattractive than the OPLS all-atom (for the isolated protein complex). The transthyretintetramer is used to study the hopping algorithm. Modifications of the protein (blockingN-termini from accepting charges and adding basic sites with a tether) are also explored.The dissociation behavior of the protein complexes is controlled by the Coulomb re-pulsion model. Protein modifications near the N-termini show potential for controlling thedissociation.iiPrefaceThe work presented in this thesis has been published by the author, Sarah K. Fegan, inco-authorship with research supervisor Prof. Mark Thachuk. Sarah K. Fegan is the mainwriter and first author of these works.The results of Chapter 3 were published as Fegan, S K and Thachuk, M, \u00E2\u0080\u0099Suitability ofthe MARTINI Force Field for Use with Gas-Phase Protein Complexes\u00E2\u0080\u0099 J. Chem. TheoryComput., 2012, 8, 1304-1313. The results of Chapter 4 were published as Fegan, S K andThachuk, M, \u00E2\u0080\u0099A Charge Moving Algorithm for Molecular Dynamics Simulations of Gas-Phase Proteins\u00E2\u0080\u0099 J. Chem. Theory Comput., 2013, 9, 2531-2539. The results of Chapter 5were published as Fegan, S K and Thachuk, M, \u00E2\u0080\u0099Controlling Dissociation Channels of Gas-Phase Protein Complexes Using Charge Manipulation\u00E2\u0080\u0099 J. Am. Soc. Mass Spectrom., 2014,25, 722-728.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Abbreviations and Symbols . . . . . . . . . . . . . . . . . . . . . . . . . ixAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Mass Spectrometry Experiments . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Computational Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Coarse Grained Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1 Molecular Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Input Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Properties for Analysis of Trajectories . . . . . . . . . . . . . . . . . . . . . 132.4 Software and Force Field Particular to This Work . . . . . . . . . . . . . . 142.5 Moving Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Suitability of the MARTINI Force Field For Use With Gas-Phase ProteinComplexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22iv3.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 A Charge Moving Algorithm for Molecular Dynamics Simulations of Gas-phase Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.3.1 Hopping Rate and Charge Distributions . . . . . . . . . . . . . . . . 474.3.2 Structural Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 534.3.3 Energy Distributions and Pair Correlations . . . . . . . . . . . . . . 534.3.4 Computational Performance . . . . . . . . . . . . . . . . . . . . . . 614.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 Controlling Dissociation Channels of Gas-Phase Protein Complexes UsingCharge Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78AppendicesA Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83A.1 mobile charge.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83A.2 mobile charge.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93vList of Tables2.1 Proton binding energy values for basic amino acid sites[1]. . . . . . . . . . . 203.1 The residue numbers of the charged amino acids on each monomer for theten M8/M2 and ten M5/M5 charge distributions. . . . . . . . . . . . . . . . 24viList of Figures1.1 Experimental mass spectra of cytochrome c dimer . . . . . . . . . . . . . . 31.2 Experimental mass spectra of shiga-like toxin pentamer . . . . . . . . . . . 52.1 A skeletal diagram of the GROMACS molecular dynamics program. . . . . 162.2 A flowchart showing the hopping algorithm. . . . . . . . . . . . . . . . . . . 183.1 Snapshot of cytochrome c monomer . . . . . . . . . . . . . . . . . . . . . . 233.2 Snapshot of ground state cytochrome c dimer . . . . . . . . . . . . . . . . . 263.3 Pull force for M5/M5 cytochrome c dimer . . . . . . . . . . . . . . . . . . . 283.4 Pull force for M8/M2 cytochrome c dimer . . . . . . . . . . . . . . . . . . . 303.5 Percentage dissociation of cytochrome c dimer . . . . . . . . . . . . . . . . . 313.6 Minimum separation of cytochrome c dimer . . . . . . . . . . . . . . . . . . 333.7 Radius of gyration of cytochrome c dimer . . . . . . . . . . . . . . . . . . . 353.8 RMSD of cytochrome c dimer . . . . . . . . . . . . . . . . . . . . . . . . . . 363.9 Effect of secondary structure constraints on cytochrome c dimer . . . . . . . 393.10 Intramolecular energies for cytochrome c dimer . . . . . . . . . . . . . . . . 403.11 Intermolecular energies for cytochrome c dimer . . . . . . . . . . . . . . . . 423.12 Snapshots of cytochrome c dimer . . . . . . . . . . . . . . . . . . . . . . . . 434.1 Hopping rate for transthyretin tetramer . . . . . . . . . . . . . . . . . . . . 484.2 Potential and kinetic energies of transthyretin tetramer . . . . . . . . . . . 504.3 Charge configurations of transthyretin tetramer . . . . . . . . . . . . . . . . 514.4 Histogram of charge distributions of transthyretin tetramer . . . . . . . . . 524.5 Radius of gyration of transthyretin tetramer . . . . . . . . . . . . . . . . . . 544.6 Snapshots of transthyretin tetramer . . . . . . . . . . . . . . . . . . . . . . 554.7 Energy distributions of attempted hops for transthyretin tetramer . . . . . 564.8 Energy distributions of attempted hops by donor type for transthyretin tetramer 584.9 Energy distributions of attempted hops by donor and acceptor type fortransthyretin tetramer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.10 Energy distributions of successful hops for transthyretin tetramer . . . . . . 604.11 Hopping pairs of transthyretin tetramer . . . . . . . . . . . . . . . . . . . . 62vii5.1 Snapshots of transthyretin tetramer . . . . . . . . . . . . . . . . . . . . . . 675.2 Snapshots of transthyretin tetramer with tether . . . . . . . . . . . . . . . . 685.3 Charge configurations and radius of gyration for transthyretin tetramer . . 705.4 Minimum distances in transthyretin tetramer at 300K . . . . . . . . . . . . 715.5 Minimum distances in transthyretin tetramer at 600K . . . . . . . . . . . . 72viiiList of Abbreviations and Symbols\u00CE\u00B1 hopping probability tuning parameter\u000F dielectric constant\u00CE\u00B5 Lennard-Jones well depth\u00CF\u0083 Lennard-Jones cross sectionBIRD blackbody infrared radiative dissociationCID collisionally induced dissociationE energyESI electrospray ionizationF forcef degrees of freedomk force constantkB Boltzmann constantLINCS Linear Constraint Solver for Molecular SimulationsLJ Lennard-Jonesm massM2 monomer with +2 chargesM5 monomer with +5 chargesM8 monomer with +8 chargesMD molecular dynamicsMS mass spectrometrym/z mass to charge ratioN number of beads in the systemPDB Protein Data Bankq chargeRg radius of gyrationRMSD root mean square deviationSID surface induced dissociationSORI sustained off-resonance irradiationT temperaturet timeixTTR transthyretin tetramerTOF time of flight (mass spectrometer)v velocityV(r) potentialxAcknowledgementsFirst and foremost, I would like to express my gratitude to my supervisor Dr. Mark Thachukfor all his guidance and encouragement. His support and advice in editing this thesis isgreatly appreciated.I would like to thank my committee members, especially Dr. Roman Krems, for readingthe thesis.I would also like to thank the members of the theory lab for making my time at UBCso enjoyable. In particular, Nalantha, Olga and Saheba have been a source of great helpand friendship.I would like to thank my family for their unconditional love and support.I acknowledge financial support from the Natural Sciences and Engineering ResearchCouncil (NSERC) of Canada. All computations were performed using WestGrid computingresources, which are funded in part by the Canada Foundation for Innovation, AlbertaInnovation and Science, BC Advanced Education, and the participating research institutions(www.westgrid.ca).xiThis thesis is dedicatedtomy parents.xiiChapter 1IntroductionProtein complexes are among the many systems studied by mass spectrometry. The bi-ological function of proteins is dependant on the interactions between them[2]. In orderto understand protein complexes it is necessary to determine the number, type, and spa-tial arrangement of the subunits within the complex[2]. Under the right conditions, thenon-covalent interactions in these complexes can be dissociated in collision cells withinmass spectrometers without disrupting the covalent bonds[3]. Identifying the interactionsbetween monomers can be done by comparing the composition of sets of subcomplexes gen-erated from dissociating the intact complex[4]. Understanding the dissociation pathwayscould lead to methods for controlling the dissociation. Ultimately this would aid in gaininginformation about the structures of protein complexes.1.1 Mass Spectrometry ExperimentsNumerous studies have used mass spectrometry (MS) to look at the dissociation of dimers[5\u00E2\u0080\u00938], tetramers[9\u00E2\u0080\u009313], and larger complexes[12, 14\u00E2\u0080\u009317]. In mass spectrometers, gas-phase ionsare separated based on their mass to charge ratio (m/z). When an ion moves through anelectric field the force on the ion is proportional to the charge (z) on the ion, thus theacceleration is proportional to z/m. In time of flight (TOF) mass spectrometers, after theacceleration the ions move through a drift tube with no external field. The time it takesions to reach the detector depends on the mass to charge ratio. Tandem mass spectrometry(MS/MS), where the first MS selects the parent ion of interest and the second MS is usedto probe the dissociation products of the parent ion, is often used in these experiments[18].Collision or dissociation cells can be placed between the two mass analyzers of the tandemmass spectrometer. The charge states used range from +11-15 for the smaller dimers andtetramers to over +30 for the largest complexes.Electrospray ionization (ESI) is a method for creating protein ions in the gas phase.These experiments are possible because one of the advantages of ESI is that it is gentleenough to preserve non-covalent interactions allowing large protein complexes to be studiedusing mass spectrometry techniques[19]. In ESI, the solution flows through a narrow needleand charged droplets are formed. They become smaller until the protein is all that remains.1There are two mechanisms by which the droplets become smaller. One is the evaporationof neutral solvent molecules. The other is Raleigh fission, when the charge density on thesurface of the droplet increases the repulsion between the charges causes the droplet to splitinto two smaller droplets.Energy must be added to get non-covalently bound protein complexes to dissociate.The main types of non-covalent bonds in these complexes are Van der Waals, dipole-dipoleinteractions, and hydrogen bonding. Two of the methods used for this are collisionallyinduced dissociation (CID) and surface induced dissociation (SID). In CID, the proteincomplex collides with a bath gas, and through these collisions energy is gradually transferredto the complex. In SID, the complex collides with a much more massive surface. Becausein SID the complex collides with a surface, it has a much faster rate of energy transferthan CID where there are many collisions with lighter bath gas particles. Some of thesedissociation experiments are discussed in more detail below.Jurchen and Williams[7] studied the dissociation of cytochrome c dimer (see Fig. 1.1).When the monomers could unfold asymmetric dissociation is observed. The higher chargestate (+19 dimer) dissociates symmetrically into monomers with +10 and +9 charges.The +11 dimer dissociates asymmetrically into monomers with +7 and +4 or +8 and +3charges. However, when cross-linkages were added to prevent the monomers from unfolding,only symmetric dissociation is seen. Although mass spectrometry does not give directinformation on the size of the molecules, this indicates that unfolding is involved in theasymmetric dissociation pathway. They studied the dissociation mechanism and determinedthat proton transfer plays a role. The charge on a protein is correlated to the surface area(not the mass)[20]. This means that charge accumulation (coming from the charge transfer)accompanies unfolding.In solution, the mechanism for proton transfer in proteins involves water molecules asintermediaries[21]. But mobile protons have also been found in mass spectrometry experi-ments. There is experimental and theoretical confirmation that charges are mobile in smallgas-phase peptides[22\u00E2\u0080\u009328], and a mobile proton model has been developed[23, 29]. Pep-tide fragmenting D/H exchange experiments performed by Jorgensen et al.[27] were ableto selectively label the C- or N-terminal ends of peptides with deuterium and found 100%scrambling in CID. Charge motion will be discussed in more detail in Chapters 2 and 4.Jones et al. dissociated a dimer cytochrome c using both SID and CID [8]. They selecteda +11 charge state prior to the dissociation. When CID was used, the products were mostlymonomers with +8 charges and the complementary monomers with +3 charges. When SIDwas used, the results were more symmetric with most monomers having +6 or +5 charges.Blackbody infrared radiative dissociation (BIRD) is where thermal energy is exchangedwith the surroundings by blackbody radiation. The homodimer ecotin was studied by2Figure 1.1: SORI-CAD spectra of odd charge states of cytochrome c dimer ions electro-sprayed from a 50:50 water/methanol solution with 2% acetic acid: 19+, 15+, and 11+dimer charge states. The 11+ dimer ions are produced by gas-phase deprotonation ofhigher charge state dimers. Reprinted with permission from [7]. Copyright 2003 AmericanChemical Society.3Felitsyn et al.[6]. They used the BIRD method with temperatures of 399 to 448 K todissociate dimers with charge states +14-17. Both symmetric and asymmetric monomerpairs result from the dissociation.BIRD has also been used to study the dissociation of the B5 pentamer of the shiga-liketoxin [14, 17]. Figure 1.2 shows the results of the dissociation. In all four examples, onemonomer is ejected with almost half of the total charge. They also calculated Arrheniusprefactors and from that values for the entropy of activation. The large values of Arrheniusfactors and the entropy of activation suggest that the transition state has a high entropy(likely from unfolding)[14].When tetramer streptavidin in the +14 charge state was dissociated using sustainedoff-resonance irradiation (SORI) the products were monomers with +7 or +6 charges andthe complementary trimers with +7 and +8 charges[9]. SORI is a thermal dissociationmethod.The transthyretin tetramer has been studied by several groups. When the +15 chargestate is dissociated using CID, the main products are monomers with +8 or +9 (some +7)charges and the corresponding trimers with +7 and +6 (some +8) charges[10]. When SIDis used to dissociate the same complex the products are mostly monomers with +4 charges,and dimers and trimers are also seen[12].The dissociation of +11-14 charge states of pentamer shiga-like toxin I was studiedusing the BIRD method[14]. The monomers formed had 30-50% of the total charge and thetetramers had the remainder of the charge.A dodecamer of small heat shock protein TaHSP16.9 was studied[15]. The dodecamerhad a distribution of charges states centered at +33. After CID, monomers and 11-merswere formed with charge state distributions around +14 and +20 respectively.The general trends were that collision induced dissociation and thermal dissociationproduced asymmetric charge separation involving a monomer leaving with close to half thecharge (or more for the dimers), and surface induced dissociation resulted in subunits withsymmetric mass-to-charge ratios. This is true for many different complexes of varying sizesand composition, thus any general model will be widely applicable.Using mass spectrometry, protein complex dissociation producing asymmetric chargedistributions has been observed in protein dimers[7] and in larger multimeric complexes[10,11, 14, 15]. Asymmetric charge dissociation occurs when the mass to charge ratios offragments differ, that is one of the fragments has a larger fraction of the total charge than ithas of the total mass. With slow dissociation methods, such as CID, unfolding and ejectionof one monomer from the complex is almost universally observed. Typically, the ejectedmonomer carries away up to half the total charge on the complex. In some cases, theremaining complexes again eject another charge-enriched monomer. With fast dissociation4Figure 1.2: BIRD mass spectra of protonated Stx1 Bn+5 ions: (a) B11+5 at a cell temperatureof 166 C and a reaction time of 12 s; (b) B12+5 at 165 C and 2.5 s; (c) B14+5 at 165 C and 1.5s; (d) Stx1 B14+5 at 143 C and 3.5 s. Reprinted with kind permission from [17]. Copyright2007 Springer Science and Business Media.5methods like SID, more symmetric dissociation products are seen[8, 11, 12]. That is, singlemonomer ejection does not dominate, and the charge distribution of the fragments is closeto what is expected for a uniformly charged protein complex.In practice, usually the weakest bound monomer in a complex unfolds and dissociatesupon CID activation. The predominance of this dissociation pathway excludes other path-ways which may shed more light on the binding of monomers in the complex, and allowone to build a better picture of their assembly. These monomer interactions are often thekey to understanding the relationship between structure and biological function in thesecomplexes. Understanding the detailed mechanism for monomer ejection will help to pro-vide methods by which this dissociation can be controlled and possibly other dissociationpathways selected. Hopefully, this will allow for more structural information about proteinsto be gathered from experiments.The relationship between the gas-phase structures and the solution structures is notimmediately obvious. It is clear that harsh conditions can cause structural changes. How-ever, structure can be preserved[30]. A recent study[31] compared the gas-phase collisioncross sections and the solution-phase Stokes radii for several different proteins and proteincomplexes. They found good agreement between the values, indicating that if structuralchanges occurred during the ionization process they were not significant enough to changethe size of the proteins.1.2 Computational BackgroundIn addition to experimental work, some theoretical work has also been done to explore thedissociation mechanism.Klassen and coworkers [17] used the charged droplet model and a protein structure modelto calculate energies for a single subunit leaving a pentamer. The charge droplet model[32]represents the dividing droplet as two spheres. Their protein structure model used moredetailed structural information, but assumed that the displacement of one monomer wasthe only change in the structure on dissociation. Their results show that protein unfoldingmay be important to the asymmetric charge distribution and a simple Coulomb energycalculation is useful in understanding the dissociation.Konermann and coworkers [33] used a simple model where folded proteins were repre-sented by spheres and unfolded proteins by a string of smaller spheres. The number ofbeads used for the unfolded proteins was varied to model partial as well as fully unfoldedproteins. They found that dimers dissociating without unfolding had a symmetric chargedistribution, while when one of the subunits unfolded the charge distribution was asym-metric. An asymmetric charge partitioning was also found when one unit of a tetramer6unfolded while the other three remained folded. Their study also included collision induceddissociation mass spectrometry experiments of dimer and tetramer proteins. By comparingthe experimental spectra to the charge distributions predicted by the model, they concludedthat dissociation was occurring with a range structures having varying degrees of unfolding.The work by Csiszar and Thachuk [34] used charged ellipsoids to model surface charges.They found that a constant surface charge density was the most favorable distribution. Sofor a dimer with an asymmetric charge partitioning, the monomer with more charges wouldwant to unfold to have a surface area larger than that of the monomer with fewer charges.All-atom studies were performed by Wanasundara and Thachuk [1, 35, 36] on the dissoci-ation of cytochrome c\u00E2\u0080\u00B2 dimers using the OPLS force field. Their work showed the symmetriccase with +5 charges on each monomer had a lower barrier to dissociation than the neutraldimer, and the asymmetric cases (M8/M2 and M9/M1) had higher barriers. The highercharged monomers in the asymmetric cases unfolded while the lower charged monomersremained folded.The essential conclusion of these studies[1, 17, 33\u00E2\u0080\u009336] is that the monomer ejectionprocess is controlled by Coulomb repulsion. In particular, charges arrange themselves soas to maintain a constant surface charge density and the lowest barrier for dissociationoccurs when two fragments have the same charge. Almost all experimental results can berationalized with this Coulomb repulsion model.Lill and Helms created a method (Q-HOP MD) for including proton transport in molecu-lar dynamics (MD) simulations by modeling hopping rates using transition state theory[37].They used multicopy methods to increase the sampling in regions of interest. Donnini etal.[38] performed simulations where the protonation states were changed in order to main-tain constant pH.1.3 Coarse Grained ModelsAll-atom force fields include parameters for every atom including hydrogen. In contrast,coarse-grained force fields group atoms together and represent each group as a bead. Tozzinireviewed coarse-grained models as used for molecular dynamics of proteins [39]. Althoughcoarse-grained models contain less detail than the all-atom ones, coarse-grained simulationshave the advantage of being faster. One reason that they are faster is that coarse-grainedsystems have fewer particles. This means there are fewer calculations to perform for eachstep, particularly for pairwise contributions to the energy. Also, the coarse-grained modelshave removed the light atoms (like hydrogen), this removes some of the fast motions andallows for a larger time step to be used. Thus fewer steps are needed to reach a particularsimulation time compared to when a smaller time step is used.7By reducing the number of particles in the simulation and removing the light atomswhich have fast vibrations, coarse-grained models are computationally faster than all-atommodels. One type of coarse-graining is the united atom model (for example the Amberunited atom model [40] and the OPLS united atom model [41]). These have the hydrogenatoms included with the heavy atom they are bonded to, but each heavy atom is representedby a single interaction site. They are faster than all-atom models but not as fast as somecoarser models. At the other end of the coarseness scale are models that use one or two beadsfor each amino acid. These models like the Go model [42, 43] are usually parameterized toreproduce the correct folding for a particular protein and are difficult to transfer to proteinswith different structures.1.4 GoalsWe are hoping to find ways in which any one (or more) of the monomers could be selectivelyunfolded and/or dissociated from the rest of the complex. This might then allow for therelative binding strengths to be determined providing information about the structure ofthe complex. The goal of this is to provide some qualitative insights, not quantitativeinformation.The goals of this work are to use coarse-grained force fields and charge moving. Thecoarse-graining has two main advantages, it reduces the computational time and it makesthe charge hopping easier (because there are no partial charges and moving the charge doesnot require changing the bonding connectivity). Using coarse-graining will increase thesimulation time scales that can be reached for a given amount of computational time. Thisis important because while bond vibrations take place on femtosecond timescales, proteinmotions that involve higher order structure can take from picoseconds to minutes[30]. Also,protein complexes have many possible conformations, so good sampling is required.Simulations of proton transfer in large protein systems are rare, because fixed chargesimulations are much easier than charge moving. However, we want to include chargemoving in order to better understand the role of mobile protons in the protein dynamics.8Chapter 2Methods2.1 Molecular DynamicsMolecular dynamics (MD) studies the motion of molecules over time. Using a potentialenergy function, the forces on each atom are calculated. From these forces the velocitiescan be calculated using Newton\u00E2\u0080\u0099s law and the positions at a later time can be determined.This is repeated for many time steps. One benefit of molecular dynamics is that it can showfluctuations not just a static picture. Time dependent properties can be studied, or averagesover long times can be used to calculate properties using statistical mechanics tools.In classical MD, Newton\u00E2\u0080\u0099s equation of motion must be solved for each particle, that isF = mdvdt= md2rdt2, (2.1)where F is the force, m is the mass, v is the velocity, r is the position, and t is time.The Verlet algorithm is derived from Taylor expansions of the particle\u00E2\u0080\u0099s coordinates[44].The new position of the particle isr(t+ \u00E2\u0088\u0086t) \u00E2\u0089\u0088 2r(t) \u00E2\u0088\u0092 r(t\u00E2\u0088\u0092 \u00E2\u0088\u0086t) +f(t)m\u00E2\u0088\u0086t2, (2.2)and the velocity isv(t) \u00E2\u0089\u0088r(t+ \u00E2\u0088\u0086t) \u00E2\u0088\u0092 r(t\u00E2\u0088\u0092 \u00E2\u0088\u0086t)2\u00E2\u0088\u0086t. (2.3)From the Verlet algorithm the equivalent Leap Frog algorithm can be obtained by defin-ing the velocities at half integer steps asv(t\u00E2\u0088\u0092 \u00E2\u0088\u0086t2 ) \u00E2\u0089\u00A1r(t) \u00E2\u0088\u0092 r(t\u00E2\u0088\u0092 \u00E2\u0088\u0086t)\u00E2\u0088\u0086t, (2.4)andv(t+ \u00E2\u0088\u0086t2 ) \u00E2\u0089\u00A1r(t+ \u00E2\u0088\u0086t) \u00E2\u0088\u0092 r(t)\u00E2\u0088\u0086t. (2.5)Rearranging Eq. (2.5) gives the new position asr(t+ \u00E2\u0088\u0086t) = r(t) + v(t+ \u00E2\u0088\u0086t2 )\u00E2\u0088\u0086t. (2.6)9Subtracting Eq. (2.4) from Eq. (2.5) and using Eq. (2.2) gives the velocity at time t+ \u00E2\u0088\u0086t2asv(t+ \u00E2\u0088\u0086t2 ) = v(t\u00E2\u0088\u0092\u00E2\u0088\u0086t2 ) +f(t)m\u00E2\u0088\u0086t. (2.7)The sets of functions and parameters used to calculate the forces acting on an atom (orcoarse-grained bead) are called force fields. The equations for the force fields describe thepotential energy for different types of interactions between atoms, and the parameters areused to tailor the calculations for specific atoms. The force on an atom is the derivative of thepotential energy. Many force fields use harmonic oscillator forms for bonding interactions,that isV (r) =12k(r \u00E2\u0088\u0092 r0)2, (2.8)where V (r) is the potential, k is the force constant for the bond stretching, r is the bondlength, and r0 is the equilibrium bond length. Lennard-Jones (LJ) potentials are often usedfor non-bonding (long range) interactions. They have both repulsive and attractive termsand produce a well shaped potential using the functional formVLJ(r) = 4\u00CE\u00B5ij((\u00CF\u0083r)12\u00E2\u0088\u0092(\u00CF\u0083r)6), (2.9)where \u00CE\u00B5ij is the well depth, \u00CF\u0083 is the cross section diameter, and r is the distance betweentwo atoms. The value of \u00CE\u00B5ij depends on the interacting particle types and in the MARTINIforce field it has values between 2.0 kJ/mol and 5.6 kJ/mol (higher values for more stronglyinteracting beads)[45]. The Coulomb potential is used for charges, that isVCoulomb(r) =qiqj4pi\u000F0\u000Frr, (2.10)where r is the distance separating two atoms, qi is the charge on atom i, \u000F0 is the dielectricconstant for a vacuum, and \u000Fr is the relative dielectric constant. The forces are calculatedfor every pair of atoms and added to determine the total force on each atom.Energy minimization is often performed before molecular dynamics in order to removeany situations where atoms/beads are too close together. These bad contacts produce verylarge forces which can make the molecular dynamics trajectories unstable. Because thesystem is able to relax during molecular dynamics steps, reaching the absolute minimumis not necessary as long as there are no bad contacts in the initial structure for the MDrun. Energy minimization is a process for finding a low energy structure by taking thepotential energy function and performing a mathematical minimization with respect tothe atom positions. These minimizations tend to find a local minimum not the globalminimum, and rarely move far from the initial conformation[46]. There are three different10energy minimization algorithms which GROMACS can perform: steepest descent, conjugategradient, and limited-memory BFGS[47]. Steepest descent is the simplest method. It movesatoms in the direction of the force with step-sizes which change at each iteration. Theconjugate gradient method uses information from the previous step to adjust the change inthe current iteration. L-BFGS is a quasi-Newton method[48].For many large systems such as protein complexes, there is not a single correct con-figuration. Rather an ensemble of low energy configurations all contribute to the resultsmeasured. Different trajectories can be started from the same geometry and input pa-rameters, but with different initial velocities. It is important to average the results frommany trajectories in order to get proper sampling, because a single trajectory may becometrapped in a local minimum and miss configurations that are important.2.2 Input ParametersWithin the molecular dynamics software there are a number of parameters the user selectsto control the simulation. The important ones discussed here are time step, electrostatics,thermostats, and constraints.The distance an atom or bead moves in a single step is the product of the velocity andthe time step. If the time step is too large, the particles move large distances which canproduce large forces and make the dynamics unstable. However, the smaller the time stepthe more steps are needed to reach a particular simulation time. Thus the general preferenceis to use the largest time step for which molecular dynamics trajectories are stable. Forall-atom simulations this is usually around 1 fs, but for coarse-grained simulations the timestep can go much larger (20-40 fs in some cases).Because all the systems studied here are isolated gas-phase molecules without solvent,no periodic boundary conditions were used. The absence of solvent also means that chargesand intermolecular interactions are unattenuated, so for the Lennard-Jones and Coulombelectrostatics no cutoffs or shifts were employed. Thus, the electrostatic energy calculationsare performed for all pairs of beads. The default value of the relative dielectric constantused with MARTINI accounts in part for the effect of solvent. In this study, chargesresiding on dissociating fragments act through empty space, for which a relative dielectricconstant of unity is physically appropriate. Higher values are used in solution simulationsto account for the screening effect of the solvent. In principle, charges within a monomerinteract through the protein itself and would require a relative dielectric constant differentfrom unity. However, as the protein conformation changes, the relative balance betweenthrough-space and through-protein interactions changes in a dynamic manner. Because thethrough-space contributions are most important for describing correctly the dissociation11process, the relative dielectric constant was set to unity for all simulations.The simulation temperature is calculated from the kinetic energies of all the particlesin the system usingT =2EkinkBf, (2.11)where Ekin is the average kinetic energy, kB is the Boltzmann constant, and f is the numberof degrees of freedom. For the isolated molecule both center of mass and angular motionare removed (of the whole system), so the number of degrees of freedom is 3N \u00E2\u0088\u0092 6 (N isthe number of beads). Thermostats modify the velocities of the particles to maintain aconstant temperature. The simplest type of thermostat is velocity rescaling where at eachtime step each particle has its velocity multiplied by the factor (Tdesired/T )1/2 to producethe desired temperature. However, this does not account for the fluctuations present in aphysical system. The two most common thermostats are the Andersen and Nose-Hooverthermostats[44]. The Nose\u00C2\u00B4-Hoover thermostat couples the system to a heat bath so thatthe temperature can fluctuate around the desired temperature[49\u00E2\u0080\u009351]. This is done byextending the system with an additional coordinate s, which provides a constraint on thesystem so that the average temperature remains constant. The coupling constant controlsthe time period of temperature fluctuations (usually on the order of 10 times the time step).The temperature in all of these simulations was controlled by a Nose\u00C2\u00B4-Hoover thermostatusing a coupling time constant of 0.3 ps.Constraints are used to keep particular bond lengths fixed. After the positions of theatoms are updated in an unconstrained move, a constraint algorithm is used to calculate theadditional force needed to keep the bond lengths at the values imposed by the constraints.The MARTINI force field adaptation for proteins uses constraints for the ring structuresin some amino acid sidechains to prevent them from flipping/inverting in ways that thereal (or all-atom) sidechains would not. Lagrange multipliers are often used to implementconstraints. However, the resulting second order equations can be difficult to solve, thusapproximations are used to make the problem linear.The method used in this work is the LINCS (Linear Constraint Solver for MolecularSimulations)[52]. There is a set of equations gi(r) = 0, one for each of the i constraints inthe system. LINCS uses matrices for the equations. Newton\u00E2\u0080\u0099s equation with the Lagrangemultipliers in the LINCS approach of adding the multipliers to the potential, V(r), is\u00E2\u0088\u0092Md2rdt2=\u00E2\u0088\u0082\u00E2\u0088\u0082r(V \u00E2\u0088\u0092 \u00CE\u00BB \u00C2\u00B7 g) , (2.12)where M is the matrix of the masses of all the particles, and \u00CE\u00BB is the matrix of Lagrangemultipliers (one for each constraint equation). When calculating matrix inverses powerexpansions of the matrix are used. Increasing the order of these expansions improves the12accuracy. The default lincs order in GROMACS is 4, but for use with MARTINI in thegas-phase I found that higher orders produced both better energy minimization and morestable trajectories, and would recommend using lincs order of 12.SHAKE[53], another commonly used constraint algorithm, uses an iterative method.The constraint equations are solved one at a time (without matrices), but solving the laterconstraints may move atoms involved in earlier constraints. So, the procedure is repeateduntil all the constraints are satisfied.Elastic network bonds[54] can be used to connect selected backbone beads with harmonicpotentials with a force constant of 2500 kJ/(mol nm); a value much larger than those forregular bonds. These are sometimes used for coarse-grained MD (not in atomistic MD)because the bonds between backbone atoms which would keep the secondary structure inplace are not included. These network bonds are usually located in parts of the proteinwhere there are four or more amino acids which have an extended secondary structure (theyare not part of a helix or beta-sheet). Topologies can be created with or without elasticnetwork bonds. These bonds, when made between flexible parts of a protein, increasethe stability of the protein thereby allowing a larger time step to be used. Combiningelastic network bonds with the MARTINI force field allows for faster simulations than theMARTINI force field alone. The effect of these bonds on the results will be considered inChapter 3 and all the simulations of later chapters include local elastic network bonds tostabilize flexible parts of the structure.2.3 Properties for Analysis of TrajectoriesTrajectories are analyzed using energy and structural properties. The structural propertiescalculated for this work are discussed below. Several different energy quantities can becalculated from the positions and velocities including potential, Coulomb, Lennard-Jones,and kinetic energies. The simulation temperature is derived from the average kinetic energyof all the beads. The bond, angle, and dihedral angle contributions to the total energy arevery small compared to the Lennard-Jones and Coulomb energies, so they are not discussed.All energies are reported in kJ/mol, and all distances in nm.The radius of gyration (Rg),Rg =(\u00E2\u0088\u0091i r2imi\u00E2\u0088\u0091imi) 12, (2.13)where mi and ri are the mass and position of atom i, is used to indicate the size of a protein.A small compact protein has a small radius of gyration. If the protein becomes larger orthe shape changes to a more elongated shape the radius of gyration will increase. The more13a protein expands or unfolds the larger the radius of gyration becomes.Another parameter that can be used to measure changes in the proteins is the root meansquare deviation (RMSD),RMSD =(1M\u00E2\u0088\u0091imi (ri \u00E2\u0088\u0092 ri(reference))2) 12(2.14)where M is the total mass, and mi and ri are the mass and position of atom i. RMSD mea-sures the difference between the positions of atoms in the protein and a reference structure.Another property that is measured is the minimum distance. This is the smallest dis-tance between the atoms on one monomer and atoms on the other monomer. This minimumseparation was calculated using the GROMACS g mindist utility. When the minimum sep-aration is small the monomers have not dissociated, but after dissociation the minimumseparation increases. Maximum distances between different beads or groups can be mea-sured as a way of monitoring unraveling in the complex.2.4 Software and Force Field Particular to This WorkGROMACS[55\u00E2\u0080\u009357] is a readily available software package that can be used to performmolecular dynamics simulations. All of the molecular dynamics simulations in this workwere performed using GROMACS. GROMACS has the capability for parallel computing,but here it was only used in serial (one processor per trajectory). Because the systemsstudied here are treated as single molecules, the distributed nature of the parallel versionof GROMACS is not useful. Trying to divide a single molecule into different domains doesnot result in any computational savings. The hopping algorithm has been coded to workwith the serial calculations, but there is no reason why it could not be adapted for parallelcomputing. GROMACS is available as open source software, so the code can be modified.It includes pull code that can be used to add a force to the system pulling groups of atomsapart.To run the simulations, three kinds of input are needed. First is a starting geometry(positions and names of all the atoms). Second is the topology which describes the bondingand connectivity of the atoms. This topology includes all the parameters needed to calculatethe potential energy such as bond force constants and Lennard-Jones parameters. The finalinput contains the molecular dynamics parameters (time step, number of steps, desiredtemperature, etc.). GROMACS has a preprocessing program (grompp) which turns all theinput files into a single file that is formatted for the main program (mdrun) to use.Figure 2.1 shows a skeletal outline of how the molecular dynamics part of the GROMACSprogram is organized (not every part of the code is shown, but it gives an overview of how14it works). The main program calls mdrunner, which initializes some of the data structuresand then makes a call to the integrator. do md is the integrator for molecular dynamics.There are others for stochastic and Brownian dynamics and for energy minimization (eventhough energy minimization is not an integration algorithm it takes the same place in thecode). Within the do md function are first some steps to set things up, and then a loopover the molecular dynamic time steps. During this loop, the forces are calculated and thevelocities and coordinates of each bead are updated for each step.To add charge hopping, I modified GROMACS by adding a new subroutine withinthe molecular dynamics loop. The position of this subroutine relative to the rest of theprogram is shown in Fig. 2.1 where the new function is highlighted in pink. Also, severalnew parameters were added to the molecular dynamics input (in the .mdp file) to controlthe hopping routine. These are mobile charge (yes or no), mobile cutoff, mobile prob,mobile freq, and mobile seed (their use is discussed more in the next section).The particular force field used in this work is MARTINI[45, 58]. It is a coarse-grainedforce field in which each bead represents four heavy atoms plus the hydrogens bonded tothem. It was originally created to model lipids, but has been extended to proteins. Eachamino acid is represented by between one and four beads. It has been parameterized bycomparing the results of simulations to thermodynamic properties such as free energies ofhydration, vaporization, and water/organic solvent partitioning.Baron and coworkers[59\u00E2\u0080\u009361] compared thermodynamic properties from all-atom andcoarse-grained simulations. The coarse-grained model they used was an older version ofMARTINI. They found the coarse-grained systems had a lower entropy than the corre-sponding all-atom systems, but the major structural features were similar. Monticelli etal. then extended the MARTINI force field to include proteins[45]. It was successfully usedin several studies[62\u00E2\u0080\u009365]. Simulations of a series of peptides at an interface were used tocalculate partitioning free energies which were compared to experimental data[66]. Milaniand coworkers used a modified MARTINI force field to study isolated nanofibers[67]. Theirresults were consistent with experimental results.The MARTINI website (http://md.chem.rug.nl/cgmartini/) has files containing the pa-rameters of the force field which are available for downloading. They also have scripts toconvert a Protein Data Bank format structure to the coarse grained structure (atom2cg) andto create a MARTINI topology file (seq2itp). The seq2itp script places positive charges onall the basic amino acid residues and negative charges on all the acidic amino acid residues.For this work, I modified seq2itp so that the default is for all the amino acid residues to beneutral. Positive charges were then added to selected basic amino acid residues to createthe desired charge distribution. No partial charges are used, and the beads charged by theseq2itp script are the MARTINI Qd type. In every positively charged monomer the charges15init_mdatoms() : initializes data structuremain()mdrunner()do_md()Loop over MD stepsatoms2md() : get initial values for molecular dynamicsmobile_charge() : the new partupdate() : changes velocities and coordinatesdo_force() : calculates forcesInput: filename.tprOutputFigure 2.1: A skeletal diagram of the GROMACS molecular dynamics program.16are distributed among the basic amino acid side chains (arginine, lysine and histidine) andthe N-terminus.Two protein complexes have been studied here, cytochrome c\u00E2\u0080\u00B2 and transthyretin. Cy-tochrome c\u00E2\u0080\u00B2 (Protein Data Bank (PDB) ID: 1bbh)[68] is a homodimer with a total of 262amino acid residues of which 30 are basic. The dimer was used in simulations withoutcharge hopping. The results of this are discussed in Chapter 3 where they are comparedto previous all-atom simulations[1, 35, 36]. Transthyretin (PDB ID: 3grg)[69] is a homote-tramer with a total of 460 amino acid residues of which 56 are basic. The tetramer wasused in the charge hopping study (results discussed in Chapters 4 and 5). The monomerswithin the tetramer are labeled as chains A, B, C, and D, and the same number of chargesis placed on each monomer when the initial charge configurations are being generated.For each given charge partitioning and configuration, charges were added to the groundstate crystallographic PDB structure. The structures were first minimized using the steep-est descent method, and then initialized with velocities randomly chosen consistent withthe temperature. Short molecular dynamics simulations were performed to relax the struc-tures to the equilibrium state. These relaxed structures were used as initial conditions forthe simulations from which data were collected. For each case (different temperatures orparameters), 10 trajectories were run with each of 10 different charge configurations for atotal of 100 trajectories. The results reported are the averages of the sets of 100 trajectoriesunless specified otherwise.2.5 Moving ChargesThe best description of charge hopping uses quantum mechanics. However, that is difficultand computationally expensive, so a simpler method is developed here. The scheme for thecharge moving algorithm I created is shown in Fig. 2.2 and the source code is in AppendixA.Overall, a charge hopping algorithm must have sites at which charges reside and criteriaby which charges are moved. Since most mass spectrometry studies of protein complexes useelectrospray ionization in positive ion mode, positive charges are sufficient for the currentstudy. Thus, a positive charge can hop from one basic site to an uncharged basic site. Allother amino acid residues remain neutral.Two criteria for moving charges are employed. The first is a simple distance cutoff. Ifthe distance between the charge donor and charge acceptor sites is less than rcutoff thena hop is possible, and the energy will be considered. If the distance is larger than thecutoff then the move will not happen. The second criteria depends upon energy and uses aMonte Carlo like scheme to determine hopping probability. This change in energy for a hop17int_mdao_sn(ntint_mdao_)n :lzm_enrmum cnum cnum cni:(s_(mdtm L_daam)LntpvdzaMzdLm_)tnldl:z:LD_n2_on)_g)f_d(s_d_td(sne_(Melmt_lmLymm(_b_d(s_hpvdzaMzdLm_aod(wm_:(_vnMznel_I_l:(s:(w_m(mtwDp.ss_sn(nt_O_daam)Lnt_)d:t_Ln_z: L_n2_)n :lzm_enrm pi:(s_aodtwms_ :Lm _gsn(nt f_d(s_M(aodtwms_ld :a_ :Lm _gdaam)Lnt fp\u00E2\u008C\u008C_(mdtm L_daam)Lnt_azn mt_Lod(_aMLn22_s: Ld(am?\u00E2\u008C\u008C_ \u00E2\u0088\u0086?_zm _Lod(_?mtn?\u00E2\u008C\u008C_td(sne_(Melmt_zm _Lod(_)??n(?L_on)?n)\u00E2\u00A4\u0082)?tn)dwdLm_Ltd?maLntDFigure 2.2: A flowchart showing the hopping algorithm.18is a combination of the difference in Coulomb energy between the final and initial chargeconfigurations and the difference in hydrogen binding energies for the acceptor and donorresidue types.In MARTINI, no partial charges are used, so it is easy to change the charges on theside chain beads of the amino acids when a hop occurs, since this involves simply increasingor decreasing the total charge on a single bead. In particular, no structural changes arerequired to change the charge state, so moving charges does not create any geometricalproblems, like bad contacts. More technical details of the algorithm are given below.To organize sites, two lists are created. One is a list of all the sites containing charges.These sites act as charge donors for the hops. The other is a list of the uncharged basicsites which are available as charge acceptors.The hopping algorithm examines the structure of the complex for possible hops beforethe MD algorithm propagates it to the next time step. This is seen in the first black boxof Fig. 2.2. First, for each element in the list of charged sites, the empty basic site nearestto it is found. If the distance to that site is less than rcutoff the pair is added to a listof possible hops. The value of rcutoff was set to 0.57 nm. This value was chosen becauseit is the diameter of a MARTINI coarse-grained bead plus 1 A\u00CB\u009A. In practice, the value ofrcutoff tunes the hopping rate. The larger rcutoff , the greater the number of donor/acceptorencounters and hence the greater probability a charge hop will occur. However, having toolarge a value of rcutoff simulates proton transfer over larger distances, which at some pointis hard to physically justify. A value was chosen small enough to consider the transfer tobe localized but not so small as to decrease too severely the hopping rate. Results werealso calculated for rcutoff = 2.0 nm and while the hopping rates did vary (as alluded toabove) the behavior of the protein structure did not. In other words, the results are notparticularly sensitive to the value of rcutoff . In an atomistic model the angles between theproton donor and acceptor would affect the probability of proton transfer. However, in thecoarse-grained model this structural detail is lost, so the angles between charge donor andacceptor sites are not considered.Second, for each pair of possible hops, the change in energy that would result fromthe hop is calculated. This part of the hopping decision is seen in the second black boxof Fig. 2.2. The Coulomb and proton binding energies are considered for this calculation,but not the Lennard-Jones energy because during the hop the coarse-grained beads donot move. The proton binding energy is the difference in energy between the neutral andprotonated forms of the amino acid. Because this energy is not included in the MARTINIforce field it must be explicitly included in the energy calculation. Table 2.1 lists the valuesof the proton binding energies for the basic sites used in this study. Ab initio calculationsof the energies of bare protonated and non-protonated amino acids were used to calculate19Table 2.1: Proton binding energy values for basic amino acid sites[1].Site \u00E2\u0088\u0086Ebinding (kJ/mol)Arginine -1028.8Histidine -957.7Lysine -937.2N-terminus -959.8the binding energies[1]. These calculations do not consider the environment of the aminoacid, but in the energy calculation the electrostatic environment is accounted for by theCoulomb energy contribution. The change in energy is calculated using\u00E2\u0088\u0086E = Efinal \u00E2\u0088\u0092 Einitial. (2.15)The energy of a particular configuration isE =N\u00E2\u0088\u0091i=1Ebinding,i +\u00E2\u0088\u0091j>ie24pi\u000Frij, (2.16)where Ebinding,i is the proton binding energy of the charged site i, \u000F is the dielectric constantin a vacuum, and rij is the distance between the centers of charged sites i and j. A relativedielectric constant of 1 is used in these calculations. The value of Einitial is the energy E ofthe current configuration while that of Efinal is the energy of the configuration that wouldresult if the charge were moved from the donor to the acceptor site. All possible hoppingpairs are sorted by their \u00E2\u0088\u0086E values from lowest to highest.Starting with the one of lowest \u00E2\u0088\u0086E, each possible hop is considered in turn. If \u00E2\u0088\u0086E \u00E2\u0089\u00A4 0,the charge is moved from the donor to the acceptor. If \u00E2\u0088\u0086E > 0 then the probability of ahop occurring, p, is calculated using the Monte Carlo equation[44]p = exp[\u00E2\u0088\u0092\u00CE\u00B1\u00E2\u0088\u0086EkT], (2.17)where \u00CE\u00B1 is an arbitrary scaling factor, k is the Boltzmann constant, and T is the tempera-ture. The factor \u00CE\u00B1 is an adjustable parameter that allows the hopping rate to be increasedor decreased without changing the temperature for the dynamics of the protein. As will beshown below, it will become necessary to increase the hopping rates in this way in orderto make the timescales for charge hopping events smaller. At the same time, one does notwish to increase the hopping rates by raising the temperature greatly, since this will directlyaffect protein dynamics. By adjusting \u00CE\u00B1, the hopping rates can be manipulated indepen-20dently of the temperature. The value of p is compared to a random number between 0 and1. Only if the random number is less than p is the charge moved from donor to acceptor.When a hop occurs, the charge in the GROMACS mdatoms record for the donor site isdecreased by one and the charge for the acceptor site is increased by one. The bead indexesin the lists of charged and uncharged basic sites are then swapped. The bead which hada charge of +1 before the hop is given a charge of 0 and is moved to the list of unchargedbasic sites, and what had been the corresponding uncharged basic site is given a charge of+1 and is put in the list of charged sites. In the MARTINI force field, charged beads havedifferent Lennard-Jones parameters than uncharged beads. However, in the implementationof the algorithm used in this study this difference was not taken into account, so the beadtypes were left unchanged. The total charge state of the system does not change. Aftereach hop the values of \u00E2\u0088\u0086E are recalculated for all the remaining possible hopping pairs andthe hopping pair list is resorted.Five options have been added to the GROMACS mdp parameter file. To select movingcharges the parameter mobile charge is set to yes (the default is no). The largest distancebetween beads over which a charge can move is rcutoff (called mobile cutoff in the mdp fileand given a default value of 0.57 nm). The mobile prob parameter is the scaling factor \u00CE\u00B1(the default value is 1.0). The seed for the random number generator is set using mobile seed(the default value of -1 uses time-of-day and process id to generate the seed).If the mobile charge algorithm is executed at every time step, one would expect to seea lot of charges flip-flopping between two basic sites, but this is not physically correct.Therefore the protein, 10 MD steps are performed after each execution of the mobile chargealgorithm to allow the protein to adapt to the charge configuration before more hops areconsidered. The mobile freq parameter is used to control the number of MD steps performedbetween each hopping event.The algorithm generates two output files (in addition to the standard GROMACS out-put). One lists the charged sites and the other gives information about all the hops consid-ered each time the algorithm is executed. The time, temperature, number of hops consid-ered, and the number of successful hops are given. As well, the charged donor site index,uncharged acceptor site index, \u00E2\u0088\u0086E, p, and whether the hop was successful are listed foreach move considered. The computational overhead of this algorithm is small (discussed inChapter 4).21Chapter 3Suitability of the MARTINI ForceField For Use With Gas-PhaseProtein Complexes3.1 IntroductionSince the MARTINI force field was parameterized with molecules in solution it is necessaryto validate the force field for use in the gas-phase. The goal of this chapter is to use theMARTINI force field to replicate the all-atom studies performed with the cytochrome c\u00E2\u0080\u00B2dimer[1, 35, 36], compare the results, and thereby assess MARTINI\u00E2\u0080\u0099s utility for furthergas-phase simulations.Experimental evidence for protein unfolding has already been presented by Williamsand coworkers [7]. The difficulty with all-atom simulations is that they are not practical forsimulating long times or large systems. We would like to simulate protein complexes largerthan dimers, and be able to include charge hopping.3.2 MethodThe system studied here is the cytochrome c\u00E2\u0080\u00B2 dimer. Three partitionings of charges be-tween the two monomers were considered. The first are neutral complexes with no charges.The second have a M5/M5 charge partitioning, being symmetric with five charges on eachmonomer for a total charge of +10, and the third have a M8/M2 charge partitioning, beingasymmetric with eight charges on one monomer and two charges on the other for a totalcharge of +10. Figure 3.1 shows the positions of the basic amino acids available for thepositive charges. All the charges are fixed. The same ten charge distributions are used foreach of the charge partitionings as were used by Wanasundara and Thachuk [35]. Theywere chosen as the lowest energy configurations based upon short molecular dynamics re-laxations. For details of the screening method see Ref. 35. The residue numbers of thecharged amino acids are listed in Table 3.1 for each of the 20 charge distributions (residue1 is the N-terminus).22Figure 3.1: Snapshot of a ground state monomer showing the available basic amino acidresidues (and the N-terminus) in black, and all other amino acids in green.23Table 3.1: The residue numbers of the charged amino acids on each monomer for the tenM8/M2 and ten M5/M5 charge distributions.M8 M2c1 1 25 27 70 72 74 89 111 64 74c2 1 25 27 70 72 74 89 111 72 89c3 1 25 27 70 72 74 89 111 74 89c4 1 25 27 70 72 74 89 111 74 127c5 1 25 27 70 72 74 89 111 89 127c6 1 25 27 70 72 74 89 111 89 131c7 1 25 27 64 74 89 111 129 89 131c8 1 25 27 64 70 72 89 111 72 86c9 1 25 27 64 70 74 89 111 89 129c10 1 25 64 70 72 74 89 111 72 86M5 M5c1 25 64 72 89 111 25 64 72 89 111c2 25 64 72 89 111 25 64 74 89 111c3 25 64 72 89 111 25 64 89 111 131c4 25 64 72 89 111 25 72 89 111 127c5 25 64 72 89 111 64 72 89 111 127c6 25 70 74 89 111 64 74 89 111 129c7 25 70 74 89 111 25 64 89 111 129c8 25 70 74 89 111 25 64 72 89 111c9 25 64 70 89 111 25 64 72 89 111c10 25 64 74 89 111 64 72 74 89 11124The reaction coordinate used for the dissociation of the dimer is the distance betweenthe centers of mass of each of the monomers. The center of mass (COM) distance for theground state dimer in the MARTINI force field is 1.4 nm and in the all-atom study, whichemployed the OPLS-AA/L force field, the ground state COM distance was 2 nm [35].In addition to regularly bonded structures, trajectories were also propagated for thesesame structures supplemented with local elastic network bonds. The effect of these bondson the results will be considered.In the present case, 20 local elastic network bonds were created in the regions of themonomer with extended structure. The atoms involved in the elastic network bonds areshown in Fig. 3.2. Each monomer has two regions containing elastic network bonds but eachsuch bond connects atoms only within its region. That is, no elastic network bonds connectatoms between the two regions. For complexes with elastic network bonds it was possible touse a time step of 30 fs while the same time step caused instabilities when the elastic bondnetwork was removed. In the latter case, trajectories would propagate without incident forshort times but for longer times would reach a point where their energies and temperatureswould start to grow uncontrollably. This was traced to a fast motion occurring within theextended structure in which the elastic network was removed. At some point, an eventoccurs whereby two beads approach too closely (since the time step of 30 fs is too large tocorrectly account for the fast motion), a bad contact occurs and the energy increases toorapidly for the thermostat to compensate. This event is rare but for long simulation timesit would occur in approximately 30% of the trajectories. Thus, elastic networks effectivelyremove fast motions arising in the extended structure regions. It is possible these fastmotions are a difficulty only for gas-phase calculations where no solvent is present to dampthem. To avoid the problem the time step was reduced to 2 fs, which decreased significantlythe computational speed of the simulations when no elastic bond networks were included.Three different types of trajectories were propagated - continuous pulling, stepwisepulling, and umbrella sampling. Then results of all three methods can be compared. Thetemperature was kept constant at 300K for all trajectories.GROMACS parameters were set to produce simulations with continuous pulling. Thatis, the SHAKE algorithm [53] was used to constrain the center of mass distance to a 0.000001nm tolerance, and this distance was increased by a constant amount at each time step. Theresulting pull force needed to maintain each center of mass distance was also collected forlater analysis. Continuous pulling simulations were run for a total time of 660 ns using apull rate of 0.00001 nm/ps. The pull forces are more sensitive than structural parametersand it was necessary to investigate their convergence, as detailed below.The convergence of the pull force results was gauged by comparing the predictions froma number of different methods. More specifically, for the M5/M5 charge distribution, three25inFigure 3.2: Snapshots of the ground state dimer with one monomer shown in yellow and theother in green. (A) From the all-atom structure. (B) A stick model of the coarse-grainedstructure showing the atoms involved in elastic network bonds in black.26simulations were performed to calculate the pull force. The pull force is the amount offorce needed to keep the monomer at a particular COM distance. The integral of the pullforce gives the potential of mean force. The first case used the constraint forces from thecontinuous pulling simulations, as detailed above. Simulations with a faster pull rate of0.001 nm/ps were also performed for comparison purposes.The second case used a stepwise pull method. That is, starting from the ground state,the COM distance was increased in steps of 0.15 nm by pulling at a rate of 0.01 nm/psfor 15 ps. Each such pull was followed by a 600 ps simulation during which the center ofmass was constrained at a fixed value using the SHAKE algorithm with a 0.000001 nmtolerance. The resulting constraint forces from the last 400 ps (the first 200 ps was allowedfor relaxation to occur) of these simulations were time averaged to produce a single forcevalue at that COM distance. This was repeated for 44 steps until the COM distance reached7.88 nm. In addition to the pull force, the radius of gyration and the RMSD were calculatedfor the stepwise pull. This method was chosen because the all-atom studies [1, 35, 36]used a stepwise pull method, although the continuous pull is closer to the dissociation anexperimental protein experiences.The third case used simulations employing umbrella sampling [70]. An umbrella po-tential force constant of 3300 kJ/(mol nm2) was used for simulations at a series of COMdistances spanning from the ground state dimer distance to 8.4 nm. Each simulation wasrun for 1800 ps with the last 1600 ps being used for data collection. The resulting his-tograms were analyzed with the WHAM algorithm as implemented in the GROMACSg wham utility program [71]. The pull force was then calculated by taking the derivativeof the WHAM-generated potential of mean force curve as a function of COM distance.3.3 Results and DiscussionThe pull force for the M5/M5 charge partitioning, shown in Fig. 3.3, was calculated by threedifferent methods in order to gauge its convergence. All three methods give the same valuesfor COM distances larger than 7 nm where the pull forces attain a slightly negative value asa result of the Coulomb repulsion between the separated monomers. However, differencesappear at smaller COM distances, especially for COM distances between 2 and 3 nm. Herethe continuous pull method with the faster pull rate of 0.001 nm/ps produced the largestpull forces while the umbrella method produced the smallest ones. This indicates there islikely some irreversible work present in this continuous pull calculation possibly because therate of pulling was too large in this COM distance range. This is supported by the fact thatthe stepwise pull calculation produced pull forces closer to the umbrella results. However,the continuous pull with the slower pull rate of 0.00001 nm/ps produced forces much closer271 2 3 4 5 6 7 8 9 10Center of Mass Distance (nm)-200-1000100200300400500600700800Pull Force (kJ/mol*nm)M5/M5 all atomM5/M5 elastic fastM5/M5 elastic slowM5/M5 elastic stepwiseM5/M5 elastic umbrellaM5/M5 no elastic fastFigure 3.3: The pull force as a function of center of mass distance for the M5/M5 dimers.The black dotted line is the all-atom pull force. The dashed line is the dimer withoutelastic network bonds. The solid lines are the dimers with elastic network bonds; the fastcontinuous pull, slow continuous pull, stepwise pull and umbrella sampling results are red,brown, green, and magenta respectively.to those of the stepwise pull. This slower continuous pull was then used for calculating thestructural parameters discussed later. The pull forces from the all-atom study are similarin peak magnitude to the umbrella results, but at distances larger than 5 nm the all-atompull force is more negative than the coarse-grained pull force. As discussed in Ref. 35, theall-atom pull force results contain some contributions from irreversible work which havenot been removed from the values plotted in Fig. 3.3. Thus, the correct all-atom result isexpected to be less than the curve shown. If one takes the umbrella result to be the best onefor the coarse-grained calculation, then the pull forces in this case are generally somewhatlarger than those corresponding to the all-atom study.The pull force calculated with the same methods (except not with the umbrella sampling)28for the M8/M2 dimer can be seen in Fig. 3.4. As with the M5/M5 dimer, the pull forcefor the slower (0.00001 nm/ps) pull rate is lower than for the 0.001 nm/ps pull rate. Thestepwise pull produces similar forces to the slower continuous pull. Unlike the pull force forthe M5/M5 dimer, the force for the M8/M2 dimer remains positive for COM distances upto 15 nm. The all-atom results are similar to the coarse-grained slower pull until a COMdistance of 4.5 nm. Trajectories which have dissociated have a very small pull force, whileunfolding trajectories which are not dissociated require a significant pull force to move tolarger COM distance. The percent of trajectories which have dissociated is discussed in thefollowing paragraphs and the unfolding of the M8 monomers can be seen in the radius ofgyration (Fig. 3.7). The pull force reported here is the average over both the dissociatedand unfolding trajectories, so for COM distances larger than 6 nm the higher pull force forthe all-atom method suggests that the all-atom simulations have a smaller percentage ofdissociated dimers implying that there is more unfolding in the all-atom simulations.Consider now the percentage of dissociated trajectories as a function of COM distance,as shown in Fig. 3.5. The ground state cytochrome c\u00E2\u0080\u00B2 dimer has a minimum distance ofapproximately 0.43 nm. Using the minimum separation it is possible to determine theproportion of the trajectories that have dissociated. When the minimum distance for a par-ticular trajectory was greater than 0.8 nm, that trajectory was considered to be dissociated.The neutral dimers all dissociate at a similar distance. Between 5.5 nm and 6.0 nm the per-cent dissociated changes rapidly from 0 to 100 percent. Neutral structures without elasticbond networks tend to dissociate at slightly smaller distances compared with those contain-ing elastic bond networks but the difference is quite small. The M5/M5 complexes begindissociating at approximately the same COM distance as the neutral ones, but the rate ofdissociation with COM distance is slightly less than for the neutral complexes. However,in both cases, the steep rise in the curves indicates that the resulting complexes dissociatequickly and with little structural changes.Contrasting this is the dissociation behavior for the M8/M2 complexes. These complexesbegin dissociating at 6 nm and show a strong increase up to a COM distance of 8 nm atwhich approximately 50% of the complexes are dissociated. However, after this point, therate of dissociation drops significantly so that at 14 nm only 80% of the M8/M2 complexesare dissociated. This implies there are two different behaviors in these complexes. Thefirst, which occurs about 50% of the time, involves dissociation with small but significantstructural changes. Complexes exhibiting this behavior have dissociated by the time theirCOM distances have reached 8 nm. The second, which occurs about 50% of the time,involves dissociation with large structural changes. Complexes exhibiting this behaviorlikely undergo significant amounts of unfolding during the dissociation process, and thuscan attain large COM distances before dissociating. Figure 3.5 also shows that while there291 2 3 4 5 6 7 8 9 10 11 12 13 14 15Center of Mass Distance (nm)0100200300400500600700800Pull Force (kJ/mol*nm)M8/M2 all atomM8/M2 elastic fastM8/M2 elastic slowM8/M2 elastic stepwiseM8/M2 no elastic fastM8/M2 no elastic slowFigure 3.4: The pull force as a function of center of mass distance for the M8/M2 dimers.The lines have the same meanings as in Fig. 3.3.304 5 6 7 8 9 10 11 12 13 14 15Center of Mass Distance (nm)020406080100Percent Dissociatedm5m5 stepwisem5m5 slow elasticm8m2 slow elasticm8m2 stepwiseneutral slow elasticFigure 3.5: The percentage of the 100 trajectories that have dissociated as a function ofcenter of mass distance. The solid, dotted, and dashed lines are the M5/M5, M8/M2, andneutral (M0/M0) charge partitionings, respectively. The black and red curves representslow pulling of the complexes with and without elastic network bonds, respectively. Theblue and green lines represent the results of the stepwise pulling.31is some variation, the dissociation behavior for the M8/M2 structures with and withoutelastic bond networks is similar.The behavior seen in Fig. 3.5 can be contrasted with that seen in the correspondingall-atom simulation results, as indicated in Table 1 and Fig. 1 of Ref. 36. For the M5/M5charge partitioning, the all-atom study showed complexes dissociated over a narrow rangeof center of mass distances centered at about 5.7 nm, while the neutral ones dissociated at6.3 nm. Given that the ground state dimer COM distance was 2.0 nm in that study, theneutral and M5/M5 dimers dissociated after being stretched by approximately 4.3 nm and3.7 nm, respectively. In the present case, the neutral and M5/M5 dimers do dissociate overa narrow range of COM distances but do so at a distance of approximately 6.0 nm whichrepresents a stretch of approximately 4.6 nm. In other words, the neutral complexes in thecoarse-grained model dissociate after about the same amount of stretching as in the all-atommodel. However, the M5/M5 complexes in the coarse-grained model must be stretched toa larger distance before dissociating. The contrast is more noticeable for the M8/M2 dimerdissociation. The all-atom results, as seen in Fig. 1 of Ref. 36 show a gradual dissociationprocess and do not give evidence for different dissociation behavior at short versus longCOM distances as seen in the present Fig. 3.5.The minimum distance between monomers is expected to increase linearly with an in-crease in the COM distance. However, for small COM distances, the minimum distance,as shown in Fig. 3.6, remains small and constant. This indicates the monomers are boundtogether as a dimer even as the center of masses are pulled apart, that is the monomersstretch as the COM distance increases. At some point, dissociation occurs and the mini-mum distance begins to increase. For the neutral and M5/M5 complexes, this occurs at aCOM distance just less than 6 nm, and for the M8/M2 complexes at about 6.5 nm. Thesedistances correspond nicely with the points in Fig. 3.5 at which the dissociation curvesbegin to rise. For the neutral and M5/M5 complexes, the minimum distance line has aslope greater than 1 just after dissociation at COM distances between 6 and 7 nm. A slopegreater than one implies the minimum distance is increasing faster than the rate at whichthe COM distances are increasing, in other words, the monomers recoil into a more com-pact form just after dissociation. Afterward, at about 7.2 nm, the minimum distance linebecomes straight and the slope goes to 1 indicating that no further changes in size occur.For the M8/M2 complexes with an elastic bond network, there is a dip in the minimumdistance curve at larger COM distances. This is caused by bound complexes in which oneof the monomers is unfolding. Such trajectories have a very small minimum distance valueand this lowers the average for the ensemble. As seen in Fig. 3.5, approximately 50% of thecomplexes undergo this unfolding behavior.The all-atom study [36] did not measure the minimum distance between monomers, but320246810 M5/M5 elastic stepwiseM5/M5 elastic slowM5/M5 no elastic slowM5/M5 all-atom0246810Minimum Distance (nm)M8/M2 elastic stepwiseM8/M2 elastic slowM8/M2 no elastic slowM8/M2 all-atom4 5 6 7 8 9 10 11 12 13 14 15Center of Mass Distance (nm)0246810 M0/M0 elastic slowM0/M0 no elastic slowFigure 3.6: The minimum distance separating the two monomers as a function of centerof mass distance. The thick dashed black line has a slope of 1. The black and red curvesrepresent slow pulling of the complexes with and without elastic network bonds, respectively.The blue and green lines represent the results of the stepwise pulling and the all-atom data,respectively33instead the minimum intermolecular residual pair distance (the smallest distance betweenamino acids on the two monomers) was calculated. The minimum residual pair distancehas slopes of less than 1 after dissociation indicating structural changes continuing at largeCOM distances. Most significantly, for no cases was a slope greater than 1 observed for anyof the dissociating complexes.Values of the radius of gyration are shown in Fig. 3.7. For the neutral and M5/M5 chargepartitioning, the radius of gyration of the coarse-grained monomers remains approximatelyconstant for all COM distances. In contrast, the all-atom monomers show an increasein radius of gyration for the M5/M5 case starting at a COM distance of 6 nm, that isafter dissociation. For the M8/M2 charge partitioning in both the coarse-grained and all-atom models, the monomer with more charges has a larger increase in radius of gyrationthan the monomer with fewer charges, resulting from the unfolding of the higher chargedmonomer. The lower charged monomer has a radius of gyration value essentially the sameas seen for the neutral and M5/M5 cases. Interestingly, the increase in radius of gyrationis larger for coarse-grained M8/M2 complexes with an elastic bond network compared withthose without. However, at COM distances larger than 12 nm, the M8/M2 complexes withelastic bond networks have a curve which begins to decrease. After some investigation it wasfound that this occurs because some trajectories which were unfolding in the predissociatedstate recoiled to a more compact structure after dissociation. This type of refolding afterdissociation is not observed in the corresponding all-atom case, and is observed to a lesserextent for M8/M2 complexes without elastic network bonds. The stepwise dissociation hadslightly higher values for the radius of gyration, but showed the same decrease at COMdistances larger than 12 nm.A similar conclusion is drawn by examining the values of the RMSD shown in Fig. 3.8.The M5/M5 charge configuration shows an increase in RMSD which then levels off for COMdistances larger than 6 nm (when the structure is no longer changing). This is differentthan the all-atom data which shows the RMSD increasing more after 6 nm. However,at large COM distances the limiting values are the same for the M5/M5 all-atom andMARTINI models. In the M8/M2 charge configuration, the monomer with more positivecharges has a larger increase in RMSD. The all-atom curves initially increase at a slowerrate than the corresponding coarse-grained ones but eventually reach comparable valuesat the largest COM distances. As seen in Fig. 3.7, the coarse-grained RMSD values forthe M8 monomers begin decrease sharply at larger COM distances, again indicating thatstructures are collapsing to a more compact structure. The neutral monomers have RMSDvalues slightly smaller than the M5/M5 charge configuration, however both these chargeconfigurations show the same trend.For all the structural parameters, standard deviations were calculated from the set of34123451234Radius of Gyration (nm)all atom m8all atom m2stepwise m8stepwise m2slow m8 elasticslow m2 elastic1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Center of Mass Distance (nm)1234M5/M5M8/M2neutralFigure 3.7: The radius of gyration as a function of center of mass distance for the M5/M5(top panel) and the M8/M2 (middle panel) charge partitionings, and the neutral complex(bottom panel). In all panels, the monomer with the higher and lower charges are shownin red and black, respectively. The solid, dashed, and dotted lines are the complexes withelastic network bonds, the complexes without elastic network bonds, and the all-atom data(M5/M5 and M8/M2), respectively. The green (higher charge) and blue (lower charge)curves are the stepwise pulling results. 35123123RMSD (nm)all atom m8all atom m2stepwise m8stepwise m2slow m8 elasticslow m2 elastic1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Center of Mass Distance (nm)0123M5/M5M8/M2neutralFigure 3.8: The root mean square deviation (RMSD) as a function of center of mass distancefor the M5/M5 (top panel) and the M8/M2 (middle panel) charge partitionings, and theneutral complex (bottom panel). The colors and line types have the same meanings as inFig. 3.7.36100 trajectories at each COM distance. Because there is an ensemble of structures thatcontribute to the protein, these standard deviations give a measure of the width of thedistribution of the properties. Since all the trajectories start from a relaxation of theground state structure, the standard deviation is small at small COM distances (on theorder of 5%). As the trajectories propagate, the monomers move away from the groundstate and the distributions become wider (standard deviations on the order of 30%).In the MARTINI force field, some secondary structural elements are imposed using thebackbone-backbone bond, angle, and dihedral force constants [45]. For helices these valuesare much larger than they are for other secondary structures so the force field providesa large potential to enforce helical structures. The cytochrome c\u00E2\u0080\u00B2 dimer has four helicesarranged in a bundle in the ground state. Unfolding can occur in principle through theunbundling of these helices or through the unraveling of individual helices. The former isnot hindered by the imposition of secondary structure while the latter is. In order to betterunderstand the effect of secondary structure on our results, a set of M8/M2 (with localelastic network bonds) simulations were performed with the values of the helical backbone-backbone bond, angle, and dihedral force constants set to 625, 350, and 200 kJ/(mol nm)respectively. These values are half the default ones.The results of these simulations are shown in Fig. 3.9. Generally, the results are qual-itatively the same. However, with the weaker helices, the M8/M2 dimers have a largerpercent dissociation and a smaller increase in radius of gyration and RMSD. The largerpercent dissociation at small distances (rising from 0% to 60%) implies a smaller fractionof monomers unfold upon dissociation. This moves the results in worse agreement with theall-atom ones which show a greater fraction of unfolding, even greater than that seen withthe default helical structure parameters. Note that the sharp decrease in RMSD and radiusof gyration at larger COM distances is still present in both cases, and the decreases arecomparable. Upon examining visually the structures at these large distances, it was foundthat just before dissociation, they consist of long extended structures, usually with one ofthe helices being stretched and partially unfolded. After dissociation, these structures com-pact to globular shapes. Comparable all-atom trajectories also show extended structuresbefore dissociation, some with stretched helical structures. However, after dissociation, thestructures remain extended. Caution is required when making comparisons here becausethe coarse-grained simulations speed up dynamic processes, and in this case, the force-fielddefined helical parameters additionally increase the rate of helix formation. It is quite likelygiven enough time, the all-atom extended structures would also collapse into globular struc-tures, akin to the coarse-grained ones. The all-atom simulations were not propagated intime long enough for this to happen. However, the fact the all-atom simulations show nosign of compaction while the coarse-grained ones show a great deal of compaction likely37indicates the rate of compaction in the coarse-grained force field is too high relative to theall-atom case.Taken together, the results in Fig. 3.9 indicate the monomers with weaker helices aremore compact, which is expected if the rigid helical structure is being somewhat relaxed.This compactness appears to be creating more stable structures, less prone to unfolding.Overall, the relaxation of the helical constraints does not change the qualitative behaviorof the complex, and in the case of the percentage dissociation, actually produces resultswhich are further from the all-atom ones. At least in this particular case, the defaulthelical parameters in MARTINI are producing satisfactory results, and do not seem tobe producing behavior which is inconsistent with the all-atom results, provided detailedquantitative comparisons are not made.The dissociation behavior of the protein complexes is affected by the balance betweenthe attractive forces within and between the monomers and the repulsive forces causedby the net charges. In order to quantify somewhat this balance, particular componentsof the potential energy were examined. In the first instance, as seen in Fig. 3.10, theLennard-Jones intramolecular energy, Coulomb intramolecular energy, and their sum werecompared. For all cases, the Lennard-Jones (LJ) intramolecular energy decreases sharplyup to a COM distance of 2 nm. Subsequently, the values for the neutral, M5, and M2monomers become either slowly decreasing or constant up to a distance of about 6 nm afterwhich point they become constant. In contrast, as a result of unfolding, the energies of theM8 monomers generally increase before becoming essentially constant at larger distances.The M5 and M2 monomers have higher LJ intramolecular energy than the neutral becausethe positive charges have caused the monomers to expand a bit thus increasing distancesand lowering LJ energy. They have lower LJ intramolecular energy than the M8 monomerbecause the M5 and M2 monomers have not unfolded while some fraction of the M8 oneshave, leading to less intramolecular contacts. The Coulomb intramolecular energy remainsapproximately constant in all cases except for the M8 monomers which show a slight decreasewith increasing center of mass distance. This is consistent with the fact that the neutral, M2,and M5 monomers show little structural changes during the dissociation while some of theM8 ones unfold, allowing them to lower their intramolecular Coulomb energy. Because theCoulomb contribution is almost constant, the behavior of the total intramolecular energy,as seen in the lower panel of Fig. 3.10, is qualitatively the same as the LJ contribution.Interestingly, at large COM distances the energies are lower than the corresponding valuesat small distances, implying that upon dissociation the structures of the monomers changeso as to increase their favorable LJ contacts. This is possible because they no longer havethe configurational constraints present as part of the dimer.The Lennard-Jones and Coulomb intermolecular energies are shown in Fig. 3.11. The LJ38020406080100Percent dissociated0123RMSD (nm)1234Radius of Gyration (nm)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Center of Mass Distance (nm)036912Minimum separation (nm)M8/M2 weak helixM8/M2 default secondary structureM8/M2 all atomFigure 3.9: The effect of the secondary structure constraints. The red, black, and green linesrepresent simulations with the default parameters, simulations with weaker helix structures,and the all-atom simulations, respectively.39-5800-5600-5400-5200-5000-4800Lennard-Jones (kJ/mol)M5 no elastic M8 no elasticM2 no elasticneutral no elasticM5 slow elasticM8 slow elasticM2 slow elasticneutral slow elastic050010001500Coulomb (kJ/mol)1 2 3 4 5 6 7 8 9Center of Mass Distance (nm)-6000-5500-5000-4500-4000Sum (kJ/mol)Figure 3.10: The intramolecular contributions to the LJ (top panel) and Coulomb (middlepanel) energies, as well as their sum (bottom panel). The sum is scaled so that the groundstate energy is 0. In all panels, the solid, dash-dot, dotted, and dashed lines represent themonomers with +5, +8, +2, and 0 charges respectively. The black and red curves representcomplexes with and without elastic network bonds, respectively.40intermolecular energy is steepest at COM distances less than 2nm, and becomes flat at COMdistances greater than 6 nm. This is the same behavior seen for the LJ energies in Fig. 4 ofthe all-atom study [36]. There is very little difference in the LJ intermolecular energy plotsbetween the different charge partitionings. The Coulomb intermolecular energy is expectedto have a 1/r decay for the M5/M5 and M8/M2 complexes. This smooth decrease inCoulomb energy can be seen in the middle panel of Fig. 3.11. As expected, the M5/M5 hasa larger intermolecular Coulomb energy than the M8/M2 because the product of the chargeson the monomers is higher in the former case. The relative total intermolecular energiesare compared in the bottom panel of Fig. 3.11 where it can be seen that, as expected, theneutral complex has the highest barrier. This barrier is reduced as charges are added tothe complex with the M5/M5 one having the lowest barrier. Note that the coarse-grainedmodel used here does not use partial charges so its behavior upon dissociation is the sameas would be expected for two monopoles separating. The all-atom study did use partialcharges so the intermolecular Coulomb energy reported in Ref. 36 also contains energy fromdipole and higher moments.In addition to the quantitative comparisons given above it is possible to compare thecoarse-grained and all-atom results on a qualitative level by comparing structures fromtypical dissociation events. The coarse-grained results are shown in Fig. 3.12 for the M5/M5and M8/M2 complexes and these should be compared with the corresponding images inFig. 5 of Ref. 35 and Figs. 5 and 6 of Ref. 36. Generally speaking the structural changesare consistent. At a finer level, the dissociated monomers for the coarse-grained M5/M5complex have a more globular shape than the corresponding ones in the all-atom study.3.4 ConclusionsTaken in their entirety, the results show that the MARTINI force field predicts structuresand properties in qualitative agreement with the corresponding all-atom results [35, 36] andin many cases in semi-quantitative agreement. However, there are several differences worthnoting. The radius of gyration values for the dissociated M5/M5 complexes are the sameas those before dissociation for the coarse-grained case and don\u00E2\u0080\u0099t show the increase seen inthe all-atom case. The minimum distance data shows that the M5/M5 and neutral coarse-grained complexes contract just after dissociation, a behavior not seen in the all-atom case.This is confirmed by the images in Fig. 3.12 showing a more globular structure for thedissociated M5 monomers. In addition, the pull forces for the M5/M5 dimer dissociationprocess are consistently higher for the coarse-grained model compared with the all-atomone.All these differences indicate that the MARTINI force field appears to have a slightly41-2000-1500-1000-5000Lennard-Jones (kJ/mol)M5/M5 no elasticM8/M2 no elasticneutral no elasticM5/M5 elastic slowM8/M2 elastic slowneutral elastic slow02004006008001000Coulomb (kJ/mol)0 1 2 3 4 5 6 7 8 9Center of Mass Distance (nm)-2000-1500-1000-5000500Sum (kJ/mol)Figure 3.11: The intermolecular contributions to the LJ (top panel) and Coulomb (middlepanel) energies, as well as their sum (bottom panel). The sum is scaled so that the groundstate energy is 0. In all panels, the solid, dotted, and dashed lines are the M5/M5, M8/M2,and neutral (M0/M0) charge partitionings, respectively. The black and red curves representcomplexes with and without elastic network bonds, respectively.42Figure 3.12: Snapshots of the ground state, M5/M5 (dissociated), and M8/M2 (unfolded)dimers. These images were created using PyMOL [72]. In the M8/M2 dimer, the monomerwith more charges is shown on the left.different balance of attractive and repulsive forces compared to the OPLS-AA/L force field.For the monomers with low charges (neutral, M2, and M5) it is too attractive. This isevidenced by the decrease in radius of gyration after the dissociation and in the minimumdistance plot (Fig. 3.6) where immediately after dissociation the slope is greater than one.For the M8 monomers, the effect of the attractive forces is not seen at smaller distances,but is seen in the rapid decrease in radius of gyration (Fig. 3.7) and RMSD (Fig. 3.8) atCOM distances greater than 12 nm. Also, while the M8 monomers do show some unfoldingbehavior, the all-atom pull forces at larger COM distances suggest that the coarse-grainedmodel has less unfolding which indicates that the force field is too attractive. It is therepulsive charges that help monomers unfold, and lower the barrier for dissociation. TheMARTINI results are consistent with a model in which the charges are not having the degreeof influence as expected from the all-atom results, again indicating the attractive forces aretoo large.The elastic network bonds have only a small effect on the results. However, they signif-icantly increase the time step that can be used. A small number of elastic network bondsin the regions of the protein with the most extended structure have the effect of restrictingsome fast motions without changing the rest of the protein. As well, the default secondarystructure parameters in MARTINI which act to constrain helices especially, do not seem to43be producing inconsistent results. In other words, the dissociation mechanism is not highlydependent upon the details of the secondary structure changes in the proteins.The goal of the present study was to determine the suitability of the MARTINI force fieldfor use in molecular dynamics calculations examining the effect of charge migration uponprotein complex dissociation. Such a study does not require a highly accurate potential sincethe dissociation mechanism is expected to be fairly generic and not dependent upon fineinteractions. From this perspective, the MARTINI force field should be suitable for sucha qualitative study, although for a quantitative study improvements would be desirable.Increasing the number of charges in the system or decreasing the value of the relativedielectric constant would increase the repulsion and thus provide a correction for the balanceof attractive and repulsive forces.44Chapter 4A Charge Moving Algorithm forMolecular Dynamics Simulations ofGas-phase Proteins4.1 IntroductionThe monomer ejection pathway is one where one monomer leaves the complex typicallytaking close to half the total charge with. Because the monomers in a multimeric complexstart with the charges distributed evenly across the surface area, this pathway requirescharges to adopt low energy configurations as the monomer unfolds, hence charges must bemobile. A rigorous treatment of proton transfer involves quantum mechanical calculation ofthe potential energy during proton motion. Such an approach has been used to study protonmotion in water (for reviews see Refs. 73 and 74). In the present case, using a quantummechanical method to describe proton transfer in protein complexes is computationallyvery intensive. Further, the time scale for the monomer ejection pathway is much largerthan the time scale for proton transfer. Thus, the precise description of this transfer is notexpected to play a major role. It is not possible to include the details of proton motion ina coarse-grained potential, because the degrees of freedom associated with protons are notpresent.Donnini et al.[38] performed simulations where the protonation states were changed inorder to maintain constant pH. However, in an isolated (gas-phase) protein complex, thetotal charge remains constant rather than the pH.Lill and Helms created a method (Q-HOP MD) for including proton transport in molecu-lar dynamics (MD) simulations by modeling hopping rates using transition state theory[37].Their general procedure calculates energies and then uses those energies to determine thehopping probability. They used a multicopy method to enhance sampling at the protontransfer step. It would be technically challenging to implement this for charge hopping inprotein complexes.Instead, a simple method to incorporate moving charges into the simulation of a gas-phase protein complex that is compatible with a coarse-grained force field will be intro-45duced. This method can be used for gas-phase proteins (no solvent). The method involvesadding a charge hopping routine to the loop over MD steps, using simple instantaneoushopping between sites and a Monte Carlo like scheme. This hopping algorithm is appliedto transthyretin, a system for which experimental data is available[10, 75\u00E2\u0080\u009377].Results from some simulations of the homomeric tetramer are discussed to show thenature of the hops which occur and the effects of charge hopping. In addition to structuralparameters, properties specific to hops are studied, including the hopping rate, distributionsof the change in energy for a hop, and correlations between donor and acceptor sites.Conclusions follow at the end.4.2 MethodTo study the hopping, a transthyretin (TTR) tetramer was used as a model system withboth high (+20) and low (+8) total charge states. TTR is a small homotetramer whichhas been studied experimentally[10, 75\u00E2\u0080\u009377] using mass spectrometry employing collisionallyactivated dissociation with total charge states ranging from +8 to +15. In Chapter 3 theMARTINI force field was found to be a bit too attractive, therefore the maximum chargestate in the simulations was chosen to be +20 to increase, as an offset, the Coulomb repulsionamong the charges. Simulations were performed for temperatures of 300K, 400K, 500K, and600K. Increasing the temperature from 300K to 600K has the effect of adding energy tothe system giving it the energy needed to overcome the dissociation and unfolding barriers,mimicking the CID experiments. The temperature is being used to speed up the dynamics,so it does not give the true temperature dependence of the unfolding or an estimate ofthe experimental protein temperature. In the present work, simulations were run with \u00CE\u00B1= 1 and 0.023. The \u00CE\u00B1 = 0.023 value was chosen because at 300K this gives a hoppingprobability of 0.5 or greater for \u00E2\u0088\u0086E values less than +75 kJ/mol (approximate midpoint of\u00E2\u0088\u0086E distribution for the \u00CE\u00B1 = 1 simulations). A 30 fs time step was used for 75 ns simulations.In all of these simulations the N-terminus of chain A was allowed to become positivelycharged while the other three N-termini remained neutral. This biases the trajectoriestowards the unfolding of one chain as is expected from the monomer ejection pathway.Although this is less realistic than having all four N-termini as basic sites, it serves as a testof the hopping algorithm.464.3 Results and Discussion4.3.1 Hopping Rate and Charge DistributionsIn order to study the behavior of the hopping algorithm, a number of quantities wereexamined. Figure 4.1 shows the average hopping rate as a function of time. For eachtrajectory the simulation time was divided into bins of 150 ps length and the number ofhops was counted in each bin and averaged over all trajectories. This number was thendivided by 150 to calculate a hopping rate in units of hops per ps, and was normalized forthe number of charges in the system. Figure 4.1 shows that the hopping rate is similar forall the temperatures considered, but slightly lower at 300K than at higher temperatures. Itis higher at small time, and then after about 10 ns reaches a steady state value of about0.002 hops/ps/charge for the high charge state and about 0.1 hops/ps/charge for the lowcharge state when \u00CE\u00B1 = 1. This is much lower than transfer rates of 0.25 hops/ps/charge,0.11 hops/ps/charge and 0.46 hops/ps/charge reported for an excess proton in water inRefs. 37, 78 and 79, respectively. The hopping rate is higher for the low charge state becausethat state has more available acceptor sites so each donor site has a higher probability ofcoming close to an acceptor site. When \u00CE\u00B1 is decreased to 0.023, the hopping rate increasesto just above 0.1 hops/ps/charge for the high charge state and 0.4 to 0.6 for the low chargestate. The decrease in hopping rate at small time is expected because the initial chargeconfigurations were not optimized and the charges quickly rearrange to find lower energyconfigurations. However, at long times the hopping rate is not zero while the slope is veryclose to zero. This implies the system is moving among multiple charge distributions whichare quite close to the lowest energy. This is consistent with predictions from simulationswith static charges which imply that protein relaxation can stabilize many different chargeconfigurations within a small band of lowest energy[1]. More hopping would be expectedat higher temperatures, and the hopping rate does show a slight temperature dependence.The hopping algorithm used here is a simple approximation to the real proton motion.Protein self-solvation and any other local environment effects are not included. The useof rcutoff = 0.57 means that it is modeling direct proton transfer and any other possiblemechanism is not considered. In effect, this \u00CE\u00B1 parameter is allowing us to tune the hoppingrate without using a large increase in temperature which would change the dynamics of thewhole system. Decreasing \u00CE\u00B1 from 1 to 0.023 increases the hopping rate for the high chargestate from near zero to the same order of magnitude as the hopping rate reported for anexcess proton in water[37, 78, 79].The potential and kinetic energies averaged over all 100 trajectories are shown in Fig. 4.2for the high charge state. For each temperature the kinetic energy is constant, as expectedsince a thermostat was used. The potential energy decreases at the beginning as the charges4700.10.20.30.40.50.6Rate (hops/ps/charge)300K400K500K600K0 10 20 30 40 50 60 70Time (ns)00.10.20.30.40.50.6+8 charges+20 chargesFigure 4.1: A plot of the average hopping rate (hops/ps/charge) as a function of time forthe +20 (bottom) and +8 (top) charge states. The temperatures 300K, 400K, 500K, and600K are shown as black, red, green, and blue lines, respectively. The solid and dashedlines represent \u00CE\u00B1 = 1 and \u00CE\u00B1 = 0.023, respectively.48rearrange, and then becomes constant. This change in potential energy is expected as thecharges move to more favorable arrangements. The plateau occurs when the charges aresampling micro-states with the lowest energy. The decrease in potential energy matches thehigher hopping rate at small time, while the flattening out occurs at the same time as thehopping rate becomes constant. With the higher hopping rate, more states are accessed.This is because the increased hopping probability makes hops from arginine to lysine morefrequently accepted. Thus each charge can undergo more movement. This is seen in Fig. 4.2where the lower \u00CE\u00B1 simulations reach a lower potential energy.The average number of charges on each monomer chain is shown as a function of timein Fig. 4.3. For the high charge state at 300K, all the chains have an average of +5 charges(a symmetric charge distribution). At 600K, chain A increases in average charge to +6 andthe other chains decrease to +4.5 for the high charge state when \u00CE\u00B1 = 1. Once the chargeof chain A begins to increase, it does not return to lower charge values. This asymmetry incharge starts quickly (within the first 2 ns), and monotonically increases with the greatestchanges occurring at small times. Intermediate temperatures show an increasing asymmetryin the charge configuration. The charge accumulation on chain A is larger (+9.5 comparedto +6) when \u00CE\u00B1 is decreased to 0.023 from 1 (see Fig. 4.3) for the high charge state. Thisis accompanied by a complementary decrease in the number of charges on the other threechains. For the low charge state, the distributions are centered at +2 \u00C2\u00B11 at both 300K and600K and this does not change when \u00CE\u00B1 is changed. This change is on the same time scaleas changes in secondary protein structure. These changes easily fall within experimentaltime scales in typical time-of-flight mass spectrometry studies (on the order of ms). Thefluctuations in charge for each chain are small compared to the average charge.While Fig. 4.3 shows the average (over 100 trajectories) number of charges on eachmonomer, Fig. 4.4 shows the distribution of charges at 300K and 600K for \u00CE\u00B1 = 0.023.The last 100 configurations of each trajectory were counted for the charge distributionhistograms. At 300K, all the chains are equivalent and have distributions centered at +5 and+2 (for the high and low charge states respectively) with the majority of populations within\u00C2\u00B11 charge of the average. At 600K, the high charge state distributions are broader overallhaving widths of about \u00C2\u00B12 charges and there is asymmetry. One chain has a distribution ofcharges centered near +10, while the other three chains have lower charges (centered near+3).While these simulations used a +20 total charge state, the highest charge state as re-ported in experiments was +15. For a +15 TTR tetramer, CID experiments show fragmentswith an average charge state +8 \u00C2\u00B11 and trimers with the remaining charge[10, 76, 77]. Theexperiments do not have a well defined temperature, but increasing collision energies areexpected to correspond to higher internal temperatures. The 600K simulation results and49-20000-19000-18000-17000-16000-15000-14000-13000Potential Energy (kJ/mol)300K a=1400K500K600K300K a=0.023400K500K600K0 10 20 30 40 50 60 70Time (ns)30004000500060007000Kinetic Energy (kJ/mol)Figure 4.2: The potential and kinetic energies averaged over all 100 trajectories as a functionof time for the +20 charge state at different temperatures. The temperatures 300K, 400K,500K, and 600K are shown as black (brown), red (magenta), green (dark green), and blue(purple) lines for \u00CE\u00B1 = 1 (\u00CE\u00B1 = 0.023), respectively.5012345678910123456789Number of Charges on Monomer0 10 20 30 40 50 60 70Time (ns)123456789 300K400K500K600K\u00CE\u00B1=0.023, +20 charges\u00CE\u00B1=1, +20 charges\u00CE\u00B1=0.023, +8 chargesFigure 4.3: The average number of charges on each monomer is shown averaged over 100trajectories as a function of time for different temperatures for the +20 charge state using\u00CE\u00B1=1 (middle) and \u00CE\u00B1=0.023 (top), and the +8 charge state with \u00CE\u00B1=0.023 (bottom). Thecolors have the same meanings as in Fig. 4.1 (there are four lines at each temperature, onefor each chain).512 4 6 8 10 12Number of Charges0 1 2 3 4 5 6Number of Charges300K, +20 charges600K, +20 charges300K, +8 charges600K, +8 chargesFigure 4.4: A plot of the charge distributions in the last 100 steps of the trajectories for the+20 charge state (left) and the +8 charge state (right) both using \u00CE\u00B1=0.023 (the top panelis at 300K and the bottom panel is at 600K). Each chain is in a different color.52the experiments both show one monomer (out of the four) having half the total charge, butthe 600K simulations have a broader distribution of charge states than the experiments.This broader distribution could be caused by the simulation time being shorter than the ex-perimental time. If some of the trajectories have unfolded and have the asymmetric chargewhile others are caught in the middle of this process, it would make the distribution widerthan it would be after a longer time.4.3.2 Structural PropertiesPlots of the radius of gyration (averaged over 100 trajectories) of each chain are shown inFig. 4.5 as a function of time for different temperatures for the high charge state. Generally,the values stay relatively constant in time except for chain A which grows with time. Thisgrowth becomes more pronounced as temperature increases. Decreasing \u00CE\u00B1 increases theextent to which the radius of gyration increases. This large increase in radius of gyrationindicates structural changes. Figure 4.6 shows snapshots of the protein complex at 300Kand 600K demonstrating the unraveling of chain A. For the low charge state, the radius ofgyration does not change for any of the chains at any temperature indicating that there isno unfolding.The root mean square deviation (RMSD) was also calculated and shows the same qual-itative pattern as the radius of gyration with chain A increasing in size with increasingtemperature. Quantitatively, the increase is larger with the higher hopping rate (lower \u00CE\u00B1).4.3.3 Energy Distributions and Pair CorrelationsIn order to examine the hopping algorithm in more detail, statistics related to the hoppingprocess itself were examined for the +20 charge state. The first of these is shown in Fig. 4.7in which the values of \u00E2\u0088\u0086E of Eq.(2.15) are plotted as distributions for all attempted hops.Each trajectory was divided into three time domains. The beginning, middle, and endcorrespond to t \u00E2\u0089\u00A4 25 ns, 25 ns < t \u00E2\u0089\u00A4 50 ns, and t > 50 ns, respectively. All the instanceswith \u00E2\u0088\u0086E < 0 represent successful hops independent of temperature. For \u00E2\u0088\u0086E > 0, thetemperature dependent hopping probability is determined from \u00E2\u0088\u0086E using Eq.(2.17) so onlya fraction of hops will be successful in this case. The energy distributions have two peaks,one at approximately 30 kJ/mol and the other at about 100 kJ/mol for \u00CE\u00B1 = 1. Most ofthe \u00E2\u0088\u0086E values are positive, which is consistent with the low hopping rate. The lower \u00CE\u00B1simulations have an additional peak in the \u00E2\u0088\u0086E distributions (see Fig. 4.7) at approximately-100 kJ/mol which was not seen when \u00CE\u00B1 was 1.The vertical lines in Figs. 4.7 and 4.10 represent the differences in Ebinding among allpossible pairs of basic sites, as calculated from Table 2.1. Recall that \u00E2\u0088\u0086E has two contri-53123456781234567Radius of Gyration (nm)12345670 10 20 30 40 50 60 70Time (ns)1234567chain Achain Bchain Cchain D0 10 20 30 40 50 60 70Time (ns)600K500K400K300K+20 charges +8 chargesFigure 4.5: The radius of gyration averaged over 100 trajectories as a function of timefor trajectories run at different temperatures for the +20 charge state (left) and the +8charge state(right). The first (bottom), second, third, and fourth (top) panels show resultsat 300K, 400K, 500K, and 600K, respectively. Chains A, B, C, and D are represented byblack, red, green, and blue lines respectively. The solid and dashed lines correspond to \u00CE\u00B1= 1 and \u00CE\u00B1 = 0.023, respectively.54int_mdaos(s) :amlasn(z)estnra(sauccL :amlasn(z)estnra(sapccLFigure 4.6: Snapshots of the ground state tetramer and the final structures at the end oftwo trajectories for the +20 charge state with \u00CE\u00B1 = 0.023 (one at 300K and the other at600K). Each chain is in a different color, and charged beads are represented in black.55Beginning300K400K500K600KMiddle-150 -100 -50 0 50 100 150 200\u00E2\u0088\u0086 E (kJ/mol)EndFigure 4.7: Distributions of \u00E2\u0088\u0086E for all attempted hops taken at times near the beginning,middle, and end of the +20 charge state trajectory propagation. The vertical lines show thedifferences in proton binding energies among all possible pairs of residue types, as calculatedfrom the values in Table 2.1. The colors have the same meaning as in Fig. 4.1. The toppanel shows data collected from the first 25 ns, the middle panel the next 25 ns, and the lastpanel the remaining 25 ns of the trajectories. The solid and dashed lines represent resultsfor \u00CE\u00B1 = 1 and \u00CE\u00B1 = 0.023, respectively. 56butions: i) from the differences in Ebinding and ii) from the differences in Coulomb energyresulting from moving a charge between two basic sites. If the difference in Ebinding domi-nated \u00E2\u0088\u0086E one would expect to see values of \u00E2\u0088\u0086E peaked near the vertical lines in Fig. 4.7.The peaks are not located at the same values as the differences in binding energy (butthey are the same order of magnitude) and they have a peak width of about 40 kJ/mol,indicating the importance of the Coulomb energy.Plotting the energy distributions separated by the donor site type produces the graphseen in Fig. 4.8. The energy distributions are further broken down by acceptor site type(data shown in Fig. 4.9). The arginine proton binding energy is approximately 90 kJ/mollower than that of histidine or lysine. Thus, if the electrostatic environment was the samefor all hops one would expect that it would take 90 kJ/mol more energy to move a protonfrom an arginine than to move a proton from a lysine or histidine. This difference in bindingenergy is the source of the peaks seen in Fig. 4.7. The hops with \u00E2\u0088\u0086E closer to 0 are fromhopping pairs where the donor and acceptor have the same type (or lysine/histidine pairs),and the other peaks are when the donor and acceptor are different.In the first few steps the charges tend to hop to arginine. In the low hopping ratesituation, those charges remain on the arginine so most of the attempted hops seen inFig. 4.7 are unsuccessful hops with an arginine donor. The high charge state has morecharges than there are arginines, so there are still some charges on lysines and histidineswhich can hop to other lysines or histidines. In the high hopping rate situation, the chargesdo hop from arginine to other basic sites, thus there are more attempted hops with non-arginine donors (and more with arginine acceptors). The \u00E2\u0088\u0086E values of the successful hopscan be seen in Fig. 4.10.The discussion above relates to the high charge state \u00E2\u0088\u0086E distributions. The low chargestate \u00E2\u0088\u0086E distributions have peaks at the same \u00E2\u0088\u0086E values but with different peak heights.The hopping pairs graph shown in Fig. 4.11 plots the bead number of the donor siteon the x -axis and of the corresponding acceptor site on the y-axis for all successful hops.Data is shown for 300K and 600K. Overall, Fig. 4.11 shows the correlations between donorand acceptor sites counted over all successful hops in the 100 trajectories. The number ofhops in all 100 trajectories between any pair of sites is indicated by the color: black is 1 to999, red 1000 to 9999, and green 10,000 to 49,999. Chains A, B, C, and D are representedby bead numbers 1-254, 255-508, 509-762, and 763-1014, respectively. Most of the hops arebetween residues in the same monomer. Hops between different monomers do occur, butthey are less frequent. Many of the hops are between pairs that are close together in thesequence. These appear as points near the diagonal. The plot is approximately symmetricabout the line y=x. This means that if a hop from residue A to residue B occurs, then thehop from residue B to residue A will often occur with a similar frequency (not necessarily57Beginning300KMiddle-200 -100 0 100\u00E2\u0088\u0086 E (kJ/mol)EndArg donorLys donorHis donor600K-200 -100 0 100 200\u00E2\u0088\u0086 E (kJ/mol)Figure 4.8: Energy distributions for all the hop attempts of the +20 charge state separatedby the type of the donor site (arginine, lysine, and histidine donors are shown in black,red, and green respectively). The left column is at 300K and the right column is at 600K.Each trajectory was split into three time regions as in Fig. 4.7. The solid and dashed linesrepresent results for \u00CE\u00B1 = 1 and \u00CE\u00B1 = 0.023, respectively.58Arginine Donor300K 600KLysine Donor-200 -100 0 100\u00E2\u0088\u0086 E (kJ/mol)Histidine DonorArg acceptorLys acceptorHis acceptor-200 -100 0 100 200\u00E2\u0088\u0086 E (kJ/mol)Figure 4.9: Energy distributions for the hop attempts of the +20 charge state in the lasttime division of Fig. 4.8 separated by the type of the donor site (arginine, lysine, andhistidine donors are shown in the top, middle, and bottom rows respectively) and also byacceptor site (arginine, lysine, and histidine acceptors are shown as black, red, and greenlines respectively). The left column is at 300K and the right column is at 600K. The solidand dashed lines represent results for \u00CE\u00B1 = 1 and \u00CE\u00B1 = 0.023, respectively.59Beginning300K400K500K600KMiddle-150 -100 -50 0 50 100 150\u00E2\u0088\u0086 E (kJ/mol)EndFigure 4.10: Energy distributions for the accepted hops. The black, red, green, and bluelines represent results at 300, 400, 500, and 600K respectively. The solid and dashed linesrepresent results with \u00CE\u00B1 = 1 and \u00CE\u00B1 = 0.023, respectively. The vertical lines represent thedifferences in binding energies among all the possible residue types.60exactly the same). This seems counterintuitive since according to the values of Ebinding hopsshould be preferred when moving charge to more basic sites and not vice versa. However,hops between sites of the same type would be equally likely no matter which one was thedonor.The number of pairs for which hops occur is much larger at 600K than at 300K. Inparticular, there are many more hops between different chains. At both temperatures thepairs with the most frequent hops are those between sites that are close together in thesequence (near the diagonal).All the possible basic sites have a positive charge at some point in the 100 trajectories.Not all sites are visited with the same frequency. The TTR tetramer has 16 arginineresidues, 16 histidine residues, and 24 lysine residues. Arginine is the most basic of thefour types of basic site considered here and the charges spend significantly more time onarginine than on any other residue. Lysine is the least basic, but it is the most commonso lysine has a larger percentage of the charges than histidine. The low occupation of theN-terminus is also explained by the number of residues of each type. As the temperatureincreases, the percent occupation for arginine increases significantly (from 57% to 77%) andthe occupation of the N-terminus increases slightly while the others decrease.4.3.4 Computational PerformanceAll the computations performed in this work were done using the orcinus computer (partof the WestGrid/Compute Canada computing consortium, www.westgrid.ca). For thetransthyretin tetramer with +20 charges, a simulation of 2,500,000 steps (75 ns) took 5.75hours with hopping and 5.5 hours with fixed charges. For the cytochrome c\u00E2\u0080\u0099 dimer with +10charges, a simulation of 2,500,000 steps (75 ns) took 1.8 hours with hopping and 1.75 hourswith fixed charges. There is only a small difference in computational time with and withoutthe hopping. The time needed for the first stage of the hopping algorithm depends on thenumber of donor and acceptor sites (not the total number of beads) and the later parts ofthe hopping algorithm are only calculated for potential hops. However, the calculations ofthe forces in the molecular dynamics program are performed pairwise for all the atoms inthe molecule and the updating of positions and velocities is also done for all the atoms, sothese take most of the computational time.4.4 ConclusionsCharge motion was studied in the +20 and +8 charge states of the TTR tetramer. Over-all, the comparison of the high and low charge states shows that the Coulomb repulsionamong the charges is a major contributor to unraveling a monomer in the complex, thereby610 254 508 762Donor Sites0254508762Acceptor Sites0 254 508 762Donor Sites0254508762Acceptor SitesChain A Chain B Chain C Chain DChain AChain BChain CChain D300KChain DChain CChain BChain AChain A Chain B Chain C Chain D600KFigure 4.11: For each successful hop in the +20 charge state, the bead number of the donorsite is plotted on the x -axis with the bead number of the corresponding acceptor site on they-axis. Results are shown for 300K (top) and 600K (bottom) both using \u00CE\u00B1=1. The numberof hops between any pair of beads is indicated by the color: black is 1 to 999, red 1000 to9999, green 10,000 to 49,999, and blue more than 50,000.62producing a state ready for dissociation. As seen in Fig. 4.5, monomer unfolding does notoccur for the low charge (+8) states, regardless of the temperature or the charge hoppingrate (that is, the value of \u00CE\u00B1). For the high charge (+20) state, charge enrichment of a singlemonomer and significant monomer unfolding can occur. The rate of these processes is veryslow when the temperature and hopping rate are low but they increase dramatically bothwith higher hopping rates and temperatures, as seen in Figs. 4.3 and 4.5. In this sense, thetemperature and value of \u00CE\u00B1 act to decrease the timescale for these events but do not changetheir qualitative behavior.This also speaks to the coupling between charge migration and monomer unfolding.With charge migration turned off, no monomer unfolding occurs regardless of the tem-peratures studied. Thus, it is the combination of charge migration, needed for chargeenrichment, and energy activation, accomplished by increasing temperature, that leads tomonomer unfolding. This is consistent with experimental studies[10, 75\u00E2\u0080\u009377].Because the Coulomb repulsion among the charges is the dominant factor for monomerunfolding, it is expected that particular microscopic details of the force field or proteinstructure are not critical for determining the qualitative behavior of the monomer unfoldingprocess. Rather it is a result of larger scale variations in the charge distribution. This is akey assumption in these calculations and in the interpretation of the results. For example,the MARTINI force field provides strong bonds to maintain secondary structures, suchas helices, so the particular way a monomer unfolds using the MARTINI force field willdiffer from that for an all-atom force field. In effect the MARTINI force field has a higherbarrier for unfolding than typical all-atom force fields. However, the protein is still able tounravel. The more flexible coils between the strong secondary structures allow movementwhich contributes to the ability of the protein to reach an extended structure. This is showngraphically in Fig. 4.6 where it can be seen that a monomer can extend quite significantly,even though there are pockets of structure still locked in by the strong secondary structurebonds. These pockets do not change the fact that charge is associating with the unfoldingmonomer, and that this charge is being moved farther from the remaining trimer complex.The drop in Coulomb repulsion still occurs as a result. For this reason, the qualitativesequence of events leading to monomer unfolding should be reasonable using the MARTINIforce field. In the simulations, the MARTINI bead types were not changed when the chargeon a bead was changed. Again, the small change in potential parameters associated withthe fitting for these different charge states is not expected to affect the qualitative behaviorof the unfolding process.Overall, the results of this study show that for the purposes of understanding chargemotions in large, charged, gas-phase protein complexes, a detailed description of protontransfer is not necessary since processes are dominated by Coulomb repulsion. This is con-63sistent with experimental observations, and with previous static charge simulations fromwhich the Coulomb repulsion model[1, 17, 33\u00E2\u0080\u009336] was developed. For highly charged com-plexes, as studied here, total charge is the main physical parameter governing the process.The method used here is general enough to be applied to simulations of different proteincomplexes. Since the total charge and the associated Coulomb repulsion govern the processrather than the detailed protein structure, the general trends will be relevant for manyprotein complexes (although the quantitative details would differ).64Chapter 5Controlling Dissociation Channelsof Gas-Phase Protein ComplexesUsing Charge Manipulation5.1 IntroductionFor positively charged proteins, the N-terminus is expected to be important for monomerunfolding. This unfolding can occur as a sequence of steps moving from one end of theprotein to the other similar to the zipper model for the folding of cytochrome c[80] discussedby Morozov et al.In this chapter, results from modifying the protonation of N-termini sites and adding atether to the protein complex are reported. This tether is a group of coarse-grained beadsthat is attached to the protein complex at a position of our choosing. The tether containsa basic bead that can accept a positive charge. By changing the number of basic sites ona monomer, one hopes to influence its unfolding behavior as this behavior determines thedissociation products seen in the mass spectra. Controlling the dissociation could providemore information about the structure of the complexes and lead to a better understanding oftheir biology[18, 81]. Our simulations do not directly model either CID or SID. Instead thesimulations use increasing temperature and/or high hopping rates to mimic energy beingtransferred to the system.5.2 MethodThe charge hopping algorithm described in Chapter 2 was used. The \u00CE\u00B1 parameter for thecharge hopping is set to 0.023 which leads to hopping probabilities centered around 0.5 forthe typical range of \u00E2\u0088\u0086E values (at 300 K) and the default values were used for the othermobile charge parameters. Simulations were run at 300 and 600 K with a 30 fs timestepand total simulation times of 75 or 300 ps. The 300 K temperature is similar to normalroom temperature, and the 600 K temperature corresponds to adding more energy morequickly (more like SID). All the charge configurations have a total charge of +20.65The transthyretin (TTR) homotetramer was used as a model system. Simulations wereperformed with and without an added tether. This tether consisted of two beads, one ofMARTINI bead type N0 (non-polar) and the other of type Qd. The Qd bead is includedin the list of basic sites. The hydrogen binding energy of the tether was set to -940 kJ/molwhich is comparable to the value for lysine. A few simulations were also run with thehydrogen binding energy set to -1200 kJ/mol. No significant differences were seen betweenthe two. The tether was attached to the cysteine sidechain at the first residue of chainA. For the case without the tether, simulations were also performed where the N-terminusfrom only one of the chains was allowed to accept a positive charge while the other threeN-termini remained neutral. This was done by removing all the N-termini except one fromthe donor-acceptor list, so the chemical environment of the neutral N-termini (number ofbeads and their interactions) was not changed.Several properties were calculated in order to examine the structural changes in theprotein complex. The number of charges on each monomer was measured as a function oftime, because the hopping algorithm allows charges to move from one monomer to another.Various distances were also calculated such as the minimum distance from the C- or N-terminus of one chain to the other three, as well as distances associated with positionswithin the length of the chain.5.3 Results and DiscussionSome snapshots of sample trajectories are shown in Figs. 5.1 and 5.2. Figure 5.1 shows aset of snapshots for the complex with all four N-termini accepting charges and without thetether. The top snapshot shows the initial structure, the snapshot in the middle is at theend of a trajectory at 300K, and the bottom is a snapshot from trajectories at 600K. In thiscase, after a 300K trajectory, two chains are showing some unraveling while the other twoare still compact showing the potential for asymmetric pathways. After a 600K trajectory,all four monomers are unraveling producing a symmetric result.Figure 5.2 shows results from complexes with tethers (small, shown in red) attachedto chain A (shown in grey). The top snapshot shows the initial structure, the snapshot inthe middle is at the end of a trajectory at 300K, and the bottom two are snapshots fromtrajectories at 600K. In the initial structure all four monomers are folded. After the 300Ktrajectory, chain A is unfolded and the end with the tether is farthest from the rest of thecomplex. After the 600K trajectories, more than one of the monomers (including chain A)have unfolded. Again the tether is at the point farthest away from the center of the complex.The chain with the tether is unraveling more than the others producing asymmetry.The number of charges per monomer and radius of gyration are shown in Figure 5.366Figure 5.1: Snapshots of the tetramer (with all four N-termini able to be protonated) areshown in the initial state (top), after a 300 ps trajectory at 300K (middle), and after a300 ps trajectory at 600K (bottom). Each chain is shown in a different color (grey, orange,green, and blue).67Figure 5.2: Snapshots of the tetramer with an attached tether are shown in the initial state(top), after a 300 ps trajectory at 300K (middle), and after a 300 ps trajectory at 600K(bottom). Each chain is shown in a different color (grey, orange, green, and blue, and thetether is shown in red attached to the end of the grey chain.)68for simulations run at 300 K and 600 K. When looking at the data plots, please keep inmind that the y-axis scales are not the same for different rows. As seen in the middlecolumn, when all four chains are identical, there is no accumulation of charge on one chainin preference to the others. The radius of gyration values are also the same for all fourchains. When chain A has a basic site at the N-terminus and the other chains do notaccept charges, as shown in the left column, there is charge accumulation on chain A (ata simulation temperature of 600K). There is also a corresponding increase in the radius ofgyration. When chain A has an added basic site on the tether (placed near the N-terminus)as seen in the right column, charge accumulation can also be seen for chain A (at both300K and 600K). In all cases, increasing the simulation temperature increases the radius ofgyration.Figures 5.4 and 5.5 show the minimum distances from beads on one monomer to theother three chains at simulation temperatures of 300K and 600K respectively. The minimumdistance is measuring how close the bead is to the rest of the complex, so if a chain isunfolding those beads involved in the unraveling will move farther away from the otherparts of the complex. The four beads used as reference points are the first bead for theN-terminus, the last bead for the C-terminus, a bead one third of the way along the chain,and a bead two thirds of the way along the chain. As seen in the middle column, when thefour N-termini are the same, the minimum distance between the N-terminus and the otherthree chains increases, while the other three distances remain the same. This increase isfaster and larger at 600K than at 300K. The left column shows that when one N-terminusis a basic site and the other three are not, at 300K the N-terminus minimum distance forchain A increases while the other distances do not. At 600K, the N-terminus minimumdistance for chain A increases much more. The minimum distances for the beads one thirdand two thirds of the way down the chain increase, but with progressively smaller valuesand later times. The C-terminus minimum distance does not change. In the case with thetether, seen in the right column, the increase is similar to the case with all four chains thesame, but chain A (the one the tether is attached to) has a slightly larger increase.In all cases, the N-terminus has the largest movement away from the rest of the complex,and as one moves towards the C-terminus the changes in distance from the complex aresmaller. It is clear that the N-terminus is leading the unraveling with the middle of theprotein monomer following. The increases in N-terminus minimum distance match theincreases in the radius of gyration.Unsurprisingly, when all four chains are the same, they all show the same behavior.When only one of the four chains has an N-terminus which can accept a positive charge,that chain increases in radius of gyration more than any of the other chains. Adding atether changes the behavior of the complex. The chain with the tether unravels more than6934567Charges per monomerChain AChain BChain CChain D12Radius of Gyration (nm)23456789Charges per monomer0 20 40 601234567Radius of gyration (nm)Figure 5.3: Averages over 100 trajectories run at 300K (top two rows) and 600K (bottomtwo rows). The top row of each pair shows the number of charges per monomer and thebottom row shows the radius of gyration (in nm). In the left column the N-terminus ofchain A could be positively charged and the other three N-termini could not. In the middlecolumn all four N-termini could accept a positive charge. In the right column a tether wasadded to chain A. The results for chains A, B, C, and D are shown in black, red, green, andblue, respectively. 70"@en . "Thesis/Dissertation"@en . "2014-11"@en . "10.14288/1.0165544"@en . "eng"@en . "Chemistry"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivs 2.5 Canada"@en . "http://creativecommons.org/licenses/by-nc-nd/2.5/ca/"@en . "Graduate"@en . "Charge hopping in coarse-grained simulations of gas-phase protein complexes"@en . "Text"@en . "http://hdl.handle.net/2429/50811"@en .