Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Comparison of formaldehyde with established crosslinkers to characterize transient protein structures… Srinivasa, Savita 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2017_september_srinivasa_savita.pdf [ 7.4MB ]
Metadata
JSON: 24-1.0348399.json
JSON-LD: 24-1.0348399-ld.json
RDF/XML (Pretty): 24-1.0348399-rdf.xml
RDF/JSON: 24-1.0348399-rdf.json
Turtle: 24-1.0348399-turtle.txt
N-Triples: 24-1.0348399-rdf-ntriples.txt
Original Record: 24-1.0348399-source.json
Full Text
24-1.0348399-fulltext.txt
Citation
24-1.0348399.ris

Full Text

Comparison of Formaldehyde withEstablished Crosslinkers to CharacterizeTransient Protein Structures Using MassSpectrometrybySavita SrinivasaB.Sc., The University of Pittsburgh, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Chemistry)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)June 2017© Savita Srinivasa 2017AbstractChemical cross-linking along with mass spectrometry can elucidate protein geom-etry by introducing stabilizing covalent linkages as distance constraints. Formalde-hyde’s small size allows it to quickly permeate the cellular membrane withoutexternal manipulation and preserve close-proximity and transient protein interac-tions under physiological conditions. Despite its established uses in biology andcompatibility with mass spectrometry, formaldehyde has not yet been applied tostructural proteomics, which other cross-linkers have already accomplished. Inthis thesis, formaldehyde along with four other established cross-linkers (three N-hydroxysuccinimide ester cross-linkers and one zero-length cross-linker), varyingin size and reactivity, were shown to capture Ribonuclease S and the Ca2+-freecalmodulin-melittin, which are two weak protein complexes. It was demonstratedthat the yield of close-proximity crosslinking from zero-length and formaldehydecross-linkers reflected the dissociation constants of both transient complexes. Acomparison between the identification of formaldehyde and established cross-linked species via first stage mass spectrometry (MS) and tandem mass spec-trometry (MS/MS) provided insight into what evidence is sufficient to confirmformaldehyde cross-linked species. Cross-linked species from all cross-linkerswere identified via MS/MS in the Ca2+-free calmodulin-melittin. These were usedto impose different distance constraints to examine the unknown binding orienta-tion of Ca2+-free calmodulin to melittin. The relatively straightforward discov-ery of N-hydroxysuccinimide ester cross-linking was offset by its large size andambiguous distance constraints that may not be suitable for small proteins. Al-though zero-length cross-linkers create close proximity linkages, the high abun-dance of its reactive sites in calmodulin-melittin produced diversified products,complicating mass spectrometric detection. The increased complexity in identify-ing formaldehyde reaction products via mass spectrometry was due to its reactivityiiAbstractwith several amino acids. This work represents the first report of formaldehydecross-links identified between non-covalently associated protein components, sup-porting formaldehyde’s ability to stabilize weak interactions. Four formaldehydecrosslinking sites were localized in calmodulin-melittin, and the mechanisms ofthe formation of these cross-links were revealed using in vivo-like conditions. Theuniformity of formaldehyde crosslink localization reflected the uniform bindingstructure of calmodulin. Furthermore, the binding orientation of calmodulin andmelittin captured by formaldehyde was shown to be most consistent with recentliterature compared to the other cross-linkers.iiiLay SummaryBiological processes are governed by the interactions of proteins in cells. Cova-lent linkages can be introduced via chemical cross-linking reagents to connect andfreeze interacting proteins, providing a snapshot of the configurations of proteinsin cells. Mass spectrometry can identify protein interacting partners and localizespecific interacting regions by measuring the mass-to-charge ratio of species in asample. Formaldehyde has been extensively used with biological material suchas preserving clinically diagnosed tissues and identifying protein interacting part-ners. Surprisingly, unlike other established cross-linking reagents, formaldehydehas yet to be applied to examine protein geometry in biologically relevant systems.In this work, cross-linkers of various lengths and reactivity were applied to ob-tain a comprehensive picture of two different protein interactions. Furthermore,formaldehyde was compared to established cross-linkers to reveal its potential tocharacterize structures of weak protein interactions using mass spectrometry forthe first time.ivPrefaceThis thesis project was proposed by my supervisor, Professor Juergen Kast. I wasresponsible for the experimental work, data analysis and literature searches. Dr.Nikolay Stoynov and Jason Rogalski performed the mass spectrometry.Chapter 1, the Introduction, was adapted from the following publication:Srinivasa, Savita, Xuan Ding, and Juergen Kast. "Formaldehyde cross-linkingand structural proteomics: Bridging the gap." Methods 89 (2015): 91-98.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvList of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxviiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxxiiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Structural Proteomics . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Mass Spectrometry in Structural Proteomics . . . . . . . . . . . 21.3 Mass Spectrometry and Chemical Cross-linking . . . . . . . . . 61.4 Tandem Mass Spectrometric Fragmentation and Nomenclature ofCross-linked Species . . . . . . . . . . . . . . . . . . . . . . . . 201.5 Importance of Formaldehyde Cross-linking: Common Applica-tions and Key Features . . . . . . . . . . . . . . . . . . . . . . . 221.6 Formaldehyde Cross-linking in vivo . . . . . . . . . . . . . . . . 241.7 Formaldehyde Cross-linking in Model Proteins . . . . . . . . . . 26viTable of Contents1.8 Non-covalent Protein Complex Model Systems . . . . . . . . . . 281.9 Thesis Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.1 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.2 Chemical Cross-linking Reactions . . . . . . . . . . . . . . . . . 372.3 Tris-Tricine Sodium Dodecyl Sulfate Polyacrylamide Gel Elec-trophoresis Separation . . . . . . . . . . . . . . . . . . . . . . . 382.4 Trypsin Digestion . . . . . . . . . . . . . . . . . . . . . . . . . 402.5 Reverse Phase High Performance Liquid Chromatography TandemMass Spectrometric Analysis . . . . . . . . . . . . . . . . . . . 412.6 Mass Spectrometric Data Analysis . . . . . . . . . . . . . . . . 422.7 Analysis Based on Relative Abundance Calculations in theCalmodulin-Melittin System . . . . . . . . . . . . . . . . . . . . 492.8 Crystal Structure Distance Constraints for the Calmodulin-MelittinSystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 General Data Analysis of Cross-linked Calmodulin-Melittin and Ri-bonuclease S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.1 Cross-linking of Calmodulin-Melittin and Ribonuclease-S Com-plexes with Various Cross-linkers . . . . . . . . . . . . . . . . . 573.2 Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Sep-aration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.3 Mass Spectrometric Analysis . . . . . . . . . . . . . . . . . . . 743.4 Moving toward the MS/MS Verification of Cross-Linked Species 944 Tandem Mass Spectrometric Fragmentation of Calmodulin-MelittinCross-linked Species . . . . . . . . . . . . . . . . . . . . . . . . . . 964.1 Tandem Mass Spectrometric Verification and Fragmentation Rulesfor Cross-linked Species . . . . . . . . . . . . . . . . . . . . . . 964.2 Tandem Mass Spectrometric Fragmentation of Other Cross-linkers 994.3 Tandem Mass Spectrometric Fragmentation of Formaldehyde . . 1374.4 Formaldehyde versus other Cross-linker Fragmentation . . . . . . 156viiTable of Contents4.5 General Criteria for Evaluating Tandem Mass Spectrometric Pat-terns of Cross-linked Species . . . . . . . . . . . . . . . . . . . . 1574.6 A Second Look at Trypsin Digestion of Cross-linked Residues . . 1584.7 MS/MS Analysis of Formaldehyde Cross-linked Ribonuclease-S 1595 Structural Characterization of Calmodulin-Melittin and RibonucleaseS Cross-linked Species . . . . . . . . . . . . . . . . . . . . . . . . . 1605.1 Trypsin Cleavage and Accessibility of Residues . . . . . . . . . 1605.2 Relative Abundance of Formaldehyde Cross-linking . . . . . . . 1685.3 Cross-linked Product Classification and Abundance in theCalmodulin-Melittin System . . . . . . . . . . . . . . . . . . . . 1765.4 Crystal Structure Distance Constraints . . . . . . . . . . . . . . . 1786 Arising Limitations of Mass Spectrometric Data Analysis of Formalde-hyde Cross-linked Species . . . . . . . . . . . . . . . . . . . . . . . 1896.1 Identification of Limitations Arising in Workflow . . . . . . . . . 1896.2 Mass Spectrometer Comparison . . . . . . . . . . . . . . . . . . 1906.3 Assignment of Monoisotopic Masses . . . . . . . . . . . . . . . 2016.4 Complexity of Cross-linked Candidates Confirmed by Mass Spec-trometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2056.5 Manual versus Software Identification of Calmodulin-MelittinCross-links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2086.6 Limitations in the MS/MS Analysis of Formaldehyde Cross-linkedRibonuclease-S . . . . . . . . . . . . . . . . . . . . . . . . . . 2337 Conclusion and Future Outlook . . . . . . . . . . . . . . . . . . . . 2357.1 Clarification of Formaldehyde Cross-linking Reaction Chemistry 2357.2 Trypsin Digestion Efficiency of Formaldehyde Reaction Products 2387.3 Establishing Tandem Mass Spectrometric Fragmentation Rules forFormaldehyde Cross-link Identification . . . . . . . . . . . . . . 2397.4 The Structural Characterization of Ca2+-free calmodulin-melittinvia Comprehensive Cross-linking . . . . . . . . . . . . . . . . . 2407.5 Revealing Complexity of Cross-linking Reaction Mixtures . . . . 2427.6 Comparing Software Versus Manual Cross-link Identification . . 245viiiTable of Contents7.7 Moving toward Cellular, in vivo Systems . . . . . . . . . . . . . 246Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248AppendicesA First Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263A.1 Confirming the Calcium-Free Calmodulin-Melittin System . . . . 263A.2 Data Analysis Codes . . . . . . . . . . . . . . . . . . . . . . . . 270A.3 Bruker Impact II Tandem Mass Spectrometric Analysis MethodDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273A.4 Calmodulin-Melittin Cross-linked Candidates From First-StageMass Spectrometry . . . . . . . . . . . . . . . . . . . . . . . . . 274A.5 Ribonuclease S Cross-linked Candidates . . . . . . . . . . . . . 282ixList of Tables2.1 Maximum cross-linking distances for every combination of possi-ble reactive sites for each cross-linker are listed . . . . . . . . . . 563.1 A list of MS/MS confirmed peptides in the control calmodulin-melittin sample. The m/z, experimental mass, calculated mass,mass accuracy, normalized peak area, molecular weight of the pro-tein gel band origin, sequence and number of missed cleavages foreach peptide are listed left to right. . . . . . . . . . . . . . . . . 763.2 A list of MS/MS confirmed calmodulin-melittin peptides in PFAtreated sample without PFA modifications. The m/z, experimen-tal mass, calculated mass, mass accuracy, normalized peak area,molecular weight of the protein gel band origin, sequence andnumber of missed cleavages for each peptide are listed left to right. 803.3 A list of MS/MS confirmed calmodulin-melittin peptides in PFAtreated sample with PFA modifications. The m/z, experimen-tal mass, calculated mass, mass accuracy, normalized peak area,molecular weight of the protein gel band origin, sequence andnumber of missed cleavages for each peptide are listed left to right.(+12) and (+30) denotes a Schiff Base/Intrapeptide cross-link andmethylol, respectively, localized on the residue before it. . . . . . 813.4 Cross-linking and modification sites in calmodulin and melittin foreach cross-linker . . . . . . . . . . . . . . . . . . . . . . . . . . 85xList of Tables3.5 List of S-peptide, S-protein and RNaseA peptides in the controlRNaseS sample. The m/z, experimental mass, calculated mass,mass accuracy, normalized peak area, molecular weight of the pro-tein gel band origin, sequence and number of missed cleavages foreach peptide are listed left to right. . . . . . . . . . . . . . . . . . 893.6 List of S-peptide, S-protein and RNaseA peptides in the PFAtreated RNaseS sample. The m/z, experimental mass, calculatedmass, mass accuracy, normalized peak area, molecular weight ofthe protein gel band origin, sequence and number of missed cleav-ages for each peptide are listed left to right. (+12) and (+30) de-notes a Schiff Base/Intrapeptide cross-link and methylol, respec-tively, localized on the residue before it. . . . . . . . . . . . . . . 923.7 Cross-linking and modification sites in RNaseS for each cross-linker. 934.1 EDC Calmodulin-Calmodulin interpeptide cross-linked species, inwhich cross-linking sites are highlighted in red. For species ap-pearing with two different charge states, annotated MS/MS spectrais shown for the m/z marked with an “*”. . . . . . . . . . . . . . . 1014.2 EDC calmodulin-melittin interpeptide cross-linked species arelisted and classified as capturing antiparallel (shaded in blue) orparallel (white) binding. Reactive residues/possible cross-linkingsites are highlighted in red. For species appearing with two differ-ent charge states, annotated MS/MS spectra is shown for the m/zmarked with an “*”. . . . . . . . . . . . . . . . . . . . . . . . . 1064.3 sulfoDST calmodulin-melittin interpeptide cross-linked speciesare listed and classified as capturing antiparallel (shaded in blue)or parallel (white) binding. Cross-linking sites are highlighted inred. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1114.4 BS3+ calmodulin interpeptide cross-linked species, in which cross-linking sites are highlighted in red. For species appearing with twodifferent charge states, annotated MS/MS spectra is shown for them/z marked with an “*”. . . . . . . . . . . . . . . . . . . . . . . 115xiList of Tables4.5 BS3+ calmodulin-melittin interpeptide cross-linked species arelisted and classified as capturing antiparallel (shaded in blue) orparallel (white) binding. Cross-linking sites are highlighted in red.For species appearing with two different charge states, annotatedMS/MS spectra is shown for the m/z marked with an “*”. . . . . . 1204.6 SulfoEGS calmodulin interpeptide cross-linked species, in whichcross-linking sites are highlighted in red. For species appear-ing with two different charge states, annotated MS/MS spectra isshown for the m/z marked with an “*”. . . . . . . . . . . . . . . . 1294.7 SulfoEGS calmodulin-melittin interpeptide cross-linked speciesare listed and classified as capturing antiparallel (shaded in blue)or parallel (white) binding. Cross-linking sites are highlighted inred. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1324.8 PFA calmodulin interpeptide cross-linked species, in which cross-linking sites are highlighted in red. . . . . . . . . . . . . . . . . 1394.9 PFA calmodulin-melittin interpeptide cross-linked species arelisted and classified as capturing antiparallel (shaded in blue) orparallel (white) binding. Cross-linking sites are highlighted in red. 1475.1 Percent abundances of cleaved trypsin cleavage sites observed inthe control and PFA treated (in 14-19, 19-33 and > 33 kDa pro-teins) calmodulin-melittin samples . . . . . . . . . . . . . . . . . 1655.2 Percent abundances of cleaved trypsin cleavage sites observed inthe control and PFA treated ( > 12 kDa proteins) RNaseS samples 1685.3 The percent abundance of PFA cross-linking sites in the unmodi-fied, modified and cross-linked forms in PFA treated calmodulin-melittin; Note: additional decimal places are reported to clarifythat values are > 0% or < 100%, as described in section 2.7 . . . . 1705.4 The calculated equilibrium constants for each cross-linking reac-tion step for each identified PFA cross-linking site in PFA treatedcalmodulin-melittin . . . . . . . . . . . . . . . . . . . . . . . . . 172xiiList of Tables5.5 The percent abundance of PFA modification sites in the unmodi-fied and modified forms in PFA treated calmodulin-melittin; Note:additional decimal places are reported to clarify that values are >0% or < 100%, as described in section 2.7 . . . . . . . . . . . . . 1745.6 The calculated equilibrium constants for the modification reac-tion for each identified PFA modification site in PFA treatedcalmodulin-melittin . . . . . . . . . . . . . . . . . . . . . . . . . 1745.7 The percent abundance of PFA modification sites in the unmodifiedand modified forms and the calculated equilibrium constants forthe modification reaction for each identified PFA modification sitein PFA treated RNaseS . . . . . . . . . . . . . . . . . . . . . . . 1755.8 Relative abundance of calmodulin-calmodulin and calmodulin-melittin cross-linked peptides . . . . . . . . . . . . . . . . . . . . 1765.9 Relative abundance of calmodulin-melittin interpeptide cross-linkssupporting the parallel and antiparallel binding orientation . . . . 1775.10 The maximum distances between all MS/MS identified cross-linking sites and the respective binding orientation it supports foreach cross-linker; Parallel orientations are shaded in white and an-tiparallel orientations are shaded in blue. . . . . . . . . . . . . . . 1846.1 The number of identified calmodulin-melittin cross-linked candi-dates identified via MS and confirmed via MS/MS using the QStarand Impact II . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1916.2 Mascot MS/MS search results for the Impact II analyzed calmod-ulin sample with the highest scoring match for each peptide identi-fied listed. The sequence position (starting and ending residue), ob-served m/z, experimental monoisotopic mass, theoretical monoiso-topic mass, mass accuracy, number of missed cleavages, Mascotscore, and sequence (trypsin cleavage site displayed in the begin-ning and end of the sequence as “R.” or “K.” ) are listed left toright. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200xiiiList of Tables6.3 Mascot MS/MS search results for the QStar analyzed calmodulinsample with the highest scoring match for each peptide identifiedlisted. The sequence position (starting and ending residue), ob-served m/z, experimental monoisotopic mass, theoretical monoiso-topic mass, mass accuracy, number of missed cleavages, Mascotscore, and sequence (trypsin cleavage site displayed in the begin-ning and end of the sequence as “R.” or “K.” ) are listed left toright. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.4 The percent of the total number of MS candidate cross-linkedspecies with incorrectly assigned monoisotopic peaks by the soft-ware for each cross-linker . . . . . . . . . . . . . . . . . . . . . . 2056.5 The percent of the total number of MS confirmed candidate cross-linked masses that correspond to modified peptides, undeterminedspecies, species with insufficient MS/MS and cross-linked speciesfor each cross-linker. . . . . . . . . . . . . . . . . . . . . . . . . 205A.1 MS Candidate Cross-linked Species for EDC . . . . . . . . . . . 274A.2 MS Candidate Cross-linked Species for PFA . . . . . . . . . . . . 275A.3 MS Candidate Cross-linked Species for PFA . . . . . . . . . . . 276A.4 MS Candidate Cross-linked Species for PFA . . . . . . . . . . . . 277A.5 MS Candidate Cross-linked Species for sulfoDST . . . . . . . . . 278A.6 MS Candidate Cross-linked Species for BS3 . . . . . . . . . . . . 279A.7 MS Candidate Cross-linked Species for sulfoEGS . . . . . . . . . 280A.8 MS Candidate Cross-linked Species for sulfoEGS . . . . . . . . . 281A.9 Candidate Cross-linked Species for EDC . . . . . . . . . . . . . . 282A.10 Candidate Cross-linked Species for PFA . . . . . . . . . . . . . . 283A.11 Candidate Cross-linked Species for PFA . . . . . . . . . . . . . . 284A.12 Candidate Cross-linked Species for sulfoDST . . . . . . . . . . . 285A.13 Candidate Cross-linked Species for BS3 . . . . . . . . . . . . . . 286A.14 Candidate Cross-linked Species for BS3 . . . . . . . . . . . . . . 287A.15 Candidate Cross-linked Species for sulfoEGS . . . . . . . . . . . 288A.16 Candidate Cross-linked Species for sulfoEGS . . . . . . . . . . . 289xivList of Figures1.1 A summary of the bottom-up proteomic strategy, in whichproteins are purified, separated via SDS-PAGE and enzymati-cally digested into peptides, which are separated through HPLCand eluted into the mass spectrometer. The schematic diagramof a Bruker Impact II QqTOF mass spectrometer is shown onthe top right (adapted from reference [14] with persmission,https://creativecommons.org/licenses/by/4.0/). At each time pointan MS spectrum of the ions eluted is plotted. For example at time= T1 , a species with a m/z of 301.00 (z = 2) eluted and the sig-nal appears as isotopic cluster such that the charge is the inverseof the difference between each isotopic peak i.e. 1 divided by0.5. The MS/MS of precursor ion produces fragment ion signals(b1,b2. . . y1,y2. . . ) which are typically the result of the cleavage ofpeptide bonds in CID. MS/MS spectra and MS spectra are matchedto theoretical databases for identifying and sequencing peptides. . 31.2 In general, cross-linkers introduce covalent bonds in proteins via atwo-step reaction: modification of protein site 1 and cross-link for-mation between protein sites 1 and 2. Upon enzymatic digestion, acomplex mixture of different types of peptides is produced. . . . 7xvList of Figures1.3 In the sulfoDST cross-linking reaction scheme, protein site 1 (mass= m1) reacts with the sulfoDST to form a modification which im-mediately reacts with protein site 2 (mass = m2) to form a cross-linking bridge (highlighted in red). A competing hydrolysis re-action product (m1+bridge + H2O) can also occur instead of thecross-linked product (M = m1 +m2 + bridge). R1, R2 and R4 aredefined in the dotted box. . . . . . . . . . . . . . . . . . . . . . . 101.4 In the BS3 cross-linking reaction scheme, protein site 1 (mass =m1) reacts with the sulfoDST to form a modification which im-mediately reacts with protein site 2 (mass = m2) to form a cross-linking bridge (highlighted in red). A competing hydrolysis re-action product (m1+bridge + H2O) can also occur instead of thecross-linked product (M = m1 +m2 + bridge). R1, R2 and R4 aredefined in the dotted box. . . . . . . . . . . . . . . . . . . . . . . 111.5 In the sulfoEGS cross-linking reaction scheme, protein site 1 (mass= m1) reacts with the sulfoDST to form a modification which im-mediately reacts with protein site 2 (mass = m2) to form a cross-linking bridge (highlighted in red). A competing hydrolysis re-action product (m1+bridge + H2O) can also occur instead of thecross-linked product (M = m1 +m2 + bridge). R1, R2 and R4 aredefined in the dotted box. . . . . . . . . . . . . . . . . . . . . . . 121.6 In the EDC/sulfoNHS cross-linking reaction scheme, protein site1 (mass = m1) reacts with the cross-linker. The intermediate canreact with sulfoNHS to form an amine reactive sulfoNHS ester,which forms a cross-link with protein site 2 (mass = m2). The in-termediate can also produce a stable N-acylisourea (m1 + 155 Da).Cross-linker bridges are highlighted in red. R1 and R2 representtwo protein sites and specific reactive amino acids are defined indashed line boxes. Mass of each species are provided in solid lineboxes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14xviList of Figures1.7 PFA reacts with protein site 1 (mass = m1), which forms a methy-lol modification (mass = m1+ 30 Da). This can dehydrate intoa Schiff Base (mass =m1+ 12 Da), which can continue to reactwith protein site 2 (mass = m2) to form a methylene bridge. Thisproduces a cross-linked species (mass = m1+ 12 Da+ m2). Cross-linker bridges are highlighted in red. R1 and R2 represent two pro-tein sites and specific reactive amino acids are defined in dashedline boxes. Mass of each species are provided in solid line boxes. 161.8 The PFA modification reaction generic mechanism (a) and for eachreactive residue (b) (Adapted from reference [37], with permission) 171.9 The PFA cross-linking reaction generic mechanism (a) and for eachreactive residue (b) (Adapted from reference [37], with permission) 181.10 (a) Cross-linked peptides (denoted as I and II) can fragment exclu-sively at the cross-linker (type 1), exclusively at the peptide back-bone (type 2) and at both the cross-linker and peptide backbone(type 3). (b) Diagnostic ions/specific fragmentation for NHS Es-ters (c) Specific fragment ions produced from sulfoEGS fragmen-tations. (a-c) Cross-linker bridges and cross-linked lysines fromeach protein site are highlighted in red. . . . . . . . . . . . . . . 211.11 (a) Amino acid sequence of the calmodulin N-terminal (teal) andC-terminal (blue) domains connected by a flexible linker (black)and of melittin (purple); All possible trypsin cleavage sites are de-noted with red vertical bars. (b) Calmodulin binds to Ca2+, whichinduces the formation of a dumbbell-shaped conformation; Uponbinding to melittin, a similar conformational change occurs forboth Ca2+ -saturated and Ca2+ -free calmodulin. Melittin com-petitively binds to calmodulin, inhibiting calmodulin’s activity. . 311.12 (a) Amino acid sequences of RNaseS components S-protein (blue)and the S-peptide (green). The S-protein contains 8 cysteines,which were reduced and alkylated in this model system. The S-peptide to S-protein binding sites are underlined. (b) The crystalstructure (1RNU) [108] of RNaseS is shown with the S-protein(blue) to S-peptide (green) binding site highlighted in red. . . . . 33xviiList of Figures2.1 Data Analysis workflow for cross-link identification, where itemsin parentheses are values that have been eliminated. All elimina-tion and matching of monoisotopic masses using Mathematica andExcel was performed using a mass accuracy of + 0.2 Da. . . . . . 472.2 PFA cross-linking equilibrium reaction steps, where K1, K2, andK3 are the respective equilibrium constants for the formation of amethylol, Schiff Base and methylene bridge, respectively. The no-tation for each reactant and product is defined. R1and R2 representprotein sites 1 and 2, respectively. . . . . . . . . . . . . . . . . . 522.3 The PFA modification equilibrium reaction step defines K1+2 in thecase where a methylol modification was not identified. R1and R2represent protein sites 1 and 2, respectively. . . . . . . . . . . . . 523.1 SDS-PAGE of calmodulin-melittin cross-linking reaction mixtureswith the protein marker (lane 1); control samples with EDC bufferconditions (lane 2), and control samples with all other cross-linkerbuffer conditions (lane 3); cross-linked samples EDC/sulfoNHS(lane 4), PFA (lane 5), sulfoDST (lane 6), BS3 (lane 7) and sul-foEGS (lane 8); Four approximiate molecular weight categories ofeach protein/protein complex band are labelled with the type ofcrosslinking (if any) indicated. . . . . . . . . . . . . . . . . . . . 613.2 SDS-PAGE of RNaseS cross-linking reaction mixtures with pro-tein marker (lane 1); control samples with EDC buffer conditions(lane 2), and control samples with all other cross-linker bufferconditions (lane 3); cross-linked samples EDC/sulfoNHS (lane 4),PFA (lane 5), sulfoDST (lane 6), BS3 (lane 7) and sulfoEGS (lane8); Four approximiate molecular weight categories of each pro-tein/protein complex band are labelled with the type of crosslink-ing (if any) indicated. . . . . . . . . . . . . . . . . . . . . . . . 62xviiiList of Figures3.3 Literature study that compares the SDS-PAGE of calmodulin in thepresence of EDTA, without Ca2+ (lane 5 and 6) and in the presenceof Ca2+, without EDTA (lane 2 and 3). Lanes 1 and 4 are proteinmarkers. The amounts of calmodulin used were 6µg (lanes 2 and 5)and 12µg (lanes 3 and 6). The concentrations of CaCl2 and EDTAwere both 5 mM (adapted from reference [139], with permission). 633.4 Relative yield of cross-linked species (blue) versus non-crosslinked species (red) in the Calmodulin-Melittin complex measuredvia SDS-PAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . 703.5 Relative yield of cross-linked species (blue) versus non-crosslinked species (red) in the RNaseS complex measured via SDS-PAGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.6 In the dotted boxes, the structures of proteins with a lysine sidechain modified four ways as indicated with their respective pKavalues are shown. Cross-linked bridges are highlighted in red. . . 833.7 Trypsin’s catalytic triad consists of aspartic acid (D102), histidine(H57) and serine (S195). Aspartic acid and histidine increase thenucleophilicity of serine, which attacks the partially positive car-bonyl carbon of the protein. The positively charged amino groupon lysine increases the electrophilicity of the carbonyl carbon. Thepeptide bond is cleaved and the trypsin catalyst is regenerated.(adapted from reference [153]) . . . . . . . . . . . . . . . . . . . 844.1 The nomenclature used to annotate MS/MS spectra of cross-linkedspecies for each type of fragment ion . . . . . . . . . . . . . . . . 984.2 Interpeptide EDC calmodulin cross-link at m/z 1100.54 (z = 3)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red; . . . . 1024.3 Interpeptide EDC calmodulin cross-link at m/z 716.96 (z = 5) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom) ; Cross-linker bridges are indicated in red. . . . 103xixList of Figures4.4 Interpeptide EDC calmodulin cross-link at m/z 875.76 (z = 3) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. . . . . 1044.5 Interpeptide EDC calmodulin-melittin cross-link m/z 621.58 (z = 4)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom) ; Cross-linker bridges are indicated in red. . . . 1074.6 Interpeptide EDC calmodulin-melittin cross-link m/z 616.66 (z = 3)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom) are shown . . . . . . . . . . . . . . . . . . . . . 1084.7 Interpeptide EDC calmodulin-melittin cross-link m/z 1101.26 (z= 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1094.8 Interpeptide sulfoDST calmodulin-melittin cross-link m/z 842.77(z = 3) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1124.9 Interpeptide sulfoDST calmodulin-melittin cross-link m/z 493.92(z =3) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red.Note: Fragmentation indicated on the backbone of the peptide cor-responds to type 3 ions only. . . . . . . . . . . . . . . . . . . . . 1134.10 Interpeptide BS3+ calmodulin cross-link m/z 1085.55 (z = 2) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. . . . . 1164.11 Interpeptide BS3+ calmodulin cross-link m/z 1079.51 (z = 3) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. . . . . 1174.12 Interpeptide BS3+ calmodulin cross-link m/z 588.96 (z = 3) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom) ; Cross-linker bridges are indicated in red. . . . 1184.13 Interpeptide BS3+ calmodulin-melittin cross-link m/z 850.46 (z= 3) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 121xxList of Figures4.14 Interpeptide BS3+ calmodulin-melittin cross-link m/z 732.39 (z= 3) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1224.15 Interpeptide BS3+ calmodulin-melittin cross-link m/z 699.36 (z= 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1234.16 Interpeptide BS3+ calmodulin-melittin cross-link m/z 733.35 (z= 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1244.17 Interpeptide BS3+ calmodulin-melittin cross-link m/z 481.00 (z= 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1254.18 Interpeptide BS3+ calmodulin-melittin cross-link m/z 520.02 (z= 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1264.19 Interpeptide BS3+ calmodulin-melittin cross-link m/z 448.97 (z= 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1274.20 Interpeptide sulfoEGS calmodulin cross-link m/z 956.96 (z = 4)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. . . . . 1304.21 Interpeptide sulfoEGS calmodulin cross-link m/z 934.46 (z = 4)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. MS/MSspectra is annotated such that I = II. . . . . . . . . . . . . . . . . 1314.22 Interpeptide sulfoEGS calmodulin-melittin cross-link m/z 571.54(z = 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1334.23 Interpeptide sulfoEGS calmodulin-melittin cross-link m/z 635.81(z = 4) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 134xxiList of Figures4.24 Interpeptide sulfoEGS calmodulin-melittin cross-link m/z 791.72(z = 3) proposed structures with fragment ion evidence (top) andMS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1354.25 Interpeptide PFA calmodulin cross-link m/z 666.33 (z = 3) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. Note:Fragmentation indicated on the backbone of the peptide corre-sponds to type 3 ions only . . . . . . . . . . . . . . . . . . . . . 1404.26 Interpeptide PFA calmodulin cross-link m/z 736.59 (z = 4) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. Note:Fragmentation indicated on the backbone of the peptide corre-sponds to type 3 ions only. . . . . . . . . . . . . . . . . . . . . . 1414.27 Interpeptide PFA calmodulin cross-link m/z 1192.23 (z = 3) pro-posed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. . . . . 1424.28 Degree of Modification: Bar graph depicting the DOM of each bion and y ion for 1GIGAVLK7 in cross-linked species m/z 744.73(z = 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1444.29 Degree of Modification: Bar graph depicting the DOM of each bion and y ion for 1GIGAVLK7 in cross-linked species m/z 484.46(z = 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1454.30 Degree of Modification: Bar graph depicting the DOM of each bion and y ion for 1GIGAVLK7 in cross-linked species m/z 588.52(z = 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1464.31 Interpeptide PFA calmodulin-melittin cross-link m/z 744.73 (z = 3)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. Note:Fragmentation indicated on the backbone of the peptide corre-sponds to type 3 ions only . . . . . . . . . . . . . . . . . . . . . 148xxiiList of Figures4.32 Interpeptide PFA calmodulin-melittin cross-link m/z 484.46 (z = 5)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. Note:Fragmentation indicated on the backbone of the peptide corre-sponds to type 3 ions only . . . . . . . . . . . . . . . . . . . . . 1494.33 Interpeptide PFA calmodulin-melittin cross-link m/z 588.52 (z = 5)proposed structures with fragment ion evidence (top) and MS/MSspectra (bottom); Cross-linker bridges are indicated in red. Note:Fragmentation indicated on the backbone of the peptide corre-sponds to type 3 ions only . . . . . . . . . . . . . . . . . . . . . 1504.34 Interpeptide PFA calmodulin-melittin cross-link m/z 730.65 (z =4) proposed structures with fragment ion evidence; Cross-linkerbridges are indicated in red. . . . . . . . . . . . . . . . . . . . . 1514.35 Interpeptide PFA calmodulin-melittin cross-link m/z 730.65 (z = 4)MS/MS spectra; Cross-linker bridges are indicated in red. . . . . 1524.36 Reaction mechanisms of the PFA modification (i) and cross-linkingformation (ii) of melittin R24 to calmodulin R126 (a), calmodulinK77 to Q3 (b), melittin G1 to calmodulin Y99 (c) and melittin G1to calmodulin Q8 (d); Reactive regions are highlighted in red. R1and R2, and R3and R4, represent arbiturary sections of the modifiedand cross-linked proteins, respectively. . . . . . . . . . . . . . . . 1555.1 Identified cross-links were mapped on the Ca2+ -free unbound (a)and bound-state (b) calmodulin conformation. Orange and greylines represent inter-residue distances that do and do not agree withmaximum cross-linker distances, respectively. Cross-linking sitesare highlighted in red and calmodulin C and N terminal domainsare colored in blue and teal, respectively. . . . . . . . . . . . . . . 182xxiiiList of Figures5.2 Calmodulin-melittin binding structures (two views of the samestructure) proposed by cross-linking distance constraints that sup-ported (a) parallel (yellow lines) and (b) antiparallel (orange lines)binding;. Orange/yellow and grey lines represent inter-residue dis-tances that do and do not agree with maximum cross-linker dis-tances, respectively. Cross-linking sites are highlighted in red,melittin is shown in purple, calmodulin C and N terminal domainsare shown in blue and teal, respectively. . . . . . . . . . . . . . . 1865.3 Calmodulin-melittin binding structures proposed by EDC (a) andPFA (b) distance constraints; W19 on melittin is highlighted inyellow. Orange and yellow lines represent inter-residue distancesthat support antiparallel and parallel binding, respectively. Cross-linking sites are highlighted in red, melittin is shown in purple,calmodulin C and N terminal domains are shown in blue and teal,respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1876.1 QStar acquired MS (middle) and MS/MS (bottom) spectrum of in-terpeptide BS3+ calmodulin-melittin cross-link m/z 896.93 (z =2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1936.2 Impact II acquired MS(middle) and MS/MS (bottom) spectrum ofinterpeptide BS3+ calmodulin-melittin cross-link m/z 896.93 (z =2); Proposed structure with the sequence fragment ion evidenceindicated on the backbone of the peptide that corresponds to type2 ions only (top). . . . . . . . . . . . . . . . . . . . . . . . . . . 1946.3 QStar acquired MS (middle) and MS/MS (bottom) of interpeptidePFA calmodulin cross-link m/z 603.24 (z = 3); Proposed structurewith the sequence fragment ion evidence indicated on the backboneof the peptide that corresponds to type 3 ions only (top). . . . . . 1966.4 Impact II acquired MS (middle) and MS/MS (bottom) of inter-peptide PFA calmodulin cross-link m/z 666.33 (z = 3); Proposedstructure with the sequence fragment ion evidence indicated on thebackbone of the peptide that corresponds to type 3 ions only (top). 198xxivList of Figures6.5 (a) Monoisotopic peak assigned for MS Signal m/z 539.07 (z =5) by DeconMSn, SNAP, SumPeak and Apex peak picking meth-ods as indicated by the blue, red, purple and teal arrows, respec-tively. (b) Summary of peak picking methods for a PFA treatedcalmodulin-melittin sample . . . . . . . . . . . . . . . . . . . . . 2036.6 (a) Venn Diagram of MS/MS verified EDC cross-linked speciesidentified by each software (StavroX and pLink) and manualmethod, two methods (two region overlap), all methods (center re-gion overlap); (b) The calculated monoisotopic mass, cross-linkpeptide sequence (cross-linked residues highlighted in red), m/z,experimental monoisotopic mass, and mass accuracy for speciesidentified by StavroX and pLink. . . . . . . . . . . . . . . . . . 2116.7 An example of a EDC cross-linked species (m/z =580.84, z = 2)and its annotated MS/MS spectrum from StavroX . . . . . . . . . 2126.8 (a) Venn Diagram of MS/MS verified sulfoDST cross-linkedspecies identified by each software (StavroX and pLink) and man-ual method, two methods (two region overlap), all methods (centerregion overlap); (b) The calculated monoisotopic mass, cross-linkpeptide sequence (cross-linked residues highlighted in red), m/z,experimental monoisotopic mass, and mass accuracy for speciesidentified by StavroX and pLink. Cross-links that agreed with themanual detection are highlighted in purple. . . . . . . . . . . . . 2146.9 An example of a sulfoDST cross-linked species (m/z = 567.53, z =4) and its annotated MS/MS spectrum from StavroX . . . . . . . 2156.10 (a) Venn Diagram of MS/MS verified BS3 cross-linked speciesidentified by each software (StavroX and pLink) and manualmethod, two methods (two region overlap), all methods (center re-gion overlap); (b) The calculated monoisotopic mass, cross-linkpeptide sequence (cross-linked residues highlighted in red), m/z,experimental monoisotopic mass, and mass accuracy for speciesidentified by StavroX and pLink. Cross-links that agreed with themanual detection are highlighted in purple. . . . . . . . . . . . . 218xxvList of Figures6.11 An example of a BS3 cross-linked species (m/z = 890.75, z = 4) andits annotated MS/MS spectrum from StavroX; In the cross-linkedstructure shown on top, the “m” in the peptide sequence refers toM(ox), i.e. an oxidized M residue. . . . . . . . . . . . . . . . . . 2196.12 (a) Venn Diagram of MS/MS verified sulfoEGS cross-linkedspecies identified by each software (StavroX and pLink) and man-ual method, two methods (two region overlap), all methods (centerregion overlap); (b) The calculated monoisotopic mass, cross-linkpeptide sequence (cross-linked residues highlighted in red), m/z,experimental monoisotopic mass, and mass accuracy for speciesidentified by StavroX and pLink. Cross-links that agreed with themanual detection are highlighted in purple. . . . . . . . . . . . . 2216.13 An example of a sulfoEGS cross-linked species (m/z = 936.46, z =4) and its annotated MS/MS spectrum from StavroX . . . . . . . 2226.14 (a) Venn Diagram of MS/MS verified PFA cross-linked speciesidentified by each software (MeroX/StavroX and pLink) and man-ual method, two methods (two region overlap), all methods (centerregion overlap); (b) The calculated monoisotopic mass, cross-linkpeptide sequence (cross-linked residues highlighted in red), m/z,experimental monoisotopic mass, and mass accuracy for speciesidentified by MeroX/StavroX and pLink. Cross-links that agreedwith the manual detection are highlighted in purple. . . . . . . . 2276.15 The proposed structure of PFA cross-linked species (m/z = 701.01,z = 6) with insufficient MS/MS evidence from StavroX; In thecross-linked structure sequence, R(+12), K(+12), and M(ox)are de-noted as “&”, “$” and “m”, respectively. . . . . . . . . . . . . . 2286.16 The MS/MS spectra of PFA cross-linked species (m/z = 701.01, z= 6) with insufficient MS/MS evidence from StavroX; In the cross-linked structure sequence, R(+12), K(+12), and M(ox)are denoted as“&”, “$” and “m”, respectively. . . . . . . . . . . . . . . . . . . 229xxviList of Figures6.17 An example of a confirmed PFA cross-linked species (m/z =606.58, z = 4) and its annotated MS/MS spectrum from MeroX. Inthe dotted box, the fragment ion evidence of the same cross-linkedstructure (m/z = 484.46, z = 5) . . . . . . . . . . . . . . . . . . . 2306.18 An example of a confirmed PFA cross-linked species (m/z =588.52, z = 5) and its annotated MS/MS spectrum from MeroX;In the dotted box, the fragment ion evidence of the same cross-linked structure (m/z = 588.52, z = 5) is shown. In the cross-linkedstructure sequence, R(+12)is denoted as “&”. . . . . . . . . . . . . 231A.1 Excel Code for Elimination . . . . . . . . . . . . . . . . . . . . . 270A.2 Mathematica Code for Possible cross-linked Species . . . . . . . 271A.3 Mathematica Code for Candidate Cross-linked Species . . . . . . 272A.4 Collision Energy Table for Bruker Impact II LC-MS/MS . . . . . 273xxviiList of AbbreviationsGeneral AbbreviationsPFA formaldehydeNMR nuclear magnetic resonanceSDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresisHPLC or LC high performance reverse-phase chromatographyMS mass spectrometryMS/MS tandem mass spectrometryMSn multistage mass spectrometryCID collision-induced dissociationECD electron capture dissociationETD electron transfer dissociationESI nanoelectrospray ionizationMALDI matrix-assisted laser desorption/ionizationQqTOF quadrupole time of flight mass spectrometersFT-ICR Fourier transform ion cyclotron resonanceTOF time of flight mass spectrometerQStar ABI QStar XL quadrupole time of flight mass spectrometerImpact II Bruker Impact II quadrupole time of flight mass spectrometerm/z mass to charge ratio[M] monoisotopic massxxviiiGeneral AbbreviationsEDTA Ethylenediaminetetraacetic acidEDC 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloridesulfoNHS N-hydroxysulfosuccinimidesulfoDST disulfosuccinimidyl tartrateBS3 bis(sulfosuccinimidyl) suberatesulfoEGS ethylene glycol bis(succiimidly succinate)RNaseA Ribonuclease ARNaseS Ribonuclease SSNAP sophisticated numerical annotation procedureS/N signal to noise ratioDOM degree of modificationSCX strong cation exchangeSEC size exclusion chromatographyIMS ion mobility spectrometryxxixAmino Acids[1]Amino Acids[1]Name Letter Symbol Molecular Weight pKa  (Side chain)  Structure Phenylalanine F 165.19 -     Tryptophan W 204.23 -     Tyrosine Y 181.19 10.10     Alanine A 89.09 -     Isoleucine I 131.17 -     Leucine L 131.17 -     Valine V 117.15 -     Glycine G 75.07 -     Proline P 115.13 -     Asparagine N 132.12 -     ONH2CH3OHONH2OHONH2OHONH2NHOHONH2CH3CH3OHONH2CH3CH3OHONH2CH3CH3OHONHOHOONH2NH2OHONH2OHOHxxxAmino Acids[1]Name Letter Symbol Molecular Weight pKa  (Side  chain)   Structure  Cysteine C 121.16 8.14    Glutamine Q 146.15 -    Methionine M 149.21 -    Serine S 105.09 -    Threonine T 119.12 -    Aspartic  Acid D 133.1 3.71    Glutamic Acid E 147.13 4.15    Arginine R 174.2 12.10    Histidine H 155.16 6.04    Lysine K 146.19 10.67    ONH2NHNOHO ONH2NH2OHONH2SHOHONH2SCH3OHONH2OHOHONH2OHCH3OHO ONH2OHOHOONH2OHOHNHO NHNH2NH2OHONH2NH2OHxxxiTypical Protein ModificationsTypical Protein ModificationsSymbol Modification Mass Shift (Da)(ac) Acetylation + 42.01(ox) Oxidation +15.99(am) Amidation -0.98(dm) Deamidation +0.98(tm) Trimethylation +42.06(cm) Carbamidomethylation +57.02-H2O Water loss - 18.02-NH3 Ammonia loss -17.02-CH3SOH Methanesulfenic Acid loss -64.11xxxiiAcknowledgementsI would like to thank my supervisor, Professor Juergen Kast for providing me withthis project. I really appreciate how accommodating and flexible Dr. Kast has beenthroughout my Ph.D education, allowing me to think independently and boldly asa researcher. I would like to acknowledge my supervisory committee for alwaysbeing very encouraging, supportive and extremely understanding. I am indebtedto the University of British Columbia for awarding me with a Four-Year DoctoralFellowship, International partial tuition award and Gladys Estella Laird ResearchFellowship to fund my Ph.D. There are numerous individuals that have gone outof their way to guide me. Dr. Nikolay Stoynov not only patiently provided mewith the bulk of mass spectrometric data that had pushed my Ph.D forward but al-ways tried his very best to go out of the way to accommodate my several requestseven if it meant transferring large data files remotely. Finally, he spent numerousdays setting up and running strong cation exchange and size exclusion chromatog-raphy experiments for me solely out of his kindness. I also cannot thank JasonRogalski enough for always going out of the way to accommodate sample requestsfor mass spectrometric analysis no matter how challenging. Jason has also helpedmeticulously proofread my proposal and other writing early in my Ph.D. Finally,whether a part of our lab or not he has always provided me with support, such astaking the time out to answer my numerous questions, and his encouragement hasreally helped me to reach the final stages of my Ph.D. I would also like to thankShujun Lin who has provided me with training in mass spectrometry instrumenta-tion, sample preparation and has run numerous samples for strong cation exchangechromatography and mass spectrometric analysis. Another individual that I amextremely grateful to is Ru Li, a fellow Ph.D student in our lab. She taught meskills in running SDS-PAGEs, sample preparation etc that have been extremely vi-tal for moving my project forward. Her hardworking and dedicated, yet extremelyxxxiiiAcknowledgementshumble and patient nature has always inspired me. Having her support throughseveral obstacles faced throughout the Ph.D has truly helped me. Several pastlab members have also made an impact in the pursuit of my Ph.D. A fellow labmember, Xuan Ding showed me the basics of the lab techniques/protocols for theexperiment and helped me transition into taking over the project. Jane O’Hara andCordula Klockenbusch, former Post-Doctorates, have helped me proofread and re-vise various written materials throughout the early years of my Ph.D. Another fel-low lab member, Davin Carter, had always enthusiastically helped me revise andformat presentations and written materials for conferences. Iris Egner, a formerPost-Doc, patiently helped me refine my thesis outline, offered advice for presen-tations, and taught me several crucial strategies for improving SDS-PAGE. Finally,I would like to thank the rest of my fellow lab members past and present for theirsupport and encouragement. My interest in chemistry was initiated by my highschool chemistry teacher, Mr. Timothy Lattanzio who provided an unforgettablefoundation in chemistry that I have deeply relied on throughout my undergradu-ate and graduate schooling. I also will never forget the kindness of my physicalchemistry professor, Professor Geoffrey Hutchison, who not only gifted his stu-dents with a strong foundation in chemistry but went out of the way to direct meto my undergraduate research supervisor and supported my graduate school ambi-tions with reference letters and guidance. My undergraduate research supervisor,Professor Seth Horne, was one of the main reasons I was able to pursue a Ph.D. Henot only meticulously provided me with my initial training in chemistry research,but guided and supported me through the transition from my final undergraduateyear to Ph.D by providing reference letters, advice and encouragement. Finally, Ionly hope that one day I can do even a little bit of justice to the God-sent familythat I am extremely blessed to have. My parents and my sister’s undying positiveenergy, faith and meticulous guidance fueled my Ph.D. Their strong passion forscience and experiences completing Ph.Ds in physics and chemistry had inspiredme to embark on my own Ph.D journey. This Ph.D education has not only taughtme skills to study a specialized field but has provided me with a new perspectiveand training for life.xxxivDedicationTo My Parents and Sister, VanitaxxxvChapter 1Introduction11.1 Structural ProteomicsProteins are dynamic entities constantly altering their structures based on theirfunctions inside and outside the cell. The fundamental aim of structural proteomicsis defining three dimensional protein structures. Protein conformation and reactiv-ity is dictated by its sequence. Protein function is governed by the interactions ofprotein structures in stable multi-subunit complexes as well as more transient as-semblies. Such interactions are responsible for carrying out a multitude of tasks incells from binding to small molecules during storage, transport and cellular signal-ing to serving as molecular switches, structural supports and catalysts [3]. Irregu-larities in protein structures at every level caused by mutating amino acids, dena-turing or aggregating structures, or non-specific binding can indicate the presenceof disease [4–6]. A large collection of high resolution three dimensional proteinstructures have been solved by X-ray crystallography, nuclear magnetic resonance(NMR) spectroscopy, and electron microscopy, many of which are housed in theProtein Data Bank (PDB). No other technique has been able to offer the sameprecision and detail of protein structural determination by X-ray crystallography.However, X-ray crystallography is limited to proteins for which crystals can beobtained. At body temperature, there is sufficient energy to cause protein motionsthat cannot be measured by X-ray crystallography, which represents most proteinstructures as static, average structures. A major advantage of NMR is that it canmeasure protein motions that occur down to the ps range and provide information1The Introduction chapter was adapted from the author’s previous publication[2]11.2. Mass Spectrometry in Structural Proteomicsregarding protein kinetics, although it requires a large amount of protein. Nonethe-less, these methods are unsuitable to characterize protein complexes or proteinswith low solubility, due their large size, or reduced stability. [7–10]. MS has an ad-vantage over X-ray crystallography and NMR with its attomolar sensitivity, rapidmeasurement and ability to retain physiological conditions of proteins even thoughit cannot match the high resolution structural data of these techniques[11]. There-fore, combining the high resolution capabilities of X-ray crystallography and NMRwith the sensitivity, speed and gentle nature of MS can serve as a powerful tool toexamine proteins and protein complexes.1.2 Mass Spectrometry in Structural ProteomicsMS measures the mass to charge ratio (m/z) of an ionized molecular species in asample. The analyte is ionized via an ionization source producing ions that areaccelerated through an electric field and organized based on their m/z in a massanalyzer. Finally, the intensity of each m/z is recorded by a detector. The massspectrum is a plot of the intensity versus m/z. MS provides the means to study thestructure of proteins as dynamic entities, which is comparable to that observed ina physiological environment in solution, making it the primary technology imple-mented in structural proteomics.Bottom-up and top-down methodologies based on the MS analysis of pep-tide digests and intact proteins, respectively, have both been employed althoughbottom-up is more widely implemented[12].In a typical bottom-up proteomics experiment (see Figure 1.1), proteins are firstisolated from cell lysates or biological species. Sodium dodecyl sulfate polyacry-lamide gel electrophoresis (SDS-PAGE) is routinely utilized for separating com-plex protein mixtures. SDS unfolds proteins and adds negatively charged SDSsulfate groups approximately proportional to its molecular weight. Samples areloaded into a polyacrylamide gel matrix and a high voltage is applied. The proteinmolecules migrate to a distance inversely proportional to their size [13].21.2. Mass Spectrometry in Structural Proteomics0 y1 y2 y3 y4 b2 b3 b4 500 m/z Intensity m/z 301.00 301.50 302.00 0.50 0.50 0 y1 y2 y3 y4 b2 b3 b4 500 0 500 ONH2CH3OCNHR1 OCNHR2 OCNHR3NHOCR4NHOCR5OHb3 y4 y3 y2 y1 b2 b4 m/z ONH2CH3OCNHR1 OCNHR2 OCNHR3NHOCR4NHOCR5OH[M] = 599.98 Da  MS at T1 MS/MS of m/z = 301.002+ 0 10 20 30 40 50 60 70 80 0 Retention time (min) Intensity Intensity Intensity Ion Chromatogram  0 y1 y2 y3 y4 b2 b3 b4 500 m/z Intensity 0 y1 y2 y3 y4 b2 b3 b4 500 m/z Intensity Purification SDS-PAGE Enzymatic Digestion HPLC QqTOF Mass Spectrometer Database Search/Identification T1 Figure 1.1: A summary of the bottom-up proteomic strategy, in which pro-teins are purified, separated via SDS-PAGE and enzymatically digested into pep-tides, which are separated through HPLC and eluted into the mass spectrom-eter. The schematic diagram of a Bruker Impact II QqTOF mass spectrome-ter is shown on the top right (adapted from reference [14] with persmission,https://creativecommons.org/licenses/by/4.0/). At each time point an MS spectrumof the ions eluted is plotted. For example at time = T1 , a species with a m/z of301.00 (z = 2) eluted and the signal appears as isotopic cluster such that the chargeis the inverse of the difference between each isotopic peak i.e. 1 divided by 0.5.The MS/MS of precursor ion produces fragment ion signals (b1,b2. . . y1,y2. . . )which are typically the result of the cleavage of peptide bonds in CID. MS/MSspectra and MS spectra are matched to theoretical databases for identifying andsequencing peptides.31.2. Mass Spectrometry in Structural ProteomicsTrypsin is a common choice of enzyme due to its high specificity in cleav-ing mainly at the carboxyl side of lysine or arginine residues (except when eitheris followed by a proline), which are abundant in proteins. Following the enzy-matic digestion of proteins in gel, the resulting peptides are extracted and sepa-rated through reversed-phase high performance liquid chromatography (HPLC oralso referred to as LC). Peptides elute in order of their hydrophobicity into the massspectrometer and MS signals or m/z of precursor ions are recorded for each elutiontime point. The number of ions (intensity) is plotted versus retention time (timeeach component elutes from LC column) to construct ion chromatograms. Softionization methods such as nanoelectrospray ionization (ESI) and matrix assistedlaser desorption ionization (MALDI) can ionize biomolecules without producingsignificant damage. Ionized tryptic peptides are normally protonated at the termi-nal lysine or arginine residue and the peptide N-terminus, giving peptide ions a +2charge. Additional basic residues within the peptide such as histidine or missedcleaved lysines or arginines may result in higher charge states of the peptide. Pep-tide signals appear as an isotopic cluster and the charge can be calculated by de-termining the inverse difference between each isotopic peak. The monoisotopicmasses ([M]) of each species can be derived from the m/z and charge. For exam-ple, a doubly charged species at m/z 301.00 would correspond to [M+2H]2+ suchthat [M+2(1.01Da)]/2 = 301.00, which gives a [M] = 599.98 Da [15][16]. There-fore, a resolution that can clearly depict isotopic patterns for determining accuratemonoisotopic masses is crucial. This makes time of flight (TOF) mass spectrome-ters particularly favorable for such analyses with their high resolution capabilities(~100,000) [14]. In TOF-MS, ions are accelerated through a flight tube and flighttimes are measured, which are proportional to the square root of m/z. Ions of dif-ferent kinetic energies are corrected by a reflectron such that ions with equal m/zarrive at the detector at the same time. Fourier transform ion cyclotron resonance(FT-ICR) mass analyzers also provide high resolution measurements (~100,000).In FT-ICR MS, a magnetic field and an orthogonal oscillating electric field accel-erate ions in a circular motion. The time stable superconducting magnetic fieldallows for the high mass accuracy. The frequency of the ions is measured to derive41.2. Mass Spectrometry in Structural Proteomicsthe m/z, which is independent of ion speed and m/z values can be measured at thesame time, allowing for high S/N [17]. Orbitrap mass analyzers operate similarlyto FT-ICR except that the absence of the magnetic field has allowed for increasedmass resolution (~300,000). However, unlike TOF MS, Orbitrap’s resolving poweris compromised for speed in the MS/MS acquisition mode [18, 19].Out of all the precursor ion signals recorded, abundant signals usually basedon user defined parameters are selected within a chosen isolation window of m/zvalues for a second stage of MS (tandem MS or MS/MS). In quadrupole time offlight mass spectrometers (QqTOF), ions are selected in the first quadrupole (Q1),subjected to MS/MS in the second quadrupole or collision cell (q2) and separatedin the TOF based on m/z. Quadrupoles are composed of four rods with a DCand RF voltage applied across each pair of rods producing a varying electric field.Each electric field generated only allows ions of specific m/z values that have sta-ble oscillations to be transmitted. In the collision cell, collision induced disso-ciation (CID) occurs in which precursor ions hit a stationary inert gas and breakinto smaller fragment ions[20]. When collision energy is equal to bond energy,the bond breaks. The most commonly observed peptide CID fragmentation occursat the peptide bond such that the charge can be retained on the N-terminal or C-terminal of the peptide, producing b ions or y ions, respectively. Fragmentationoccurring between the alpha carbon and carbonyl carbon or the nitrogen and al-pha carbon is less likely under CID. Additional fragmentation of fragment ions viamultiple stages of MS/MS (MSn) can provide more precise details about sequencesand modified amino acids. Also, with every subsequent MS, the signal intensitiesdecrease and MSn relies on strong precursor ion signals. In-source fragmenta-tion techniques have been combined with triple quadrupole mass spectrometry forpseudo MS3 [21].Other fragmentation mechanisms utilized include electron capture dissociation(ECD) and electron transfer dissociation (ETD). ECD was designed for FourierTransform where as ETD was designed for Quadrupole-based mass analyzers.ECD and ETD differ from CID in that fragmentation occurs at low energy produc-ing c and z ions from the fragmentation of the Cα-N bond. Advantages of ECD andETD over CID include the preservation of post-translational modifications and thefragmentation of disulfide bonds. However, ECD and ETD lack efficiency when51.3. Mass Spectrometry and Chemical Cross-linkingfragmenting ions with low charge states (+1, +2). Recent combinations of CIDand ECD/ETD have demonstrated to be a powerful tool to obtain complementaryMS/MS fragment evidence [22]Peptides can be identified by matching their monoisotopic masses to MS sig-nals and confirming their sequence by matching theoretical peptide fragment ionsto MS/MS signals. Large databases of proteins and MS or MS/MS based softwaresuch as GPM [23], Mascot [24] and SEQUEST [25] are available for automaticprotein/peptide identification.1.3 Mass Spectrometry and Chemical Cross-linkingMS has fostered an immense growth in proteomics research. LC–MS coupledwith affinity purification analyses identifies components of protein complexes andnetworks [26]. Cutting-edge instrumentation in MS/MS maps proteins and theirmodifications at the amino acid residue level for high resolution geometry [27].Native ion mobility MS characterizes protein conformation by measuring its cross-sectional area [28, 29]. Imaging MALDI-MS can be used to investigate the spatialarrangement of protein structures in intact tissues. Approaches combining theseMS technologies with chemical methods such as limited proteolysis, chemical sur-face modification and hydrogen–deuterium exchange, monitor the solvent accessi-bility of regions of a protein to observe conformational changes [12, 30, 31]. Theenzymatic cleavage, modification, or deuterium exchange at particular amino acidsdirectly correlates to the exposure of that respective region to the solvent and thesecan be used to determine structural changes[30, 31]. Chemical cross-linkers formcovalent bonds in proteins, which preserve their cellular context and introduce dis-tance constraints to map their structure.Cross-linking occurs in two steps: first, the modification reaction and second,the cross-linking reaction. Upon cross-linking and enzymatic digestion of a proteincomplex, a mixture of unmodified, modified, intra-cross linked (cross-link of tworesidues within one peptide) and inter-cross linked (cross-link of two residues ontwo different peptides) peptides is produced (see Figure 1.2).61.3. Mass Spectrometry and Chemical Cross-linkingUnmodifiedModifiedIntra-Cross-LinkInter-Cross-linkComplex Digest of Peptides Enzymatic DigestionProtein 1, + Cross LinkerProtein 2Modified  Protein1,  Protein2Protein 1  Cross-Linked to Protein 2Figure 1.2: In general, cross-linkers introduce covalent bonds in proteins via atwo-step reaction: modification of protein site 1 and cross-link formation betweenprotein sites 1 and 2. Upon enzymatic digestion, a complex mixture of differenttypes of peptides is produced.Unmodified peptides can be used to identify the protein components partici-pating in the quaternary interaction holding the complex together. Measuring dis-tances between both intra and inter cross-linking sites on peptides can offer a lowresolution picture of secondary and tertiary protein structures and monitor changesunder different conditions. Localizing the cross-linking sites to specific aminoacids can depict precise geometries and reaction interfaces of proteins at the pri-mary structural level. Modified peptides can be used to observe fluctuations inconformation via solvent accessibility by tracking the variation in the degree ofmodification as a function of projected external factors. The mass of a cross-linkedspecies should equal the sum of the masses of each component peptide, bridge,and any additional modification. Over 100 different cross-linkers are commerciallyavailable. Different cross-linkers of various sizes, solubility, lengths and reactivitymay be suitable depending on the location of the complex, the reactive sites and in-termolecular distances of protein components, or the information desired, such asstructural or mechanistic properties. For proteomic research, amine-reactive cross-linkers are most widely used due to the abundance of primary amine (primarilyN-terminal and lysines) that are also more solvent and reagent accessible than thehydrophobic sites that are buried within the protein structure. Non-specific cross-71.3. Mass Spectrometry and Chemical Cross-linkinglinking technologies such as photoactivatable cross-linkers that form linkages thatare mostly independent of the type of amino acid upon light irradiation can be ap-plied to various proteins regardless of amino acid composition [12]. Below, thecross-linking chemistries explored in this study are described.1.3.1 Types of Chemical Cross-linkers1.3.1.1 N-hydroxy Succinimide Ester Cross-linkersN-hydroxy succinimide (NHS) esters are common homobifunctional cross-linkers,or cross-linkers with the same reactive groups on either end. They form cross-links specifically between primary amino groups present in lysine (K) amino acidside chains and the N-terminus. Since lysine residues are abundant, and are ac-cessible at the surface in proteins, NHS esters are widely used for protein cross-linking. NHS ester cross-linkers are generally cellular membrane permeable, wa-ter insoluble and are routinely used to stabilize intracellular protein complexes.With the addition of sulfonyl groups, these cross-linkers become water solubleand membrane impermeable and are used to characterize cellular surface proteins.Various lengths of NHS ester cross-linkers are commercially available. Exam-ples of water soluble, cellular impermeable cross-linkers include disulfosuccin-imidyltartrate (sulfoDST), bis(sulfosuccinimidyl) suberate (BS3), and ethylglycolbis(sulfosuccinimidylsuccinate) (sulfoEGS) with cross-link bridge lengths of 6, 12and 16Å, respectively. Figure 1.3,1.4, and 1.5 depicts the cross-linking reactionsfor sulfoDST, BS3, and sulfoEGS, respectively, which share similar cross-linkingmechanisms. One major drawback of NHS ester cross-linkers is that they rapidlyhydrolyze under cross-linking reaction conditions (pH > 7, 25-37 ºC) with a halflife on the scale of tens of minutes. This restricts cross-linking to short reactiontimes, making it difficult to increase product yield with larger reaction times indilute protein solutions. Also, since NHS esters react with only basic sites on theprotein, the overall positive charge is reduced, which can induce conformationalchanges, hinder trypsin cleavage, and reduce the ionization efficiency in MS. Fi-nally, the longer the cross-linker bridge, the higher the probability of it existingwithin the distance of two residues if its structurally flexible. Lysine’s flexible longside chain makes it likely to randomly move within the cross-linking bridge dis-81.3. Mass Spectrometry and Chemical Cross-linkingtance of these larger cross-linkers. Therefore, distinguishing between specific andnon-specific cross-linking with long cross-linker bridges can be challenging [32].91.3. Mass Spectrometry and Chemical Cross-linkingR1 NH2 m1 + 132.0 Da m1 OOHOOHOHNHR1ONHOOHOHNHR1R2R1-NH2 = R2-NH2 =N-term or K          OR4OOHOHNHR1(548.43 g/mol) R4OR4OOHOHNOOSO OO-O(Crosslinker)R4 = R2 NH2m2 M = m1 + m2 + 114.0 Da R4-OHFigure 1.3: In the sulfoDST cross-linking reaction scheme, protein site 1 (mass =m1) reacts with the sulfoDST to form a modification which immediately reacts withprotein site 2 (mass = m2) to form a cross-linking bridge (highlighted in red). Acompeting hydrolysis reaction product (m1+bridge + H2O) can also occur insteadof the cross-linked product (M = m1 +m2 + bridge). R1, R2 and R4 are defined inthe dotted box.101.3. Mass Spectrometry and Chemical Cross-linkingR1 NH2m1 NOOSO OO-O(Crosslinker)R4 = R2 NH2m2 M = m1 + m2 +1 38.1 Da  m1 +156.1 Da R4OR4OOR4ONHR1OOHONHR1ONHONHR1R2(572.43 g/mol) R4-OHR1-NH2 = R2-NH2 =N-term or K          Figure 1.4: In the BS3 cross-linking reaction scheme, protein site 1 (mass = m1)reacts with the sulfoDST to form a modification which immediately reacts withprotein site 2 (mass = m2) to form a cross-linking bridge (highlighted in red). Acompeting hydrolysis reaction product (m1+bridge + H2O) can also occur insteadof the cross-linked product (M = m1 +m2 + bridge). R1, R2 and R4 are defined inthe dotted box.111.3. Mass Spectrometry and Chemical Cross-linkingR1 NH2m1 NOOSO OO-O(Crosslinker)R4 = R2 NH2m2 R4OOOOR4OOM = m1 + m2 + 226.1 Da  m1 + 244.1 Da (660.45 g/mol) NHOOOOR4OOR1NHOOOOOHOOR1NHOOOONHOOR1R2R4-OHR1-NH2 = R2-NH2 =N-term or K          Figure 1.5: In the sulfoEGS cross-linking reaction scheme, protein site 1 (mass =m1) reacts with the sulfoDST to form a modification which immediately reacts withprotein site 2 (mass = m2) to form a cross-linking bridge (highlighted in red). Acompeting hydrolysis reaction product (m1+bridge + H2O) can also occur insteadof the cross-linked product (M = m1 +m2 + bridge). R1, R2 and R4 are defined inthe dotted box.121.3. Mass Spectrometry and Chemical Cross-linking1.3.1.2 Zero-length Cross-linkersZero-length cross-linkers add a cross-linker bridge the length of a single bondand thus join very close proximity protein sites together, despite the bulky size ofthe cross-linker reagent itself. A common zero-length cross-linking strategy uses1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDC) in conjunc-tion with N-hydroxysulfosuccinimde (sulfoNHS). As Figure 1.6 shows, these het-erobifunctional (i.e. cross-linkers with two different reactive groups) cross-linkersform peptide bonds between primary amino and carboxylic acid groups ( glutamicacids (E) aspartic acids (D) and C-terminus), inducing an overall mass shift of -18.02 Da i.e. form cross-links via a condensation reaction. EDC first is attackedby the carboxylate oxygen in the a carboxylic group on the first protein site (m1)to form an unstable intermediate (O-acylisourea). This intermediate reacts withsulfoNHS to form an amine-reactive sulfoNHS ester, This intermediate then reactswith a primary amino group on the second protein site (m2) to form a peptide bond.The formation of the cross-link thus should not change the overall net charge sinceboth a negatively charged and positively charged group are neutralized. The O-acylisourea can also rearrange to form a N-acylisourea (m1 + 155 Da) shown in thetop of Figure 1.6. In this stable product, a positively charged modification replacesthe negatively charged carboxylic group, which alters the overall net charge [33].Due to the lack of an actual cross-link bridge, locating the cross-linking sites ofzero-length cross-linkers is difficult via MS. These cross-linkers are water soluble,but are cellular membrane impermeable [34]. Cross-linking is performed at pH ~6.5 for the best results and a major drawback is the inefficiency of EDC/sulfoNHS2cross-linking under physiological pH conditions[33, 35].2Note: Throughout the text the combined EDC and sulfoNHS cross-linking chemistry is referredto as “EDC” cross-linking.131.3. Mass Spectrometry and Chemical Cross-linkingR1 NHOR2R1OONHNCH3NH+CH3CH3R1NOCH3NHONH+CH3CH3R1 = D, E, or C-term  R2-NH2 = N-term or K m1 M = m1 + m2 -18 Da m1 + 155 Da (191.70 g/mol) (217.13 g/mol) NOOSOOO-OHNNCH3NH+CH3CH3R1 O-ONOOS OOO-OR1OR2 NH2m2 Figure 1.6: In the EDC/sulfoNHS cross-linking reaction scheme, protein site 1(mass = m1) reacts with the cross-linker. The intermediate can react with sulfoNHSto form an amine reactive sulfoNHS ester, which forms a cross-link with proteinsite 2 (mass = m2). The intermediate can also produce a stable N-acylisourea (m1+ 155 Da). Cross-linker bridges are highlighted in red. R1 and R2 represent twoprotein sites and specific reactive amino acids are defined in dashed line boxes.Mass of each species are provided in solid line boxes.141.3. Mass Spectrometry and Chemical Cross-linkingFormaldehyde (PFA) 3 can be thought of as a pseudo zero length cross-linkerdue to its a relatively small size (2.3 Å). Figure 1.7 shows the general PFA reactionscheme. Figures 1.8 and 1.9 show the mechanisms and reaction schemes for poten-tial PFA modification and cross-linking sites, respectively. Primary amino groupson the first protein site (mass = m1) nucleophilically attack PFA’s carbonyl carbonto form a methylol intermediate, adding +30 Da to the mass of the protein. Unlikethose of NHS esters and EDC cross-linkers, PFA modification intermediates arestable in solution and have been detected via MS and MS/MS to probe accessibilityin proteins [36]. The methylol intermediate can dehydrate into a stable Schiff basemodified protein (mass = m2 + 12 Da). Without water, the equilibrium betweenthe methylol and Schiff base intermediates would shift toward the right. However,in solution, the equilibrium is unaffected by the amount of water. In proteins, thisequilibrium would be affected by the accessibility of water, which is dictated byits structure. In addition, the equilibrium may be driven to the right if the Schiffbase is involved in a subsequent reaction. The Schiff base intermediate reacts withthe second protein site (mass = m2), forming a methylene bridge between the pro-tein sites and an overall mass shift of +12 Da is produced. PFA is both membranepermeable and water soluble and can form cross-links rapidly under physiologicalconditions, making it particularly useful for examining weakly associated proteincomplexes in their native environment. In contrast to other cross-linkers, PFA’ssemi-specific reactivity allows cross-link formation among a wide range of aminoacids [37].3Formaldehyde exists as paraformaldehyde, a polymerized form of formaldehyde, as a solid.Paraformaldehyde is dissolved in solution to obtain the formaldehyde reagent used for cross-linking.The non-polymerized formaldehyde is referred to as “PFA” throughout the text.151.3. Mass Spectrometry and Chemical Cross-linkingR2 –H = N-term,R,Q,H,Y, or NR1 –NH2 = N-term, K, or R m1m2m1 + 30 Da m1 + 12 Da M = m1 + m2 + 12 Da(30.03 g/mol)H HOR1NH OHR1 NH2R1N R1NH R2R2H-H2OFigure 1.7: PFA reacts with protein site 1 (mass = m1), which forms a methylolmodification (mass = m1+ 30 Da). This can dehydrate into a Schiff Base (mass=m1+ 12 Da), which can continue to react with protein site 2 (mass = m2) to forma methylene bridge. This produces a cross-linked species (mass = m1+ 12 Da+m2). Cross-linker bridges are highlighted in red. R1 and R2 represent two proteinsites and specific reactive amino acids are defined in dashed line boxes. Mass ofeach species are provided in solid line boxes.161.3. Mass Spectrometry and Chemical Cross-linkingNH2NHOR1R1NNHOR1R1CH2NHNHOR1R1CH2OHONHNH2R1R1ONHNR1R1CH2ONHNHR1R1CH2OHNHONH NH2NHR1R1NHONH NNHR1R1CH2NHONH NHNHR1R1CH2OH-H2O-H2O-H2OH HOH HOH HO(b) R1 = N-term     R1 = K      R1 = R     m1 m1 + 30 Da m1 + 12 Da (30.03 g/mol) H HOR1NH OHR1 NH2 R1N-H2O, -H+H+H+R1NH OH2+(a)           + Figure 1.8: The PFA modification reaction generic mechanism (a) and for eachreactive residue (b) (Adapted from reference [37], with permission)171.3. Mass Spectrometry and Chemical Cross-linkingNR1 CH2NR1 CH2NHR1OONHNHR2R2O ONH2NHR2R2OONH2NHR2R2NHR1ONHONHR2R2ONHOHR2R2NHR1ONHNNNHR1R2R2R1NHON NHNHR2R2 NR1NH2ON NNHR2R2 NNHNHOR2R2 R1NH2NHOR2R2NHONH NH2NHR2R2ONHNHNR2R2ONHOHR2R2NR1 CH2NR1 CH2NR1 CH2NR1 CH2   (b) R2 –Nu-H =  Side Chain  of Q     R2 –Nu-H  =Side Chain  of  N      R2 –Nu-H =Side Chain  of  Y      R2 –Nu-H  =Side Chain  of  H     R2 –Nu-H  = Side Chain  of R     R2 –Nu-H  = Nterm + + + + + + Nu = nucleophile in protein R2 m2 m1 + 12 Da M = m1 + m2 + 12 Da R1NR1NH Nu - R2R2-Nu-HR1 NHNu+R2HH++ (a)           Figure 1.9: The PFA cross-linking reaction generic mechanism (a) and for eachreactive residue (b) (Adapted from reference [37], with permission)181.3. Mass Spectrometry and Chemical Cross-linking1.3.2 Identification of Cross-linking and Structure ElucidationThe identification of cross-linked species generally follows the bottom-up pro-teomics method. However, to aid the MS detection of the low abundant cross-linked species in the reaction mixture, isotopic labeling, affinity or reporter groupshave been incorporated. Cross-linker bridges synthesized with bonds susceptibleto cleavage either chemically or via CID to release component peptides for sub-sequent MSn have aided in determining cross-links. Unfortunately, cross-linkerssuch as EDC and PFA are unable to exploit such technologies due to their smallsize[38].A variety of cross-link analyses software exists with capabilities to identifycross-linked species, localize cross-linking sites, sequence peptide components,distinguish intra versus intermolecular cross-linking and evaluate cross-links usingprotein geometries[32, 34, 39–41]. These bioinformatics tools offer the versatilityto customize searches based on characteristics of standard cross-linkers and identi-fication features such as cleavable cross-linker bonds, isotopic labeling, or affinitytags used in experiments.With recent advancements in cross-linking technologies, large-scale in vivotopology mapping via cross-linking site localization has now been performed. Forexample, the protein interaction reporter (PIR) is a cross-linker constructed withan affinity tag and cleavable bond in MS/MS to assist in cross-link identification.Similar to PFA, this membrane-permeable cross-linker is applied to cells prior tolysis. PIR technology analyzed various key membrane proteins in E.coli[42, 43]and even human cells[44]. These studies represent one of the largest identifica-tions of cross-linked species. Although, PIR technologies have successfully iden-tified numerous cross-linked species and relevant interaction sites, its large sizeand bulky substituents may hinder in capturing exclusively true interactions, in-terfere with reactivity or affect physiological structure of proteins [45]. On theother hand, PFA’s short cross-linker bridge may not be suitable for protein interac-tions with intermolecular cross-linking sites that have large distances. Therefore,using multiple cross-linkers varying in reactivity and cross-linker bridge lengthsin conjunction with PFA would prove to be a powerful tool for a comprehensivepicture of biological systems and to further understand what interactions PFA can191.4. Tandem Mass Spectrometric Fragmentation and Nomenclature of Cross-linked Speciesspecifically capture with respect to other cross-linkers.Over the last decade, cross-linking along with MS has complemented molec-ular modeling experiments to verify protein topologies and contribute informationabout the dynamics of proteins[32, 37, 39, 46–49] . However, PFA has yet to matchconventional cross-linkers engineered to produce straightforward MS analyses ofthree dimensional protein structures. The question remains, why has PFA’s poten-tial for structural proteomics not been fully unleashed despite its compatibility withMS and the successful MS discovery of protein interactions in vivo?1.4 Tandem Mass Spectrometric Fragmentation andNomenclature of Cross-linked SpeciesUpon cross-linking and the trypsin digestion of proteins, the resultant cross-linkedpeptides produce unique MS/MS fragmentation patterns in contrast to single pep-tides under CID. Since cross-linked peptides contain two peptide backbones andan additional bridge, three possible types of fragmentation can occur as illustratedin Figure 1.10a: at the bridge (type 1), at the peptide backbone (type 2) and at boththe bridge and peptide backbone (type 3). Cross-linked component peptides aredenoted as “I” and “II” and the cross-linked bridge is highlighted in red. Fragmen-tation at the cross-linker bridge generates whole peptide component fragment ionsthat can aid in the identification of each peptide. The sequence of each peptidecomponent can be verified by the type 3 ions arising from the fragmentation ofboth the cross-link bridge and peptide backbone. Type 3 along with type 2 frag-ment ions, in which the cross-linker bridge remains intact and the peptide backboneis fragmented, can localize the cross-linking sites [50–52][50–52].201.4. Tandem Mass Spectrometric Fragmentation and Nomenclature of Cross-linked Species(a)(b) X1X1KPeptide IIPeptide IX1KIIIR1NH2NHO R2NHO R3OHOR1NH2NHO R2NHO R3OHOR1NH2NHO R2NHO R3OHOIb3Iy1+IIR1NH2NHO R2NHO R3OHOIIb3+IIIy1R1NH2NHO R2NHO R3OHOIb3Iy1Ia3R1NH2NHO R2NHO R3OHOType 1(Cross-Linker)Type 2(Backbone)Type 3(Cross-linker +Backbone)X1KR3 = C2O2H4(sulfoDST),C6H12 (BS3),or C8O4H12(sulfoEGS)ONHNHNHOR3ONHNHNHOR1 R2R1 R2NOR3ONHNHONHR2R2C+O R3ONHNHNHOR2R2NHOOOOOONHNHR2ONHOR1R1R2(c)X12KPeptide II +X1KX2KX3KPeptide I +X3K (+82 Da) Peptide II +X2K (+144 Da)NOR3O+NON+OR3HNOR3O+NH2X1K +NH3Peptide II +X1X1KPeptide IIPeptide INHOOOOHNHOR2R2NOONHOR1R1Figure 1.10: (a) Cross-linked peptides (denoted as I and II) can fragment exclu-sively at the cross-linker (type 1), exclusively at the peptide backbone (type 2) andat both the cross-linker and peptide backbone (type 3). (b) Diagnostic ions/specificfragmentation for NHS Esters (c) Specific fragment ions produced from sulfoEGSfragmentations. (a-c) Cross-linker bridges and cross-linked lysines from each pro-tein site are highlighted in red.211.5. Importance of Formaldehyde Cross-linking: Common Applications and Key FeaturesIn addition, NHS ester cross-linked peptides can fragment either at the bond be-tween the cross-linker carbonyl carbon and lysine nitrogen or at the peptide back-bone bond, producing ions with modifications denoted as +X1 or +X1K, respec-tively (See Figure 1.10b) [53]. A double X1K type fragmentation at both cross-linked lysine peptide backbone bonds produces X12K , X1K and X1K+NH3 ions,which can serve as diagnostic ions to confirm the presence of the cross-linker [52].Figure1.10c illustrates the diagnostic ions specific to the sulfoEGS crosslinkerwhere fragment ions with +X3K (+82 Da) and +X2K(+144 Da) modifications canbe generated from the fragmentation at the bond between the carbonyl carbon andoxygen within the sulfoEGS crosslinker bridge. However, PFA and EDC, beingpseudo zero and zero-length cross-linkers, are unlikely to produce diagnostic ions.1.5 Importance of Formaldehyde Cross-linking:Common Applications and Key FeaturesPFA is one of the oldest and widely applied cross-linkers in biological studies.Over a hundred years ago, PFA was discovered to be a suitable tissue fixative.Its widespread use has generated thousands, if not millions of tissue samples withtheir structural and functional properties preserved with PFA. Histopathologicalstudies performed on these formalin-fixed tissues has unlocked valuable informa-tion to characterize and diagnose diseases[54]. PFA treatment inactivates enzymesand destroys bacteria responsible for tissue degradation, allowing tissues to be sus-tained for a long period of time [55]. In the early 1920s, Ramon recognized PFA’sutility to develop vaccines because it inactivates toxins and viral proteins withoutdestroying and possibly stabilizing antigens under mild PFA reaction conditions[56–58]. Essentially, this demonstrated that PFA can suspend the function of pro-teins without permanently damaging sites it modifies. PFA’s minimal effect onantigens has enabled its conjunction with immunoprecipitation methods relyingon accessible antigen sites for antibody binding to facilitate the investigation ofprotein–protein interactions. The first application of PFA cross-linking to capturespecific interactions between biological species in their native environment was ex-amining protein–DNA binding [59, 60]. Unlike tissue fixation, which utilizes long221.5. Importance of Formaldehyde Cross-linking: Common Applications and Key Featuresreaction times and high concentrations of PFA to extensively cross-link and main-tain tissues long-term, these experiments call for selective cross-linking using shortreaction times (10–30 min) and low concentrations (1%) of PFA. In chromatin im-munoprecipitation, PFA is directly introduced to cells to cross-link and maintainthe spatial context of protein–DNA complexes, which are then isolated with an ap-propriate antibody-antigen interaction to obtain genomic binding sites [61]. Thesesuccessful experiments promoted the expansion of the PFA cross-linking approachto enhance the detection of true, specific protein–protein interactions by maintain-ing their cellular environment and spatial constraints, using similar experimentalconditions[37, 62]. Importantly, protein interaction analyses demonstrate the uti-lization of mass spectrometry (MS) to identify interacting proteins preserved byPFA [62]. Indeed, MS is a critical component in many current structural pro-teomics studies. Notably, the development of a polyacrylamide gel silver-stainingprocedure designed for subsequent MS analyses involved the incorporation of PFA,suggesting PFA-treated material can be subjected to MS [37, 63]. Hence, MS iscompatible with PFA-induced chemical changes, albeit not requiring their actualdetection. Dimethyl labeling, a routine MS-based technology, exploits short re-action times and low PFA concentrations to produce Schiff-base modifications.These are immediately reduced to dimethyl substituents with NaCNBH3 instead offorming methylene cross-link bridges [64]. Stable isotope dimethyl labeling hassuccessfully quantified protein expression levels in cells by being applied to bothpeptide digests and intact proteins [65]. Minimal side reactions and the conserva-tion of charge states in PFA modified peptides and proteins were reported, indicat-ing that these modifications should not significantly disrupt chemical or physicalproperties of proteins. Dimethyl substituents were observed almost exclusively onlysines, illustrating the high specificity of the PFA modification in the first step ofthe reaction at low PFA concentrations and short incubation times [65, 66]. As de-scribed above, PFA cross-linking is suitable for in vivo biological applications andhas been effectively utilized with MS, which are essential qualities for structuralproteomics.231.6. Formaldehyde Cross-linking in vivo1.6 Formaldehyde Cross-linking in vivo: MassSpectrometric Analyses of Formalin Fixed Tissuesand Protein–Protein InteractionsPFA cross-linking in biological applications relies on one of two distinct strategies.The long-term preservation of protein structures in tissues employs high concen-trations (>4%) of PFA and long incubation times (several hours to days) to formnon-specific cross-links with a higher yield. However, to access proteins for func-tional, and possibly structural, information, the yield of cross-linking should bereduced to be compatible with MS analysis [67]. In contrast, less extensive PFAcross-linking conditions (0.05–1% PFA and 5–20 min incubation) have alreadyroutinely been implemented to capture relevant interactions through in vivo PFAcross-linking of protein complexes [37, 62, 68]. Such analysis is not dependenton detecting cross-linked species and relies on utilizing unmodified peptides toidentify interacting proteins. Thus, maximizing PFA cross-linking yield to detectcross-linked species with MS without sacrificing specificity has yet to be achievedin vivo.1.6.1 Formalin-fixed tissue: extensive cross-linking using longincubation times and high concentrations of FormaldehydeAn exhaustive range of formalin-fixed paraffin-embedded tissues stored in archivesover the years has triggered the recent enthusiasm toward investigating specificprotein structural and functional changes in these tissues with predetermined dis-ease states. This structural analysis relies on MS-based detection of a sufficientamount of protein. It has been shown that heating can reverse cross-links and re-store antigen reactivity for immunodetection to enable effective protein extractionand MS-analysis [55, 67, 69–71]. Notably, spectroscopy experiments with PFAcross-linked ribonuclease A had proven that PFA preserves tertiary structure evenwhen applying high PFA concentrations and long incubation times similar to tissuefixation cross-linking [71]. This suggests that protein structure may be recoveredpost PFA cross-link reversal in fixed tissues for subsequent MS investigations. TheLC–MS/MS analysis of hundreds to even thousands of proteins in an expansive241.6. Formaldehyde Cross-linking in vivocollection of formalin-fixed tissue samples has been achieved[67]. With the ad-vancement of MS imaging, antigen retrieval and enzymatic digestion can directlybe applied to intact formalin-fixed, paraffin-embedded tissue slides without dis-turbing the native environment of quaternary structures. This newly developedtechnology has defined targets for the early detection of cancer in various tissues[67, 69]. Taken together, the proven ability of PFA cross-linking of tissue in pre-serving quaternary structures and its compatibility with MS are key prerequisitesfor its use for comprehensive structural analysis, although the cross-links them-selves have never been observed in these experiments.1.6.2 Protein-Protein Interactions: specific, controlled cross-linkingusing short incubation times and low concentrations ofFormaldehydeFor protein interaction studies, PFA is directly introduced to cells to freeze proteinstructures in their physiological environment prior to lysing cells. The mild PFAcross-linking conditions utilized in these experiments generate a low yield of cross-linked complexes. Following enzymatic digestion, LC–MS/MS analyses identi-fies proteins via the large abundance of unmodified peptides present [68]. Thishas been successful in determining protein interacting partners in various species[37, 62, 68, 72–84]. For example, PFA’s capacity to quickly diffuse through andfreeze protein geometries in living biological material has triggered large-scalecross-linking of even whole organisms through time-controlled transcardiac perfu-sion cross-linking [74, 76, 79]. It has also enabled the localization of transient anddynamic proteasome complexes in human leukemic cells for the first time. Thestabilization induced by PFA cross-links prevented inter-compartment leakage andretained the location of proteasome complexes in cells [84]. Although, PFA hasexamined protein interactions in tissues and cells originating from a vast numberof organisms, structural analysis derived from direct evidence of binding via reac-tion interface mapping is limited by the inability to detect and identify cross-linkedspecies in vivo using MS.251.7. Formaldehyde Cross-linking in Model Proteins1.7 Formaldehyde Cross-linking in Model ProteinsThe apparent failure to observe PFA cross-linked species via MS in cells and tissuesis remarkable. In part, this can be explained by the incomplete understanding ofPFA chemistry and MS behavior. Consequently, simpler protein model systeminvestigations are needed to clarify these aspects.1.7.1 Formaldehyde Modification: Reactive sites and potential forexploiting solvent accessibilityIn model peptides and proteins treated with PFA for several days, similar to theextensive treatment of PFA in fixed tissues, modifications on the N-terminus andside chains of lysine (K), arginine (R), histidine (H), cysteine (C), tyrosine (Y),tryptophan (W), serine (S), threonine (T) and phenylalanine (F) were discovered[85–87]. In contrast, under reaction conditions derived for specific in vivo cross-linking in protein interaction studies (physiological pH buffer, 1% PFA concentra-tion and 5–20 min incubation time), only the N-terminus and cysteine, lysine, andto a lesser extent, arginine side chains, were primary reactive sites in the PFA mod-ification reaction [36, 85]. In model peptides, in which varying site accessibilitydue to protein folding would not be a concern, N-termini were more reactive thanlysine side chains and cysteines exhibited the highest reactivity using comparablein vivo protein interaction cross-linking reaction times. Reactivity of PFA withcysteines in proteins has yet to be studied under these reaction conditions [85]. Ifobserved, PFA could possibly be used to also investigate cysteine-containing pro-teins, in which availability of free cysteines to react would depend on the protein’scellular location and functions [88, 89]. The increase in the quantity of reactiveresidues with exposure to higher concentrations of PFA for longer times exempli-fies that the specificity of the modification reaction is reduced under these reactionconditions. This is consistent with dimethyl labeling methods that utilize mild PFAreaction conditions for specificity, as stated earlier in section 1.5 [66]. In general,these studies demonstrate the localization of modification sites, suggest that mi-croenvironments of amino acids play a key role in PFA reactivity, and validate thespecificity of the modification reaction [85].At the protein level, PFA modification is also governed by protein folding,261.7. Formaldehyde Cross-linking in Model Proteinswhich affects the accessibility of reactive sites. MS studies conducted with myo-globin and lysozyme, using in vivo protein interaction reaction conditions, exam-ined the effect of PFA modification on structure. With only one N-terminal residueand no cysteines, the primary reactive sites in the myoglobin model system werelysines and, to a lesser extent, arginines. When PFA treated and untreated myo-globin samples were unfolded by decreasing the pH, both samples experiencedsimilar increases in charge states, i.e. unfolding, irrespective of PFA treatment[36]. This agrees with the previously mentioned dimethyl labeling experiments re-porting that PFA modification maintained the charge state of proteins and peptides[66]. A drastic rise in PFA modification resulted at the time points denaturant wasadded. The degree of modification increased as the protein unfolded due to the risein accessibility of reactive amino acids that were previously buried, demonstrat-ing that PFA reactivity is dictated by a protein’s spatial constraints from folding.Furthermore, the low degree of modification of PFA treated myoglobin in the ab-sence of denaturant validates that PFA itself does not induce protein unfolding,confirming its reliability for examining protein structural changes [36].1.7.2 Formaldehyde Cross-linking: Reactive sites and formaldehydecross-linked species in model systems.Studies with single amino acids, model peptides and proteins using long incuba-tion times similar to tissue fixation conditions for extensive cross-linking claimedthat N-terminal groups, and the side chains of R, Y, H, W, asparagine (N) and glu-tamine (Q) residues are potential reactive sites in the second, cross-linking step,suggesting PFA’s capacity to probe reaction interfaces containing any of theseresidues[86, 87, 90–92]. However, whether the number of potential PFA cross-linking sites would decrease using protein interaction, in vivo PFA cross-linkingprotocols that have been optimized for specific, less extensive cross-linking mustbe clarified [68]. Deducing cross-linking reactive sites in proteins is more suitablethan in peptides. The yield of non-specific, intermolecular cross-linking in solutionis low since it relies on the random contact of two peptide or protein molecules. Inpeptide cross-linking, only non specific or intermolecular cross-linking can occurdue to the lack of tertiary structure. In solution cross-linking that occurs within a271.8. Non-covalent Protein Complex Model Systemsprotein or complex depends on its specific geometry and therefore should gener-ate a higher yield of cross-linking than with peptides [85]. However, unlike withpeptides, examining relative reactivity in proteins is a challenge due to the vary-ing accessibility of sites from folding. Furthermore, protein systems can introducecomplexities such as the increased number of possible reactive sites, which arepartially occupied, giving rise to heterogeneous products in the case of PFA [36].A bovine insulin model system (51 amino acids, 5.7 kDa) was treated with PFAunder mild reaction conditions similar to in vivo protein interaction studies and di-gested with endoproteinase Glu-C under reducing and non-reducing conditions. Inthis model system, interpeptide PFA cross-linked species and cross-linking of theN-terminus to N and Y, and between K and Y were identified for the first time.The presence of fragment ions with and without the cross-link bridge intact ex-posed the semicleavable nature of PFA cross-link bridges subjected to CID in thisstudy, which facilitated validation of cross-linked species and the regions of cross-linking. Along with fragmentation patterns, these studies also revealed that mildreaction conditions similar to in vivo protein interaction PFA cross-linking pro-vide sufficient yield and specificity to detect PFA cross-linked species. Although,this proved that detecting PFA cross-linked species via MS/MS is possible in verysimple, small proteins, exploring a wide spectrum of different protein structuresvarying in complexity is required to eventually reach the complexity of detectingPFA cross-linking in cells and tissues[50].1.8 Non-covalent Protein Complex Model Systems1.8.1 Calmodulin-Melittin ComplexIt has been illustrated that PFA cross-linking can stabilize transient protein interac-tions in cells and tissues, however the detection of these cross-links for structuralanalysis has yet to be achieved. Weak transient interactions bind with a dissoci-ation constant ( Kd ) in the μM range whereas those classified as strong transientinteractions bind with a Kd in the nM range[93]. This present work aims to addressthe potential of PFA to capture weak transient interactions by examining the Ca2+-free calmodulin-melittin complex.281.8. Non-covalent Protein Complex Model SystemsCalmodulin is a Ca2+binding protein that functions as calcium signal trans-ducer in many key cellular pathways. It is a 16,779.78 Da protein composed of148 amino acids (see Figure 1.11a) with a N-terminal domain (residues 1-76) andC-terminal domain (residues 80-148) connected by a flexible linker (77 -79). Mod-ifications on calmodulin include the removal of the N-terminal methionine (M),acetylation of the N-terminus (+42 Da mass modification), and trimethylated K116(+42 Da mass modification).With four EF hands that have a signature helix-loop-helix structure, calmodulin can bind up to four Ca2+. The apo calmodulin (Ca2+unsaturated) adopts a dumbbell conformation upon Ca2+ binding, as illustrated inFigure 1.11b. Ca2+ binds sequentially to the C-terminal domain and then the N-terminal domain due to its higher affinity for the negatively charged C-terminaldomain. In Ca2+ saturated calmodulin, buried hydrophobic pockets are exposedand this triggers complex formation. Calmodulin is also known to bind to proteinsin the cell in the absence of Ca2+such as actin-binding proteins, cytoskeletal andmembrane proteins, enzymes, and receptors [94]. Typically, calmodulin forms aglobular compact structure that surrounds its binding partner, wrapping the flexi-ble loop around the target[95]. Melittin, the main constituent in bee venom, is a2844.73 Da, 26 amino acid peptide (see Figure 1.11a) that can competitively bindto and inhibit calmodulin. It exists in solution in a tetrameric form. Each melittinmolecule is composed of mostly hydrophobic amino acids (1-20) with positivelycharged C-terminal residues (20-24). The C-terminus of melittin is amidated (-1Da mass modification). Upon binding, melittin becomes a bent rod structure: twoalpha helices connected by the central proline residue[96] [97].Melittin binds to Ca2+-saturated calmodulin with Kd of 3nM. However, in theabsence of Ca2+, Melittin can still form a weaker complex with calmodulin witha Kd of 10μM, as depicted in Figure 1.11b. NMR studies have shown that boththese complexes exist with the same conformation [98]. Presently, complete X-raycrystal structural data of the calmodulin-melittin complex are not available[95].Models have been constructed from distance constraints using various cross-linkersapplied to Ca2+ -saturated calmodulin bound to melittin. However, these studiesconfirmed cross-linked species only at the MS level, which may not provide suf-ficient verification and localization of cross-linking[99]. cross-linking with disuc-cinimidyl suberate (DSS) combined with MS/MS verification has also been ap-291.8. Non-covalent Protein Complex Model Systemsplied to a Ca2+-saturated calmodulin-melittin model system [100]. neither of thesestudies have examined the weaker complex formed between Ca2+-free calmodulinand melittin. Two possible binding orientations between these components arepossible: N-terminal domains are aligned with the C-terminal domains (antipar-allel) or C-terminal domains are aligned with C-terminal domains and vice versa(parallel). Since the calmodulin C-terminal domain is more negatively chargedthan the N-terminal domain and melittin’s C-terminus is positively charged, itis more electrostatically favorable to assume a parallel orientation[101]. Previ-ous MS-based cross-linking studies have observed both types of orientations[99].MS -based EDC cross-linking and limited proteolysis experiments supported anti-parallel binding [102]. However, recent NMR and MS/MS based DSS cross-linking studies have shown that calmodulin and melittin predominantly bind ina parallel orientation[103] [100]. It has been shown that the C-terminal domain ofCa2+-free calmodulin primarily interacts with the C-terminus of melittin and theN-terminal calmodulin domain is left free[95]. In addition, melittin induces a con-formational change in Ca2+-free calmodulin upon binding to its C-terminal domain[95]. Specifically, spectroscopic experiments have shown conformational fluctua-tions involving Y99 and Y138 of calmodulin when binding to melittin and melittinis oriented perpendicular to calmodulin, with its W19 becoming inaccessible uponbinding [101].301.8. Non-covalent Protein Complex Model Systems(a)(b)Ca2+Ca2+ free Calmodulin Ca2+ + CalmodulinN-termC-term1GIGAVLK|VLTTGLPALISWIK|R|K|R|QQ-NH226M1A(ac)DQLTEEQIAEFK|EAFSLFDK|DGDGTITTK|ELGTVMR|SLGQNPTEAELQDMINEVDADGGTIDFPEFLTMMARK|MK77|DTD80SEEEIR|EAFRVFDK|DGNGYISAAELR|HVMTNLGEK(TM)LTDEEVDEMIR|EADIDGDGQVNYEEFVQMMTAK 149N-termC-termCalmodulin – Peptide ComplexMelittin Peptide Kd = 3 nMKd = 10 µMFigure 1.11: (a) Amino acid sequence of the calmodulin N-terminal (teal) andC-terminal (blue) domains connected by a flexible linker (black) and of melittin(purple); All possible trypsin cleavage sites are denoted with red vertical bars.(b) Calmodulin binds to Ca2+, which induces the formation of a dumbbell-shapedconformation; Upon binding to melittin, a similar conformational change occursfor both Ca2+ -saturated and Ca2+ -free calmodulin. Melittin competitively bindsto calmodulin, inhibiting calmodulin’s activity.311.8. Non-covalent Protein Complex Model Systems1.8.2 Ribonuclease S ComplexAnother transient protein complex examined in this project is the Ribonuclease S(RNaseS) complex. Ribonucleases are responsible for catalyzing the degradationof RNA molecules. Ribonuclease A (RNaseA) is cleaved by subtilisin to formRNaseS. RNaseS is composed of an S-peptide and S-protein, non-covalently as-sociated, which retains the enzymatic activity of RNaseA when both componentsare present. Subtilisin is a non-specific enzyme that works like a serine protease,using a catalytic triad D-H-S to cleave such that the N-terminal primary amineis retained on the S-protein[104]. The cleavage of RNaseA can occur anywherebetween residues 16 – 21, resulting in a mixture of S-proteins and S-peptideswith various sequences[105, 106]. Figure 1.12 shows the masses and sequences ofRNaseS components resulting from a cleavage after RNaseA residue 20. In Figure1.12, the 11530.30 Da S-protein (residues 21-124) has eight cysteine residues en-gaged in disulfide bonding, which are reduced and stablized with carbamidomethylgroups in the model system examined in this work. The 2165.01 Da S-peptide iscomposed of 20 amino acids (1-20). S-peptide and S-protein binding involves S-peptide residues 11-14 and S-protein residues 44-48. The binding affinity (Kd =1µM)[107] of S-peptide to S-protein is significantly less than nucleotide binding,which involves residues H12, H119, K41 and Q11. It also known that R10, R33,F8 and M13 are involved in the stabilization of the complex[108]. Various cross-linkers have previously been applied to RNaseS[105, 109, 110] . Interestingly, thedistance between the S-protein and S-peptide binding sites is about 3 Å, suggestingthat PFA possesses a suitable cross-linker bridge length (2.3 Å). It remains to beseen whether PFA can cross-link to preserve the weak interaction between RNaseScomponents.321.9. Thesis Aims S-peptide: 2165.01 Da (residues 1-20)      S-protein : 11530.30 Da (residues 21-124)       2 1 S S S N Y C N Q M M K S R N LT K D R C K P V N T F V H E S L A D V QAV C S Q K N VA C K N G Q T N C Y Q S Y S T M S I T D C R E T G S S KY P N C AY K T T Q A N K H I I VA C E G N P Y V P V H F D A S V 1 2 4  1 KETAAAKFERQHMDSSTSAA2 0  Binding  Site S-protein S-peptide (a)                 (b)           Figure 1.12: (a) Amino acid sequences of RNaseS components S-protein (blue)and the S-peptide (green). The S-protein contains 8 cysteines, which were reducedand alkylated in this model system. The S-peptide to S-protein binding sites areunderlined. (b) The crystal structure (1RNU) [108] of RNaseS is shown with theS-protein (blue) to S-peptide (green) binding site highlighted in red.1.9 Thesis AimsBased on the recent promises of structural proteomics and the unique potential ofPFA as a cross-linker that can capture close-proximity transient interactions, thisthesis work has the following aims:(1) Identify and localize PFA cross-linking in protein systems larger than syn-thetic peptides, amino acids or insulin.331.9. Thesis AimsBoth the Ca2+-free calmodulin bound to melittin and RNaseS non-covalentprotein complexes are chosen to examine PFA’s capabilities to stabilize transientprotein interactions (under mild reaction conditions similar to that of in vivo pro-tein interaction studies). Identifying cross-linking and modification sites can con-firm the PFA cross-linking mechanism and reaction chemistry under mild, in vivo-like conditions, which has thus far been only revealed in small peptides and in-sulin. Confirming PFA cross-linking in these more biologically relevant systemscan bring the field closer to exploring and understanding PFA cross-linking in cel-lular and in vivo environments.(2) Examine the reaction chemistry, MS analysis and MS/MS fragmenta-tion patterns of PFA cross-linking with respect to other established cross-linkingchemistries.In addition to PFA, other commercially available cross-linkers of different reac-tivity and lengths are chosen to be applied in parallel. The zero-length cross-linker,EDC, is selected due to its comparable small size to PFA and heterobifunctional re-activity. NHS ester cross-linkers of various lengths (sulfoDST, BS3and sulfoEGS)can be used to explore a variety of distance constraints and the effect of cross-linker bridge, relying on its predictable reactive sites. SDS-PAGE can be used toestimate and compare the cross-linking yield across intact cross-linked proteinsfor each cross-linker. Upon in-gel trypsin digestion and LC-MS/MS analysis, thecomplexity of analyzing cross-linked samples via MS and MS/MS can be exam-ined for each cross-linker. The MS/MS fragmentation patterns that are establishedfor EDC and the NHS ester cross-linkers can be used as guidelines to confirm PFAcross-links and derive MS/MS fragment ion evidence criteria to confirm cross-linking. Unlike these other cross-linkers, PFA cross-linking has yet to be identifiedwith current cross-linking identification software. Therefore, the manual versussoftware identification of PFA cross-linking and of other established cross-linkingcan be compared in this study to highlight attributes for future software specifi-cally tailored for PFA cross-link identification. The MS and MS/MS identificationof cross-linking depends on chemical properties such as the trypsin cleavage ef-ficiency of modified/cross-linked residues and formation of cross-linker-specificmodifications vs cross-links, and also on instrument/software factors such as thesensitivity, accuracy and resolution of the mass spectrometer and MS peak picking341.9. Thesis Aimsalgorithms. Thus, these were also chosen to be explored. All together, these as-pects shall place PFA in perspective with other established cross-linkers that havealready achieved what PFA cross-linking has yet to accomplish i.e. cross-linkinglocalization in biologically relevant systems and capturing three-dimensional pro-tein structures.(3) Apply cross-linking to explore the transient interaction between Ca2+ -freecalmodulin and melittin, a protein complex with an unknown binding orientation,and introduce distance constraints for mapping protein structure.Although the Ca2+saturated calmodulin-melittin system has been examinedwith established cross-linkers, the more transient Ca2+ -free calmodulin-melittincomplex has yet to be examined via cross-linking to my knowledge. There-fore, using the distance constraints imposed by cross-linkers of various reactiv-ity and length can potentially provide a comprehensive structural analysis of thecalmodulin-melittin complex, an unresolved structure. Furthermore, this would re-veal whether PFA cross-linking can map three-dimensional protein structure andmatch the capabilities of established cross-linkers.35Chapter 2Methods2.1 MaterialsPurified calmodulin (≥ 95% purity by SDS-PAGE), from bovine brain,lyopholized from a 400 μL buffer containting 150 mM sodium chloride(NaCl), 50 mM Tris-Hydrochloride (Tris-HCl), 2 mM Ethylenediaminete-traacetic acid (EDTA), and melittin (>97% purity by HPLC), from apis mel-lifica, were purchased from MilliporeSigma (Darmstadt, Germany). The 16%formaldehyde solution (PFA) ampules were purchased from Thermo scien-tific (Waltham, MA).1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochlo-ride (EDC), N-hydroxysulfosuccinimide (sulfoNHS), disulfosuccinimidyl tar-trate (sulfoDST), bis(sulfosuccinimidyl) suberate (BS3), and ethylene glycolbis(sulfosuccinimidyl succinate) (sulfoEGS) were bought from CovaChem (LovesPark, IL). Purified Ribonuclease S (> 70% purity by UV), Trizma Base (Tris),4-(2-Hydroxyethyl)piperazine-1-ethanesulfonic acid (HEPES), 2-(N-morpholino)ethanesulfonic acid (MES), sodium dodecyl sulfate (SDS), ammonium bicarbonate(NH4HCO3), glycerol, sodium hydroxide, tetramethylethylenediamine (TEMED),tricine, sodium chloride, potassium chloride and iodoacetamide were obtainedfrom Sigma (St. Louis, MO). Sequencing grade modified trypsin, from porcine,was acquired from Promega (Madison, WI). Acetonitrile (ACN, HPLC grade),formic acid (FA, 88%), acetic acid (glacial) were obtained from Fisher. Acry-lamide, ammonium persulfate (APS), bromophenol blue, Coomassie brilliant blueR250, gel casting and running systems were purchased from Biorad (Hercules,CA). Dithiothreitol was purchased from BDH Chemicals (London, United King-dom). Deionized water was obtained using a Nanopure Ultrapure Water SystemBarnstead (Dubuque, IA).362.2. Chemical Cross-linking Reactions2.2 Chemical Cross-linking ReactionsCalmodulin (at a final concentration of 60 μM and containing EDTA at a finalconcentration of 807 μM, see Appendix A.1.1 for the calculation of EDTA con-centration ) and melittin (at a final concentration of 60 μM) were incubated for 20minutes at 37 ºC in the respective cross-linking buffer (100 mM MES, pH = 6.5 forEDC samples and 20 mM HEPES, pH = 7.4 for other cross-linker samples) priorto the addition of each cross-linker reagent. The purified RNaseS sample was pur-chased with the components of the complex already present together so incubationprior to addition of cross-linker reagent was not required. RNaseS cross-linkingwas performed at a final concentration of 50 μM in 100 mM MES buffer (pH =6.5) for EDC samples, and in 10x PBS buffer (pH = 7.4) for other cross-linkersamples.All cross-linkers were dissolved in cross-linking buffers just before their addi-tion to the protein complexes. EDC was added to the protein mixture which wasimmediately followed by the addition of sulfoNHS with final concentrations of 60and 30 mM, respectively. A EDC control sample (referred to as “-EDC”) was pre-pared by replacing the combined EDC-sulfoNHS volume with MES buffer. Foreach NHS ester (sulfoDST, BS3and sulfoEGS) and PFA cross-linked sample, a 3mM and 330 mM final cross-linker concentration was utilized, respectively. Thecontrol sample (denoted as “-others”) for NHS ester and PFA cross-linkers wasprepared by replacing the cross-linker volume with HEPES and 10x PBS bufferfor calmodulin-melittin and RNaseS samples, respectively. EDC and NHS estercross-linking reactions were performed for 1 hour at 37 ºC. PFA cross-linking wascarried out for 6 hours at at 37 ºC.EDC cross-linking reactions were quenched with dithiothreitol (DTT), usinga final concentration of 40 mM. NHS ester and PFA cross-linking reactions werequenched with Tris buffer, using final concentrations of 60 and 500 mM, respec-tively.372.3. Tris-Tricine Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separation2.3 Tris-Tricine Sodium Dodecyl Sulfate PolyacrylamideGel Electrophoresis Separation2.3.1 Casting GelsTris-Tricine SDS Gels were cast using a 16% running and 4 % stacking gel. Thefollowing protocol was used to cast two gels at a time. A Tris-HCL/SDS bufferwith 3M Tris-HCl and 0.3% SDS, and pH= 8.45, was prepared. The 16% runninggel was prepared by first mixing 30% acrylamide/0.8% bisacrylamide (7.5 mL) ,3M Tris-HCl/SDS (5 mL, pH = 8.45), glycerol (1.5 mL) and H2O (1 mL) together.APS (0.2 mL) followed by TEMED (0.01 mL) was added and the running gel wasimmediately poured between gel casting glass plates (1mm), followed by the addi-tion of isopropyl alcohol to cover the top of the gel. The gel was left to polymerizefor approximately 30 minutes.The 4% stacking gel was prepared by first mixing30% acrylamide/0.8%bisacrylamide (0.8 mL), 3M Tris-HCl/SDS (1.5 mL, pH =8.45), and H2O (3.7 mL) together. After removing the isopropyl alcohol and wash-ing the top of the gels with H2O, the stacking gel was poured between the glassplates immediately after the addition of APS (0.08 mL) and TEMED (0.01 mL) tothe stacking gel solution. A 10-well green comb was inserted between the glassplates. After 20 minutes at room temperature, gels were stored at 4 ºC overnightbefore use.2.3.2 One Dimensional Sodium Dodecyl Sulfate Polyacrylamide GelElectrophoresis SeparationThe Tris-Tricine gel running chamber was prepared by filling the inner chamberbetween the glass plates holding the polymerized gels with a Tris-Tricine cathodebuffer (0.1 M Tris, 0.1 M Tricine, 0.1% SDS) and the outer chamber with a Tris-Tricine anode buffer (0.1 M Tris, pH = 8.9). Reaction mixtures were mixed with 4Xsample buffer (200 mM Tris-HCl, 2% SDS, 40% glycerol, 0.04% Coomassie bril-liant blue R250) and incubated for 5 minutes at 65ºC. Each sample was cooled anda prestained protein marker (10 kDa- 250 kDa), -EDC, -others, +EDC, PFA, sul-foDST, BS3 and sulfoEGS samples were loaded in each gel well from left to right.Electrophoresis was conducted at 25 mA until the gel bands passed through the382.3. Tris-Tricine Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationstacking gel, then the current was increased to 40 mA until the gel bands reachedthe end of the plate. Gels were visualized with coomassie brilliant blue R250 andgel images were obtained using an Odyssey infrared imaging system (LI-COR Bio-sciences, Lincoln, NE).2.3.3 Analysis of Sodium Dodecyl Sulfate Polyacrylamide GelElectrophoresis SeparationMolecular weights of gel bends were estimated using ImageJ, an image processingprogram designed for scientific research (University of Wisconsin) [111]. Migra-tion distances from the dye front to the midpoint of each gel band appearing onthe SDS-PAGE was measured using ImageJ. A standard curve was prepared us-ing the protein marker by plotting the log of the known molecular weights (kDa)of each band versus its migration distance (pixels) in Microsoft Excel 2007. Theequation of the best fitting line for the calmodulin-melittin SDS-PAGE was y =50.911x−0.662 (R² = 0.9899), where x is the migration distance and y is the log of themolecular weight. For the RNaseS SDS-PAGE, this equation was y = 73.13x−0.723(R2 = 0.9910). For the literature SDS-PAGE of Ca2+-free versus Ca2+ loadedcalmodulin, this equation was y = (3×10−7)x2 - 0.0037x + 1.9277 (R2 = 0.9909).These equations were used to estimate the molecular weights of gel bands for eachsample lane.Intensities of each gel band were analyzed using ImageJ. Each lane in the SDSgel image was selected in ImageJ, and a density plot for each lane was obtainedsuch that the intensity of bands from the top to bottom (highest to lowest molecularweight) of the gel were plotted left to right. The peak area was proportional to theintensity of each gel band in each lane. Cross-linked protein bands were distin-guished from non-cross-linked protein bands using these density plots. The peakscorresponding to the unmodified protein in the control sample was used to deter-mine the unmodified protein peak in the experimental samples. The relative yieldof cross-linking was calculated by summing the intensities of the cross-linked pro-tein bands divided by the total intensity of all the bands in each cross-linked samplelane. The relative yield of non-cross linked species was calculated by summing theintensities of the non-cross linked protein bands divided by the total intensity of all392.4. Trypsin Digestionthe bands in each cross-linked sample lane.The isoelectric point of each protein component was determined using the Ex-PASy (Swiss Institute of Bioinformatics) ProtParam tool[112].2.3.4 Excision and Washing of Gel BandsMolecular weight categories to group relevant gel bands were determined based onthe position and size of each gel band. Control (-EDC and -others) and sulfoDSTcross-linked calmodulin-melittin samples, gel bands in each lane were excised andapproximately grouped based on the following molecular weight categories: < 10kDa, 10 - 20 kDa, and 20 - 40 kDa. For EDC, PFA, BS3 and sulfoEGS cross-linkedcalmodulin-melittin samples, where gel bands shifted 3.2, 4.2, 3.5, and 4.0 kDabelow the actual molecular weight, respectively, the gel bands were approximatelygrouped based on the following molecular weight categories: < 14 kDa, 14-19 kDa,19-33 kDa and > 33 kDa. For all RNaseS samples, gel bands were approximatelygrouped based on the following molecular weight categories: < 12 kDa, 12-20kDa, and > 20 kDa. It is important to note that these molecular weight categoriesare approximiate and variation between extremities in each group most likely exist.Protein gel bands were washed with alternating cycles of acetonitrile and 100mM NH4HCO3. For RNaseS, gel bands were reduced with 10 mM DTT at 56ºCfor 30 minutes in the dark and alkylated with 55 mM iodoacetamide (IAA) at 25ºCfor 1 hour in the dark.2.4 Trypsin DigestionAn in-gel trypsin digestion was performed by incubating gel pieces with trypsin.Approximately 50 and 8 µg of calmodulin and melittin, respectively, were presentin the reaction mixtures loaded onto the SDS gel. For the RNaseS complex, ap-proximately 58 and 11µg of the S-protein and S-peptide, respectively, were presentin the loaded reaction mixtures. Gel bands were resuspended in 50 mM NH4HCO3and trypsin was added to each gel band such that the protein-to-trypsin ratio wasapproximately 50 weight/weight (w/w). The digestion was performed overnight at37 ºC and was quenched with 5% formic acid. Peptides were extracted with ace-402.5. Reverse Phase High Performance Liquid Chromatography Tandem Mass Spectrometric Analysistonitrile, purified with C18 stage tips, and resuspended in 0.035 mL of 5% formicacid.2.5 Reverse Phase High Performance LiquidChromatography Tandem Mass SpectrometricAnalysis2.5.1 Bruker Impact II Quadrupole Time-of-Flight Analysis ForCross-linked Calmodulin-Melittin and RNaseS PeptidesPurified trypsin digests were separated using a nano HPLC column, and an injec-tion volume of 5μL was used. The pre-column was 4 cm long, with a 100 μmdiameter, and was packed with 5 μm Aqua C18 material (Phenomenex). The nanocolumn was 40 cm long with a 75 μm diameter, and packed with 3μm Reprosil-PurC18-AQ material (Dr. Maisch Gmbh, Ammerbuch, Germany). The mobile phasebuffers used were Buffer A (0.1 % formic acid) and Buffer B (80% acetonitrile and0.1% formic acid). A flow rate of 250 nL/min was used. A 90 minute run was usedwith a gradient as follows:t = 0 mins: 95% Buffer A, 5% Buffer Bt = 30 mins: 83% Buffer A, 17% Buffer Bt = 73 mins: 65% Buffer A, 35% Buffer Bt = 75 mins: 0 % Buffer A, 100% Buffer BSamples were analyzed by the Bruker Impact II QqTOF (Bruker DaltonicsBillerica, MA). MS spectra were collected for a m/z range of 200 - 2200. Anisolation window was varied based on the m/z as follows: 2 Th for m/z = 300 and400, 3 Th over a m/z = 500 - 800, 5 Th for m/z > 900 and interpolated for valuesfalling between these ranges. The collision gas was nitrogen and the collisionenergy varied between 23-65 eV as a function of m/z and charge. Precursor ionswere excluded for 0.4 min after being fragmented. MS/MS spectra were collectedusing data dependent acquisition over a m/z range of 50 - 2200. Appendix A.4lists the collision energy used for each m/z and charge (intermediate values wereinterpolated).412.6. Mass Spectrometric Data Analysis2.5.2 ABI QStar XL Quadrupole Time of Flight Analysis For MassSpectrometer ComparisonThe nanoHPLC separation of the samples was performed using a lab-made nano-column that was 15 cm long, 75 μm diameter and packed with 3μm Reprosil-PurC18-AQ material (Dr. Maisch Gmbh, Ammerbuch, Germany). A 100 min gradientwas used with mobile phase Buffer A (0.1 % formic acid) and Buffer B (80%acetonitrile and 0.1% formic acid). Samples run on the ABI QStar XL QqTOF(Applied Biosystems, Foster City, CA). MS/MS spectra were collected using aninformation dependent acquisition (IDA). The collision gas was nitrogen and thecollision energy was varied based as function of m/z and charge.2.6 Mass Spectrometric Data Analysis2.6.1 MaxQuant Verification of Unmodified/Modified Peptides viaTandem Mass SpectrometryThe raw LC-MS/MS Bruker Impact II QqTOF data files for the calmodulin -melittin and RNaseS samples (control and PFA treated) were directly loadedinto MaxQuant (a proteomics software distributed by the Max Planck Society)[113]. The search was configured by adding custom protein fasta files and PFA-specific modifications. Fasta files of calmodulin-melittin and RNaseS were pre-pared by inputting the protein sequences into Format Converter (HIV SequenceDatabase)[114]. For RNaseS, fasta files were prepared with two S-peptide se-quences (RNaseA residues 1-19 and 1-20), two S-protein sequences (RNaseAresidues 20-124 and 21-124), and RNaseA sequence (RNaseA 1-124). For thecalmodulin-melittin searches, the variable modifications used were the following:amidation of the C-terminus, acetylation of the N-terminus, deamidation of N, ox-idation of M and trimethylation of K. For the PFA-cross-linked sample, Schiffbase and methylol modifications on the N-terminus, K residues and R residueswere added as variable modifications. For RNaseS searches, the following variablemodifications were used: deamidation of N, oxidation of M and Carbamidomethylon C. A minimum score cut off of 40 and a mass accuracy cut off of 40 ppm wasused.422.6. Mass Spectrometric Data Analysis2.6.2 Manual Verification of Cross-linked Species via Tandem MassSpectrometryFigure 2.1a outlines the data analysis procedure used to identify and analyze cross-linked species.For all samples analyzed with the Bruker Impact II QqTOF, the raw LC-MS/MS data was prepared as follows. Monoisotopic mass lists were acquired byusing the sophisticated numerical annotation procedure (SNAP) algorithm to pickpeaks with a signal to noise ratio (S/N) > 2 and the deconvolution feature in theBruker Daltonics Compass Data Analysis 4.2 software. For samples analyzed withthe ABI QStar XL QqTOF, monoisotopic mass lists were obtained using the An-alyst QS 1.1 software and signals that had a S/N below 5 were omitted. For eachcross-linked sample, m/z values < 400 and signals appearing in control sampleswere excluded to remove background signals. Cross-linked species were assumedto have at least a +2 charge since they are composed of two tryptic peptides, eachwith at least a +1 charge from the terminal K or R residues. Using Microsoft OfficeExcel 2007, masses corresponding to signals in control samples were eliminatedfrom the mass lists from cross-linked species using a + 0.2 Da window. For EDCcross-linked samples, the “-EDC” control sample was used and for PFA, sulfoDST,BS3, and sulfoEGS the “-others” control sample was used.A theoretical list of possible cross-linked masses specific for each cross-linkerwas derived assuming that a cross-linked mass was equal to the mass of each com-ponent peptide plus the cross-linker bridge mass plus any additional modificationmass. ExPASy Peptide mass[115] was used to perform a theoretical trypsin diges-tion considering a maximum of four missed cleavages to produce a list of possi-ble unmodified peptides. For calmodulin-melittin, the following fixed modifica-tions: amidation of the C-terminus, acetylation of the N-terminus, and trimethy-lation of K were considered; and the following variable modifications were con-sidered: deamidation of N and oxidation of M. The N-terminus of calmodulin isblocked by an acetyl modification and K115 is blocked with a trimethyl modi-fication, so these sites were disregarded as modification and/or cross-linking sitesfor all cross-linkers. For RNaseS the following variable modifications were consid-ered: deamidation of N, oxidation of M and Carbamidomethyl on C. The number of432.6. Mass Spectrometric Data Analysiscross-linking and modification sites in each peptide was determined for each cross-linker. In the case of PFA modified species, two modifications are possible at eachsite (N-terminal, R and K residues): Schiff Base (+12 Da mass shift) and methylol(+30 Da). Also, intra and interpeptide PFA cross-links produce a +12 Da massshift and can form on N-terminal, R,H, Q,N and Y residues. For EDC, a -18.02 Damass shift for each cross-linking site (E/D to N-terminal/K residues) and a +155.00Da mass shift for each modification site (D or E residues) was considered. For theNHS ester cross-linkers, a 114.00,138.07, and 226.05 Da mass shift for sulfoDST,BS3and sulfoEGS, respectively, on both N-terminal and K residues were consid-ered. In addition, modification after the hydrolysis of sulfoDST, BS3and sulfoEGScross-linkers with mass shifts of 132.01, 156.08, and 244.06 Da, respectively, werealso considered for additional N-terminal or K residues on each peptide. The massof every possible modified peptide involved in a cross-link was determined as-suming each peptide could possess zero up to one minus the maximum possiblemodifications. A list of possible unmodified and modified peptide masses for eachcross-linker was created and Mathematica 10 (Wolfram Research Inc., Champaign,IL)[116] was used to find every possible combination of these masses to producea theoretical cross-link mass list specific to each cross-linker. Every possible com-bination of unmodified peptides was also derived using Mathematica to produce alist of impossible cross-link masses. Using Microsoft Excel, unmodified/modifiedpeptide and impossible cross-link masses were eliminated from the monoisotopicmass lists for each cross-linker to obtain a final mass list. Microsoft Excel andMathematica codes can be found in the AppendixA.2Final mass lists from each cross-linked sample were matched to theoreticallists for each cross-linker with Mathematica using a +0.2 Da window to obtain thecandidate list. For each candidate cross-linked species, the MS and MS/MS signalswere inspected manually using the Bruker Daltonics Compass Data Analysis 4.2or Analyst QS 1.1 software to visualize the LC-MS/MS data sets produced by theBruker Impact II or ABI QStar XL, respectively. The extracted ion chromatogram(EIC) was obtained for each candidate to confirm the elution timepoint, and theMS spectrum at that timepoint was examined to verify the presence of the signal.Signals that did not exist, that were at the noise level of the spectrum, or weremixed with other signals were rejected to produce a MS Candidate List. For each442.6. Mass Spectrometric Data AnalysisMS candidate signal, the MS/MS spectrum was obtained. In the Bruker DaltonicsCompass Data Analysis 4.2 software, a list of all MS/MS spectra for each MSspectrum is provided and the MS/MS spectrum corresponding to the MS signalof the candidate was searched and selected manually. If a species with the samem/z and charge eluted at different timepoints, the MS spectrum of the timepointat which the software was able to obtain a MS/MS spectrum was selected. In theAnalyst QS 1.1 software, the elution time point of the MS candidate was selected inthe total ion chromatogram (TIC) of the MS/MS scans to obtain all of the MS/MSspectra collected for each species, which were summed together.Theoretical fragment ions for the backbone fragmentation of unmodified pep-tides from calmodulin and melittin were obtained using Protein Prospector (Uni-versity of California, San Francisco) MS product [117]. These were used tomanually prepare theoretical cross-linked and/or modified peptide fragment iondatabases specific to each cross-linker chemistry. In each peptide, for each frag-ment that contained a cross-linking or modification site, modified fragment masseswere obtained by adding the mass of each cross-linker-specific modification. Inthe case of PFA, for fragment ions containing modification sites, a series of thesefragment ion masses with multiples of +12 Da, multiples of +30 Da and everycombination of +12 Da and +30 Da mass shifts were prepared depending on thenumber of modification sites. For PFA fragment ions containing cross-linking sites,a series of fragment ion masses with multiples of +12 Da mass shifts were deriveddepending on the number of cross-linking sites. For EDC fragment ion containingmodification sites, a series of fragment ion masses with multiples of + 155.00 Damass shifts was prepared. For sulfoDST, BS3and sulfoEGS cross-linkers, a seriesof fragent ion masses with mass shifts of +132.01, 156.08, and 244.06 Da, respec-tively, for each modification site was derived. Each MS/MS spectrum of candidatecross-linked species was inspected manually for the presence of unmodified and/ormodified backbone fragment ions. If the MS/MS signals matched the theoreticalfragment ion signals of one peptide part of the database, the mass of the precursorion was used to deduce the mass of the possible second component peptide. Themass of the potential second peptide is equal to the mass of precursor ion minusthe mass of the first peptide minus the mass of the cross-linker bridge pertaining tothe sample. In addition, cross-links between adjacent peptides were not considered452.6. Mass Spectrometric Data Analysisas cross-linked species since they also matched single peptides with missed cleav-ages. If this mass did not correspond to any peptide in the database but matched apotential modification (i.e. oxidation, deamidation, or cross-linker-specific modifi-cation) then this species was considered to be a single peptide and not a cross-link.If the mass matched a another peptide then the expected backbone fragment ions ofthe second peptide were searched for manually in the MS/MS spectrum. Fragmentions corresponding to intact unmodified or modified peptides were searched for inthe MS/MS spectrum. In addition, a list of fragment ion masses equal to the back-bone fragment ion mass of one peptide plus the cross-linker bridge plus the mass ofthe other peptide component was prepared and compared to the MS/MS spectrumof the potential cross-linked species. All fragment ions with a maximum of oneminus the charge state of the precursor ion were considered. If MS/MS signals didnot match peptides in the database or matched one peptide with an unexplainablemodification mass, these species were labelled as “undetermined.”Cross-linking sites were localized by considering the residue reactivity of eachcross-linker and examining modified type 3 and type 2 fragment ion evidence. Ifcross-linking sites were ambiguous, the degree of modification (DOM) was calcu-lated using the following formula:DOM = PA0PA0+PA1 +PA1PA0+PA1, wherePA0= Relative peak area of unmodified fragment ion and PA1= Relative peakarea of singly modified fragment ionPeak area values were normalized to the total peak area and were equated tothe relative peak areas of each signal.462.6. Mass Spectrometric Data AnalysisBruker Compass Data AnalysisMonoisotopic Mass List(- S/N < 2, m/z < 400, z = +1)ExcelFinal Mass List(- Control, Peptide, Impossible Cross-Link signals)MathematicaCandidate ListManual InspectionMS Candidate List(- Insufficient MS signals i.e.low signals, mixtures)Manual InspectionMSMS Confirmed Cross-Links(- Peptide, Insufficient MS/MS, Undeterminable Signals)Crystal Structure (Pymol) and SDS PAGEVerified and Classified Cross-LinksFigure 2.1: Data Analysis workflow for cross-link identification, where items inparentheses are values that have been eliminated. All elimination and matching ofmonoisotopic masses using Mathematica and Excel was performed using a massaccuracy of + 0.2 Da.2.6.3 Mascot Analysis of Unmodified CalmodulinThe MS/MS data from a control calmodulin sample produced by the QStar andBruker Impact II were analyzed using the Mascot MS/MS peptide search (MatrixScience, Boston, MA)[118]. The Mascot generic format (mgf) files were gener-ated using the Bruker Daltonics Compass Data Analysis 4.2 or Analyst QS 1.1software. The MS/MS Ion Search was performed using a peptide mass toleranceand fragment mass tolerance of 0.2 Da, a significant threshold p < 0.05 and a score472.6. Mass Spectrometric Data Analysiscut off of 20. Variable modifications were set as follows: Trimethyl (K), Oxidation(M), Acetyl (N-terminus), Deamidated (NQ) and Acetyl (Protein N-terminus). Thedatabase was set to UP_cow and the instrument was set to ESI-QUAD-TOF.2.6.4 Automated Cross-link IdentificationStavroX/MeroX (University of Halle-Wittenberg)[119, 120] and pLink (Instituteof Computing Technology, Chinese Academy of Sciences, Beijing, China)[121]were used to compare manual versus software identified cross-linked species.Parameters similar to the manual MS/MS search for cross-linked species wereimplemented into software searches. For all the software, an MS/MS search wasperformed. The mgf files for each LC-MS/MS sample data set was preparedusing the Bruker Daltonics Compass Data Analysis 4.2 software. Fasta files ofcalmodulin-melittin and RNaseS were prepared by inputting the protein sequencesinto a Format Converter (HIV Sequence Database)[114]. Minimum mass accura-cies of 60 ppm for MS and 10 ppm for MS/MS signals, respectively were used,based on the mass accuracy of manually identified cross-linked species. A maxi-mum of four missed cleavages was accounted for and trypsin was set to cleave aftercross-linked K and R residues to maximize the number of cross-linked speciesidentified. For the calmodulin-melittin searches, the following fixed modifica-tions were used: amidation of the C-terminus, acetylation of the N-terminus, andtrimethylation of K; and the following variable modifications were used: deami-dation of N and oxidation of M. For the PFA-cross-linked sample, Schiff base andmethylol modifications on the N-terminus, K residues and R residues were addedas variable modifications. For sulfoDST, BS3and sulfoEGS cross-linked samples,modifications due to the hydrolysis of each cross-linker were set as variable mod-ifications. For EDC cross-linked samples, variable modifications from the forma-tion of N-acylurea (+ 155.00 Da) was considered. The PFA cross-linker had to beadded to the cross-linker database for both MeroX and pLink and was set to formcross-links between N-terminal, K and R,residues to N-terminal, H, Y, Q, R andN residues with a bridge mass of 12 Da. For StavroX, a score cut off of 50 wasused, which corresponds to a false discovery rate (FDR) < 5%[120]. In pLink, onlycross-linked species that fall below a 5% FDR are reported [122]. The minimum482.7. Analysis Based on Relative Abundance Calculations in the Calmodulin-Melittin SystemMS/MS fragment ion evidence established to confirm a cross-linked species themanually was used to assess whether cross-links identified by the software existed(see section 4.5 for specific criteria). The m/z and charge of the software identifiedcross-linked species were examined in the MS spectrum using Bruker DaltonicsCompass Data Analysis 4.2 software. If the MS signals of the cross-linked specieswere not present, they were classified as having insufficient MS evidence for con-firmation. StavroX produces an annotated spectrum for each cross-linked speciesidentified and this was used to evaluate each cross-link identification. The pLinksoftware did not provide an annotated MS/MS spectra and therefore signals werechecked in the raw MS and MS/MS spectra manually using the Bruker DaltonicsCompass Data Analysis 4.2 software.2.7 Analysis Based on Relative Abundance Calculationsin the Calmodulin-Melittin SystemFor all unmodified, modified and cross-linked species that were identified, abun-dances were equated to the normalized peak area of their MS signal. The peak areaof each peptide was normalized by dividing it by the total peak area of all pep-tides in each sample. Normalized peak areas were summed across all runs for eachunique peptide. In order to make conclusions based on the abundance of species ineach sample, a universal assumption was made in this study that no peptides werelost in the sample preparation, MS-detection or identification processes since it isnot feasible to account for this loss. It is also assumed that there is uniformity in theMS-response of all peptides in the sample and that all peptides were ionized withsimilar efficiency. Percent abundance values (> 1 %) are reported with zero deci-mal places to reflect the accuracy of the MS peak area measurements. For percentabundance values < 1%, two or three decimal places are given to prevent reportinga 0 % percent abundance. In addition, in order for the total percent abundance toequal 100 % for a particular site, the percent abundance values > 99% are reportedto a number of decimal places that matches the corresponding values < 1%.492.7. Analysis Based on Relative Abundance Calculations in the Calmodulin-Melittin System2.7.1 Percent Trypsin Cleavage and AccessibilityThe percent trypsin cleavage for the control and PFA treated samples in bothcalmodulin-melittin and RNaseS systems was derived for all observed trypsincleavage sites in these samples. A list of all trypsin cleavage sites and missed cleav-age sites were prepared from the MaxQuant identified control, PFA unmodified andPFA modified peptide lists. A list of all trypsin cleavage sites and missed cleavagesites were also prepared from the PFA cross-linked peptides identified manually.For each peptide, N-terminal and C-terminal K or R residues (with the exception ofthe calmodulin C-terminal residue K148 and melittin N-terminal residue G1) wereconsidered as trypsin cleavage sites. Internal K or R residues were consideredmissed cleavage sites (with the exception of the calmodulin trimethylated K115).In addition missed cleaved residues with either +12 or +30 Da PFA modificationwere excluded as trypsin missed cleavages sites to only examine trypsin cleavageas a function of structural accessibility. In the case where the position of the PFAmodification was ambiguous, the number of trypsin missed cleavage sites was sub-tracted by the number of modifications per peptide to obtain the number of trypsincleavage sites that were not a result of a PFA modification.The percent trypsin cleavage for calmodulin and melittin in control vs PFAtreated samples was calculated for each cleavage site by dividing the normalizedabundance of peptides that supported cleavage at a particular site by the total nor-malized abundance of all peptides containing the particular site (with or without amissed cleavage). This was performed for unmodified, modified and cross-linkedpeptides identified in the PFA treated sample and the unmodified peptides identi-fied in the control sample.2.7.2 Relative Abundances and Equilibrium of FormaldehydeUnmodified, Modified and Cross-linked Species2.7.2.1 Percent Relative Abundance of Cross-linking and ModificationFor each cross-linking site identified in the PFA treated calmodulin-melittin sam-ples, the relative abundances of all the peptides that were unmodified, modifiedwith a +12 Da modification, modified with a +30 Da modification and cross-502.7. Analysis Based on Relative Abundance Calculations in the Calmodulin-Melittin Systemlinked at that particular cross-linking site were summed for each category. Thisdetermined the total unmodified abundance, modified abundance, and cross-linkedabundance for each cross-linking site. To determine the total percent of unmodi-fied, modified and cross-linked protein, the abundances of each cross-linking sitein its unmodified, modified and cross-linked form were each summed and dividedby the total sum of abundances for all forms. It was assumed that all +12 Damodifications were Schiff Base modifications since previous PFA studies have con-firmed that the formation of Schiff Base modifications is more likely than intrapep-tide cross-links in proteins when using short reaction times similar to this currentstudy[36, 85] and to account for all potential Schiff Base modifications.2.7.2.2 Derivation of Equilibrium Constants for FormaldehydeCross-linking ReactionThe equilibrium constant for the PFA cross-linking that occurred between the iden-tified cross-linking sites for the calmodulin-melittin system were determined. Forcross-linking sites that involved residues that were identified with both methylol(+30 Da) and Schiff Base (+12 Da) modified peptides, the equilibrium constantsK1, K2 and K3 are defined in the reaction scheme depicted in Figure 2.2. However,for those cross-linking sites that involved residues that were identified with only+12 modified peptides, the equilibrium constant K1+2 is defined in the reactionscheme depicted in Figure 2.3.512.7. Analysis Based on Relative Abundance Calculations in the Calmodulin-Melittin SystemR1NHCH2OHR1 NH2R1NCH2R1NH R2K1K2K3CH2OR2-HOH2R1NHCH2OHR1NCH2m1m2m1 + 30 Dam1 + 12 DaM = m1 + m2 + 12 Da+++m1 + 30 Dam1 + 12 DaFigure 2.2: PFA cross-linking equilibrium reaction steps, where K1, K2, and K3are the respective equilibrium constants for the formation of a methylol, SchiffBase and methylene bridge, respectively. The notation for each reactant and prod-uct is defined. R1and R2 represent protein sites 1 and 2, respectively.R1 NH2 R1NCH2K1+2CH2O OH2m1 m1 + 12 Da++Figure 2.3: The PFA modification equilibrium reaction step defines K1+2 in thecase where a methylol modification was not identified. R1and R2 represent proteinsites 1 and 2, respectively.The expressions for the equilibrium constants were derived. The equilibriumconstant is defined as the concentration of products over the concentration of reac-tants (in dilute solutions):K = [Products][Reactants]The total equilibrium constant is the product of the equilibrium constants (K1,K2, and K3) for each intermediate step of the reaction:K1 =[R1CH2OH][R1NH2][CH2O]522.7. Analysis Based on Relative Abundance Calculations in the Calmodulin-Melittin SystemK2 =[R1NCH2][H2O][R1CH2OH]K3 =[[R1NHCH2R2][R1NCH2][R2H]The amount of PFA was added in much higher excess than protein (5500X)and the amount of PFA that reacted with the protein is negligible. Therefore, itassumed that PFA is a constant and was excluded from the equilibrium expression.Also, since cross-linking and subsequent analyses were performed in solution, wa-ter is also assumed to not affect the equilibrium and was thus excluded from theequilibrium expression. K1’ and K2’ are defined below, assuming PFA and waterare constants in the equilibrium expressions:K ′1 =[R1CH2OH][R1NH2]K ′2 =[R1NCH2][R1CH2OH]“A” is defined as the normalized abundance (i.e. the normalized MS peak area)of each species, where n is the identity of the species:An = Abundance of species nThe concentration of a species involved in the equilibrium reaction is propor-tional to its relative abundance. The relative abundance was calculated by dividingthe normalized abundance of each product or reactant by the total normalized abun-dance of all products and reactants. This ensures that equilibrium constant remainsunitless in all three equilibrium constant equations. The relative abundance wassubstituted in the equilibrium constant expression and the definition of the equilib-rium constant based on abundance, KMS , is given below:KMS =Aproduct∑AAreactant∑A532.7. Analysis Based on Relative Abundance Calculations in the Calmodulin-Melittin SystemFor the PFA cross-linking reaction shown in Figure 2.2 and 2.3, the total nor-malized abundance of all products and reactants for each step of the reaction aredefined as follows:∑A1 = Am1 + Am1+30∑A2 = Am1+12+ Am1+30∑A3 = Am2 + Am1+12+ AMThe relative abundances were substituted in the equilibrium constant expres-sions to derive the following equations:K ′1MS =Am1+30∑A1Am1∑A1=Am1+30Am1K ′2MS =Am1+12∑A2Am1+30∑A2=Am1+12Am1+30K ′(1+2)MS = K′1MSK′2MS =Am1+12Am1K3MS =AM∑A3(Am1+12∑A3) (Am2∑A3)= AM(Am1+12)(Am2 ) (∑ A3)542.8. Crystal Structure Distance Constraints for the Calmodulin-Melittin System2.7.3 Relative Abundance of Different Types of Cross-linkingThe relative abundance of calmodulin-calmodulin versus calmodulin-melittin andantiparallel vs parallel calmodulin-melittin cross-linking was calculated using theidentified EDC, PFA, sulfoDST, BS3and sulfoEGS cross-linked species. Cross-links were classified as calmodulin-calmodulin cross-linking if both cross-linkingsites were from the calmodulin molecule. Each calmodulin-melittin cross-link wasclassified as “parallel” if the cross-linking occurred between residues from the N-terminal domain of both calmodulin and melittin. Each calmodulin-melittin cross-link was classified as “antiparallel” if the cross-linking occurred between residuesfrom opposite domains of calmodulin and melittin (i.e. the N-terminal domainof one component and the C-terminal domain of the other component). For eachcross-linker, the normalized peak area values of all identified cross-linked speciesfor each type of cross-linked species (calmodulin-calmodulin, calmodulin-melittin,parallel calmodulin-melittin or antiparallel calmodulin-melittin) was divided by thesum of normalized peak area values of all identified cross-linked species.2.8 Crystal Structure Distance Constraints for theCalmodulin-Melittin SystemCrystal structures for unbound Ca2+saturated calmodulin (3CLN)[123], unboundCa2+ -free calmodulin (1CFC) [124], calmodulin (1PRW) [125] in its bound stateand melittin(2MLT)[126] were obtained from Research Collaboratory for Struc-tural Bioinformatics: Protein Data Band (RCSB PDB)[127]. All distance measure-ments on each crystal structure were performed using Schrödinger PyMol, a molec-ular visualizations software [128] (Schrödinger, New York, NY). The Cα is definedas the carbon atom on the peptide backbone connected to the side chain. For in-tramolecular calmodulin cross-links, distances between Cα of each cross-linkedamino acid side chain (Cα-Cα distance) were measured on unbound calmodulinand bound calmodulin crystal structures. The maximum cross-linking distance isdefined as the largest distance that two amino acid reactive sites can exist in theprotein structure to form a cross-link. Table 2.1 lists the maximum cross-linkingdistances for each cross-linker for every possible combination of reactive sites. It552.8. Crystal Structure Distance Constraints for the Calmodulin-Melittin Systemwas assumed that side chains can freely rotate about the Cα with a radius equal tothe length of the side chain. Side chains that can come into contact with the cross-linker within this radius were considered cross-linked. This accounts for the sidechain flexibility. To account for backbone flexibility, an additional 6 Å was addedto the maximum cross-linking distance. All together, maximum cross-linking dis-tances were calculated by summing the length of each amino acid side chain in-volved in the cross-link (measured from Cα to cross-linked atom), the cross-linkerspacer arm length (bridge length), and 6 Å for backbone flexibility. This calcula-tion was based on recent molecular dynamic simulations [129]. As shown in Table2.1, the maximum cross-linking distance possible for EDC, PFA, sulfoDST, BS3,and sulfoEGS cross-linkers is 17.3, 22.1, 25.2, 30.2 and 34.9 Å, respectivelyTable 2.1: Maximum cross-linking distances for every combination of possiblereactive sites for each cross-linker are listedCross-Linker Spacer Arm Length (Å) Cross-Linking ResiduesModifiedResiduesEDC 0 D EN-term 9.7 Å 10.9 ÅK 16.1 Å 17.3 ÅPFA 2.3 N-term K RN-term 8.3 Å 14.7 Å 15.2 ÅQ 12.2 Å 18.6 Å 19.1 ÅN 10.8 Å 17.2 Å 17.7 ÅR 15.2 Å 21.6 Å 22.1 ÅH 13 Å 19.4 Å 19.9 ÅY 12.9 Å 19.3 Å 19.8 ÅN-term KSulfoDST 6.4 N-term 12.4 Å 18.8ÅK 18.8 Å 25.2 ÅBS3 11.4 N-term 17.4 Å 23.8 ÅK 23.8 Å 30.2ÅSulfoEGS 16.1 N-term 22.1 Å 28.5 ÅK 28.5 Å 34.9 ÅThe Cα-Cα distance were compared to the maximum cross-linking distanceto evaluate intra calmodulin cross-linked species. Calmodulin-melittin structureswere derived using maximum cross-linking distances between all MS/MS verifiedcross-linked species.56Chapter 3General Data Analysis ofCross-linkedCalmodulin-Melittin andRibonuclease S3.1 Cross-linking of Calmodulin-Melittin andRibonuclease-S Complexes with VariousCross-linkersPFA along with various established cross-linkers (EDC/sulfoNHS,sulfoDSTBS3,and sulfoEGS) were applied to two transient protein complexes: Ca2+-freecalmodulin-melittin and RNaseS (S-peptide and S-protein).EDC/sulfoNHS, sulfoDST, BS3 and sulfoEGS cross-linking of this complexwas carried out using previously optimized protocols [99] using a one hour incu-bation time at 37 ºC. All cross-linking was performed at physiological pH (pH =7.4) except EDC, which was performed at a pH = 6.5. The activation step of EDCand sulfoNHS has been shown to be most efficient at this pH[33, 35]. Mild PFAcross-linking reaction conditions (physiological pH, 1% PFA, and 6 hour reactiontime) were utilized to mimic in vivo protein interaction studies [50, 68]. The twocontrol samples (one for EDC and one for other cross-linkers) were prepared byreplacing the cross-linker reagent with an equal volume of buffer and served asa negative controls. EDC’s zero-length cross-linker bridge and heterobifunction-ality [33] served as suitable comparison to PFA’s pseudo zero-length bridge and573.1. Cross-linking of Calmodulin-Melittin and Ribonuclease-S Complexes with Various Cross-linkersdiffering residue reactivity between the first and second step of the cross-linkingreaction. On the other hand, sulfoNHS ester cross-linkers (sulfoDST, BS3and sul-foEGS) are reactive with lysines and N-terminal residues in both steps of the cross-linking reaction[32], making their reactive sites much more predictable than PFA.With straightforward reactivity, the NHS ester cross-linkers of various lengths werechosen to examine the effect of cross-linker size on specificity and capturing pro-tein structure.3.1.1 Confirming Calcium-Free Calmodulin-MelittinIn this study, the Ca2+-free calmodulin-melittin system was analyzed. In orderto confirm that the calmodulin-melittin system remained Ca2+-free regardless ofpotential Ca2+ contamination in the solution, the expected ratio of Ca2+-loadedcalmodulin to Ca2+-free calmodulin was calculated. The level of Ca2+ reportedin unpurified tap water in the University of British Columbia (the location of thelab) was < 2.5 mg/L ( 6.2×10−5 M) [130]. However, the water purification sys-tem (Nanopure Ultrapure Water System Barnstead) utilized in this experiment wasshown to produce deionized water that contained < 1 ng/L ( 2.5×10−11 M) of Ca2+[131]. The presence of EDTA, a calcium chelating reagent, in the calmodulinsample can maintain a Ca2+-free complex by binding to any residual Ca2+ fromwater in the sample, as demonstrated in previous literature [132, 133]. EDTA andcalmodulin were present in sample at a final concentration of 8.1 ×10−4 M and 6.0×10−5 M in this experiment, respectively. To determine whether the concentrationof EDTA present in the sample was sufficient to prevent Ca2+-loaded calmodulinfrom forming to a significant extent, the equilibria of Ca2+ binding to EDTA andto calmodulin was simultaneously considered. EDTA binds to Ca2+ with a higheraffinity than calmodulin and the concentration of EDTA is 13 times greater thancalmodulin [134]. The pH-dependent binding affinity of EDTA to Ca2+ was takeninto account for the two different buffer pH conditions utlized in this experiment,i.e. pH = 6.5 for EDC cross-linking and pH = 7.4 for other (PFA, sulfoDST, BS3,and sulfo EGS) cross-linking[135]. Previous studies have shown that upon bind-ing to melittin, the binding affinity of calmodulin for Ca2+ increases 300 times[136]. The total concentration of Ca2+-loaded calmodulin-melittin was calculated583.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationto be 7.9× 10−13M and 8.2× 10−12M in the presence of EDTA under the EDCand other cross-linking buffer conditions, respectively. The ratio of Ca2+-loadedcalmodulin-melittin to Ca2+-free calmodulin-melittin was calculated to be only1.3× 10−8 and1.4× 10−7, in the EDC and other cross-linking reaction mixtures,respectively, (the details of the calculation are shown in Appendix A.1.3), whichare negligible.These ratios were calculated based on the calcium content in the ultrapure,deionized water utilized in this experiment. However, if calcium contaminationhad occurred, it would require an initial Ca2+ concentration of at least 7.8 x 10−5M (3.3 mg/L) or 6.2 x 10−5 M (2.5 mg/L) in EDC and other cross-linking reactionmixtures, respectively, in this experiment for even 1% of the calmodulin-melittin tobe Ca2+-loaded (calculation shown in Appendix A.1.3). These Ca2+ concentrationsare within the concentration of Ca2+ in regular, unpurified tap water. This suggeststhat if an amount of tap water had been introduced into the reaction mixture, Ca2+-loaded calmodulin-melittin would not form to a significant extent in the samples.All together, this supports that the majority of the calmodulin-melittin complexesshould be Ca2+-free in this study.3.2 Sodium Dodecyl Sulfate Polyacrylamide GelElectrophoresis Separation3.2.1 Running Conditions and Gel Band AppearanceReaction mixtures were separated by one dimensional SDS-PAGE and visualizedusing Coomassie Blue to qualitatively assess cross-linking products and presenceof protein components. A 16% acrylamide gel, Tris-Tricine buffer system (massrange of 1-100 kDa) was used since it is suitable for small proteins < 30 kDa suchas calmodulin (~16.9 kDa) and the S-protein (~11.5 kDa) of RNaseS and also forpeptides such as melittin (~2.9 kDa) and the S-peptide (~2.2 kDa) in RNaseS. Inthis buffer system, both the stacking and running gels have a pH = 8.45, whichallows for proteins to move slower for an improved separation of proteins with anarrow range of molecular weights in comparison to glycine-based, buffer systemsthat decrease the pH to 6.8 for the stacking gel. Furthermore, a higher percent of593.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationacrylamide was used to to further enhance the resolution of small protein compo-nents by allowing them to migrate at a slower rate [137].Since cross-linked species form at a relatively low yield, a higher protein con-centration must be used in comparison to other protein experiments in order tobe detected by LC-MS/MS post in-gel digestion and purification procedures[32].Approximately 50 and 8 µg of calmodulin and melittin, respectively, were presentin the loaded reaction mixtures. For the RNaseS complex, approximately and 58and 11µg of the S-protein and S-peptide, respectively were present in the loadedreaction mixtures. Loading a high protein concentration in the SDS-PAGE cancause poor resolution and streaking [138]. For the calmodulin-melittin complex(Figure 3.1), a poor resolution of bands was observed especially for PFA (lane 5),BS3 (lane 7) and sulfoEGS (lane 8) cross-linked samples. Both streaking and poorresolution were more prominent in the RNaseS SDS-PAGE (Figure 3.2) for theother cross-linker control sample (lane 3) and the PFA (lane 5)and BS3 (lane 7)cross-linked samples. Achieving an excellent resolution with cross-linked samplesis challenging since unmodified, intra-cross-linked and modified proteins do notdiffer significantly in mass and also possess similar molecular shapes and aminoacid sequences, thus appearing as overlapping bands.603.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separation1 2 3 4 5 6 7 8kDa25013010070554025201510(4) >40 KDaintermolecular complexcross-linking(3) 20 – 40 KDaintermolecular calmodulinor complex cross-linking(2) 10 – 20 KDaintramolecular calmodulinor complex cross-linking(1) < 10 kDainter or intra molecularmelittin cross-linkingFigure 3.1: SDS-PAGE of calmodulin-melittin cross-linking reaction mixtureswith the protein marker (lane 1); control samples with EDC buffer conditions (lane2), and control samples with all other cross-linker buffer conditions (lane 3); cross-linked samples EDC/sulfoNHS (lane 4), PFA (lane 5), sulfoDST (lane 6), BS3(lane 7) and sulfoEGS (lane 8); Four approximiate molecular weight categories ofeach protein/protein complex band are labelled with the type of crosslinking (ifany) indicated.613.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separation1 2 3 4 5 6 7 8 kDa  250 130 100 70  55 40  25 20   15   10    (4) >40 KDa  intermolecular complex  cross-linking (3) 20 – 40 KDa  intermolecular S-protein  or complex cross-linking  (2) 10 – 20 KDa  intramolecular S-protein or complex cross-linking (1) < 10 kDa  inter or intra molecular  S-peptide cross-linking Figure 3.2: SDS-PAGE of RNaseS cross-linking reaction mixtures with proteinmarker (lane 1); control samples with EDC buffer conditions (lane 2), and controlsamples with all other cross-linker buffer conditions (lane 3); cross-linked sam-ples EDC/sulfoNHS (lane 4), PFA (lane 5), sulfoDST (lane 6), BS3 (lane 7) andsulfoEGS (lane 8); Four approximiate molecular weight categories of each pro-tein/protein complex band are labelled with the type of crosslinking (if any) indi-cated.623.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis SeparationFigure 3.3: Literature study that compares the SDS-PAGE of calmodulin in thepresence of EDTA, without Ca2+ (lane 5 and 6) and in the presence of Ca2+, with-out EDTA (lane 2 and 3). Lanes 1 and 4 are protein markers. The amounts ofcalmodulin used were 6µg (lanes 2 and 5) and 12µg (lanes 3 and 6). The concen-trations of CaCl2 and EDTA were both 5 mM (adapted from reference [139], withpermission).3.2.2 Unmodified Protein Gel BandsFirst, the control samples (lane 2 and 3) were examined to ensure the presence ofthe protein components in the reaction mixture and to observe the behavior of theunmodified protein complex under the conditions utilized in the SDS-PAGE. InFigure 3.1, control samples (lanes 2 and 3) contained a strong band for calmodulin(~ 17 kDa) and faint bands for melittin (< 10 kDa). An additional band in thecontrol sample, more prominent in the EDC control (lane 2), appeared between 17- 20 kDa. This band was also observed in literature [99, 140, 141] for calmodulin,suggesting that it is most likely an impurity associated with calmodulin. In thiscurrent study, the purchased calmodulin was 95% pure by SDS-PAGE, so it ispossible that impurities are present. The buffer pH used for the EDC control (lane2) and the control for other cross-linkers (lane 3) was 6.5 and 7.4, respectively.Higher pH promotes oligomerization of melittin due to its alpha amino group beingdeprotonated [142]. This may explain the bands for melittin oligomers (< 10 kDa633.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationbands) appearing in only the control for other cross-linkers.It was determined whether the appearance of the unmodified calmodulin SDS-PAGE gel bands in this experiment matched that of the SDS-PAGE gel bands pro-duced by Ca2+-free calmodulin in literature. Upon binding to Ca2+, calmodulinadopts a more compact structure with a smaller Stokes radius than its Ca2+-freestructure [143]. Therefore Ca2+-loaded calmodulin possesses an increased elec-trophoretic mobility during SDS-PAGE than Ca2+-free calmodulin [144]. It wasobserved previously that Ca2+-loaded calmodulin migrates further down the SDS-PAGE gel than Ca2+-free calmodulin, even under denaturing conditions in whichthe protein should be unfolded [145]. Figure 3.3 depicts a 17% SDS-PAGE gelfrom a previous study that examined the electrophoretic mobility of calmodulinin the presence of Ca2+ or EDTA [139]. As shown in Figure 3.3, the bands forCa2+-free calmodulin (treated with EDTA) appeared at 14-18 kDa whereas Ca2+-loaded calmodulin (treated with CaCl2) bands appeared at 10-14 kDa, illustratingthe faster migration of the more compact structure of Ca2+ -loaded calmodulin. Itis expected that Ca2+-calmodulin would produce a series of bands correspondingto calmodulin loaded with 1,2,3 or 4 Ca2+ ions and Ca2+-free calmodulin wouldproduce a single band for calmodulin. For both samples, the bands spread overa 4 kDa range. However, in the Ca2+-loaded calmodulin sample, more distinctbands corresponding to multiple forms of calmodulin loaded with different num-bers of Ca2+ ions were observed in contrast to the uniformly spread band in theCa2+-free calmodulin sample. In this study, the bands for unmodified calmodulinwere observed at ~17 kDa, as shown in Figure 3.1, which is similar to the Ca2+-freecalmodulin sample in Figure 3.3. In addition, the spreading of the calmodulin bandin this study resembled the spreading that occurred with the Ca2+-free calmod-ulin sample more than the Ca2+ -loaded calmodulin sample in the literature studyshown in Figure 3.3. Therefore, both these attributes support that the calmodulinin this experiment was Ca2+-free when comparing the SDS-PAGE with what wasobserved in literature [139]. This observation is also consistent with section 3.1.1,in which the presence of EDTA was shown to quench the majority of the Ca2+ inthe sample, resulting in a nearly zero concentration of Ca2+-loaded calmodulin inthe sample.Figure 3.2 shows the SDS-PAGE of the cross-linked RNaseS samples. Since643.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationthe cleavage site of RNaseA can be between residues 16 – 21, the resulting hetero-geneous mixture of S-protein and S-peptide with various sequences may explainthe appearance of a series of overlapping bands. The control samples (lane 2 and3) contained a strong band for the S-protein (11.4 – 11.6 kDa) and a faint bandfor RNaseA (13.7 kDa). A thin band appearing around 10 kDa could possibly bean impurity associated with the sample since the purchased RNaseS was > 70%pure. Although the Tris-Tricine buffer system was chosen specifically to capturelow molecular weight proteins/peptides, the S-peptide (2.0 - 2.2 kDa) was still notobserved. Other reports of RNaseS cross-linking have also been unable to observethe S-peptide on a gel due to its small size, which makes it difficult to remain inthe gel [105]. The S-peptide (17-20 residues) is only slightly smaller than melit-tin (26 residues), which was observed as faint bands in the calmodulin-melittinSDS-PAGE. However, the S-peptide has an almost neutral isoelectric point (~6.8)in comparison to melittin (12.0) and may have migrated faster toward the end ofthe gel (toward the positive electrode), increasing the probability of escaping fromthe gel.3.2.3 Estimation of Molecular Weight for Cross-linked Protein GelBandsIntermolecular cross-linking between two different calmodulin/S-protein (~33kDa/ ~23 kDa) or calmodulin-melittin/RNaseS complex(> ~39 kDa /~27 kDa) molecules are expected to appear with at least double themass of unmodified calmodulin/S-protein or complex molecules. Intramolecularcross-linking within calmodulin-melittin or RNaseS would appear with a mass ap-proximately equal to the mass of the complex (~20 kDa and 14 kDa, respectively)since the mass change produced by the cross-linker bridges and/or modificationswould be too small to observe on an SDS-PAGE gel. However, intramolecularcross-linking within calmodulin/S-protein would produce a negligible mass dif-ference in comparison to unmodified calmodulin/S-protein. This negligible massincrease would be based on only on the number of cross-links and/or modificationsformed and cross-linker bridge mass.Using the protein marker bands of known molecular weights, a standard curve653.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationfor the molecular weight vs distance migrated from the top of the running gel wasplotted and this was used to estimate the molecular weights of all gel bands for thecalmodulin-melittin and RNaseS SDS-PAGE. Interestingly, cross-linked bands ap-peared below the expected mass, especially with EDC and PFA treated calmodulin-melittin. This may be due to the stabilizing nature of cross-linking, which resiststhe denaturation process of SDS-PAGE and maintains the compact protein struc-tures. SDS-PAGE operates by unfolding proteins and adding negatively chargedSDS sulfate groups approximately proportional to each protein’s molecular weight.As proteins without defined three-dimensional structure, they migrate down the gelbased on their molecular weight. However, if the structures of these proteins arepreserved with cross-linking, then they would travel down the gel as function oftheir size/structure and molecular weight. Since more compact or smaller sizestructures travel further down the gel, their bands correspond to a lower molecularweight than the unfolded, larger size conformation. This observation is consistentwith previous studies showing that non-reduced lysozyme migrated further than re-duced lysozyme due to its disulfide bonds retaining its folded structure. Thus, thenon-reduced lysozyme appeared with a molecular weight 14% lower than expected[146]. Furthermore, other studies have shown that lysine acetylation, which alsoneutralizes these basic sites similar to lysine cross-linking, blocks SDS bindingand increases protein migration in SDS-PAGE [147]. The effect of cross-linkingon SDS-induced protein unfolding may also be more of an issue in this presentexperiment due to the milder denaturing conditions used, i.e. incubating sampleswith SDS at 65ºC rather than 90-99 ºC utilized by other studies [99] prior to elec-trophoresis. Nevertheless, increasing the temperature to ensure proper denaturationmay reverse cross-links [68] and was therefore avoided. Thus the location of thecross-linked gel bands was not solely based on the molecular weight, but also afunction of size or compactness of the protein, which depended on the extent ofcross-linking and/or cross-linker.This phenomenon was more of an issue with calmodulin-melittin than withRNaseS. This may be due to calmodulin being more acidic (pI ~ 4) than RNaseS(pI ~ 9 ) resulting in more repulsion with the negatively charged SDS and thusincreased resistance to unfold the protein [148]. The displacement of cross-linkedgel bands from the position corresponding to their actual molecular weight was663.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationestimated by measuring the difference in molecular weight between the non-cross-linked calmodulin or S-protein bands in each cross-linked sample lane and thecalmodulin or S-protein band in the control sample lanes. It was determined thatPFA, EDC, sulfoDST, BS3, and sulfoEGS cross-linked gel bands appeared ap-proximately 3.2, 4.2, 0.1, 3.5, and 4.0 kDa, respectively, lower than their actualmolecular weight. SulfoDST exhibited the smallest displacement in comparison toother cross-linkers, suggesting that its cross-linking did not significantly block theSDS-induced denaturation of the three-dimensional protein structure. For RNaseS,the calculated displacement in the molecular weight of protein gel bands was neg-ligible for PFA, EDC, sulfoDST, BS3, and sulfoEGS (+0.2, +0.1, +0.4, -0.2, and-0.1 kDa, respectively).3.2.4 Cross-linking EvidenceThe series of overlapping bands reflects the range of different cross-linked productsproduced [149], which was most prominent with PFA treated calmodulin-melittinsamples due to PFA’s diverse reactivity. As mentioned in section 3.2.3, cross-linkedgel band positions appeared much lower than the actual molecular weight. In theEDC treated calmodulin-melittin sample (lane 4), the occurrence of intermolecularcross-linking is supported by bands appearing at ~30-40 kDa (corrected to ~34-44kDa based on the calculated molecular weight displacement from section 3.2.3). Itis possible that intramolecular calmodulin EDC cross-linking also occurred, sup-ported by the strong band below 14 kDa (corrected to ~ 18 kDa). The 20-25 kDaband (corrected to 24-29 kDa) suggests that EDC cross-linking either occurredbetween calmodulin and melittin tetramers, or involved the impurities that wereobserved at 17 - 20 in the EDC control sample. For the PFA treated sample (lane5), the series of bands between 13-18 kDa (corrected to ~ 16-21 kDa) indicate theformation of intramolecular cross-linked species. Higher order complexes formedfrom extensive intermolecular cross-linking appeared as low intensity bands athigher molecular weights (> 30 kDa, corrected to > 33 kDa). These complexesare most likely artifacts generated by PFA cross-linking that do not reflect true in-teractions. This observation is consistent with PFA cross-linked protein complexesin cellular studies, in which such artifacts formed using PFA concentrations above673.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separation0.4% [73].This consistency supports the use of this current non-covalent proteincomplex model system to reveal PFA cross-linking characteristics in future cellu-lar studies. The appearance of unreacted melittin oligomers (bands < 10 kDa) andcalmodulin (17 kDa) in the sulfoDST (lane 6) treated sample similar to the controlsample (lane 3) suggests that cross-linking between calmodulin and melittin didnot occur as extensively as with other cross-linkers. This is consistent with previ-ous observations in literature [99]. For the BS3-crosslinked sample (lane 7), bandsappearing between ~13 -16 kDa (corresponds to ~ 17-22 kDa) reflect intramolec-ular cross-linking. A faint band appearing at approximately 30 kDa (correspondsto~34 kDa) suggests the formation of intermolecular cross-linked species. For thesulfoEGS cross-linked sample (lane 8), strong bands appearing between ~13-16kDa (corresponds to ~ 17-22 kDa) indicate the presence of intramolecular cross-linking. A very low intensity band appearing at about 30 kDa (corresponds to~34kDa) suggests that intramolecular cross-linking may have occurred.Strong bands for intramolecular cross-linked RNaseS appeared for PFA, BS3and EDC cross-linked samples at around ~14 kDa. Intermolecular cross-linkingbetween multiple S-protein and complex molecules are marked by bands appearingabove ~20 kDa . Faint bands between 20-30 kDa appeared in the EDC cross-linkedsample, which corresponds to these intermolecular cross-linked complexes. Simi-lar to the calmodulin-melittin model, PFA contained such high order cross-linkedcomplexes marked by a series of bands between 25 - 100 kDa, which are alsolikely to be non-specific complexes formed by PFA cross-linking. Cross-linkingevidence for sulfoDST was weaker than with the other cross-linkers, marked bystrong bands for unreacted protein at below 12 kDa, similar to the bands appearingin the control sample in lane 3. For the BS3cross-linked sample, a series of bandsbetween 12- 25 kDa suggested that intramolecular cross-linking occurred and a se-ries of faint bands above 20 kDa suggested that some intermolecular cross-linkingalso occurred. Very faint bands between 12-20 and above 20 kDa suggests thatsome intramolecular and intermolecular cross-linking, respectively, did occur withsulfoEGS and RNaseS.To determine the relative yield of cross-linking for each cross-linker based onSDS-PAGE, the relative densities of the cross-linked protein bands were summedand compared to the relative density of the non-cross linked protein band for each683.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationcross-linker. The bar graphs illustrating these values are shown in Figure 3.4 forthe calmodulin-melittin system and Figure 3.5 for the RNaseS system. For thecalmodulin-melittin model system, PFA displayed the highest cross-linking yield(77%) , which is what is expected based on the increased number of cross-linkingsites in comparison to other cross-linkers. The lowest yield of cross-linking (9%)occurred in the sulfoDST sample, which is consistent with the observation of un-reacted protein bands in the SDS-PAGE and literature [99]. Being a Ca2+bindingprotein, calmodulin contains many negatively charged D and E residues which alsoare major reactive sites for EDC. Since the calmodulin-melittin system has a highcontent of EDC-specific reactive sites, the relative yield was higher (73%). Al-though sulfoDST, BS3 and sulfoEGS contain the same number of reactive sites (Kand N-terminal residues), which are fewer than EDC and PFA, BS3 exhibited ahigher cross-linking yield (67%) than sulfoEGS (44%) and sulfoDST (9%). Sincethe cross-linker bridge is the main variable across these three NHS ester cross-linkers, the higher cross-linking yield for BS3 suggests that many of the K andN-terminal residues exist within the distance of the cross-linker bridge. Possibleexplanations for a lower cross-linking yield with sulfoEGS may be that as a largecross-linker with a longer bridge, it formed more interprotein cross-linked com-plexes that precipitated out and thus did not appear on the gel, or that it could notpermeate through the protein complex as easily as the smaller BS3 cross-linker.For the RNaseS model system, similar to the calmodulin-melittin system, bothEDC and PFA cross-linked samples exhibited similar cross-linking yields of 55%and 51%, respectively. This may have occurred since both cross-linkers formclose proximity cross-links with multiple types of residues. The observation ofunreacted protein bands in the sulfoDST sample lane supported sulfoDST’s rela-tively low cross-linking yield of 7%. BS3cross-linked samples displayed the high-est cross-linking yield of 80% . The sulfoEGS cross-linked samples possesseda cross-linking yield of 45%. Even though the NHS ester cross-linkers have thesame residue reactivity, the varying cross-linking yields illustrate that the cross-linker length plays a crucial role in forming cross-links specific to each proteinstructure, as observed with the calmodulin-melittin system. In both RNaseS andcalmodulin-melittin, BS3 displayed the highest cross-linking yield out of theseNHS ester cross-linkers, suggesting that K and N-terminal residues have a highest693.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separationprobability of existing within the length of the BS3 bridge and/or its bridge has theoptimal balance of length and flexibility to form such cross-links in these proteincomplexes. The higher cross-linking yield of PFA in the calmodulin-melittin sys-tem than in the RNaseS system suggests that there are more PFA-reactive residuesin the calmodulin-melittin that are close in proximity since PFA possesses a smallcross-linker bridge. The lower cross-linking yield in RNaseS for PFA samples maybe due to PFA reactive residues being inaccessible or not in close proximity. Like-wise, the lower cross-linking yield in RNaseS versus calmodulin-melittin for EDCsamples suggests the reduced number of the reactive sites accessible to EDC ornot within a zero-length proximity with each other. Nevertheless, these hypothesesmust be confirmed via MS and MS/MS analysis of the cross-linked samples.73% 77%9%67%44%27% 23%91%33%56%0%100%EDC PFA sulfoDST BS3 EGSRelative Yield Cross-LinkerCross-linked Non-cross-linkedFigure 3.4: Relative yield of cross-linked species (blue) versus non-cross linkedspecies (red) in the Calmodulin-Melittin complex measured via SDS-PAGE703.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separation55% 51%7%80%45%45% 49%93%20%55%0%100%EDC PFA sulfoDST BS3 EGSRelative Yield Cross-LinkerCross-linked Non-cross-linkedFigure 3.5: Relative yield of cross-linked species (blue) versus non-cross linkedspecies (red) in the RNaseS complex measured via SDS-PAGE3.2.5 Cross-linking Yield and Protein Complex DissociationConstantThe percent cross-linked and non-cross-linked protein complex is hypothesized toreflect the the amount of bound calmodulin-melittin ([CM]) and unbound calmod-ulin [C] and melittin [M], respectively, which was calculated using the known dis-sociation constant ( Kd ) of 10μM [98]. The dissociation constant is defined asfollows:Kd =[C][M][CM] = 10µM ,where [C] ,[M], and [CM] are the equilibrium concentrations of calmodulin,melittin and the calmodulin-melittin complex, respectively. The initial concen-tration of calmodulin ([C]0) and melittin ([M]0) used was 60 μM in the reactionmixture. However, this is the concentration of the calmodulin and melittin samplesthat were 95% and 97% pure, respetively. A more accurate estimate of the initialconcentrations of calmodulin and melittin present in the reaction mixture would be713.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis Separation95% and 97% of 60 μM, respectively. Therefore, the equilibrium concentrationsare defined as follows:[C] = [C]0− [CM][M] = [M]0− [CM][C] = (95%)(60µM)− [CM][M] = (97%)(60µM)− [CM][C] = 57µM − [CM][M] = 58µM − [CM]Substituting 58μM-[CM] and 57μM-[CM] for the concentration of melittin andcalmodulin, respectively, in the Kd expression above and solving for [CM] givesthe following:Kd =(57µM−[CM])(58µM−[CM])[CM] = 10µM[CM] = 38µM[C] = 57µM − [CM] = 57µM −38µM = 19µM[M] = 58µM − [CM] = 58µM −38µM = 20µMThe total concentration of the sample was set to 60 μM%BoundCalmodulinMelittin = 38µM60µM = 63%%UnboundCalmodulin = 19µM60µM = 32%%UnboundMelittin = 20µM60µM = 33%723.2. Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis SeparationTherefore, the theoretical percent of bound calmodulin-melittin is 63% andunbound calmodulin and melittin is 32% and 33%, respectively in the 95% and97% pure calmodulin and melittin samples, respectively. This is fairly close tothe percent yield of cross-linking observed for PFA, EDC and BS3 cross-linkers(77%, 73% and 67%, respectively). Thus, this suggests that cross-linking withthese cross-linkers can reflect the amount of bound calmodulin-melittin expectedand preserve this interaction.This analysis was also performed for the RNaseS system (with a Kd = 1µM[107]), in which Kd is defined as follows:Kd =[Spep ][Spro ][RNaseS] = 1µM ,where [Spep] ,[Spro], and [RNaseS] are the equilibrium concentrations of S-peptide, S-protein and the RNaseS complex, respectively. Unlike the calmodulinand melittin samples, that were 95% pure, the purchased RNaseS sample was only70% pure. Therefore, it is assumed that only 70% of the initial concentration cor-respond to RNaseS components. The initial concentration of the RNaseS sampleused was 50 μM and the equilibrium concentrations are defined as follows:[Spro] = [Spro]0− [RNaseS][Spep] = [Spep]0− [RNaseS][Spro] = [Spep] = (70%)(50µM)− [RNaseS][Spro] = [Spep] = 35µM − [RNaseS]Substituting 35μM-[RNaseS] for the concentration of S-peptide and S-proteinin the Kd expression above and solving for [RNaseS] gives the following:Kd =(35µM−[RNaseS])2[RNaseS] = 1µM[RNaseS] = 23µM[Spro] = [Spep] = 35µM −23µM = 12µM733.3. Mass Spectrometric Analysis%BoundRNaseS = 23µM50µM = 49%%UnboundSpeptide =%UnboundSProtein = 12µM50µM = 24%This gives a theoretical percent of the RNaseS complex and unbound S-proteinand S-peptide of 49 % and 24 %, respectively, in the 70% pure sample. Interest-ingly, the percentage of bound RNaseS is fairly close to percent yield of cross-linking for EDC, PFA, and sulfoEGS (55%, 51% and 45 %, respectively). Thissuggests that these cross-linkers are suitable to capture the RNaseS interaction andalso provide evidence that the amount of cross-linking observed relates to the ex-pected amount of protein complex formed. EDC and PFA crosslinking yields sup-ported this relationship for both calmodulin-melittin and RNaseS complexes.3.3 Mass Spectrometric Analysis3.3.1 Calmodulin-Melittin System3.3.1.1 Control Peptide Analysis via MaxQuantThe MS/MS is automatically acquired on the most intense MS signals. MaxQuantcan be used to perform an MS/MS search of the control calmodulin-melittin sample(-others) to confirm the presence of the expected tryptic peptides without cross-linker-specific modifications and cross-links. Table 3.1 lists the 34 calmodulinand melittin peptides identified in the control sample with their m/z, experimentalmonoisotopic mass ([M]exp), calculated monoisotopic mass ([M]calc), mass ac-curacy, sum of the normalized peak areas of all their signals (Norm. Peak Area),molecular weight of the protein gel band origin, sequence (with position in pro-tein and modifications shown) and number of trypsin missed cleavages. A 100%sequence coverage of calmodulin and melittin was obtained in control sampleswithin a mass accuracy of 20 ppm for the peptide matches. The average numberof missed cleavages for the identified peptides was one, suggesting that the trypsindigestion was efficient. The following expected modifications were confirmed:trimethylated K115 (+42.06 Da), acetylated calmodulin N-terminus (+42.01 Da)and amidated melittin C terminus (-0.98 Da), which are denoted as (tm), (ac),743.3. Mass Spectrometric Analysisand (am), respectively. A subset of calmodulin peptides containing oxidized Mresidues (at positions 36, 51, 71, 72, 76, 109, 124, 144,145) and deamidated as-paragines (at positions 97 and 111) were observed producing a mass shift of +15.99and +0.98 Da and denoted as (ox) and (dm), respectively. Peptides with both un-modified and modified methionines and asparagines were observed. Oxidation ofmethionines can occur during sample preparation such as the extensive vacuumdrying of samples[150] and the deamidation of glutamine/asparagine is commonduring trypsin digestion [151]. Overall, it was confirmed that experimental pro-cessing itself did not result in significant protein loss and conveyed what peptidesand modifications are expected to be present in cross-linked samples. Based onthese findings, cross-linking or modification on K115, the acetylated calmodulinN-terminal residue, and the amidated melittin C-terminal residue were not consid-ered.The position of each SDS-PAGE gel band was used to estimate the molecu-lar weight of the protein in which each peptide was identified, post in-gel trypsindigestion and MS analysis. The molecular weight of the proteins in which allmelittin and calmodulin peptides originated from were < 10 kDa and 10 - 20 kDa,respectively. This agrees with expected molecular weight of melittin tetramers (~6 kDa) and a single calmodulin molecule (~17 kDa). In addition, this suggests thatcalmodulin and melittin exist as unbound species in the control sample.753.3. Mass Spectrometric AnalysisTable 3.1: A list of MS/MS confirmed peptides in the control calmodulin-melittinsample. The m/z, experimental mass, calculated mass, mass accuracy, normalizedpeak area, molecular weight of the protein gel band origin, sequence and numberof missed cleavages for each peptide are listed left to right.m/z z [M]exp(Da)[M]calc(Da)Mass Accuracy (ppm)Norm.Peak AreaProtein MWOrigin(kDa)Peptide Sequence MissedCleavagesCalmodulin Peptides782.37 2 1562.75 1562.74 8.3 4.02% 10 - 20 1A(ac)DQLTEEQIAEFK13 0478.74 2 955.48 955.47 13.7 0.03% 10 - 20 14EAFSLFDK21 0855.42 2 1708.84 1708.84 4.0 2.94x10-3% 10 - 20 22DGDGTITTKELGTVM(ox) R37 1411.21 2 820.41 820.42 10.6 6.96% 10 - 20 31ELGTVM(ox)R37 0403.22 2 804.42 804.42 4.5 3.50% 10 - 20 31ELGTVMR37 01030.21 4 4116.82 4116.84 3.9 0.30% 10 - 20 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)AR74 01058.24 4 4228.92 4228.93 2.3 0.01% 10 - 20 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)MARK75 11062.24 4 4244.92 4244.93 3.5 6.43x10-3% 10 - 20 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)ARK75 1901.82 5 4504.05 4504.07 3.3 0.01% 10 - 20 38SLGQNPTEAELQDMINEVDADGNGTIDFPEFLTM(ox)M(ox)ARKM(ox)K77 21131.2 4 4520.05 4520.07 4.4 0.01% 10 - 20 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)ARKM(ox)K77 2992.48 2 1982.94 1982.94 1.6 0.14% 10 - 20 75KMKDTDSEEEIREAFR90 3451.54 3 1351.59 1351.59 1.9 17.38% 10 - 20 76MKDTDSEEEIR86 1684.8 2 1367.59 1367.59 1.8 1.70% 10 - 20 76M(ox)KDTDSEEEIR86 1619.29 3 1854.84 1854.84 1.0 1.57% 10 - 20 76MKDTDSEEEIREAFR90 2464.72 4 1.74%624.62 3 1870.84 1870.84 1.7 6.19% 10 - 20 76M(ox)KDTDSEEEIREAFR90 2547.24 2 1092.46 1092.46 2.6 11.22% 10 - 20 _78DTDSEEEIR86 0798.86 2 1595.71 1595.71 2.3 1.27% 10 - 20 78DTDSEEEIREAFR90 1532.91 3 1.48% 10 - 20506.27 2 1010.52 1010.52 0.0 4.78x10-3% 10 - 20 87EAFRVFDK94 1877.94 2 1753.86 1753.86 2.0 1.88% 10 - 20 91VFDKDGNGYISAAELR106 1585.63 3 8.46% 10 - 20878.43 2 1754.85 1754.86 7.1 0.23% 10 - 20 91VFDKDGN(dm) GYISAAELR106 1585.96 3 6.23%633.31 2 1264.60 1264.60 0.1 1.04% 10 - 20 95DGNGYISAAELR106 0633.8 2 1265.59 1265.60 12.6 0.10% 10 - 20 95DGN(dm) GYISAAELR106 0601.05 4 2400.17 2400.12 19.6 0.52% 10 - 20 _107HVMTNLGEK(tm) LTDEEVDEMIR126 1801.39 3 2401.15 2401.12 12.9 0.17% 10 - 20 107HVMTN(dm)LGEK(tm) LTDEEVDEMIR126 1601.29 4 0.02%1209.9 2 2416.16 2416.12 17.4 0.53% 10 - 20 107HVM(ox) TNLGEK(tm) LTDEEVDEMIR126 1605.05 4 3.50%806.72 3 2417.15 2417.12 10.8 0.01% 10 - 20 107HVM(ox) TN(dm)LGEK(tm) LTDEEVDEMIR126 1605.29 4 0.39%811.73 3 2432.16 2432.12 15.2 1.72% 10 - 20 107HVM(ox) TNLGEK(tm) LTDEEVDEM(ox) IR126 1609.05 4 7.91%1245.54 2 2489.08 2489.07 2.8 0.22% 10 - 20 127EADIDGDGQVNYEEFVQMMTAK148 0836.02 3 2505.07 2505.07 0.6 2.55% 10 - 20 127EADIDGDGQVNYEEFVQM(ox) MTAK148 0841.36 3 2521.07 2521.07 1.0 1.49% 10 - 20 127EADIDGDGQVNYEEFVQM(ox) M(ox) TAK148 0Melittin Peptides657.43 1 656.43 656.42 14.8 0.41% < 10 1GIGAVLK7 0756.46 2 1510.91 1510.91 1.5 0.01% < 10 8VLTTGLPALISWIK21 0504.64 3 4.82%556.68 3 1667.01 1667.01 2.0 0.04% < 10 8VLTTGLPALISWIKR22 1442.27 5 2206.37 2206.35 10.7 6.80x10-3% < 10 8VLTTGLPALISWIKRKRQQ(am) 26 3763.3. Mass Spectrometric Analysis3.3.1.2 PFA Modified Peptide Analysis via MaxQuantA MaxQuant MS/MS search was used to determine the modified and unmodifiedpeptides in the PFA treated sample. This was used to shed light on reaction prod-ucts of the first, i.e. the modification, step of PFA cross-linking (see Figure 1.6) andaid in clarifying cross-linking mechanisms. Unlike established cross-linkers, themodification step in PFA cross-linking produces two stable intermediates: methy-lol (+30 Da mass shift), which dehydrates into a Schiff Base (+12 Da mass shift).In the calmodulin-melittin system, R, K and N-terminal melittin residues can po-tentially form PFA modifications under the mild PFA reaction conditions utilized(physiological pH, 1% PFA, and 6 hour reaction time). Using MaxQuant, a to-tal of 41 and 30 unmodified (Table 3.2) and PFA-modified (Table 3.3), respectivelyunique peptides were identified in the PFA treated calmodulin-melittin sample witha mass accuracy within 30 ppm. Similar to the control sample, the PFA peptidesproduced a 100% sequence coverage of calmodulin and melittin with the followingmodifications: trimethylated K115, acetylated calmodulin N-terminus, amidatedmelittin C terminus, oxidation of M and deamidation of N. In the modified pep-tides, the following calmodulin residues appeared with a +12 Da mass shift: K75,K77, K94, K148, R106, R74, R86, and R90. A +12 Da mass shift indicates thateither a Schiff Base formed and did not proceed in the second step of the cross-linking reaction or an intrapeptide cross-link formed.The following calmodulin residues appeared with a +30 Da mass shift: K30,K75, K77, and K94. A +30 Da mass shift indicates that a methylol modificationformed, which means that these residues were modified in the first step of thecross-linking reaction and did not proceed to form a cross-link in the second step.Interestingly, no R residues were observed with a +30 Da modification suggestingthat R is less reactive in the first, modification step of the reaction. This is consistentwith previous PFA studies in which R was shown to be less reactive that K residuesin a myoglobin model protein system[36] and in model peptides[85]. If the +12Da modified R residues are Schiff Bases and not intrapeptide cross-links, anotherpossibility is that methylol modified R is significantly less stable than Schiff Base773.3. Mass Spectrometric Analysismodified R residues, favoring the immediate dehydration of the methylol modifiedR residue to a Schiff Base. For melittin, A +12 Da mass shift was localized onG1 in the peptides 1GIGAVLKVLTTGLPALISWIK21 and 1GIGAVLK7 indicatingthat either a Schiff Base or intrapeptide cross-link formed in this peptide. Themelittin peptide 8VLTTGLPALISWIKRKRQQ(am)26 appeared with a +24 and +36Da modification. Since the C-terminus of the peptide contains four consecutivepotential PFA modification sites (21KRKR24), it was not possible to localize themodifications and/or intrapeptide cross-links to individual residues.For the PFA treated sample, the molecular weight range of the proteins in whichthe peptides originated from are listed in Tables 3.2 and 3.3. This was estimatedbased on the position of SDS-PAGE gel band in which each peptide was iden-tified in post in-gel trypsin digestion and MS analysis. In addition, the molecu-lar weight correction factor (+ 3.2 kDa) calculated in section 3.2.3 was applied.No peptides from proteins < 14 kDa were identified suggesting that no isolatedmelittin tetramers or unbound melittin peptides were identified, which is consis-tent with the lack of a gel band at this molecular weight position observed in theSDS-PAGE. There were 30 out of 71 peptides (14 with PFA modifications) orig-inating from 14- 19 kDa proteins, which could correspond to either unmodifiedor modified calmodulin, intramolecular cross-linked calmodulin or intermolecularcross-linked melittin tetramers. A total of 45 out of 71 peptides (19 with PFA mod-ifications) from 19- 33 kDa proteins. which were most likely from intramolecularcross-linked calmodulin-melittin complexes (~ 20 kDa) or intermolecular cross-linked calmodulin molecules (~34 kDa). Finally, 41 out 71 peptides (15 with PFAmodifications) came from proteins > 33 kDa, which were most likely from pro-tein complexes with intermolecular cross-links between multiple calmodulin orcalmodulin-melittin complex molecules. Therefore, peptides with the same iden-tity appeared across proteins with different molecular weights i.e. different types ofcross-linking. In addition, both unmodified and modified calmodulin peptides wereidentified in the digest of 14-19, 19-33 and > 33 kDa protein samples. Althoughunmodified melittin peptides were identified in the digest of 14-19, 19-33 and > 33kDa protein samples, modified melittin peptides were only identified in the digestof 14-19 and 19-33 kDa protein samples. Overall, this supports the indentificationof calmodulin-melittin peptides from crosslinked species via MS/MS, providing783.3. Mass Spectrometric Analysisassurance for the subsequent MS/MS identification of cross-linked peptides.793.3. Mass Spectrometric AnalysisTable 3.2: A list of MS/MS confirmed calmodulin-melittin peptides in PFA treatedsample without PFA modifications. The m/z, experimental mass, calculated mass,mass accuracy, normalized peak area, molecular weight of the protein gel bandorigin, sequence and number of missed cleavages for each peptide are listed left toright.m/z z [M]exp [M]calc Mass Accuracy (ppm)Norm.  Peak AreaProtein MWOrigin(kDa)Sequence Missed cleavagesCalmodulin Peptides782.38 2 1562.761562.76 0.0 11.51% 14-33 1A(ac)DQLTEEQIAEFK13 0478.74 2 955.48 955.47 13.7 7.10% 14-19 14EAFSLFDK21 0570.62 3 1708.851708.83 12.8 0.17% > 33 22DGDGTITTKELGTVM(ox) R37 1403.22 2 804.43 804.42 18.1 28.47% < 19, >33 31ELGTVMR37 0411.21 2 820.43 820.41 17.7 45.90% 14-19, >33 31ELGTVM(ox)R37 01026.21 4 4100.864100.83 7.1 0.23% 14-19 38SLGQNPTEAELQDMINEVDADGNGTIDFPEFLTM(ox)M(ox)AR74 01030.21 4 4116.854116.82 7.1 5.47% > 19 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)AR74 01030.46 4 4117.844117.81 7.1 0.05% 14-33 38SLGQNPTEAELQDM(ox)INEVDADGN(dm)GTIDFPEFLTM(ox)M(ox)AR74 01062.24 4 4244.954244.92 6.9 0.92% 19-33 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)ARK75 1901.82 5 4504.094504.05 8.1 0.13% 14-19 38SLGQNPTEAELQDMINEVDADGNGTIDFPEFLTM(ox)M(ox)ARKM(ox)K77 2905.02 5 4520.094520.05 8.0 0.77% > 14 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)ARKM(ox)K77 2494.24 3 1479.711479.69 14.8 0.24% 14-33 75KMKDTDSEEEIR86 2499.57 3 1495.701495.68 14.6 0.16% 14-19 75KM(ox)KDTDSEEEIR86 2496.74 4 1982.971982.94 14.7 1.32% > 33 75KMKDTDSEEEIREAFR90 3500.74 4 1998.961998.93 14.6 18.77% >14 75KM(ox)KDTDSEEEIREAFR90 3676.8 2 1351.611351.59 10.8 2.54% > 19 76MKDTDSEEEIR86 1684.8 2 1367.601367.59 10.6 3.41% > 33 76M(ox)KDTDSEEEIR86 1619.29 3 1854.861854.84 11.8 1.15% > 19 76MKDTDSEEEIREAFR90 2936.43 2 1870.851870.84 7.8 19.40% > 19 76M(ox)KDTDSEEEIREAFR90 2722.35 5 3606.733606.69 10.1 0.43% > 14 76M(ox)KDTDSEEEIREAFRVFDKDGNGYISAAELR106 4547.24 2 1092.471092.46 13.3 10.65% 14-33 _78DTDSEEEIR86 0798.86 2 1595.721595.71 9.1 12.38% > 33 78DTDSEEEIREAFR90 1696 3 2084.992084.97 10.5 0.11% > 19 _78DTDSEEEIREAFRVFDK94 2833.9 4 3331.593331.56 8.7 3.80% > 14 78DTDSEEEIREAFRVFDKDGNGYISAAELR106 3753.38 3 2257.132257.11 9.7 2.41% > 14 87EAFRVFDKDGNGYISAAELR106 2753.71 3 2258.122258.10 9.7 5.46% 14-19, >33 87EAFRVFDKDGN(dm)GYISAAELR106 2877.94 2 1753.881753.86 8.3 101.38% > 14 91VFDKDGNGYISAAELR106 1878.43 2 1754.861754.85 8.3 7.53% > 14 91VFDKDGN(dm) GYISAAELR106 1633.31 2 1264.621264.60 11.5 19.10% > 33 95DGNGYISAAELR106 0633.8 2 1265.601265.59 11.5 1.94% < 33 95DGN(dm) GYISAAELR106 0601.05 4 2400.202400.17 12.1 3.70% 14-19 _107HVMTNLGEK(tm) LTDEEVDEMIR126 1601.29 4 2401.182401.15 12.1 0.21% > 19 107HVMTN(dm)LGEK(tm) LTDEEVDEMIR126 1605.05 4 2416.192416.16 12.0 52.08% 14-33 107HVM(ox) TNLGEK(tm) LTDEEVDEMIR126 1605.29 4 2417.172417.15 12.0 0.39% >14 107HVM(ox) TN(dm)LGEK(tm) LTDEEVDEMIR126 1609.05 4 2432.192432.16 12.0 49.32% 19-33 107HVM(ox) TNLGEK(tm) LTDEEVDEM(ox) IR126 1609.29 4 2433.172433.14 12.0 13.26% >14 107HVM(ox) TN(dm)LGEK(tm) LTDEEVDEM(ox) IR126 1839.36 3 2515.072515.05 8.7 0.40% >14 127EADIDGDGQVNYEEFVQM(ox)MTAK148 0Melittin Peptides657.43 1 656.43 656.42 12.8 3.29% 14-33 1GIGAVLK7 0756.46 2 1510.931510.91 9.6 1.56% > 19 8VLTTGLPALISWIK21 0834.51 2 1667.031667.01 8.7 8.36% 14-19, >33 8VLTTGLPALISWIKR22 1599.38 3 1795.131795.11 12.2 0.15% 14-33 8VLTTGLPALISWIKRK23 2803.3. Mass Spectrometric AnalysisTable 3.3: A list of MS/MS confirmed calmodulin-melittin peptides in PFA treatedsample with PFA modifications. The m/z, experimental mass, calculated mass,mass accuracy, normalized peak area, molecular weight of the protein gel bandorigin, sequence and number of missed cleavages for each peptide are listed left toright. (+12) and (+30) denotes a Schiff Base/Intrapeptide cross-link and methylol,respectively, localized on the residue before it.m/z z [M]exp [M]calc Mass Accuracy (ppm)Norm. Peak AreaProtein MWOrigin(kDa)Sequence Missed cleavagesCalmodulin Peptides575.29 3 1722.871722.85 11.6 0.09% > 33 22DGDGTITTK(+30)ELGTVMR37 11133.02 4 4528.084528.05 6.6 0.53% 14-33 38SLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMAR(+12)K(+12)MK77 0754.85 2 1507.701507.68 13.3 1.06% >14 75K(+12)M(ox)KDTDSEEEIR86 1508.24 3 1521.721521.70 13.1 0.04% 14-19 75K(+30)MK(+12)DTDSEEEIR86 1513.57 3 1537.711537.69 13.0 1.18% >14 75K(+30)M(ox)K(+12)DTDSEEEIR86 1499.74 4 1994.961994.94 10.0 4.01x10-3% 14-19 75K(+12)MKDTDSEEEIREAFR90 2669.99 3 2006.972006.94 14.9 0.06% 14-19 75K(+12)MK(+12)DTDSEEEIREAFR90 2503.74 4 2010.962010.93 14.9 0.92% 19-33 75K(+12)M(ox)KDTDSEEEIREAFR90 2403.6 5 2013.002012.95 24.8 7.86% > 33 75KM(ox)K(+30)DTDSEEEIREAFR90 2506.74 4 2022.962022.93 14.8 2.74% 14-33 75K(+12)M(ox)K(+12)DTDSEEEIREAFR90 2511.24 4 2040.962040.94 9.8 0.96% > 33 75K(+30)MK(+12)DTDSEEEIREAFR90 3455.54 3 1363.621363.59 22.0 0.07% 14-33 76MK(+12)DTDSEEEIR86 3461.54 3 1381.621381.60 14.5 3.19% >14 76MK(+30)DTDSEEEIR86 3627.29 3 1878.871878.84 16.0 0.16% >14 76MK(+12)DTDSEEEIR(+12)EAFR90 3628.62 3 1882.861882.84 10.6 0.13% 14-19, >33 76M(ox)K(+12)DTDSEEEIREAFR90 3472.22 4 1884.881884.85 15.9 3.00% 14-19 76MK(+30)DTDSEEEIREAFR90 3632.62 3 1894.861894.84 10.6 2.06% 14-33 76M(ox)K(+12)DTDSEEEIR(+12)EAFR90 3634.62 3 1900.861900.85 5.3 0.71% > 19 76M(ox)K(+30)DTDSEEEIREAFR90 1727.15 5 3630.733630.69 9.9 0.44% > 19 76M(ox)KDTDSEEEIREAFR(+12)VFDK(+12)DGNGYISAAELR106 1839.9 4 3355.603355.56 11.9 0.64% >14 78DTDSEEEIR(+12)EAFRVFDK(+12)DGNGYISAAELR106 4761.38 3 2281.142281.11 13.2 5.29% 14-33 87EAFR(+12)VFDK(+12)DGNGYISAAELR106 2589.63 3 1765.891765.86 17.0 5.37% > 19 91VFDK(+12)DGNGYISAAELR106 2892.94 2 1783.881783.87 5.6 12.48% >14 91VFDK(+30)DGNGYISAAELR106 0839.41 5 4192.054192.01 9.5 0.45% >14 91VFDK(+12)DGNGYISAAELR(+12)HVM(ox) TNLGEK(tm) LTDEEVDEMox)IR126 4837.7 3 2510.102510.07 12.0 0.11% >14 127EADIDGDGQVNYEEFVQMMTAK(+12)148 2Melittin Peptides669.43 1 668.43 668.41 30.1 2.04% 19-33 1G(+12)IGAVLK7 0721.45 3 2161.352161.32 13.9 8.95x10-4 % 14-19 1G(+12)IGAVLKVLTTGLPALISWIK21 0659.41 3 1975.231975.21 10.1 1.04% 14-19 8VLTTGLPALISWI[KRKR](+24)24 4744.44 3 2230.342230.33 4.5 0.18% 14-33 8VLTTGLPALISWI[KRKR] (+24) QQ(am)26 3748.44 3 2242.342242.33 4.5 0.02% 14-19 8VLTTGLPALISWI[KRKR] (+36) QQ(am)26 2813.3. Mass Spectrometric Analysis3.3.1.3 Trypsin Cleavage After Modified and Cross-linked SitesAll cross-linked samples were digested with trypsin, which specifically cleaves af-ter residues that also are potential modification and cross-linking sites for EDC,PFA, sulfoDST,BS3,and sulfoEGS .i.e. K and R residues. Previous cross-linkingexperiments with NHS ester cross-linkers and EDC have claimed that trypsin rarelycleaves modified and/or cross-linked K residues, resulting in missed cleavages[34, 38, 152]. However, whether this applies to PFA cross-linked species mustbe examined to effectively predict products formed by cross-linking.The cleavage of trypsin is triggered by the electrophilic center induced by thepositively charged primary amino group on K and R residues under physiologicalpH conditions, which favors the nucleophilic attack by serine in the active site oftrypsin (see Figure 3.7)[153]. Figure 3.6 displays four structures of K upon mod-ification or cross-linking and their associated pKa values. The formation of PFAinduced Schiff bases (pKa ~ 4 [154]) reduces the electrophilic nature of K residues,which diminishes serine’s tendency to attack and cleave its peptide bond. This issupported by the lack of Schiff Base and/or methylol modifications on terminalK and R residues in peptides (with the exception of the calmodulin C-terminus,K148) observed in this study (see Table 3.3), suggesting that trypsin did not cleaveafter modified R and K residues, as expected. Furthermore, the average numberof missed cleavages increased from 1.2 for the non-PFA modified peptides (Table3.2) in the PFA-treated samples to 2.1 for the PFA modified peptides, suggestingthat the overall trypsin cleavage efficiency is reduced at PFA modified sites. It isimportant to note that trypsin cleavage is not expected to occur at calmodulin K115since it is trimethylated.However, trypsin cleavage after PFA cross-linked K residues may still be possi-ble since trypsin cleavage occurs after monomethylated K residues, both of whichare similar to a secondary amine functional group[155]. This hypothesis is sup-ported by the negligible difference in pKa values of primary amines and secondaryamines (both with a pKa ~ 10.7), which suggests that these groups are protonatedunder trypsin digestion buffer conditions (ammonium bicarbonate buffer, pH =7.8). Furthermore, it has been observed that PFA does not change the charge stateof proteins treated with PFA, suggesting that the electrophilicity of K residues re-823.3. Mass Spectrometric Analysisquired to activate trypsin cleavage should not be significantly affected [36]. Incontrast, when NHS esters react to form cross-links with K residues, an amide re-places the primary amine group, which has a much lower pKa value (-0.5). Thisresults in a drastic reduction of the electrophilicity of lysine, hampering trypsincleavage [156]. However, another factor to consider is that cross-linked multimersoccupy a larger surface area than unmodified monomers and in order for trypsin tocleave, the cleavage site must be able to fit in the active site of trypsin. It remains tobe seen whether PFA cross-links are potentially small enough to fit in the active siteof trypsin. Therefore, the possibility of terminal K or R residues in tryptic peptidesforming PFA cross-links was not ruled out in this study. However, based on the listof PFA modified peptides containing only internal modifications, it was assumedthat trypsin does not cleavage after Schiff Base or methylol modifications.OR1NNHR2CH2 OR1NHNHR1'R2OOR1N+NHR1'CH3HHPFA Cross-Linked LysinepKa ~ 10.7MonomethylatedLysinepKa ~ 10.7Schiff Base Modified LysinepKa ~ 4NHS ester Cross-Linked LysinepKa ~ -0.05OR1N+NHR1'HHR2++++Figure 3.6: In the dotted boxes, the structures of proteins with a lysine side chainmodified four ways as indicated with their respective pKa values are shown. Cross-linked bridges are highlighted in red.833.3. Mass Spectrometric AnalysisO (S195)HOO-(D102)(H57)NHNOR1NH3+NHR2HOH+NHHR2++OO-(D102)(H57)NHNH+O-R1NH3+NH R2O+(S195)HOO-(D102)(H57)NHNOR1NH3+O+(S195)HO (S195)HOO-(D102)(H57)NHNOR1NH3+OHδ+ Trypsin Protein OR1NNHR2CH2 OR1NHNHR1'R2OOR1N+NHR1'CH3HHOR1N+NHR1'HHR2R’1 , R1 = Protein 1 R2 = Protein 2 Cleaved N-terminal Protein Segment  Trypsin Cleaved C-terminal Protein Segment  Figure 3.7: Trypsin’s catalytic triad consists of aspartic acid (D102), histidine(H57) and serine (S195). Aspartic acid and histidine increase the nucleophilicityof serine, which attacks the partially positive carbonyl carbon of the protein. Thepositively charged amino group on lysine increases the electrophilicity of the car-bonyl carbon. The peptide bond is cleaved and the trypsin catalyst is regenerated.(adapted from reference [153])843.3. Mass Spectrometric Analysis3.3.1.4 Cross-link Identification via Manual Data AnalysisIn addition to unmodified and modified peptides, the trypsin digestion of cross-linked proteins also produces cross-linked peptides, which were evaluated manu-ally via MS and MS/MS. Based on the identified unmodified/modified peptides,the amidation of the C-terminus, acetylation of the N-terminus, and trimethylationof K were set as fixed modifications and the deamidation of N and oxidation of Mwere considered as variable modifications. Four missed cleavages were accountedfor since peptides with upto four missed cleavages were obsereved. A theoreticaldigestion using ExPASy Peptide mass[115] produced a total of list of 50 calmod-ulin and 17 melittin peptides without cross-linker-specific modifications or cross-links. Table 3.4 lists each cross-linker’s potential modification and cross-linkingsites specific to calmodulin and melittin.Cross-linked masses were prepared assuming that the mass of a cross-linkedspecies is equal to the mass of each peptide component plus the mass of the bridgeand any extra modification.Table 3.4: Cross-linking and modification sites in calmodulin and melittin for eachcross-linkerCross-Linker Melittin Calmodulin Peptide Components PossibleCross-linked SpeciesModification SitesCross-LinkingSitesModification SitesCross-Linking SitesEDC 0 4 (3K + 1 Nterm)38 (17D + 21E)7 (K) 353 123684PFA 6 (3K + 1 Nterm + 2R )5 (1 Nterm + 2R + 1 Q)13 (6K + 6R) 20 (6R + 6Q + 6N +2Y)439 191796SulfoDST 4 (3K + 1 Nterm)4 (3K + 1 Nterm)6 (K) 6 (K) 160 24675BS3 4 (3K + 1 Nterm)4 (3K + 1 Nterm)6 (K) 6 (K) 160 24675SulfoEGS 4 (3K + 1 Nterm)4 (3K + 1 Nterm)6 (K) 6 (K) 160 24675In melittin, there are a total of five and six potential PFA modification andcross-linking reactive sites, respectively. In calmodulin, there are a total of 13 and20 potential PFA modification and cross-linking reactive sites, respectively. To853.3. Mass Spectrometric Analysisillustrate the strategy used to derive a list of PFA modified peptides for prepar-ing a list of theoretical cross-linked masses, peptide VFDKDGNGYISAAELR isdiscussed. For the peptide VFDKDGNGYISAAELR, K and R are potential mod-ification sites and N,Y and R are potential cross-linking sites. In other words, Kand R can each contain a +12 Da or +30 Da mass shift and N,Y and R can containa +12 Da mass shift. Since trypsin is expected to not cleave after PFA modifiedresidues, the terminal R residue is only considered as a potential cross-linking sitein this case. PFA cross-linked masses must contain at least one +12 Da mass shiftcorresponding to the cross-linker bridge and when searching for interpeptide cross-linking, it is assumed that one of the reactive sites on each peptide is cross-linkedto another peptide. Therefore, in the case of peptide VFDKDGNGYISAAELR,only three of the four sites can be modified or be occupied in intrapeptide cross-linking. When preparing the list of possible cross-linked masses, only the totalmodification mass between both peptide components is considered. Since possi-ble modifications are also considered for the second peptide component, one lessthan the total reactive sites are considered for each peptide. Therefore seven to-tal forms of peptide VFDKDGNGYISAAELR are included, i.e. the mass of thepeptide plus 0,12, 24, 30, 36, 42 and 54 Da. The presence of a methylol indicatesthat the residue is modified and not cross-linked. Thus, cross-linked masses thatcorrespond to the mass of two peptides with only a +30 Da total mass shift are notpossible.In melittin, there are a total of zero and four potential EDC modification andcross-linking reactive sites, respectively. In calmodulin, there are a total of 38and seven potential EDC modification and cross-linking reactive sites, respectively.Therefore, calmodulin contained the highest number of possible EDC modificationsites. Nevertheless, the number of EDC cross-linking sites was equal to that ofNHS esters. EDC can potentially also form +155 Da modifications on D or Eresidues. Although this EDC modification is rare under short EDC incubationtimes (1 hr), it was still considered [33, 34]. There are total of four and sevencross-linking/modification sites in melittin and calmodulin, respectively, for NHSester cross-linkers. Hence, the number of possible cross-linked masses for EDCand NHS ester cross-linkers was significantly lower than for PFA.Applying a similar approach to each peptide for calmodulin and melittin, pos-863.3. Mass Spectrometric Analysissible modified peptide masses were derived. This was performed for PFA, EDC,and the NHS ester cross-linkers (sulfoDST, BS3, and sulfoEGS) to generate a listof 439, 353, and 160 peptide components, respectively, to make up cross-linkedpeptides. Mathematica was used to find every combination of all possible modi-fied and unmodified peptide masses to create lists of possible cross-linked massesof 191,796, 123,684 and 24,675 species for PFA, EDC, and the NHS ester cross-linkers (sulfoDST, BS3, and sulfoEGS), respectively.The total number of calmodulin-melittin cross-linked candidates identified atthe MS level for EDC, PFA, sulfoDST, BS3 and sulfoEGS were 160, 335, 62,77, and 158, respectively (listed in AppendixA.4). These findings are consistentwith the relative cross-linking yield observed in the SDS-PAGE. PFA and EDCdemonstrated the highest cross-linking yield and also the highest number of MScandidate cross-links. In addition, NHS ester cross-linkers demonstrated a lowercross-linking yield and lower number of candidates that increased proportional totheir length.3.3.2 Ribonuclease-S System3.3.2.1 Control Peptide Analysis via MaxQuantTo determine the composition of a tryptic digest of RNaseS without cross-linking,the RNaseS control sample peptides were analyzed using MaxQuant. Althoughcleavage of RNaseA can occur between residues 16 – 21 to form RNaseS, theprotein system would be highly complex if five forms of the S-peptide and fiveforms of the S-protein were considered. Therefore, the two major forms of eachprotein component based on literature findings were considered, which are pro-duced from the cleavage after residue 19 and 20 of RNaseA [105, 106, 157, 158].In addition, a band corresponding to RNaseA was observed in the SDS-PAGE ofRNaseS and therefore, RNaseA peptides were also considered. A maximum offour missed cleavages were accounted for and the following variable modificationswere searched: deamidation of N, oxidation of M and Carbamidomethyl on C.Table 3.5 lists the RNaseS peptides identified by MaxQuant. A 92% sequencecoverage for the RNaseS complex was obtained within a mass accuracy of 20ppm. Two small S-protein peptides 62NVAC(cm)K66 and 86ETGSSK91 were miss-873.3. Mass Spectrometric Analysising from identified peptides, giving a sequence coverage less than 100 %. The MSsignals corresponding to these peptides were also not identified manually. There-fore, it is hypothesized that these small peptides eluted too early from the HPLCcolumn and/or did not ionize efficiently due to their short length and low chargestate, and thus were not detected by MS. An average of one and three missed cleav-ages were observed for the S-protein and S-peptide, respectively. Non-reducedRNaseS possesses disulfide bonds between the following cysteine (C) residues:C26 to C84, C58 to C110, C40 to C95, and C65 to C72 [159]. Carbamidomethylmodifications from the reduction and alkylation of disulfude bonds were identi-fied on C26, C72, C84, and C110. In addition, non-alkylated (reduced, withoutcarbamidomethyl modifications) C26, C95, C58, and C40 residues were also iden-tified. This suggests that the disulfide bonds in RNaseS were reduced, however,the alkylation was inefficient for C40, C58, and C95.MaxQuant identified 22 unique peptides out of which five corresponded toRNaseA, two were products of the RNaseA cleavage after residue 19 ( one S-peptide and one S-protein tryptic peptide), and 15 were products of the RNaseAcleavage after residue 20 (two S-peptides and 13 S-protein tryptic peptides). There-fore, the majority of the identified RNaseS peptides supported cleavage afterRNaseA residue 20.The position of the SDS PAGE gel band in which each peptide was identifiedwas used to estimate the molecular weight of the protein from which each peptideoriginated. S-peptide and S-protein tryptic peptides were identified in < 12 kDaproteins, suggesting that the S-peptide (~ 2.2 kDa) and S-protein (~11.5 kDa) re-mained unbound in the control samples. The peptides would have been identifiedin ~ 13.7 kDa proteins if the S-peptide and S-protein were in their bound, RNaseScomplex form. The RNaseA peptides were identified in 12- 15 kDa, as expectedbased on the molecular weight of RNaseA (~13.7 kDa).883.3. Mass Spectrometric AnalysisTable 3.5: List of S-peptide, S-protein and RNaseA peptides in the control RNaseSsample. The m/z, experimental mass, calculated mass, mass accuracy, normalizedpeak area, molecular weight of the protein gel band origin, sequence and numberof missed cleavages for each peptide are listed left to right.m/z z [M]exp [M]calc Mass Accuracy (ppm)Norm. Peak AreaProtein MWOrigin(kDa)Sequence Missed cleavagesS-peptide Peptides722.68 3 2165.04 2165.02 10.1 0.16% < 12 1KETAAAKFERQHMDSSTSAA20 31019.47 2 2036.94 2036.92 7.1 0.28% < 12 2ETAAAKFERQHMDSSTSAA20 2699.00 3 2094.00 2093.98 10.4 0.54% < 12 1KETAAAKFERQHMDSSTSA19  *From S-peptide 1-19 3S-protein Peptides646.76 2 1291.51 1291.50 11.3 1.51% < 12 21SSSNYCNQMMK31 0662.75 2 1323.50 1323.49 11.0 1.17% < 12 21SSSNYCNQM(ox)M(ox)K31 0691.26 2 1380.53 1380.51 10.5 0.27% < 12 21SSSNYC(CM)NQM(ox)M(ox)K31 0755.02 3 2262.05 2262.03 9.7 0.82% < 12 21SSSNYCNQMMKSRNLTKDR39 3832.05 3 2493.16 2493.13 8.8 0.38% < 12 21SSSNYCNQMMKSRNLTKDRCK41 4675.55 5 3372.74 3372.70 10.8 0.45% < 12 32SRNLTKDRCKPVNTFVHESLADVQAVCSQK61 4373.71 2 745.42 745.41 19.5 0.08% < 12 24NLTKDR39 1535.67 5 2673.34 2673.30 13.6 0.79% < 12 38DRCKPVNTFVHESLADVQAVCSQK61 2801.73 3 2402.19 2402.17 9.1 6.87% < 12 40CKPVNTFVHESLADVQAVCSQK61 1481.44 5 2402.21 2402.17 15.1 1601.55 4 2402.20 2402.17 12.1 11151.46 2 2300.92 2300.91 6.3 8.33% < 12 67NGQTNC(CM)YQSYSTM(ox)SITDC(CM)R85 0429.69 2 857.39 857.37 17.0 11.88% < 12 92YPNCAYK98 0937.47 3 2809.41 2809.39 7.8 0.99% < 12 99TTQANKHIIVACEGNPYVPVHFDASV124 1742.03 3 2223.10 2223.08 9.8 66.01% < 12 105HIIVAC(CM)EGNPYVPVHFDASV124 0855.73 3 2564.19 2564.17 8.5 0.17% < 12 20ASSSNYCNQMMKSRNLTKDRCK41 *From S-protein 20-124 4RNaseA Peptides1337.61 3 4009.84 4009.81 5.4 0.06% 12 – 20 2ETAAAKFERQHMDSSTSAASSSNYCNQMMKSRNLTK37 41237.56 3 3709.67 3709.65 5.9 0.31% 12 – 20 8FERQHMDSSTSAASSSNYCNQMMKSRNLTKDR39 41228.22 3 3681.66 3681.64 5.9 0.69% 12 – 20 1KETAAAKFERQHMDSSTSAASSSNYCNQMMKSR33 4769.97 3 2306.92 2306.90 9.5 0.90% 12 – 20 11QHMDSSTSAASSSNYCNQMMK31 0851.02 2 2550.06 2550.04 8.6 0.10% 12 – 20 1QHMDSSTSAASSSNYCNQMMKSR33 1893.3. Mass Spectrometric Analysis3.3.2.2 PFA-Treated Peptide Analysis via MaxQuantA MaxQuant MS/MS search was used to determine the modified and unmodifiedpeptides in the PFA treated sample, to understand the modification reaction prod-ucts formed from the PFA cross-linking of RNaseS. A total of eight unique pep-tides without PFA modifications and four unique peptides with PFA modificationswere identified by MaxQuant within a mass accuracy of 15 ppm and are listed inTable 3.6. A 95% sequence coverage was obtained for the S-protein. Similar tothe control sample, S-protein peptide 62NVACK66 was missing from the identi-fication. Among the identified peptides corresponding to the S-protein, the pep-tide 21SSSNYC(CM )NQM(ox)M(ox)K(+12)SR(+12)NLTK37 supported cleavage af-ter RNaseA residue 20 and no peptides supported cleavage of RNaseA after residue19. Carbamidomethyl modifications from the reduction and alkylation of RNaseSwere identified on C26, C40, C58, C72, C84, C95, and C110, i.e. all expected Cresidues except C65. RNaseS peptides containing C65 were not identified. OnlyC95 also appeared in the reduced form without alkylation.The average number of missed cleavages for the S-protein was one, which wasconsistent with that observed in the control samples. Consistent with the PFA mod-ified calmodulin-melittin peptides, trypsin cleavage was not observed after PFAmodified residues in the RNaseS system. A +12 Da modification was localized toS-protein K31, R33, K91, and K98. A +30 Da modification was only localized onS-protein K98. This confirms that K98, K31, and K91 are modified in the first stepof the cross-linking reaction of RNaseS. The absence of +30 modified R residuesidentified in RNaseS is consistent with the observations in the calmodulin-melittinsystem. Since +12 Da could correspond to either a Schiff base or intrapeptidecross-link, it is ambiguous whether R33 (both a potential cross-linking and modi-fication site) was modified or cross-linked.No RNaseA peptides or S-peptide peptides were identified by MaxQuant. Thisis consistent with the SDS-PAGE of PFA treated RNaseS, where no band forRNaseA and for the S-peptide was observed. A band corresponding to unmodi-fied S-peptide did not appear in the PFA treated sample but appeared in the controlsample. However, a band for the RNaseS complex at ~14 kDa was observed forthe PFA treated sample. This suggested that the S-peptide molecules are engaged903.3. Mass Spectrometric Analysisin intramolecular PFA cross-links with the S-protein. Previous kinetics experi-ments that examined trypsin cleavage in RNaseS, demonstrated that the S-peptideis not accessible for trypsin cleavage while bound to the S-protein in the com-plex. If the S-peptide is cleaved, then the interaction between the S-peptide andS-protein was diminished[160]. Thus, if cross-linking truly preserved the RNaseScomplex and partially preserved the complex structure through SDS denaturation,then trypsin digestion products from the S-peptide would be expected to be rareand only cross-linked S-peptides would be expected. This may explain the absenceof the S-peptide in the MaxQuant identified peptides from the PFA treated sample.Table 3.6 lists the molecular weight of the protein each peptide originated from,which was estimated by SDS-PAGE. No peptides from < 12 kDa proteins wereidentified, supporting the absence of individual S-protein molecules or S-peptidemolecules. Nine out of the twelve (two of them being PFA modified peptides) S-Protein peptides were from 12-20 kDa proteins, which are likely from intramolec-ular cross-linked RNaseS (~14 kDa). Seven out of the twelve (two of them beingPFA modified peptides) S-Protein peptides were from 20-30 kDa proteins, whichare likely from intermolecular cross-linked S-protein (~23 kDa) or RNaseS (~28kDa) molecules. However, these hypotheses must be verified by identifying PFAcross-linked RNaseS peptides via MS/MS.913.3. Mass Spectrometric AnalysisTable 3.6: List of S-peptide, S-protein and RNaseA peptides in the PFA treatedRNaseS sample. The m/z, experimental mass, calculated mass, mass accuracy,normalized peak area, molecular weight of the protein gel band origin, sequenceand number of missed cleavages for each peptide are listed left to right. (+12)and (+30) denotes a Schiff Base/Intrapeptide cross-link and methylol, respectively,localized on the residue before it.m/z z [M]exp [M]calc Mass Accuracy (ppm)Norm. Peak AreaProtein MWOrigin(kDa)Sequence Missed cleavagesS-protein Peptides558.48 5 2787.38 2787.34 13.1 1.98% 12 – 30 38DRC(CM)KPVNTFVHESLADVQAVC(CM)SQK61 1767.98 3 2300.93 2300.91 9.5 68.89% 12-20, > 30 67NGQTNC(CM)YQSYSTM(ox)SITDC(CM)R85 0964.40 3 2890.20 2890.18 7.6 2.24% 12-20 67NGQTNC(CM)YQSYSTM(ox)SITDC(CM)RETGSSK91 1502.23 3 1503.69 1503.67 14.5 20.68% > 12 86ETGSSKYPNC(CM)AYK104 1537.76 4 2147.02 2147.00 13.6 8.01% 20-30 86ETGSSKYPNC(CM)AYKTTQANK104 2520.25 3 1557.75 1557.72 14.0 37.34% 12 – 30 92YPNC(CM)AYKTTQANK104 1717.61 4 2866.44 2866.41 10.2 2.22% 12-20 99TTQANKHIIVAC(CM)EGNPYVPVHFDASV124 1742.03 3 2223.10 2223.08 9.8 26.45% > 12 105HIIVAC(CM)EGNPYVPVHFDASV124 0S-protein Peptides with PFA Modifications702.31 3 2103.94 2103.91 10.4 6.44% 12-20 21SSSNYC(CM)NQM(ox)M(ox)K(+12)SR(+12)NLTK37 2526.50 4 2102.00 2101.97 13.8 0.51% 20-30 86ETGSSK(+12)YPNCAYKTTQANK104 2524.25 3 1569.75 1569.72 13.9 9.64% 20-30 92YPNC(CM)AYK(+12)TTQANK104 1530.25 3 1587.76 1587.74 13.8 115.59% > 12 92YPNC(CM)AYK(+30)TTQANK104 13.3.2.3 Cross-link Identification via Manual Data AnalysisTo determine RNaseS cross-linked species, the trypsin digest of cross-linkedRNaseS samples were analyzed manually via MS using a similar data analysisprocedure that was used for the calmodulin-melittin system. To simplify the the-oretical cross-link search space, peptides from S-proteins and S-peptides formedfrom the cleavage of RNaseA after residue 20 were only considered, since this rep-resents the majority of the peptides identified by MaxQuant. ExPASy Peptide mass[115] was used to perform a theoretical trypsin digestion to produce 80 S-protein923.3. Mass Spectrometric Analysisand 10 S-peptide peptides. Since control peptides with missed cleavages up to fourwere observed in the MaxQuant search results, four missed cleavages were con-sidered. The following variable modifications were considered: deamidation of N,oxidation of M and Carbamidomethyl on C. Carbamidomethyl on C was set as avariable modification because peptides with both non-alkylated and alkylated cys-teines were identified by MaxQuant. The number of possible unmodified peptidesin the RNaseS system was 23 more than in the calmodulin-melittin system, makingit a slightly more complex model. Table 3.7 lists each cross-linker’s modificationand cross-linking sites specific to RNaseS.Table 3.7: Cross-linking and modification sites in RNaseS for each cross-linker.Cross-LinkerS-peptide S-protein Peptide Components PossibleCross-linked SpeciesModification SitesCross-LinkingSitesModification Sites Cross-Linking SitesEDC 3(1D + 2E) 3 (2K+1Nterm) 7(4D + 3E) 9(8K+1Nterm) 356 125811PFA 4 (2K +1Nterm + 1R ) 3 (1R +1Q+1Nterm) 12( 8K+1Nterm +3R) 26(3R + 6Q+1Nterm +10N +6Y) 911 828996SulfoDST 3 (K+1Nterm) 3 (K+1Nterm) 9(8K+1Nterm) 9(8K+1Nterm) 354 124391BS3 3 (K+1Nterm) 3 (K+1Nterm) 9(8K+1Nterm) 9(8K+1Nterm) 354 124391SulfoEGS 3 (K+1Nterm) 3 (K+1Nterm) 9(8K+1Nterm) 9(8K+1Nterm) 354 124391Masses of all possible modified and unmodified forms of each peptide werecalculated to derive a list of possible unmodified/modified peptide componentsthat could be cross-linked. This was performed for PFA, EDC, and the NHS estercross-linkers (sulfoDST, BS3, and sulfoEGS) to generate a list of 911, 356, and354 peptide components, respectively. In the S-peptide, there are four and threePFA modification and cross-linking sites, respectively. In the S-protein, there are12 and 26 PFA modification and cross-linking sites, respectively. The total numberof PFA reactive sites in the RNaseS system is eight more than in the calmodulinsystem. Although RNaseS is only slightly more complex than the calmodulin-melittin system, the number of theoretical PFA cross-linked species increased from933.4. Moving toward the MS/MS Verification of Cross-Linked Species191,796 to 828,996. However, with the other cross-linkers, the increase in possiblecross-linked species was not as high in magnitude since they have fewer possiblereactive sites. For EDC, four and three potential modification and cross-linkingsites, respectively are present in the S-peptide. In the S-protein, there are sevenand nine EDC modification and cross-linking sites, respectively. The total numberof possible EDC cross-linked species was 125,811, which is 2127 more than in thecalmodulin-melittin system. For the NHS ester cross-linkers, there are three andnine cross-linking/modification sites in the S-peptide and S-protein, respectively.The total number of possible sulfoDST, BS3 and sulfoEGS cross-linked specieswas 124,391, which is 99,716 more than in the calmodulin-melittin system. Thetotal number of RNaseS candidates for EDC, PFA, sulfoDST, BS3 and sulfoEGScross-links was 220, 607, 208, 471 and 463, respectively (listed in Appendix A.5).This represents the number of candidate cross-linked species in the RNaseS systemupon which further manual inspection of MS signals subsequent MS/MS spectraare required to verify cross-linked species. Similar to the calmodulin-melittin sys-tem, PFA generated the highest number of candidates in the RNaseS system. How-ever, since EDC reactive sites are less abundant in the RNaseS system, EDC did notproduce the second highest number of candidates. Finally, among the NHS estercross-linkers, BS3 produced the highest number of candidates in the RNaseS modelsystem, which is consistent with the relatively high cross-linking yield observed inthe SDS-PAGE.3.4 Moving toward the MS/MS Verification ofCross-Linked SpeciesIn this chapter, a preliminary evaluation of PFA, EDC, sulfoDST, BS3, and sul-foEGS cross-linking in Ca2+-free calmodulin-melittin and RNaseS was performedin order to facilitate the identification and localization of cross-linked species viaMS/MS. First, the intact protein complexes produced by cross-linking were ex-amined via SDS-PAGE to determine what type of cross-linking occurred. Thissupported the formation of both intra and inter-PFA, EDC, BS3 , and sulfoEGScross-linked species for both Ca2+-free calmodulin-melittin and RNaseS. The rel-943.4. Moving toward the MS/MS Verification of Cross-Linked Speciesative cross-linking yields for PFA, EDC and BS3 species were comprable to thetheoretical yields of each Ca2+-free calmodulin-melittin and RNaseS complex de-rived from their respective dissociation constants, suggesting that these cross-linkers captured revelant non-covalent interactions. For intact sulfoDST cross-linked species, only intra-cross-links were observed in the SDS-PAGE for bothprotein model systems. In addition, sulfoDST produced a lower relative yieldthan the other cross-linkers. Second, PFA-modified and unmodified peptide anal-ysis confirmed the MS-detection of peptides from cross-linked species for Ca2+-free calmodulin-melittin and RNaseS, although less peptides were identified in theRNaseS model system. PFA-modified peptides revealed modification sites to aid incross-link localization. PFA modifications were observed primarily on N-terminaland K, and to lesser extent, R residues, which confirmed the specificity of the mod-ification reaction under the in vivolike, mild reaction conditions utilized, similarto PFA cellular cross-linking. Furthermore, the observation of only internal PFAmodifications on peptides demonstrated that trypsin cleavage after PFA modifica-tions (Schiff Base or methylol) is unlikely. However, the mechanism of trypsindigestion suggests that trypsin can potentially cleave after cross-linked residuesproduced by PFA unlike those produced with established cross-linkers. Therefore,PFA cross-linking at terminal residues is considered a possibility. Finally, a listof MS-confirmed candidate cross-linked species was obtained. Consistent withthe relative yield measured in the SDS-PAGE, PFA generated the most numberof MS-confirmed candidate cross-linked candidates in both protein model systemsout of all the cross-linkers due to its ability to react with several amino acids andpotential to produce multiple modifications and cross-links. EDC reactive sitesare abundant in the Ca2+-free calmodulin-melittin, and therefore produced the sec-ond highest number of MS-confirmed candidate cross-linked species. The MS-confirmed candidate cross-linked species produced by NHS esters cross-linkers ofdifferent lengths suggested that cross-linker bridge length is proportional to thenumber of candidates, due to its increased probability of being within the distanceof two reactive sites. However, MS/MS verification of these candidates is requiredto confirm cross-linked structures and localize cross-linking sites, which shall beexplored in the following chapter.95Chapter 4Tandem Mass SpectrometricFragmentation ofCalmodulin-MelittinCross-linked Species4.1 Tandem Mass Spectrometric Verification andFragmentation Rules for Cross-linked SpeciesIn chapter 3, a total number of 160, 335, 62, 77, and 158 calmodulin-melittin cross-linked candidates for EDC, PFA, sulfoDST, BS3 and sulfoEGS, respectively wereidentified at the MS level. The MS/MS spectra for each calmodulin-melittin MScandidate were analyzed to verify cross-linking. Theoretical fragment ions for thebackbone fragmentation of unmodified peptides from calmodulin and melittin wereobtained using protein prospector MS product[117] . These were used to manuallyprepare theoretical cross-linked and/or modified peptide fragment ion databasesspecific to each cross-linker chemistry. In each peptide, for each fragment thatcontained a cross-linking or modification site, modified fragment masses were ob-tained by adding the mass of each cross-linker-specific modification (i.e. +12 or+30, +155.00,+132.01, +156.08, and +244.06 Da, for PFA, EDC, sulfoDST, BS3,and sulfoEGS, respectively).The MS/MS spectrum of each MS cross-linked candidate was inspected manu-ally for the three types of fragment ions that can be generated when a cross-linkedspecies undergoes CID (illustrated in Figure 1.10): whole peptide component ions(type 1), peptide backbone ions with the cross-linker intact (type 2) and peptide964.1. Tandem Mass Spectrometric Verification and Fragmentation Rules for Cross-linked Speciesbackbone ions with cross-link bridge fragmented (type 3). In addition, residuereactivity (Figures 1.3,1.4,1.5,1.6, and1.7) was also considered in evaluating cross-linked species. Cross-linked peptides are represented in the text as (peptide I ) ^(peptide II), where “^” symbolizes the cross-linker bridge. Figure 4.1 illustratesthe nomenclature used to annotate the fragment ion peaks in the MS/MS spectra ofeach cross-linked species where “ABCDE” represents a generic sequence of eachpeptide component. The peaks labelled as “[M]” in these spectra correspond to thepercursor ion of the intact cross-linked species.974.1. Tandem Mass Spectrometric Verification and Fragmentation Rules for Cross-linked SpeciesPeptide I Peptide II Ib 1 2 3 4 5A B C D Ey 5 4 3 2 1b 1 2 3 4 5A B C D Ey 5 4 3 2 1IIb 1 2 3 4 5A B C D Ey 5 4 3 2 1b 1 2 3 4 5A B C D Ey 5 4 3 2 1Ib2+II     Ib3+II    Ib4+IIIy4     Iy3       Iy2      Iy1b 1 2 3 4 5A B C D Ey 5 4 3 2 1Ib2       Ib3      Ib4Iy4     Iy3       Iy2      Iy1b 1 2 3 4 5A B C D Ey 5 4 3 2 1IIb2       IIb3      IIb4IIy4     IIy3       IIy2    IIy1Peptide I Peptide II Peptide I Peptide II Type 1 Fragment Ions           Type 2 Fragment Ions             Type 3 Fragment Ions Cross-Linker  Bridge  Figure 4.1: The nomenclature used to annotate MS/MS spectra of cross-linkedspecies for each type of fragment ionEDC, sulfoDST, BS3and sulfoEGS are classified as “other cross-linkers” and984.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkerstheir established fragmentation patterns are compared to PFA cross-links to clarifyacceptable fragment ion evidence to confirm PFA cross-linked species. Cross-linksformed by other cross-linkers were compared to literature however these studieswere based on Ca2+bound calmodulin complexed with melittin unlike the Ca2+-free calmodulin-melittin complex used in this study. According to NMR and flu-orescence experiments in literature, both complexes exhibit similar conformationsand thus cross-linking sites should be comparable [95]. Nevertheless, it remains tobe seen whether this is supported by the MS/MS verified cross-linking identified inthis current study. It is hypothesized that Ca2+ binding that occurs with negativelycharged carboxyl groups will create variations in cross-linking sites, especiallywith the carboxyl group reactive EDC cross-linking.Cross-linking between two calmodulin peptides and between calmodulin andmelittin peptides was observed for all cross-linkers except the sulfoDST cross-linker, for which only cross-linking between calmodulin and melittin was identi-fied. Previous studies have shown that calmodulin could form two different struc-tures (antiparallel or parallel binding modes) with melittin, supporting that cross-linked species signals could be representing multiple conformations [99, 100].However, in both modes of binding, the conformation of calmodulin itself shouldbe uniform, and consequently it is expected that cross-linking within calmodulinshould be consistent in major cross-linked products. This was considered whilededucing cross-linking sites via MS/MS.4.2 Tandem Mass Spectrometric Fragmentation of OtherCross-linkers4.2.1 EDC4.2.1.1 Calmodulin-Calmodulin Cross-linked PeptidesThe MS/MS fragmentation patterns of EDC cross-linked species were examined.Three calmodulin interpeptide cross-links were discovered, which are listed in Ta-ble 4.1 that conveys the m/z, charge, monoisotopic mass (experimental and calcu-lated), mass accuracy (ppm), the mass and sequence of each component peptide994.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers(highlighted residues correspond to cross-linked sites).The cross-linked structure 91VFDKDGNGYISAAELR106^1A(ac)DQLTEEQI-AEFK13 appeared as a triply charged species at m/z 1100.54 and with a charge offour at m/z 825.66. with two different charge states (Figure 4.2). Type 1 ions ofeach whole peptide component confirmed the cross-linked structure composition.A series of IIy1 to IIy7 and Ib3 to Ib5 ions for 1A(ac)DQLTEEQIAEFK13, andIy1to Iy3, Iy5 to Iy7, and Iy10 to Iy12 ions for 91VFDKDGNGYISAAELR106matched the unmodified sequences for each peptide. A series of type 2 ions IIy9-Ito IIy12-I, with 91VFDKDGNGYISAAELR106 attached to 2DQLTEEQIAEFK13,3QLTEEQIAEFK13, 4LTEEQIAEFK13 and 5TEEQIAEFK13 along with the un-modified IIy1 to IIy7 ions localized the cross-link to 5TE6. In this segment, E6 isthe only residue with a carboxylic group for cross-linking. Since EDC forms cross-links between carboxylic groups and primary amino groups, the cross-link formedbetween E6 and K94, the only lysine residue in 91VFDKDGNGYISAAELR106.The N-terminal residue A1 is acetylated and thus was excluded as a potential cross-linking site.The species at m/z 716.96, with a charge of five, corresponded to14EAFSLFDKDGDGTITTK30^91VFDK- DGNGYISAAELR106 (Figure 4.3).IIy1 to IIy7, IIb2 and IIb3 ions confirmed the part of the unmodified sequenceof 91VFDKDGNGYISAAELR106. Iy1 to Iy16 ions along with a type 1 ion con-firmed the unmodified sequence of 14EAFSLFDKDGDGTITTK30. Type 2 ionsIb3+II, Ib6+II to Ib8+II suggested that E14 was modified and cross-linked to K94,the only lysine residue in 91VFDKDGNGYISAAELR106. Similar to this struc-ture, 91VFDKDGNGYISAAELR106 ^22DGDGTITTK30 was detected as a triplycharged species at m/z 875.76 (4.4). Iy1 to Iy9, Ib2 and Ib3 ions confirmed the un-modified sequence of 91VFDKDGNGYISAAELR106. Type 2 ions Ib4-II to Ib10-IIlocalized the cross-link to 94KDGNGYI101. The 22DGDGTITTK30 peptide wasidentified exclusively by its mass since its respective type 1 or backbone frag-ment ions did not appear in the MS/MS spectra. The mass of the cross-linkedspecies and the type 2 ions Ib4-II to Ib10-II suggests that two cross links formed,each resulting in a 18.02 Da decrease. This indicates that K30 and either D22or D24 should have formed an intrapeptide cross-link. With K30 occupied in theintrapeptide cross-link, 22DGDGTITTK30 must have been modified first at either1004.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersD22 or D24 to form the interpeptide cross-link to K94, the only lysine residuein 94KDGNGYI101. It was noted in section 3.3.1.3 that trypsin is unlikely tocleave after EDC cross-links, which is contradicted by the formation of the in-trapeptide cross-link with terminal K30. Without peptide backbone fragment ionsfrom 22DGDGTITTK30, it is not possible to determine whether the extra -18.02 Dashift corresponds to the proposed intrapeptide cross-link or the loss of H2O froman amino acid side chain such as threonine. It is hypothesized that the intrapep-tide bond indeed formed and sterically hindered the peptide backbone fragmenta-tion of 22DGDGTITTK30. Nevertheless, both species at m/z 716.96 and 875.76with a charge of five and three, respectively, supported cross-linking between sim-ilar regions of calmodulin. However, the different cross-linking sites indicate thatthey represent two different reaction products. The MS/MS identification of EDCcross-link formation between calmodulin peptides 14EAFSLFDKDGDGTITTK30and 91VFDKDGNGYISAAELR106 was discovered for the first time. Furthermore,this is the first report of MS/MS verified EDC cross-linking between E6 and K94.Table 4.1: EDC Calmodulin-Calmodulin interpeptide cross-linked species, inwhich cross-linking sites are highlighted in red. For species appearing with twodifferent charge states, annotated MS/MS spectra is shown for the m/z marked withan “*”.Cross-Linked Species Calmodulin Peptide 1 Calmodulin Peptide 2m/z z [M]exp(Da)[M]calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence875.76 3 2624.28 2624.27 3 906.43 22DGDGTITTK30 1753.86 91VFDKDGNGYISAAELR106825.66 4 3298.62 3298.60 6 1562.75 1A(ac)DQLTEEQIAEFK13 1753.86 91VFDKDGNGYISAAELR106*1100.54 3716.96 5 3579.8 3579.74 18 1753.86 91VFDKDGNGYISAAELR106 1843.88 14EAFSLFDKDGDGTITTK301014.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers3.6 x 103IIb3 IIb4 IIb5IIy2Iy2Iy1IIy1IIy4IIy3 IIy5Iy4Iy5Iy6Ib12 3 4 5 6 7 8 9 10 11 12 13D Q L T E E Q I A E F K12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Peptide IPeptide II2.0x 104IntensityIIb2Iy30 100 200 300 400 500 600IIy6[I] 2+IIy7Iy7Iy8Iy9Iy11 Iy12[I+IIy12] 2+[I+IIy11] 2+[I+IIy10] 2+[M] 3+[I+IIy9] 2+0 700 800 900 1000 1100 1200M/Z[II] 2+Figure 4.2: Interpeptide EDC calmodulin cross-link at m/z 1100.54 (z = 3) pro-posed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red; 1024.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.4x 104[Iy10] 2+b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17E A F S L F D K D G D G T I T T Ky 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 12 3 4 5 6 7 8 9 10 11 12 13 14 15 16F D K D G N G Y I S A A E L R15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Peptide IPeptide IIb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17E A F S L F D K D G D G T I T T Ky 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1IIy1 IIy2IIy3 IIy4IIb2IIb3Iy1 Iy2Iy3[IIy6] 2+Iy40.00 150 200 250 300 350 400 450 500[IIy7] 2+IIa2IIy5IIy6IIy7Iy6[Iy12] 2+[Iy13] 2+[Iy14]2+Iy8[Iy15] 2+Iy9[Iy11] 2+[Iy16] 2+0.0 550 600 650 700 750 800 850 900 950 1000 1050Iy10[I-H2O] 2+2.0x 104M/ZIntensity[Ib3+II]3+[Ib6+II]3+[Ib7+II]3+[Ib8+II]3+Figure 4.3: Interpeptide EDC calmodulin cross-link at m/z 716.96 (z = 5) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom) ; Cross-linker bridges are indicated in red. 1034.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers2.6x 103Iy1 Iy2,Ib2Ib3Iy5,Iy4b 1 2 3 4 5 6 7 8 9D G D G T I T T Ky 9 8 7 6 5 4 3 2 12 3 4 5 6 7 8 9 10 11 12 13 14 15 16F D K D G N G Y I S A A E L R15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9D G D G T I T T Ky 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2b 1 2 3 4 5 6 7 8 9D G D G T I T T Ky 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9D G D G T I T T Ky 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Peptide IPeptide IIOR3.2x 103M/ZIntensityIa20100 150 200 250 300 350 400 450 500 550 600Iy6 Iy7Iy8 Iy9[Ib5+II] 2+[Ib4+II]2+[Ib7+II] 2+[Ib8+II] 2+[Ib6+II] 2+ [Ib9+II] 2+[Ib10+II] 2+[Iy15+II] 3+0 600 650 700 750 800 850 900 950 1000[M] 3+[M-H2O] 3+Figure 4.4: Interpeptide EDC calmodulin cross-link at m/z 875.76 (z = 3) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1044.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers4.2.1.2 Calmodulin-Melittin Cross-linked PeptidesTable 4.2 lists the identified EDC cross-linked species composed of calmod-ulin and melittin peptides. MS/MS spectra of all of these cross-linked pep-tides produced type 1 ions for the melittin peptide, confirming its presence. For14EAFSLFDKDGDGTITTK30^1GIGAVLK7, apearing at m/z 621.58 with a chargeof four (Figure 4.5), type 2 ions Ib2+II to Ib4+II localized the cross-link to E14.Therefore the calmodulin peptide was modified and cross-linked to a primaryamino group (G1 or K7) in the melittin peptide. An almost complete unmodifiedy ion series for 1GIGAVLK7 and 14EAFSLFDKDGDGTITTK30 confirmed the se-quence of each component peptide and the absence of unmodified b ions suggestedthat G1 was cross-linked. Since K7 is a terminal lysine residue, it is more likelythat the N-terminus of G1 formed the cross-link.The triply charged species at m/z 616.66 corresponded to the structure1A(ac)DQLTEEQIAEFK13 ^23KR24 (Figure 4.6). Unmodified Ib2 to Ib5 and Iy1to Iy2 ions verified the sequence of the calmodulin peptide and type 2 ions Iy3+IIto Iy12+II ions localized the cross-link to E11. A type 1 ion and IIy2 ion verifiedthe presence of the melittin peptide. Therefore, calmodulin E11 was cross-linkedto K23, the only reactive residue in the melittin peptide.The cross-linked species38SLGQNPTEAELQDMINEVDADGNGTIDFPEFLTM(ox)M(ox)AR74 ^23KR24 appeared with a signal at m/z 1097.26, with a chargeof four, and also with an additional oxidized M residue (+15.99 Da mass shift)producing signals at m/z 881.21, with a charge of five and m/z 1101.26, with acharge of four (Figure 4.7). Unmodified Iy1 to Iy12 and Ib2 to Ib5 ions verifiedthe sequence of the calmodulin peptide and a series of type 2 ions (Ib18+II toIb32+II, Iy21+II to Iy24+II and Iy27+II) localized the cross-linking site to E54.A type 1 ion corresponding to the melittin peptide confirmed its presence.Thus,calmodulin E54 was cross-linked to melittin K23, the only EDC reactive residue inthe melittin peptide. Similarly, for the unoxidized form of this cross-linked species(m/z = 1097.26, with a charge of four), unmodified fragment ions Ib2 to Ib5 and Iy1to Iy11 confirmed the sequence of the calmodulin peptide and the unmodified type1 ion verified the presence of the melittin peptide. Type 2 ions (Ib23+II to Ib25+II,Ib27+II, Ib28+II, Iy22+II, and Iy23+II) supported cross-linking at calmodulin E541054.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersand thus provided evidence for cross-linking between melittin K23 and calmodulinE54.The calmodulin-melittin EDC cross-linking sites E14 to G1 and E11 to K23 areconsistent with previous studies that identified EDC cross-linking using a combi-nation of MS and Edman degradation [102]. The EDC cross-linking was identifiedin Ca2+-saturated calmodulin-melittin in previous literature. Ca2+binding occurswith negatively charged carboxyl groups that are also EDC reactive. However, theconsistency of cross-linking sites between Ca2+-free (used in this study) and Ca2+-saturated calmodulin-melittin proposes that Ca2+binding does not significantly af-fect EDC cross-linking. Overall, this present study allowed for further validationand localization of specific cross-linking sites with the use of MS/MS for the firsttime.This is the also the first report of EDC cross-linking between calmodulin E54and melittin K23.Table 4.2: EDC calmodulin-melittin interpeptide cross-linked species are listedand classified as capturing antiparallel (shaded in blue) or parallel (white) binding.Reactive residues/possible cross-linking sites are highlighted in red. For speciesappearing with two different charge states, annotated MS/MS spectra is shown forthe m/z marked with an “*”.Cross-Linked Species Melittin Peptide Calmodulin Peptidem/z z [M]exp(Da)[M]Calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence621.58 4 2482.332482.30 15 656.42 1GIGAVLK71562.75 14EAFSLFDKDGDGTITTK30616.66 3 1846.971846.94 13 302.21 23KR24 1562.75 1A(ac)DQLTEEQIAEFK131097.26 4 4385.044385.04 1 302.21 23KR24 4100.84 38SLGQNPTEAELQDMINEVDADGNGTIDFPEFLTM(ox)M(ox)AR74*1101.26881.21454401.054401.05 1 302.21 23KR24 4116.85 38SLGQNPTEAELQDM(ox)INEVDADGNGTIDFPEFLTM(ox)M(ox)AR741064.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers6 x 103 IIy1, Iy1 IIy2 IIy3 Iy4IIy5IIy6Iy2 Iy3 Iy4 Iy5Iy11Iy10Ib3+IIIy61 2 3 4 5 6 7G I G A V L K11 10 9 8 7 6 51 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17A F S L F D K D G D G T I T T K17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 11 10 9 8 7 6 5b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17E A F S L F D K D G D G T I T T Ky 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1CalmodulinPeptide IMelittinPeptide II5 x 103M/ZIntensityIIb3IIb2IIb4 Ib7Ib2Ib4+II0 150 200 250 300 350 400 450 500 550 600IIIy8Iy12 Iy13 Iy14 Iy15Iy16Iy9Ib2+IIIy11Iy100 700 800 900 1000 1100Figure 4.5: Interpeptide EDC calmodulin-melittin cross-link m/z 621.58 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1074.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersIb2 Ib3Ib3III[DQ]2.4 x 104b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1CalmodulinPeptide IMelittinPeptide IIM/ZIntensityIIb5Iy1 Iy2[Iy4+II] 2+ [Iy5+II] 2+[Iy6+II] 2+[Iy7+II]2+IIy1Ia3-NH30.0 150 200 250 300 350 400 450 500 550 600[Ib8+II]2+[Iy8+II]2+[Iy9+II]2+[Iy10+II]2+Iy3+II Iy4+II Iy5+II[Iy11+II]2+[Iy12+II]2+Iy6+II Iy7+II0.0 600 700 800 900 1000 11002.4 x 104Figure 4.6: Interpeptide EDC calmodulin-melittin cross-link m/z 616.66 (z = 3)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom)are shown 1084.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersCalmodulin Peptide Ib 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37S L G Q N P T E A E L Q D M(ox) I N E V D A D G N G T I D F P E F L T M(ox) M(ox) A Ry 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1MelittinPeptide II1.4x 104M/ZIntensityIb2Ib3Ib4Ib5IIy1,Iy1IIIy2Iy3(ox)Iy3Iy4(2ox)[Iy7(2ox)]2+[Iy10(2ox)]2+Iy5(2ox)Iy6(2ox)Iy7(2ox)[Ib23(ox)+II] 3+[Ib24(ox)+II] 3+[Ib25(ox)+II] 3+[Ib33(ox)+II] 4+[Ib26+II] 3+[Iy11(20x)] 2+[Iy9(2ox)0.0 200 300 400 500 600 700 800 900 10008.0x 103Iy8(2ox) Iy10(2ox)Iy11(2ox)Iy12(2ox)Iy9(2ox)[Ib27(ox)+II](+2)[Ib29(ox)+II]3+[M] 4+[M-H2O] 4+[Ib30(ox)+II] 2+[Ib31(ox)+II] 2+[Iy26(2ox)+II] 3+[Ib18(ox)+II] 2+[Ib17(ox)+II] 2+[Ib19(ox)+II] 2+[Ib20(ox)+II] 2+[Ib21(ox)+II] 2+[Ib22(ox)+II] 2+[Ib23(ox)+II] 2+[Ib26(ox)+II] 3+[Ib27+II] 2+[Ib27+II] 2+[Ib28+II] 2+0 1000 1100 1200 1300 1400 1500 1600[Iy23(ox)+II] 3+[Iy22(ox)+II] 3+[Iy21(ox)+II] 2+Figure 4.7: Interpeptide EDC calmodulin-melittin cross-link m/z 1101.26 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red.1094.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersNo simultaneous fragmentation at the bridge and backbone i.e. type 3 frag-mentation was observed in EDC cross-linked species. A possible reason is thatsince the EDC cross-linked bridge is a peptide bond, it has an equal bond energyto the backbone peptide bonds and would thus require the same collision energyto fragment. In addition, fragmentation of the backbone of the smaller compo-nent peptide was not as extensive as with the larger component peptide, which iscommon phenomenon [51].4.2.2 SulfoDST4.2.2.1 Calmodulin-Melittin Cross-linked PeptidesCross-linking with sulfoDST involving only calmodulin was not detected in thisstudy. Two calmodulin-melittin sulfoDST cross-linked species were identifiedand listed in Table 4.3, which lists the m/z, charge, monoisotopic mass (ex-perimental and calculated), mass accuracy (ppm), the mass and sequence ofeach component peptide (highlighted residues correspond to cross-linked sites).Cross-linked species 91VFDKD(dm)GNGYISAAELR106^1GIGAVLK7 appearedas a triply charged species at m/z 842.77 (Figure 4.8). A full y ion series of1GIGAVLK7 (IIy1 to IIy6) and of 91VFDKDGN(dm)GYISAAELR106 (Iy1 to Iy12and Ib2 to Ib3) confirmed the sequences of the cross-linked peptides. The deami-dation of N97 was marked by the + 0.98 Da mass shift on Iy10 to Iy12. Thisdeamidated N97 was also previously observed in other calmodulin studies [161]and also in control samples (see Table 3.1). Type 2 ions Ib4+II to Ib6+II, Iy13+IIto Iy15+II and IIb6+I localized the cross-linking between G1 and K94.The cross-linked candidate at m/z 493.92, with a charge of three, correspondedto the following structure 22DGDGTITTK30^22RKR24 (Figure 4.9). A type 1 ion,type 3 ions Iy1 to Iy4 and a Ib2 ion confirmed the sequence of 22DGDGTITTK30.Only a IIb2 ion appeared from 22RKR24, however poor fragmentation of thesmaller peptide in a cross-linked species is typical [51]. Although no type 2 ionswere present, with only one reactive residue in each peptide, cross-linking musthave occurred between K30 from calmodulin and K23 from melittin. The un-modified Iy1 to Iy4 ions suggests that the cross-linker bridge and peptide back-bone were simultaneously fragmented. SulfoDST cross-linking between calmod-1104.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersulin K30 and melittin K23 is consistent with previous MS-based identification ofsulfoDST cross-linking between calmodulin segment 1- 37 to melittin K23 [99].However, the sulfoDST cross-link formation at K30 contradicts previous claimsthat trypsin cleavage at cross-linked K is unlikely, as discussed in section 3.3.1.3.Table 4.3: sulfoDST calmodulin-melittin interpeptide cross-linked species arelisted and classified as capturing antiparallel (shaded in blue) or parallel (white)binding. Cross-linking sites are highlighted in red.Cross-Linked Species Melittin Peptide Calmoulin Peptidem/z z [M]exp(Da)[M]Calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence842.77 3 2525.30 2525.28 7 656.42 1GIGAVLK7 1754.86 91VFDKDGN(dm) GYISAAELR106493.92 3 1478.75 1478.73 12 458.31 22RKR24 906.43 22DGDGTITTK301114.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers8.0 x 103 Ia2 IIy1 Ib2 IIy2 Iy2 Ib3 IIy3 Iy1 IIy5 IIy4 Iy3 Iy5 Iy6 [Ib4+II]2+ [Iy11]2+ X1K 0 150 200 250 300 350 400 450 500 550 600 [M]3+ Iy9 Iy10(dm) Iy11(dm) Iy12(dm) [M-NH3]3+ [Ib9(dm)+II]2+ [Ib7(dm)+II]2+ [Ib5+II]2+ [Ib6+II]2+ [Ib8(dm)+II]2+ Iy8 Iy7 [Iy14(dm)+II]2+ [Iy13(dm)+II]2+ [Iy15(dm)+II]3+ 0 700 800 900 1000 1100 1200 9.0 x 103 Calmodulin Peptide I Melittin Peptide II M/Z Intensity b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1(dm) Figure 4.8: Interpeptide sulfoDST calmodulin-melittin cross-link m/z 842.77 (z= 3) proposed structures with fragment ion evidence (top) and MS/MS spectra(bottom); Cross-linker bridges are indicated in red. 1124.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers2.5.x 104 X1 K Ib2 Iy1 Iy2 Iy3 Iy4 I-H2O IIb2 0.0 200 300 400 500 600 700 800 Calmodulin Peptide I Melittin Peptide II M/Z Intensity b 1 2 3R K Ry 3 2 1b 1 2 3 4 5 6 7 8 9D G D G T I T T Ky 9 8 7 6 5 4 3 2 1Figure 4.9: Interpeptide sulfoDST calmodulin-melittin cross-link m/z 493.92 (z =3)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. Note: Fragmentation indicated on thebackbone of the peptide corresponds to type 3 ions only.A diagnostic ion X1K (m/z = 200) appeared in the MS/MS spectrum of91VFDKDGNGYISAAELR106^1GIGAVLK7, which verified the presence of thecross-linker and indicated that fragmentation occurred at the bond connecting thecross-linker bridge to melittin G1 and at the peptide backbone bond of calmodulinK94. In addition, the production of type 3 ions from 22DGDGTITTK30^22RKR24signifies simultaneous fragmentation of the bridge and peptide backbone.1134.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers4.2.3 BS34.2.3.1 Calmodulin-Calmodulin Cross-linked PeptidesThree BS3 interpeptide calmodulin cross-linked species were discovered and arelisted in Table 4.4, which lists the m/z, charge, monoisotopic mass (experimen-tal and calculated), mass accuracy (ppm), the mass and sequence of each com-ponent peptide (highlighted residues correspond to cross-linked sites). The BS3cross-linked structure 91VFDKDGNGYISAAELR106^76MK77 appeared as a dou-bly charged species at m/z1085.55 (Figure 4.10) and as a triply charged speciesat m/z 724.04 i.e. with two different charge states. A full unmodified y ion se-ries (Iy1 to Iy12) and Ib2 to Ib3 ions for 91VFDKDGNGYISAAELR106 confirmedits sequence. Type 2 ions (Ib4+II, Ib5+II and Iy13+II to Iy15+II) verified thatcross-linking occurred between K77 and K94. The 76MK77 was confirmed by theappearance of its type 1 ion.Candidate species appearing as a triply charged species at m/z 1079.51matched the mass of a BS3 cross-link between two identical peptides:75KMKDTDSEEEIR86^75KMKDTDSEEEIR86 with two cross-linked bridges(Figure 4.11). Unmodified type 3 ions (Iy1 to Iy4, Iy6, Iy8, Iy9) confirmed thesequence of the peptide components. Although, type 2 ions were absent, type 3b ions attached to the BS3 bridge (Ib3+X1 to Ib7+X1) localized one cross-linkingsite to K77. With only two K residues in each peptide, either two K77 to K75 inter-peptide cross-links or one K77 to K77 and one K75 to K75 interpeptide cross-linkformed.The triply charged species at m/z 588.96 and doubly charged species at m/z882.94 corresponded to K cross-linked 75KMKDTDSEEEIR86 (Figure 4.12). Itwas not possible to determine whether the K was either K75 or K77 from calmod-ulin, or K23 from melittin. Both 91VFDKDGNGYISAAELR106^76MK77 and75KMKDTDSEEEIR86^K suggest that trypsin cleaved a cross-linked lysine, whichis not expected, as explained in section 3.3.1.3.1144.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersTable 4.4: BS3+ calmodulin interpeptide cross-linked species, in which cross-linking sites are highlighted in red. For species appearing with two different chargestates, annotated MS/MS spectra is shown for the m/z marked with an “*”.Cross-Linked Species Calmodulin Peptide 1 Calmodulin Peptide 2m/z z [M]exp(Da)[M]calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence*1085.55 2 2169.10 2169.08 9 277.15 76MK77 1753.86 91VFDKDGNGYISAAELR106724.04 31079.51 3 3235.54 3235.51 10 1479.69 75KMKDTDSEEEIR86 1479.69 75KMKDTDSEEEIR86*588.96 3 1763.88 1763.86 10 146.11 75K or 23K (melittin) 1479.69 75KMKDTDSEEEIR86882.94 21154.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersII-H2O2.0 x 104b 1 2M Ky 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 11085.552+ 3579.8 3579.74 9 277.15 76-77 1753.86b 1 2M Ky 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Peptide IPeptide IIIy1 Iy2 Iy3Iy4 Iy5Iy6Ib2II0.0 200 300 400 500 600Iy7Iy8 Iy9Iy10Iy11 Iy12Ib4+II [Iy14+II]2+ [Iy15+II]2+[M-H2O]2+[M]2+[Iy13+II]2+Ib5+II0.0 700 800 900 1000 1100 12001.7x 104X1K+NH3M/ZIntensityFigure 4.10: Interpeptide BS3+ calmodulin cross-link m/z 1085.55 (z = 2) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1164.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.0 x103b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1Peptide IPeptide IIX1K+NH3 X12KI/IIy1 I/IIy2 I/IIy30 200 300 400 500 600 7002.0 x103X1KI/IIy4I/IIb3+X1I/IIb4+X1I/IIb5+X1I/IIy10(ox)+X1K-NH3I/IIy1 I/IIy2 I/IIy3I/IIy6I/IIb3+X1I/IIb4+X12.0 x103I/IIb9+X1+X1K-NH3b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1Type 1Type 3[I/II+X1]2+I/II+X10.0 200 400 600 800 1000 1200 1400 1600 1800 20003.1 x 105I/IIy8I/IIy9I/IIy7I/IIb6+X1 I/IIb7+X10 850 900 950 1000 1050 1100M/ZIntensityI/IIy4I/IIb5+X1X1K0 200 300 400 500 600 700X1K+NH3X12KFigure 4.11: Interpeptide BS3+ calmodulin cross-link m/z 1079.51 (z = 3) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red. 1174.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersIy2Iy4 Iy5 Iy7 Iy8IIb3+IIIb5+II[IIb7+II]2+7.0 x 1032 3 4 5 6 7 8 9 10 11 12M K D T D S E E E I R11 10 9 8 7 6 5 4 3 2 1b 1Ky 1X1K+NH3X12KX1KPeptide IPeptide IIIntensityb 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1Ky 1Iy1Iy3Iy6Ib4+II0 200 300 400 500 600 700 800 900M/ZFigure 4.12: Interpeptide BS3+ calmodulin cross-link m/z 588.96 (z = 3) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom) ; Cross-linker bridges are indicated in red.4.2.3.2 Calmodulin-Melittin Cross-linked PeptidesEight calmodulin-melittin BS3 cross-linked peptides were identi-fied, which are listed in Table 4.5. The BS3 cross-linked structure91VFDKDGNGYISAAELR106^1GIGAVLK7 appeared as a triply chargedspecies at m/z 850.46 and a doubly charged species at m/z 1275.19 (Figure 4.13).A complete series of unmodified Iy1 to Iy12 and Ib2 to Ib3 ions confirmed thecalmodulin peptide component. Type 2 ions Ib4+II to Ib12+II and Iy13+II toIy15+II localized the cross-link to K94. A full unmodified y ion series for the1184.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersmelittin peptide verified its sequence and type 2 ion IIb2+I localized cross-linkingto G1.91VFDKDGNGYISAAELR106^23KR24, appearing as a triply chargedspecies at a m/z 732.39, was identified (Figure 4.14). A full y ion series and type 1ion for the calmodulin peptide and a modified melittin peptide type 1 ion verifiedthe components of the cross-linked structure. Consecutive type 2 ions Ib4+II toIb12+II and Iy13+II to Iy15+II localized the cross-linking site to K94, which wascross-linked to K23, the only lysine in the melittin peptide.The species at m/z 699.36, with a charge of four, corresponded to the cross-linked structure: 75KM(ox)KDTDSEEEIREAFR90^1GIGAVLK7 (Figure 4.15). Acomplete unmodified y ion series of both component peptides and a type 1 ion ofthe melittin peptide confirmed the cross-linked species sequence. Iy15(ox) con-firmed M76 oxidation, a modification also observed in control samples (see Table3.1). Type 2 ions IIb2+I to IIb6+I localized the cross-linking to G1. Ib3+II, Ib4+IIand Ib14+II localized the cross-link to 75KM(ox)K77. Since Iy15 and Iy14 appearedunmodified by the cross-linker, calmodulin K75 likely formed the cross-link withmelittin G1.The species with a charge of four appearing at m/z 733.35 (Figure 4.16) andat m/z 741.35 corresponded to 127EADIDGDGQVNYEEFVQMMTAK148^23KR24where the latter differed due to oxidized methionines. Unmodified Ib2 to Ib16 ionsconfirmed the sequence of the calmodulin peptide and type 2 ions Iy1+II to Iy11+IIlocalized the cross-linking to calmodulin K148 and melittin K23. A type 1 ion(II+X1) for the melittin peptide confirmed its presence. Likewise, for the oxidizedspecies with a charge of four at m/z 741.35 , unmodified Ib2 to Ib11 ions confirmedthe calmodulin peptide and type 1 ion (II+X1) verified the melittin peptide. Type 2ions Iy1+II to Iy12+II localized the cross-linking to calmodulin K148 and melittinK23.Cross-linking between calmodulin K77 and melittin K23 was supportedby three cross-linked species appearing with a charge of four and at m/z481.00, 520.02, and 448.97 with structures 75KMKDTDSEEEIR90^23KR24,75KMKDTDSEEEIR90^22RKR24 and 76MKDTDSEEEIR90^23KR24, respectively(Figures 4.17,4.18, and 4.19). In all three species, an unmodified type 1 ion of themelittin peptide, a series of unmodified y ions for the calmodulin peptide, and type2 b ions of the calmodulin peptide connected to the whole melittin peptide verified1194.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersboth peptide components and confirmed that calmodulin K77 was cross-linked tomelittin K23.Table 4.5: BS3+ calmodulin-melittin interpeptide cross-linked species are listedand classified as capturing antiparallel (shaded in blue) or parallel (white) binding.Cross-linking sites are highlighted in red. For species appearing with two differentcharge states, annotated MS/MS spectra is shown for the m/z marked with an “*”.Cross-Linked Species Melittin Peptide Calmodulin Peptidem/z z [M]exp(Da)[M]Calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence*732.39 3 2194.16 2194.14 10 302.21 23KR24 1753.86 91VFDKDGNGYISAAELR106549.54 4699.36 4 2793.46 2793.43 11 656.42 1GIGAVLK7 1998.94 75KM(ox) KDTDSEEEIREAFR90*733.35 4 2929.39 2929.35 14 302.21 23KR24 2489.07 127EADIDGDGQVNYEEFVQMMTAK148977.46 3741.35 4 2961.38 2961.35 12 302.21 23KR24 2521.07 127EADIDGDGQVNYEEFVQM(ox) M(ox) TAK148481.00 4 1919.99 1919.96 15 302.21 23KR24 1479.69 75KMKDTDSEEEIR86520.02 4 2076.07 2076.06 6 458.31 22RKR24 1479.69 75KMKDTDSEEEIR86448.97 4 1791.90 1791.87 17 302.21 23KR24 1351.59 76MKDTDSEEEIR86*850.46 3 2548.38 2548.35 9 656.42 1GIGAVLK7 1753.86 91VFDKDGNGYISAAELR1061275.19 21204.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersIy6Iy3 IIy51.5 x 105Ia21 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L R16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1CalmodulinPeptide IMelittinPeptide IIb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1IIy1IIy3Ib3IIy4Iy4Iy5IIy6Iy1IIy2Iy2Ib2Ib4+IIIb5+IIIb6+IIIy70. 200 300 400 500 600 700Iy9Iy7Ib7+IIIb8+IIIb9+IIIy8 Ib10+IIIb11+IIIy13+IIIy12Iy14+IIIy15+IIIy12Ib12+IIIIb1+I[M]3+0 800 900 1000 1100 12001.5 x 105[M-NH3]3+II+X1X1KM/ZIntensityFigure 4.13: Interpeptide BS3+ calmodulin-melittin cross-link m/z 850.46 (z = 3)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. 1214.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersIa2 Ib3Ib2Iy1 Iy2Iy4Iy3 II+X1Iy6Iy5[Ib5+II]2+[Ib6+II]2+[Ib7+II]2+[Ib8+II]2+[I]2+3.0 x 104b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1CalmodulinPeptide IMelittinPeptide IIb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 10.0 200 300 400 500 600Iy7Iy8Iy9Ib4+IIIb5+IIIb6+IIIy10 Iy11 Iy12[Iy13+II]2+[Iy14+II]2+[Iy15+II]2+[Ib9+II]2+[M-NH3]3+[M]3+[Ib10+II]2+[Ib11+II]2+[Ib12+II]2+0.0 700 800 900 1000 1100 12003.0x 104M/ZIntensityFigure 4.14: Interpeptide BS3+ calmodulin-melittin cross-link m/z 732.39 (z = 3)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. 1224.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers3.3 x 104IIb2[Iy8]2+[Iy10]2+[Iy15(ox)]2+ Iy5Iy4Iy3[Iy11]3+[Iy12]3+[Iy13]3+[IIb4+I]3+[IIb2+I]3+[IIb3+I]3+[Ib3+II]3+II[M]4+1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16K M K D T D S E E E I R E A F R16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 11 2 3 4 5 6 7G I G A V L K7 6 5 4 3 2 1CalmodulinPeptide IMelittinPeptide IIIntensityb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16K M K D T D S E E E I R E A F Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1Iy1IIy1 [Ib6]2+Iy14II3y[Iy7]2+Iy2 [Iy14]2+[IIb5+I]3+[Ib6+I]3+IIy5[Ib14]3+0.0 200 300 400 500 600 700 800 900M/ZFigure 4.15: Interpeptide BS3+ calmodulin-melittin cross-link m/z 699.36 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red.1234.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers9.0x 104Ib3Ib6Ib7Iy1+IIIy2+IIIIy3+II[Iy6+II]2+[Iy7+II]2+[Iy8+II]2+[Ib13]2+QVNYEE-28b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22E A D I D G D G Q V N Y E E F V Q M M T A Ky 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1Calmodulin Peptide IMelittinPeptide IIb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22E A D I D G D G Q V N Y E E F V Q M M T A Ky 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 11.6x 104M/ZIntensityIb9Ib10Ib11Ib12 Ib13 Ib14[Ib15]2+[IIy11+II]2+[Iy10+II]2+0.00800 900 1000 1100 1200 1300 1400 1500[Ib4+II]2+Ib5+IIIb6+II Ib7+IIIb2Ib4Ib5Ib8[Ib12]2+II + X1X1K[II+X1K]2+[Iy1+II]2+ [Iy2+II]2+Ia2Ib4-NH3IIy1[Iy3+II]2+[Iy4+II]2+[Iy5+II]2+[Iy9+II]2+[Ib14]2+0 200 300 400 500 600 700 800Figure 4.16: Interpeptide BS3+ calmodulin-melittin cross-link m/z 733.35 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. 1244.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers3.6x 103Iy1Iy2Iy4[Ib5+II]2+II + X1[II + X1K]2+IIIb2Iy4-H2OIb4-H2O +XII + X1K1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I R12 11 10 9 8 7 6 5 4 3 2 11 2K R2 1X1K+NH3CalmodulinPeptide IMelittinPeptide IIIntensityb 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1Iy3Iy5Iy6[Ib4+II]2+Ib2+II[Ib6+II]2+Iy6-H2OIb30 200 300 400 500 600 700Iy1-NH3 Iy2-NH3Iy5-NH3M/ZFigure 4.17: Interpeptide BS3+ calmodulin-melittin cross-link m/z 481.00 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red.1254.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.2x 104Iy1,IIy1 Iy2Iy3Iy4IIy2+X1IIy2II+X1KIIy2+X1K[Ib4+IIy2]2+[Ib3+II]2+[Ib4+II]2+[Ib5+IIy2]2+IIb2+X1Ib4-H2O +X12 3 4 5 6 7 8 9 10 11 12M K D T D S E E E I R11 10 9 8 7 6 5 4 3 2 1b 1 2 3R K Ry 3 2 1X1K+NH3CalmodulinPeptide IMelittinPeptide IIIntensityb 1 2 3 4 5 6 7 8 9 10 11 12K M K D T D S E E E I Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3R K Ry 3 2 1Iy5Iy6 Iy7Ib2+II[Ib4+II]2+[Ib5+II]2+IIIIb20.0 200 300 400 500 600 700 800M/ZFigure 4.18: Interpeptide BS3+ calmodulin-melittin cross-link m/z 520.02 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red.1264.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.7x 104Iy2Iy1Iy3II+X1Iy4II[Ib4+II]2+ Ib2+II Ib3+II[Ib3+II]2+ II+X1KIy4-H2OIb6+X1KII+X1K1 2 3 4 5 6 7 8 9 10 11M K D T D S E E E I R11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1CalmodulinPeptide IMelittinPeptide IIIntensityb 1 2 3 4 5 6 7 8 9 10 11M K D T D S E E E I Ry 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1Iy5 Iy6[Ib2+II]2+0.0 200 300 400 500 600 700 800 900[Ib4+II]2+ Ib3+X1KX1K+NH3X1KM/ZFigure 4.19: Interpeptide BS3+ calmodulin-melittin cross-link m/z 448.97 (z = 4)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red.Previous MS [99] and MS/MS [100] experiments confirmed BS3 cross-linkingof calmodulin K77 to melittin G1. Similar regions were shown to cross-link in thispresent study, which localized cross-linking between calmodulin K75 to melittinG1. In addition, MS/MS verified cross-linking of calmodulin K94 to melittin G1and K23, calmodulin K77 to melittin K23, and calmodulin K148 to melittin K23were discovered for the first time.Diagnostic ions (See Figure 1.10b) X1K, X1K+NH3, and X12K(m/z = 222.14, 239.17 and 305.22 for BS3, respectively), which have1274.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkerspreviously been reported [52], appeared in the MS/MS spectra of75KMKDTDSEEEIR86^75KMKDTDSEEEIR86 and K^75KMKDTDSEEEIR86.The appearance of at least one diagnostic ion indicates the presence of theBS3 cross-linker. Only one diagnostic ion, X1K+NH3, appeared in the MS/MSspectrum of 76MK77^91VFDKDGNGYISAAELR106. The MS/MS spectra of allthree species 75KMKDTDSEEEIR90^23KR24, 75KMKDTDSEEEIR90^22RKR24and 76MKDTDSEEEIR90^23KR24 contained diagnostic ion X1K+NH3 and atype 1 ion corresponding to the melittin peptide modified with a (+ X1K),indicating that fragmentation occurred at the peptide bond of K77. The un-modified type 1 ion of the melittin peptide in all three cross-linked speciessuggests that fragmentation also occurred at the amide bond between K23and the BS3 cross-linker bridge. 91VFDKDGNGYISAAELR106^1GIGAVLK7contained both type 1 ion II+X1 and diagnostic ion X1K, indicating thatthe cross-linker bridge was fragmented at the amide bond connectingcalmodulin K94 to the cross-linker bridge, and at the calmodulin back-bone peptide bond of K94. Diagnostic ion X1K and type 1 ion II+X1K inthe MS/MS spectrum of 127EADIDGDGQVNYEEFVQMMTAK148^23KR24suggests that fragmentation occurred at the peptide bond of K148. How-ever, diagnostic ions in both 91VFDKDGNGYISAAELR106^23KR24 and75KM(ox)KDTDSEEEIREAFR90^1GIGAVLK7 MS/MS spectra were not de-tected. Generally, simultaneous fragmentation of the cross-linker bridge and pep-tide backbone (type 3 ions) were not observed and this is consistent with literature[51] However, an exception is 75KMKDTDSEEEIR86^75KMKDTDSEEEIR86.It remains to be seen whether this is a characteristic behavior of cross-linkedidentical peptides under CID, due to their equal bond energies.4.2.4 SulfoEGS4.2.4.1 Calmodulin-Calmodulin Cross-linked PeptidesTwo calmodulin interpeptide cross-links were identified in sulfoEGS treated sam-ples as shown in Table 4.6, which lists the m/z, charge, monoisotopic mass (experi-mental and calculated), mass accuracy (ppm), the mass and sequence of each com-ponent peptide (highlighted residues correspond to cross-linked sites). The cross-1284.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkerslinked structure 14EAFSLFDKDGDGTITTK30^91VFDKDGNGYISAAELR106appeared with a charge of four at m/z 956.96 and with a charge of five atm/z 765.77 (Figure 4.20). Fragment ions Iy1 to Iy9 and Ib2 to Ib7, andIIy1 to IIy12, IIb2 and IIb3 confirmed the sequence of each component pep-tide, respectively. The following type 2 ions localized the cross-linking sitesto K21 and K94: Ib13+II to Ib15+II, Iy12+II, Iy14+II, Iy15+II, IIb5+I toIIb10+I, IIy13+I and IIy14+I. The cross-linked species with a charge of fourat m/z 934.46 corresponded to the following structure composed of identicalpeptides: 91VFDKDGNGYISAAELR106^91VFDKDGNGYISAAELR106 (Figure4.21). Fragment ions ions I/IIy1 to I/IIy12, I/IIb2 and I/IIb3 confirmed the peptidesequence. Type 2 ion I/IIb8+II/I verified cross-linking. Since K94 is the only re-active residue in each peptide, cross-linking must have occurred at this site. Thepresence of type 2 ions and lack of type 1 ions indicated that the cross-linker bridgeremained intact unlike the BS3cross-link between identical peptides.Table 4.6: SulfoEGS calmodulin interpeptide cross-linked species, in which cross-linking sites are highlighted in red. For species appearing with two different chargestates, annotated MS/MS spectra is shown for the m/z marked with an “*”.Cross-Linked Species Calmodulin Peptide 1 Calmodulin Peptide 2m/z z [M]exp(Da)[M]calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence*956.96 4 3823.85 3823.80 13 1753.86 91VFDKDGNGYISAAELR106 1753.86 14EAFSLFDKDGDGTITTK30765.77 5934.46 4 3733.83 3733.77 14 1753.86 91VFDKDGNGYISAAELR106 1479.69 91VFDKDGNGYISAAELR1061294.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.9x 103 Iy11 Iy7 Iy6 Iy5 Iy4 Iy3 Iy2 Iy1 Ib3 Ib2 Iy9 Iy8 Iy10 Ia2 Iy12 0 200 400 600 800 1000 1200 Peptide I Peptide II M/Z Intensity b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1[M]4+ [II/Ib8+I/II]3+ Figure 4.20: Interpeptide sulfoEGS calmodulin cross-link m/z 956.96 (z = 4) pro-posed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. 1304.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.9x 1031 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16F D K D G N G Y I S A A E L R16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 11 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16F D K D G N G Y I S A A E L R16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Iy7Iy6Iy5Iy4Iy3Iy2Iy1Ib3Ib2Peptide IPeptide IIIntensityb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Iy11Iy9Iy8Iy10Ia2Iy120 200 400 600 800 1000 1200M/ZFigure 4.21: Interpeptide sulfoEGS calmodulin cross-link m/z 934.46 (z = 4) pro-posed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. MS/MS spectra is annotated such that I= II.4.2.4.2 Calmodulin-Melittin Cross-linked PeptidesThree calmodulin-melittin sulfoEGS cross-linked species were discovered and arelisted in Table 4.7. The species with a charge of four appearing at m/z 571.54 cor-responded to the structure 91VFDKDGNGYISAAELR106^23KR24 (Figure 4.22).This structure and sequence was confirmed with a type 1 ion of the melittin pep-tide and a series of unmodified y ions (Iy1 to Iy8), a Ib2 ion and a Ib3 ion ofthe calmodulin peptide. With only one lysine in each peptide, cross-linking must1314.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkershave occurred between calmodulin K94 and melittin K23. Type 2 ions Iy15+II,Ib8+II and Ib11+II were also consistent with this conclusion. The cross-linkedspecies with a charge of four at m/z 635.81 matched the following structure:76MKDTDSEEEIREAFR90^22RKR24 (Figure 4.23). An unmodified type 1 ionfor the melittin peptide, a modified type 1 ion (I+X1)of the calmodulin peptide,and an almost complete y ion series (Iy1, Iy3, Iy5, Iy6 to Iy13) of the calmod-ulin peptide confirmed the cross-linked structure. SulfoEGS cross-linking wasdetermined to occur between calmodulin K77 and melittin K23, the only possi-ble reactive sites, and type 2 ions Ib11+II and Iy15+II also supported this finding.14EAFSLFDKDGDGTITTK30^23KR24 appeared as a triply charged species at m/z791.72 (Figure 4.24). Unmodified ions Iy1 to Iy9 and Ib2 to Ib7 confirmed thecalmodulin peptide sequence and a type 1 ion verified the presence of the melittinpeptide. Type 2 ions Iy10+II to Iy16+II, Ib8+II, Ib9+II, Ib11+II and Ib12+II local-ized cross-linking between K30 to K23. This is the first report of MS/MS verifiedsulfoEGS cross-linking in the calmodulin-melittin complex.Table 4.7: SulfoEGS calmodulin-melittin interpeptide cross-linked species arelisted and classified as capturing antiparallel (shaded in blue) or parallel (white)binding. Cross-linking sites are highlighted in red.Cross-Linked Species Melittin Peptide Calmoulin Peptidem/z z [M]exp(Da)[M]Calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence571.54 4 2282.15 2282.12 14 302.21 23KR24 1753.86 91VFDKDGNGYISAAELR106635.81 4 2539.25 2539.20 20 458.31 22RKR24 1854.84 76MKDTDSEEEIREAFR90791.72 3 2372.17 2372.14 13 302.21 23KR24 1843.884 14EAFSLFDKDGDGTITTK301324.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers8 x103b 1 2K Ry 2 12 3 4 5 6 7 8 9 10 11 12 13 14 15 16F D K D G N G Y I S A A E L R15 14 13 12 11 10 9 8 7 6 5 4 3 2 1I/IIy1Iy3 Iy4Iy6Iy7[II+X2K-NH3]2+ II+X2K[II+X1K]2+GY-COIS-H2O/AE-H2OIICalmodulinPeptide IMelittinPeptide IIIntensityb 1 2K Ry 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Iy2 Iy5[Iy8]2+Ib3Ib2[Ib8+II]2+[Ib9+II]2+ [Ib11+II]2+0 200 300 400 500 600 700 800M/ZFigure 4.22: Interpeptide sulfoEGS calmodulin-melittin cross-link m/z 571.54 (z= 4) proposed structures with fragment ion evidence (top) and MS/MS spectra(bottom); Cross-linker bridges are indicated in red.1334.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers1.2.x 1042 3K R2 1Iy102+[Ib11+II]3+Iy132+I+X12 3 4 5 6 7 8 9 10 11 12 13 14 15K D T D S E E E I R E A F R14 13 12 11 10 9 8 7 6 5 4 3 2 1CalmodulinPeptide IMelittinPeptide IIIntensityb 1 2 3R K Ry 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15M K D T D S E E E I R E A F Ry 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1X1K +NH3X1KIy1,IIy1IIy1Iy3Iy4Iy62+ Iy72+Iy82+ Iy92+Iy122+Iy142+IIIIb20.0 200 300 400 500 600 700 800M/ZFigure 4.23: Interpeptide sulfoEGS calmodulin-melittin cross-link m/z 635.81 (z= 4) proposed structures with fragment ion evidence (top) and MS/MS spectra(bottom); Cross-linker bridges are indicated in red.1344.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkers2.0 x 104Ib3Iy3Iy6[Iy12+II]3+b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17E A F S L F D K D G D G T I T T Ky 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1CalmodulinPeptide IMelittinPeptide IIb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17E A F S L F D K D G D G T I T T Ky 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 11.9x 104Ib2Ib4Ib5Iy1IIy1Iy2Iy4Iy5IIX1KX1K + NH3 [Iy11+II]3+0.0 150 200 250 300 350 400 450 500 550 600Ib4-NH3Ib5-NH3Ib6Iy7Iy8Iy9[Ib8+II]2+[Ib9+II]2+[Ib12+II]2+Iy14+II Iy15+IIIy13+IIIy12+II[Iy11+II]2+[Ib11+II]2+Ib7[M][Iy10+II]2+[Iy13+II]3+[Iy14+II]3+[Iy15+II]3+[Iy16+II]3+0.0 650 700 750 800 850 900 950 1000 1050M/ZIntensityFigure 4.24: Interpeptide sulfoEGS calmodulin-melittin cross-link m/z 791.72 (z= 3) proposed structures with fragment ion evidence (top) and MS/MS spectra(bottom); Cross-linker bridges are indicated in red. 1354.2. Tandem Mass Spectrometric Fragmentation of Other Cross-linkersDiagnostic ions (See Figure 1.10b) X1K and X1K+NH3 at m/z = 312 and 330,respectively were observed and confirmed the presence of the cross-linker. Fig-ure 1.10c summarizes the fragmentation patterns that have been previously ob-served for sulfoEGS cross-linked species. Fragmentation occurred at the pep-tide bond of the cross-linked lysine residue on 23KR24 giving rise to II+X1K.Fragmentation also occurred within the cross-linker connected to 23KR24 at theC-O bond, giving rise to II+X2K, which is commonly observed with sulfoEGScross-linking [162]. The MS/MS spectra of 76MKDTDSEEEIREAFR90^22RKR24and 14EAFSLFDKDGDGTITTK30^23KR24contained diagnostic ion X1K. Frag-mentation of 91VFDKDGNGYISAAELR106^23KR24 produced diagnostic ionsII+X2K-NH3 and II+X2K. Both calmodulin cross-linked species did not pro-duce type 1 or type 3 ions under CID. However, all calmodulin-melittincross-linked species generated a type 1 ion for the melittin peptide and76MKDTDSEEEIREAFR90^22RKR24 produced type 1 ions for both peptide com-ponents under CID.4.2.5 Summary of Other Cross-linker Fragmentation PatternsCommon fragmentation patterns were observed across most EDC, sulfoDST,BS3and sulfoEGS cross-linked peptides. In general, type 1 ions of at least thesmaller component peptide and unmodified backbone fragment ions of at least one(usually y ions of the larger peptide) is expected. Type 2 ions with the cross-linkerattached to a whole component is of less importance in NHS ester cross-linkingto localize cross-linking since cross-linking almost exclusively occurs between Kresidues. Nevertheless, most cross-linked species produced extensive type 2 ionsin their MS/MS spectra, which further confirmed cross-linking. Type 3 ions wererarely observed for all cross-linkers. The observation of more extensive backbonefragments from the larger peptide and intact smaller peptide ions in the MS/MSof NHS ester cross-linking is consistent with literature [52]. NHS ester cross-linkbridges allow for additional possible fragment ions generated by dissociation ofbonds within the bridge. Both sulfoEGS and BS3 produced X1K fragmentationat the peptide backbone bond of melittin K23 and both sulfoDST and BS3 pro-1364.3. Tandem Mass Spectrometric Fragmentation of Formaldehydeduced X1K fragmentation at the peptide backbone bond of calmodulin K94, sug-gesting that this type of fragmentation may not depend on cross-linker length butthe specific cross-linked lysine environment. Production of diagnostic ions revealthe presence of the cross-linker bridge, however, diagnostic ions were not detectedin every identified cross-linked species, which was also previously observed in lit-erature [52, 53].4.3 Tandem Mass Spectrometric Fragmentation ofFormaldehyde4.3.1 Calmodulin-Calmodulin Cross-linked PeptidesThe MS/MS spectra of PFA cross-linked calmodulin-melittin was examined. Atotal of three interpeptide cross-linked species involving calmodulin were identifiedand confirmed via MS/MS as shown in Table 4.8, which lists the m/z, charge,monoisotopic mass (experimental and calculated), mass accuracy (ppm), the massand sequence of each component peptide (highlighted residues correspond to cross-linked/modified sites).The mass of the triply charged species at m/z 666.33 corresponded to themass of calmodulin peptides75KM(ox)K77and 1A(ac)DQLTEEQIAEFK13 plus the12 Da bridge (Figure 4.25). Type 3 (Iy1 to Iy10 and Ib2 to Ib6) and type1(I-NH3) ions confirmed 1A(ac)DQLTEEQIAEFK13. Type 3 (IIb2(ox), IIy1,IIy2(ox)), and type 1(II(ox)+12) ions verified peptide 75KM(ox)K77. All togetherthe cross-linked species structure was confirmed. The acetylated N-terminus of1A(ac)DQLTEEQIAEFK13 prevented Schiff base, methylol and cross-link forma-tion at this site. In this cross-linked structure, the residues that can form cross-linksin the second step, i.e. Q3 and Q8, are only present in 1A(ac)DQLTEEQIAEFK13. Therefore, it can be concluded that 75KM(ox)K77 was modified first at either K75or K77 and subsequently reacted with Q3 or Q8 in 1A(ac)DQLTEEQIAEFK13.Modified type 3 ions Ib3+12, Ib5+12 and Ib6+12 suggested that Q3 was involvedthe cross-link and the type 3 ion, IIy2(ox) +12, suggested that K77 was modified,supporting cross-linking between Q3 and K77. However, the absence of type 2ions prevented further validation of cross-linking sites.1374.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeThe species with a charge of four at m/z 735.59 species correspondedto 1A(ac)DQLTEEQIAEFK13^76M(ox)KDT- DSEEEIR86 (Figure 4.26). In theMS/MS spectra, unmodified type 3 (Iy1 to Iy10, Ib2 to Ib7 and IIy1 to IIy9) ionsfor both component peptides confirmed the sequence of the cross-linked peptides.Type 1 ions for both peptides were present. The type 1 ion ,II(ox)+12- CH3SOH,resulted from the loss of methanesulfenic acid (-64 Da) observed in M residuecontaining peptides under CID[163]. Fragmentation occurred such that the +12 Damethylene bridge remained attached to peptide 76M(ox)KDTDSEEEIR86, as indi-cated by the II(ox) + 12 type 1 ion. Interestingly, +12 Da modified type 3 ions(IIb2(ox)+12,IIb3(ox)+12, IIb5(ox)+12, and IIb6(ox)+12) only appeared for thepeptide containing the K77 as well and localized this mass shift to K77. There-fore, K77 was modified and cross-linked to either Q3 or Q8, similar to the previouscross-linked species discussed, but the lack of type 2 ions prevented further valida-tion of cross-linking sites.The mass of the triply charged species at m/z 1192.23 matched the follow-ing structure: 75KM(ox)KDTDSEEEIREAFR90^1A(ac)DQLTEEQIAEFK13 (Fig-ure 4.27). The MS/MS spectra contained only signals pertaining to the whole,component peptide components due to exclusive cross-link bridge fragmentation.The lack of type 3 and type 2 ions prevented peptide backbone sequencing andcross-link site localization. Similar to the previous two cross-linked species, type1 ion I(ox) +12 indicated that the +12 Da bridge remained attached to the peptidecontaining K77.These three cross-linked species propose that K77 was modified in the firststep of the reaction and retained the PFA cross-link bridge during type 1 fragmen-tation. All three PFA cross-linked calmodulin species involve the same region ofthe protein and only differ in missed cleavage sites at K75 and K77. From theMaxQuant unmodified peptide analysis in chapter 3 (Table 3.1), a missed cleav-age after K75 and K77 was observed suggesting that the missed cleavage at theseresidues can occur even without being modified with PFA. Interestingly, the signalintensity of 75KM(ox)K77and 1A(ac)DQLTEEQIAEFK13 was 16 times strongerthan for 75KM(ox)KDTDSEEEIREAFR90^1A(ac)DQLTEEQIAEFK13, indicatingthat the latter formed in a much smaller quantity. It is not clear whether all cross-linked peptides represented one cross-link formation between the same two sites1384.3. Tandem Mass Spectrometric Fragmentation of Formaldehydeor a mixture of all combinations of proposed cross-linking sites. Under these mildcross-linking conditions, it is likely that only major cross-link reaction productsare detected. Therefore, evidence supports cross-linking between K77 and Q3 asthe major species. In chapter 3, a +12 and +30 Da modification was localized onK77 in modified peptides identified via MaxQuant. This further provides evidencethat K77 was modified in the first step of the reaction and formed a cross-link withQ3 in the second step.Table 4.8: PFA calmodulin interpeptide cross-linked species, in which cross-linking sites are highlighted in red.Cross-Linked Species Calmodulin Peptide 1 Calmodulin Peptide 2 TotalModificationMassm/z z [M]exp(Da)[M]calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence666.33 3 1996.00 1995.99 7 421.24 75KM(ox)K77 1562.75 1A(ac)DQLTEEQIAEFK13 12736.59 4 2942.37 2942.34 9 1367.59 76M(ox)KDTDSEEEIR86 1367.59 1A(ac)DQLTEEQIAEFK13 121192.23 3 3573.69 3573.68 3 1998.97 75KM(ox)KDTDSEEEIREAFR90 1562.75 1A(ac)DQLTEEQIAEFK13 121394.3. Tandem Mass Spectrometric Fragmentation of Formaldehyde3.2 x 104 M/Z Intensity Ib2 Ib3 Ib4 Iy4 Iy2, IIy2(ox) Iy3 Iy5 Ib5 IIy1,Iy2 IIb2 IIy2(ox)+12 IIy2+12 II + 12 0.0 100 200 300 400 500 600 Iy5 Ib6 Iy6 Iy7 Iy8 Iy9 Iy10 Ib6 – H2O I – NH3 0.0 600 700 800 900 1000 1100 1200 2.1 x 104 Iy2, IIy2(ox)+12 Peptide I Peptide II b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3K M Ky 3 2 1(ac) (ox) Ib3+12 Ib5+12 Ib6+12 Figure 4.25: Interpeptide PFA calmodulin cross-link m/z 666.33 (z = 3) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red. Note: Fragmentation indicated on the backboneof the peptide corresponds to type 3 ions only1404.3. Tandem Mass Spectrometric Fragmentation of FormaldehydePeptide I Peptide II Ib3 Ib2 Ib4 Ib5 IIy1 Iy4 Iy2 Iy1 IIy3 IIy4 IIb3(ox)+12 IIb2 IIy2, IIb2(ox)+12 IIb5(ox)+12 Iy5 IIb2-NH3 Iy2 Ia3 Ib5-NH3 0.0 150 200 250 300 350 400 450 500 550 600 Iy9 Iy8 [II(ox)+12]2+ IIy5 IIy6 IIy7 IIy8 IIy9 IIb6(ox)+12 Iy7 Iy6 [I-NH3]2+ IIb11(ox)+12 Iy8 [II(ox)+12-CH3SOH]2+ [II(ox)+12-CH3SOH-H2O]2+ 0 700 800 900 1000 1100 1.9 x 104 4.0 x 104 M/Z Intensity b 1 2 3 4 5 6 7 8 9 10 11M K D T D S E E E I Ry 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1(ac) (ox) Figure 4.26: Interpeptide PFA calmodulin cross-link m/z 736.59 (z = 4) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red. Note: Fragmentation indicated on the backboneof the peptide corresponds to type 3 ions only. 1414.3. Tandem Mass Spectrometric Fragmentation of Formaldehyde1.9 x 103 II [I(ox}+12] 3+ [I(ox)+12] 2+ [I(ox)] 2+ 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Peptide I Peptide II M/Z Intensity b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16K M K D T D S E E E I R E A F Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1(ac) (ox) Figure 4.27: Interpeptide PFA calmodulin cross-link m/z 1192.23 (z = 3) proposedstructures with fragment ion evidence (top) and MS/MS spectra (bottom); Cross-linker bridges are indicated in red.1424.3. Tandem Mass Spectrometric Fragmentation of Formaldehyde4.3.2 Calmodulin-Melittin Cross-linked PeptidesFour PFA cross-links between calmodulin and melittin were identified as shownin Table 4.9. The candidate with a triple charge at m/z 744.73 matchedthe mass of melittin peptide 1GIGAVLK7 cross-linked to calmodulin peptide1A(ac)DQLTEEQIAEFK13 (Figure 4.31). Type 3 ions (Iy1 to Iy10 and Ib2 to Ib10)confirmed the sequences of 1A(ac)DQLTEEQIAEFK13. A full series of +12 Damodified type 3 b and y ions and a type 1 ion confirmed 1GIGAVLK7 with a mod-ification either at G1 or K7. The lack of MS/MS evidence for modified calmodulinK13 suggested that melittin G1 or K7 was modified and cross-linked to calmod-ulin Q3 or Q8. It is possible that both reactions occurred, producing structuralisomers with different cross-linking sites. If cross-linking within calmodulin is as-sumed to be uniform across major reaction products, it is unlikely that calmodulinresidues occupied in cross-links to other calmodulin residues will form cross-linkswith melittin residues. Cross-linking between G1 or K7 and Q8 is most likely sinceQ3 formed a cross-link to K77 within calmodulin. However, hypothesis must befurther validated. Nevertheless, type 3 ion IIb8+12 provides evidence that Q8 wasinvolved in the cross-link formation. Stronger signals for modified G1 in com-parison to K7 suggested that cross-linking between melittin G1 and calmodulinQ8 represented the major product. To determine whether G1 or K7 representedthe major modification site, the degree of modification (DOM) was calculated, asdescribed in section 2.6.2.The plot of the DOM of 1GIGAVLK7 (Figure 4.28) revealed a drastic increasein modification for IIb2 and a negligible rise in DOM for the y ions, supporting thatG1 was the main site for the modification. This is consistent with the observationof +12 Da modified melittin G1 in the PFA modified peptides identified in chapter3.1434.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeBIons1.000.750.500.25Y Ions0.250.500.751.00Degree of ModificationG I G A V L KFigure 4.28: Degree of Modification: Bar graph depicting the DOM of each b ionand y ion for 1GIGAVLK7 in cross-linked species m/z 744.73 (z = 3)Proposed structures of the species with a charge of five and m/z at 484.46and 588.52 are 91VFDKDGNGYISAAELR106^1GIGAVLK7 (Figure 4.32) and87EAFR(+12)VFDKDGNGYISAAELR106^1GIGAVLK7 plus an additional Schiffbase modification (Figure 4.33), respectively. For the species with a charge of fiveand m/z at 484.46, the calmodulin peptide, 91VFDKDGNGYISAAELR106, wasconfirmed with the type 3 ions Ib2, Ib4, Ib5, Ib8 and Ib9 and the melittin pep-tide 1GIGAVLK7 was confirmed with a +12 Da modified type 1 ion and type 3ions IIb2 to IIb5, IIy1 to IIy3, and IIy5. For the species with a charge of five andm/z at 588.52, the calmodulin peptide 87EAFR(+12)VFDKDGNGYISAAELR106was confirmed with type 3 ions Ib2 to Ib10 and Ib13 and the melittin peptide1GIGAVLK7 was confirmed with a +12 Da modified type 1 ion and type 3 ionsIIb2 to IIb6, IIy1 to IIy3, and IIy5. The species with m/z at 588.52 species dif-fers from the species with m/z at 484.46 only in that the calmodulin peptide hasan additional missed cleavage site and there is an extra modification. Therefore itis likely that the modification occurred at the missed cleavage site, R90. This issupported by the series of Ib4 to Ib12 ions (with only Ib11 + 12 missing) modifiedwith a +12 Da mass in the MS/MS spectra of the species with a charge of five andm/z at 588.52. Possible cross-linking mechanisms include: (1) melittin G1 or K7was modified and cross-linked to calmodulin N97, Y99 or R106; (2) calmodulin1444.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeK94 or R106 was modified and cross-linked to melittin G1. The presence of amodified type 1 ion for the melittin peptide supports that 1GIGAVLK7 was modi-fied first and cross-linked to calmodulin 91VFDKDGNGYISAAELR106. Modifiedfragment ions (IIb2+12 to IIb5+12) support a modified G1. Weaker signals forIIy1+12 and IIy2+12 fragment ions support a modified K7. Similar to the triplycharges species at m/z 744.73 discussed above, DOM plots localized the +12 Damodification to G1 for both cross-linked species at m/z 484.46 and 588.52 (seeFigure 4.29 and 4.30, respectively).BIons1.000.750.500.25Y Ions0.250.500.751.00Degree of ModificationG I G A V L KFigure 4.29: Degree of Modification: Bar graph depicting the DOM of each b ionand y ion for 1GIGAVLK7 in cross-linked species m/z 484.46 (z = 5)1454.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeBIons1.000.750.500.25Y Ions0.250.500.751.00Degree of ModificationG I G A V L KFigure 4.30: Degree of Modification: Bar graph depicting the DOM of each b ionand y ion for 1GIGAVLK7 in cross-linked species m/z 588.52 (z = 5)In the MS/MS spectra of the species at m/z 484.46, Iy8 +12 and Iy9+12 ionssuggest that Y99 is likely to be involved in a cross-link. Similar to the MS/MSspectrum of the species at m/z 588.52, IIb2+12 to IIb5+12 support a modified G1and weaker signals for IIy1+12 and IIy2+12 fragment ions support a modified K7.Assuming that cross-linking sites are consistent among most cross-linked complexmolecules involving similar regions, pooling information from the MS/MS spectraof both species at m/z 484.46 and 588.52 indicates that cross-linking occurred pri-marily between G1 and Y99 with an extra modification on R90. This is consistentwith modified peptides identified in chapter 3 with +12 Da modifications localizedon melittin G1 and calmodulin R90.Species with a charge of four at m/z 730.65 corresponded to the mass of thefollowing structure: 107HVM(ox)TNLGEK(tm)LTDEEVDEM(ox)IR126^22RKR24with an additional mass of +48 Da. Figure 4.34 displays the proposed struc-ture with the fragment ion evidence and Figure 4.35 shows the MS/MS spec-trum. Only type 1 ions corresponding to each peptide component with veryfew type 3 ions matching 107HVM(ox)TNLGEK(tm)LTDEEVDEM(ox)IR126 wereidentified in the MS/MS spectrum. Type 1 ion I+60 suggests that the +12 Dabridge along with an additional +48 Da modification was retained on peptide107HVM(ox)TNLGEK(tm)LTDEEVDEM(ox)IR126. Type 1 ion I(ox)-H2O and1464.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeType 3 (Iy3(ox) to Iy6(ox) and Ib4(ox)+12) ions suggest that M residues wereoxidized. The +48 Da corresponds to the mass of three oxygens which can beexplained by one methionine oxidized to methionine sulfoxide (+16 Da) and theother oxidized to methionine sulfone (+32 Da) [164]. Since K115 was blocked inthe calmodulin peptide, cross-linking could have occurred with the following pos-sible mechanisms: (1) melittin R22, K23 or R24 was modified and cross-linked tocalmodulin H107, N111 or R126; (2) calmodulin R126 was modified and cross-linked to melittin R22 or R24. Type 2 ion Iy8(ox)+II+12 suggests that R126 iscross-linked and Iy14(2ox)+IIy1+12 suggests that R126 is connected to R24. Thisis consistent with with +36 Da (3 Schiff Bases) localized on melittin 21KRKR24 inmodified peptides identified in chapter 3.Table 4.9: PFA calmodulin-melittin interpeptide cross-linked species are listed andclassified as capturing antiparallel (shaded in blue) or parallel (white) binding.Cross-linking sites are highlighted in red.Cross-Linked Species Melittin Peptide Calmodulin Peptide Total PFAModificationMassm/z z [M]exp(Da)[M]Calc(Da)Mass Accuracy(ppm)[M](Da)Sequence [M](Da)Sequence744.73 3 2231.19 2231.17 9 656.42 1GIGAVLK7 1562.75 1A(ac)DQLTEEQIAEFK13 12730.65 4 2918.60 2918.43 59 458.31 22RKR24 2400.12 107HVM(ox) TNLGEK(tm)LTDEEVDEM(ox) IR126 12484.46 5 2422.30 2422.29 6 656.42 1GIGAVLK7 1753.86 91VFDKDGNGYISAAELR106 12588.52 5 2937.60 2937.53 22 656.42 1GIGAVLK7 2257.11 87EAFRVFDKDGNGYISAAELR106 241474.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeM/Z Intensity 1.9x104 1.7 x 104 Ib3 Ib2 Ib4 Iy4 Iy3 Iy2 Iy1, IIy1 IIb1+12 IIb3+12 IIy2 IIy2+12 IIb4+12 IIb5+12 IIb6+12 IIb5 IIb4 IIb3 IIb6 IIy3 IIy3+12 Ia4 IIy5+12 IIy5 0 150 500 II+12 200 250 300 350 400 450 Ib5 Ib6 Ib7 Ib8 Ib8+12 Ib9 Ib10 Iy9 Iy8 Iy7 Iy6 Iy5 Iy10 I-H2O IIy6+12 0 600 1200 II 500 550 700 800 900 1000 1100 Calmodulin Peptide I Melittin Peptide II b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1(ac) Figure 4.31: Interpeptide PFA calmodulin-melittin cross-link m/z 744.73 (z = 3)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. Note: Fragmentation indicated on thebackbone of the peptide corresponds to type 3 ions only1484.3. Tandem Mass Spectrometric Fragmentation of Formaldehyde0 2.8 x 10 3 Iy3 Iy2 Iy1 IIy1 IIb2 IIy2+12 Ib2 IIb3 IIb3+12 IIy2 IIy2+12 IIy3 IIb4+12 IIb5 IIb5+12 IIb5 IIb5+12 Ib4 IIa2+12 IIa3 IIy3-NH3 IIa5 150 500 200 250 300 350 400 450 2.7 x 103 Iy6 Iy5 Ib5 II+12 Iy7 Ib8 Iy8+12 Ib9 Iy9 Iy9+12 0 550 1000 600 650 700 750 800 850 900 950 Calmodulin Peptide I Melittin Peptide II M/Z Intensity b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16V F D K D G N G Y I S A A E L Ry 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1Figure 4.32: Interpeptide PFA calmodulin-melittin cross-link m/z 484.46 (z = 5)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. Note: Fragmentation indicated on thebackbone of the peptide corresponds to type 3 ions only 1494.3. Tandem Mass Spectrometric Fragmentation of Formaldehyde4.8x 103 M/Z Intensity Iy1 Iy2 Iy3 Iy4 Iy5 Ib1 Ib3 Ib4+12 [Ib10+12] 2+ IIy1 IIy1 +12 IIb2 IIb2 + 12 IIb3+12 IIb3 IIy2 IIy3 IIy3+12 IIb4+12 IIb4 IIb5 IIb5 +12 IIb6 IIb6+12 IIb5 IIb5+12 0 150 200 250 300 350 400 450 500 550 600 Iy6 Iy7 Ib5 +12 Ib6+12 Ib7+12 Ib8+12 Ib9+12 II+12 [Ib13-NH3] 2+ 0 600 700 800 900 1000 1100 1200 6.2x 103 [Ib12+12] 2+ Calmodulin Peptide I Melittin Peptide II b 1 2 3 4 5 6 7G I G A V L Ky 7 6 5 4 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20E A F R V F D K D G N G Y I S A A E L Ry 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1Figure 4.33: Interpeptide PFA calmodulin-melittin cross-link m/z 588.52 (z = 5)proposed structures with fragment ion evidence (top) and MS/MS spectra (bottom);Cross-linker bridges are indicated in red. Note: Fragmentation indicated on thebackbone of the peptide corresponds to type 3 ions only 1504.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeCalmodulin Peptide IMelittin Peptide IIb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20H V M T N L G E K L T D E E V D E M I Ry 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3R K Ry 3 2 1Type 1 and 3 ionsType  2 ionsb 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20H V M T N L G E K L T D E E V D E M I Ry 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3R K Ry 3 2 1b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20H V M T N L G E K L T D E E V D E M I Ry 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3R K Ry 3 2 1(tm)(tm)(tm)(ox) (ox)(ox) (ox)(ox) (ox)Figure 4.34: Interpeptide PFA calmodulin-melittin cross-link m/z 730.65 (z = 4)proposed structures with fragment ion evidence; Cross-linker bridges are indicatedin red.1514.3. Tandem Mass Spectrometric Fragmentation of Formaldehyde2.6 x 103M/ZIntensityIIIb2Iy1,IIy1Ia2Iy3(ox)0200 250 300 350 400 450[I+60]3+[I(ox)-H 2O]3+Iy4Ib7Ib4(ox)+12 Iy4(ox) Iy6(ox)Ib11(ox)Iy5(ox)0500 550 600 650 700 750 8004.3 x 103[Iy14(2ox)+IIy1+12]3+[Iy8(ox)+II+12]2+Figure 4.35: Interpeptide PFA calmodulin-melittin cross-link m/z 730.65 (z = 4)MS/MS spectra; Cross-linker bridges are indicated in red.The labile nature of PFA’s cross-linker bond was illustrated in MS/MS of thecross-linked species 107HVM(ox)TNLGEK(tm)LTDEEVDEM(ox)IR126^22RKR24and 75KM(ox)KDTDSEEEIREAFR90^1A(ac)DQLTEEQIAEFK13, in which ex-clusive fragmentation at the cross-linker bridge generated fragment ions corre-sponding to intact peptide components. Although both type 3 and type 2 ion evi-1524.3. Tandem Mass Spectrometric Fragmentation of Formaldehydedence was not relatively abundant, high intensity signals pertaining to componentpeptides and modified component peptides in the MS/MS spectra were deemed assufficient to confirm cross-linking.Type 2 ions were observed only for cross-linked species107HVM(ox)TNLGEK(tm)LTDEEVDEM(ox)IR126^22RKR24. It is hypothe-sized that this species required a higher CID energy to fragment both the cross-linkbridge and backbone. Thus this cross-linked species produced type 2 ions andmainly type 1 ions at the lower CID energy utilized.The PFA bridge fragmented such that the peptide that was modified in the firststep of the reaction retained the +12 Da mass shift on its subsequent fragment ions.Since the specific subset of amino acids that can form modifications is known, thiscould potentially clarify the cross-linking mechanism for the elucidation of cross-linking sites. The exception was the cross-linking observed between melittin R24and calmodulin R126, suggesting that this trend is also a function of the identityof the amino acids involved in the cross-link that would dictate the bond energy ofthe cross-linking bridge. Therefore, MS/MS spectra of cross-linked species frommore protein complexes with different cross-linking sites are required to explorethis trend.The cross-linking sites localized are consistent with the modified peptides iden-tified in chapter 3. Peptides with a +12 Da modification localized on calmod-ulin K77 and melittin G1, and with +36 Da (3 Schiff Bases) localized on melittin21KRKR24 confirm that these residues/regions were modified in the first step ofthe reaction. In addition, peptides with a +12 Da modification on R90 was alsoobserved and this is consistent with the extra modification localized on this sitein the cross-link 87EAFR(+12)VFDKDGNGYISAAELR106^1GIGAVLK7. Figure4.36 depicts the proposed reaction mechanisms for the formation of cross-linkingof melittin R24 to calmodulin R126 (a), calmodulin K77 to Q3 (b), melittin G1 toY99 (c) and melittin G1 to calmodulin Q8 (d).1534.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeH HO-H2ONHONH NH2NHR1R2NHONH NNHR1R2CH2NHONH NHNHR1R2CH2OHNHONH NH2NHR3R4NHONHNHR1R2NHON NHNHR3R4 NNHONHNHR1R2NH2ON NNHR3R4 NNHONH NNHR1R2CH2(a)i.ii. ++ONHNH2R1R2O ONH2NHR3R4ONHNR1R2CH2ONHNHR1R2CH2OHONHNHR1R2OONHNHR3R4H HO-H2OONHNR1R2CH2(b)i.ii. ++NH2 NHOR1R2NNHOR1R2CH2NHNHOR1R2CH2OHH HO-H2OONHCH OHR3R4ONHCOHR3R4NHNHOR1R2NNHOR1R2CH2(c)i.ii. ++1544.3. Tandem Mass Spectrometric Fragmentation of FormaldehydeNHNHOR1R2OONHNHR3R4O ONH2NHR3R4NNHOR1R2CH2NH2NHOR1R2NNHOR1R2CH2NHNHOR1R2CH2OHH HO-H2O(d)i.ii. ++Figure 4.36: Reaction mechanisms of the PFA modification (i) and cross-linkingformation (ii) of melittin R24 to calmodulin R126 (a), calmodulin K77 to Q3 (b),melittin G1 to calmodulin Y99 (c) and melittin G1 to calmodulin Q8 (d); Reactiveregions are highlighted in red. R1 and R2, and R3and R4, represent arbiturarysections of the modified and cross-linked proteins, respectively.PFA cross-linking of calmodulin K77 to Q3 and of melittin G1 to calmodulinQ8 represent the first interpeptide PFA cross-links identified in proteins involvingglutamine, and PFA cross-linking between melittin R24 and calmodulin R126 rep-resent the first interpeptide PFA cross-linking identified between two R residues inproteins under mild in vivo-like reaction conditions. These were previously onlyobserved with long and extensive PFA cross-linking conditions using model pep-tides and small proteins [86, 87, 90–92]. PFA cross-linking between melittin G1and calmodulin Y99 is consistent with previous studies that observed cross-linkingbetween these amino acids in small model proteins under mild in vivo-like reactionconditions [50]. In general, this is the first study that identified PFA cross-linkingwithin a non-covalent protein complex.1554.4. Formaldehyde versus other Cross-linker Fragmentation4.4 Formaldehyde versus other Cross-linkerFragmentationPlacing the observed fragmentation patterns of PFA cross-linked species in contextto those of other established cross-linkers in this study aided in establishing frag-mentation rules to confirm the presence of PFA cross-linked species. With othercross-linkers, type 1 fragment ions (i.e. fragment ions corresponding to the intactpeptide component) for the smaller peptide component was most often detectedand type 1 evidence of none or both components was unusual, which was also ob-served with PFA cross-linking. Type 3 ions (i.e. fragment ions from the fragmenta-tion of the crosslinker bridge and peptide backbone) were rarely observed and type2 ions (i.e. fragment ions from the fragmentation of the peptide backbone) withthe cross-linker intact facilitated the majority of the cross-linking identificationwith other cross-linkers. In contrast, type 2 ions were uncommon for PFA cross-linked species, and type 3 ions were used instead to verify cross-linked structures.Therefore, type 1 ion evidence of at least one component peptide along with type3 ion evidence of the second component peptide was accepted as the minimumcross-linking evidence required to confirm a candidate PFA cross-linked speciesdetected in the MS spectrum. This is based on studies of fragmentation patterns ofother cross-linkers, which established that it is common to observe extensive back-bone fragmentation of only the larger peptide component in a cross-linked species[51, 53] . The lack of type 2 evidence was justified by PFA’s labile cross-linkerbond under CID fragmentation that has been previously reported to fragment si-multaneously with the peptide backbone [50]. Diagnostic ions were not observedand PFA’s cross-linker bridge is too small to generate fragments within the linkerunlike the large NHS ester cross-linkers.In general, the evidence for localizing PFA cross-linking is much more am-biguous than with other cross-linkers, especially since multiple structural isomersare possible. This was particularly evident in cross-links involving 1GIGAVLK7,where evidence supported cross-linking at both G1 and K7. However, DOM cal-culations consistently proved that G1 was the major cross-linking site across allthese PFA calmodulin-melittin cross-links. Furthermore, all PFA cross-links withincalmodulin involved the same regions. Despite the diverse reactivity of PFA, the1564.5. General Criteria for Evaluating Tandem Mass Spectrometric Patterns of Cross-linked Speciesconsistency in cross-link site localization observed across different cross-linkedpeptides illustrates that PFA has the potential to capture specific interactions.4.5 General Criteria for Evaluating Tandem MassSpectrometric Patterns of Cross-linked SpeciesTypically, cross-link identification software derives scores for cross-linked speciesbased on the percent of expected backbone fragment ions present (i.e. the numberof expected y and b ions divided by the total length of the peptide)[120, 165, 166].Therefore a similar approach was used to summarize the general criteria for con-firming cross-linking manually. It was observed that the larger peptide extensivelyfragmented while the shorter peptide tended to remain intact under CID fragmenta-tion, which is consistent with previous observations of cross-link fragmentation inliterature[51, 53]. Nevertheless, the appearance of two intact peptide ion MS/MSsignals from the exclusive cross-link bridge fragmentation proves that two peptidesexist and along with their combined mass plus cross-linker bridge mass matchingthe mass of the cross-linked species, this was also considered sufficient evidenceto confirm cross-linking. Therefore, candidates that did not produce fragment ionevidence that met any one of the requirements listed below were not considered ascrosslinked species:(1) > 20% of the expected y and b ions from the backbone fragmentation (type 2)or backbone plus cross-linker fragmentation (type 3) of each peptide component;(2) > 20% of the expected y and b ions from the backbone fragmentation (type 2) orbackbone plus cross-linker fragmentation (type 3) of one peptide component plustype 1 ions of the other peptide component from the exclusive cross-linker bridgefragmentation;(3) type 1 ions of both peptide components from the exclusive cross-linker bridgefragmentation;1574.6. A Second Look at Trypsin Digestion of Cross-linked ResiduesIn addition, cross-linking between adjacent peptides were classified as singlepeptides with missed cleavages and not crosslinked species. Overall, this providesobjective guidelines to evaluate cross-linking when interpreting the MS/MS of can-didate cross-linked species manually.4.6 A Second Look at Trypsin Digestion of Cross-linkedResiduesIn chapter 3, it was shown that trypsin cleavage did not occur after PFA modifiedresidues due to the lack of modifications localized on terminal K and R residuesi.e trypsin cleavage sites. Theoretically, since PFA cross-linking does not af-fect the electrophilicity of K and R residues, and forms small cross-link bridgesthat could potentially fit in the active site of trypsin, the possibility of trypsincleaving after PFA cross-links was not excluded. Using MS/MS, cross-linkingwas localized on terminal calmodulin K77 and R126 and melittin R24 in the fol-lowing PFA cross-linked species:75KM(ox)K77and 1A(ac)DQLTEEQIAEFK13 and107HVM(ox)TNLGEK(tm)LTDEEVDEM(ox)IR126^22RKR24. These findings sug-gest that trypsin can potentially cleave after PFA cross-linked residues.Based on the peptide bond hydrolysis mechanism via trypsin and previous lit-erature [34, 38, 152], it was concluded that cross-linked K residues formed byEDC/sulfoNHS, sulfoDST, BS3, sulfoEGS would not be cleaved. However, trypsincleavage was observed after BS3 cross-linked K77. It is hypothesized that the adja-cent oxidized methionine’s electron withdrawing nature increased the partial pos-itive charge on the carbonyl carbon connected to the cross-linked lysine to favortrypsin cleavage. Cleavage after sulfoDST and EDC cross-linked K30 was alsoobserved. Whether the lack of a missed cleavage is sufficient enough to eliminatethese cross-linked species when MS/MS provides evidence of cross-linking is un-known. To clarify this, it may be useful to explore trypsin cleavage efficiency aftercross-linked residues with various microenvironments.1584.7. MS/MS Analysis of Formaldehyde Cross-linked Ribonuclease-S4.7 MS/MS Analysis of Formaldehyde Cross-linkedRibonuclease-SAlthough the MS/MS verification of PFA cross-linked species was successful in thecamodulin-melittin system, similar analysis could not be performed for the RNaseSsystem. The limitations associated with the PFA cross-linked RNaseS system arediscussed in chapter 6. Nevertheless, insight into the structure and stablization ofthe PFA treated RNaseS complex is examined in the next chapter.159Chapter 5Structural Characterization ofCalmodulin-Melittin andRibonuclease S Cross-linkedSpecies5.1 Trypsin Cleavage and Accessibility of Residues5.1.1 Comparing Cross-linker Reagent versus Literature TrypsinAccessibility in Calmodulin-MelittinExamining trypsin cleavage sites in calmodulin-melittin can provide insight intothe conformation of calmodulin and melittin, since the extent of trypsin cleavageat a particular site conveys the extent of the site’s exposure to the trypsin reagent,which is dictated by the conformation of the protein. Upon binding, both calmod-ulin and melittin undergo a change in conformation and would thus differ in reagentaccessibilities in their bound versus unbound state. In this study, the Ca2+-freecalmodulin-melittin structure is analyzed via cross-linking. NMR and fluorescenceexperiments in literature, have shown that both Ca2+-free calmodulin-melittin andCa2+-saturated calmodulin-melittin complexes exhibit similar conformations [95].Therefore structural characteristics are comparable between both complexes.A previous limited proteolysis experiment examining a Ca2+-saturatedcalmodulin-melittin system revealed that major trypsin cleavage sites were R37,R74, K75, K77 and R106 in unbound calmodulin. Upon complex formation, onlyK77 and to a smaller extent K75 and R74 were trypsin cleavage sites in calmod-ulin [102]. This suggests that K77 is accessible in the complex conformation of1605.1. Trypsin Cleavage and Accessibility of Residuescalmodulin, allowing access to cross-linking reagents. This is consistent with theobservation in this study of cross-linking at K77 with PFA, BS3 and sulfoEGS, i.e.cross-linkers that vary in size and membrane permeation ability, supporting thatK77 was accessible also in these cross-linked structures. Similarly, the limited pro-teolysis experiment in literature showed that for melittin, cleavage sites K7, K21,R22, K23 and R24 in the unbound molecule became inaccessible in melittin uponbinding to calmodulin except for R24. PFA cross-linking was observed at R24,which is adjacent to K23, a major cross-linking site for other cross-linkers in thispresent study, which also supports the accessibility of this region in the cross-linkedstructures. Furthermore, the limited proteolysis experiment in literature concludedthat terminal amino acids in melittin and the flexible loop (specifically, residues76-78 and 87-92) in calmodulin were most accessible in the complex conforma-tion. This may explain why only terminal segments of melittin (1-7 and 22-24)were observed in all identified cross-linked species and further supports extensivecross-linking localized within this flexible loop (K75 and K77) of calmodulin inthis study. Finally, it was demonstrated that calmodulin E54 interacts with themelittin C-terminus in the limited proteolysis experiment in literature, which isconsistent with EDC cross-linking observed between calmodulin E54 and melittinK23 in this study [102]. All together, the accessibility of the calmodulin-melittincomplex conformation observed in the limited proteolysis experiment in literatureagrees with the cross-linking structures observed in this study. This supports thatcross-linking potentially captured the binding of calmodulin to melittin.5.1.2 Comparing Trypsin Accessibility in Control versus PFATreated Calmodulin-MelittinIt is hypothesized that in the absence of stabilizing cross-linking, the transientcalmodulin-melittin interaction would not be preserved. In other words, it is ex-pected that control samples would contain calmodulin and melittin molecules inthe unbound state and cross-linked samples would contain calmodulin-melittinmolecules in the bound state conformation. It is important to note that in thisexperiment, trypsin digestion was performed in gel overnight after denaturing pro-teins with SDS whereas the trypsin digestion in the limited proteolysis experiment1615.1. Trypsin Cleavage and Accessibility of Residuesin literature was performed in solution without denaturing the proteins over a 15min to four hour time course [102]. Therefore trypsin cleavage sites observed inthe literature experiment may not be comparable to trypsin cleavage sites in thisexperiment. Nevertheless, for cross-linked samples, it was observed that the SDSwas inefficient in denaturing cross-linked proteins. This was supported by the pro-teins migrating further down the gel than expected, suggesting that the protein wasnot fully folded. However, control protein gel bands migrated to expected posi-tions, suggesting that the protein was efficiently denatured by SDS. Therefore, itis hypothesized that trypsin cleavage would vary between control and cross-linkedsamples based on differing accessibility of the structures captured in these samples.The trypsin cleavage sites observed in the control vs PFA cross-linked sam-ple was compared. In the list of identified unmodified peptides originating fromcontrol samples (see Tables 3.1), trypsin cleavage was observed for calmodulin atK13, K21, K30, R37, R74, K75, K77, R86, R90, K94, R106, and R126 and formelittin at K7, K21,and R22 suggesting that these residues were accessible. Inter-estingly, all residues that were shown to be accessible in the unbound conformationof calmodulin and melittin in literature [102] were also trypsin cleavage sites in thecontrol sample, supporting that the transient interaction between calmodulin andmelittin was not preserved in the control sample after SDS denaturing. This isconsistent with section 3.3.1.1, where control peptides were identified in a singlecalmodulin molecule (~17 kDa) or melittin tetramers (~ 6 kDa), suggesting thatcalmodulin and melittin exist as separate species in the control sample.Assessing the accessibility in cross-linked calmodulin-melittin samples wasmore complicated since variations in the extent and type of cross-linking can occurthat may preserve the calmodulin-melittin interaction differently. Upon PFA cross-linking of calmodulin-melittin, SDS-PAGE separation, and the subsequent in-geltrypsin digestion, all intramolecular calmodulin cross-links and the intramolecu-lar calmodulin-melittin cross-link at m/z 730.65 (z = 4) were identified in 14-19kDa proteins and the intramolecular calmodulin-melittin cross-links at m/z 744.73(z = 3), 484.46 (z = 5) and 588.52 (z = 5) were identified in 19-33 kDa pro-teins via MS/MS (see chapter 4, Tables 4.8 and 4.9). Although the SDS-PAGEprovided evidence for intermolecular cross-linking between multiple calmodulinor calmodulin-melittin complex molecules (> 33 kDa), no cross-linked peptides1625.1. Trypsin Cleavage and Accessibility of Residueswere confirmed via MS/MS in their respective gel bands. Regardless, PFA un-modified and modified peptides were identified in proteins from all three molec-ular weight categories via MaxQuant (see section 3.3.1). It is hypothesized thatthe intramolecular cross-linked proteins (14-33 kDa) would be stabilized and re-tain or partially retain the calmodulin-melittin binding structure. Since no cross-linked peptides were identified in proteins > 33 kDa, it is unknown whether onlyintermolecular cross-links formed or whether intramolecular cross-links preserv-ing the calmodulin-melittin interaction also formed. In addition, it is possiblethat the 14-19 kDa molecular weight protein gel band also contained unmodifiedcalmodulin (~17 kDa) or calmodulin with only modifications and no cross-links.These proteins are hypothesized to not retain structural properties of calmodulin-melittin, similar to the control sample. Nevertheless, 14-19 kDa may also containcalmodulin-melittin crosslinked species, which would preserve their binding struc-ture, since distingushing between the 19 kDa cut-off and 19.6 kDa (i.e. the massof a calmodulin-melittin complex) is difficult on the SDS-PAGE gel.To gain insight into whether cross-linking preserved or partially preservedcross-linked structures in all three molecular weight categories, the accessibilitywas examined via trypsin cleavage. Using the PFA unmodified and modified pep-tides identified in chapter 3 (from Tables 3.2, and 3.3) and the PFA cross-linkedpeptides identified in chapter 4 (from Tables 4.8 and 4.9), the percent trypsincleavage of PFA treated calmodulin-melittin samples was compared to the controlcalmodulin-melittin sample.The percent trypsin cleavage for calmodulin and melittin in control vs PFAtreated samples was calculated for each cleavage site by dividing the abundance(i.e. normalized MS signal peak areas) of peptides that supported cleavage at aparticular site by the abundance of all peptides containing the particular site (withor without a missed cleavage). This was performed for unmodified, modified andcross-linked peptides identified in the PFA treated sample and the unmodified pep-tides identified in the control sample. It was assumed that no significant loss of pep-tides occurred prior or during the MS detection and identification, and that normal-ized peak areas of identified peptides accounted for the variation in MS-reponse.Since a modification at a trypsin cleavage site may hinder cleavage, missed cleav-age sites that contained a PFA modification or a trimethyl group (i.e. calmodulin1635.1. Trypsin Cleavage and Accessibility of ResiduesK115) were disregarded to observe trypsin cleavage only as a function of proteinconformation and not modification. In melittin, the position of the PFA modifica-tions were ambiguous for the segment 21KRKR24, which appeared with +24 and+36 Da modifications. Since the +24 (2 Schiff Base modifications) and +36 (3Schiff Base modifications) Da modified peptides exhibited three and four missedcleavages, respectively, one of the missed cleavages in each of the peptide was nota result of the PFA modification. Thus, one missed cleavage site was accounted forthis peptide segment.As shown in Table 5.1, there was a decrease in percent trypsin cleavage atcalmodulin K75, K77, R86, R90, and K94, and melittin 21KRKR24 in all peptidesidentified in the PFA treated samples versus the control sample. This is consis-tent with the reduced trypsin cleavage at these sites observed in bound calmod-ulin in literature. However, bound calmodulin is also expected to experience re-duced trypsin cleavage at calmodulin R37, R106, and to a smaller extent, R74 ,and melittin K7 [102]. In contrast, a 100% trypsin cleavage was observed at thesesites in all three PFA treated samples in this study. All together, it suggests thatPFA cross-linking partially preserved the structure of calmodulin-melittin post theSDS-PAGE denaturation process. In the gel bands with 19-33 kDa proteins, onlycalmodulin-melittin cross-links were identified. Thus, it was expected that thissample would provide the most accurate picture of whether PFA cross-linking be-tween calmodulin-melittin preserved its binding structure even after SDS-PAGEdenaturation. Since these trends in percent trypsin cleavage were fairly consis-tent across peptides originating from proteins of all three molecular weight cat-egories, it suggests that similar structures were preserved with different types ofPFA cross-linking. Nevertheless, this also supports that PFA intermolecular cross-linked species ( > 33 kDa proteins) that likely formed may have been lost in thesample processing or detection and thus were not identified. This may have causeda slight discrepancy in the percent trypsin cleavage at calmodulin K77, where >33 kDa proteins did not exhibit as great decrease in trypsin cleavage as 14-33 kDaproteins treated with PFA. The peptides originating from 14 - 19 kDa proteins inthe PFA treated samples matched the percent trypsin cleavage patterns observed inthe 19-33 kDa proteins more than in the control samples. This suggested that PFAcross-linked proteins were the majority of the 14-19 kDa proteins and not PFA-1645.1. Trypsin Cleavage and Accessibility of Residuesunmodified proteins or proteins with only PFA modifications. This was furthersupported in section 3.2.4, where a high yield of PFA cross-linked versus non-cross linked proteins was observed. The > 19 kDa proteins did not display asgreat decrease in trypsin cleavage as 14-19 kDa proteins at calmodulin R90. It ispossible that the 14-19 kDa proteins contained mostly intramolecular calmodulincross-linked proteins and thus would differ from > 19 kDa proteins that most likelycontained calmodulin-melittin crosslinked proteins as the major species. In con-clusion, this supports that PFA cross-linking partially preserved the calmodulin-melittin structure even after SDS-PAGE denaturation, which was not retained inthe absence of cross-linking in the control samples.Table 5.1: Percent abundances of cleaved trypsin cleavage sites observed in thecontrol and PFA treated (in 14-19, 19-33 and > 33 kDa proteins) calmodulin-melittin samplesTrypsinCleavage SitePercent Abundance of Cleaved ResidueControl SamplePFA Treated Samples14-19kDa 19-33kDa >33 kDaCalmodulin ResiduesK13 100% 100% 100% N/AK21 100% 100% N/A N/AK30 100% 100% N/A 100%R37 100% 100% 100% 100%R74 95% 100% 100% 100%K75 50% 23% 44% 40%K77 99% 9% 8% 29%R86 71% 33% 23% 43%R90 100% 39% 98% 87%K94 6% 0% 1% 0%R106 100% 100% 100% 100%R126 100% 100% 100% 100%Melittin ResiduesK7 100% 100% 100% N/AK21 to R24 99% 35% 49% 54%1655.1. Trypsin Cleavage and Accessibility of Residues5.1.3 Comparing Trypsin Accessibility in Control versus PFATreated Ribonuclease SIn order gain insight into whether PFA preserved or partially preserved the RNaseScomplex, even after SDS denaturation procedures prior to trypsin digestion, thepercent trypsin cleavage was examined in both control and PFA treated RNaseSsamples. Similar to the calmodulin-melittin analysis in section 5.1.2, the per-cent cleavage was calculated for each cleavage site using the peptides identifiedby MaxQuant in control (Table 3.5) and PFA treated (Table 3.6) RNaseS samples.The analysis was performed using only S-protein and S-peptides produced fromthe cleavage of RNaseA at residue 20 for consistency. Also, in order to observetrypsin cleavage as a function of accessibility and not the chemical modificationof a cleavage site, sites with PFA modifications were excluded as missed cleavagesites, similar to the analysis of calmodulin-melittin. It is assumed that the major-ity of the peptides identified in the PFA treated RNaseS samples were digestionproducts of cross-linked proteins since all peptides originated from proteins > 12kDa (i.e. greater than the molecular weight of a single S-protein or S-peptide) ac-cording to the molecular weight measured by SDS-PAGE. It is important to notethat trypsin cleavage was performed in-gel after proteins were denatured with SDSand are therefore expected to be unfolded. However, whether the RNaseS inter-action was diminished in the control sample and partially preserved by in PFAtreated samples, as observed in the calmodulin-melittin system, was examined inthe RNaseS system.Table 5.2 lists the percent trypsin cleavage of each site observed in the con-trol and PFA treated RNaseS samples. In control RNaseS, trypsin cleavage wasobserved at S-protein R39, K61, K66, R85, K91, K98, and K104, and to smallerextent K31 and R33. Almost no trypsin cleavage (7% cleaved) observed at K41was due to the adjacent proline residue and minimal cleavage (~ 5%) observed atK37 suggested that this residue was not accessible. No trypsin cleavage was ob-served within the S-peptide. For the PFA treated sample, a significant decreasein percent trypsin cleavage was observed at K31, R33, and K98. Previous lim-ited MS-based oxidation studies have shown that S-protein residues 96 -100 areblocked when bound to the S-peptide while S-protein residues 39, 85-95 and 101-1665.1. Trypsin Cleavage and Accessibility of Residues104 remain accessible for trypsin cleavage[167]. This is consistent with the lack oftrypsin cleavage observed at K98, while trypsin cleavage was observed at R39, R85and K104 in the S-protein for PFA treated RNaseS. Therefore, this suggests thatPFA cross-linking may have preserved the S-protein to S-peptide binding interac-tion. Other studies have shown that S-protein R33 plays a key role in stabilizingRNaseS via a salt bridge to S-peptide D14 [168], which also agrees with the re-duced trypsin cleavage at R33 observed in PFA treated samples and supports thestabilization of RNaseS via PFA cross-linking. As mentioned in section 3.3.2.2,previous kinetics experiments that examined trypsin cleavage in RNaseS, demon-strated that the S-peptide is not accessible for trypsin cleavage while bound to theS-protein [160]. Thus, the absence of single S-peptide peptides in PFA treatedRNaseS samples observed in this study supports the preservation of the RNaseScomplex. The identification of S-peptides in the control RNaseS sample, suggeststhat the S-peptide was accessible and unbound to S-protein. This was also sup-ported in section 3.3.2.1, where S-peptide and S-protein peptides were identifiedin < 12 kDa proteins, suggesting that they were not bound in a RNaseS complex(~13.7 kDa). Overall, these findings support that PFA preserved the RNaseS inter-action, which was lost in the control sample due to the absence of cross-links.1675.2. Relative Abundance of Formaldehyde Cross-linkingTable 5.2: Percent abundances of cleaved trypsin cleavage sites observed in thecontrol and PFA treated ( > 12 kDa proteins) RNaseS samplesTrypsin Cleavage Site Percent Abundance of Cleaved ResidueControl Sample PFA Treated SampleS-Protein ResiduesK31 40% 0%R33 34% 0%K37 5% 0%R39 99% 100%K41 7% N/AK61 100% N/AK66 100% 100%R85 100% 85%K91 100% 93%K98 100% 14%K104 100% 99%S-Peptide ResiduesK1 0% N/AK7 0% N/AR8 0% N/A5.2 Relative Abundance of Formaldehyde Cross-linking5.2.1 Percent Abundance and Equilibrium of FormaldehydeCross-linking Sites in Calmodulin-MelittinPFA cross-linking in the calmodulin-melittin system was localized to the follow-ing cross-linking sites: calmodulin K77 to calmodulin Q3, melittin G1 to calmod-ulin Y99, melittin G1 to calmodulin Q8, and melittin R24 to calmodulin R126.All of these cross-linked peptides appeared in proteins that were 14-33 kDa, sug-gesting that these represented intramolecular cross-links within one calmodulinmolecule or calmodulin-melittin complex. The abundance of each amino acid in-volved in the cross-link in its unmodified, modified (+12 or +30 Da modification)and cross-linked form was determined. Unmodified and modified peptides iden-tified by MaxQuant in the PFA treated sample, listed in Tables 3.2 and 3.3, and1685.2. Relative Abundance of Formaldehyde Cross-linkingcross-linked species identified manually, listed in Tables 4.8 and 4.9, were used.As mentioned in section 2.7, the abundance of each unique peptide was equated tothe sum of normalized MS peak areas of all identical peptides. In addition, it wasassumed that species were not lost during the sample preparation, MS detection oridentification and that the MS-response of these species were uniform.The percent of each cross-linking site in its unmodified, modified and cross-linked forms are listed in Table 5.3. Melittin G1 (36 %) had the highest percent inthe modified form, followed by calmodulin K77 (32 %) . Only 14 % of melittinR24 was identified its modified form. The cross-linked residues Q3, Y99, Q8 andR126 were not identified with a modification, which is consistent with proposedmechanism in which these residues formed cross-links in the second step of thereaction. For the calmodulin intramolecular cross-link, 2 % of K77 was cross-linked to Q3 and 2 % of Q3 was cross-linked to K77. About 5 % of melittin G1was cross-linked to Y99 and 0.2 % of Y99 was cross-linked to G1. Only 0.5 %of G1 was cross-linked to Q8 and 0.05% of Q8 was cross-linked to G1. Finally,2 % of R24 was cross-linked to R126 and only 0.02% of R126 was cross-linkedto R24. Overall, this displays that the majority of the residues involved in theidentified cross-links remained in their unmodified form, with only a small percentengaged in cross-linking. The percent abundance in the cross-linked form wassimilar for K77 and Q3, which reflects the uniformity predicted for intramolecularcalmodulin cross-linked species. In contrast, the percent abundance of cross-linkedforms between melittin and calmodulin residues in each identified cross-linkingsite varied significantly, which suggests that multiple conformations may exist thatwould disperse the abundance among different conformations.1695.2. Relative Abundance of Formaldehyde Cross-linkingTable 5.3: The percent abundance of PFA cross-linking sites in the unmodified,modified and cross-linked forms in PFA treated calmodulin-melittin; Note: addi-tional decimal places are reported to clarify that values are > 0% or < 100%, asdescribed in section 2.7Modification Site (R1 NH2) Cross-Linking Site (R2H)PercentUnmodifiedPercentModifiedPercentCrosslinkedPercentUnmodifiedPercentModifiedPercentCrosslinkedCalmodulin K77 67 % 32 % 2 % Calmodulin Q3 98 % 0 % 2 %Melittin G1 59 % 36% 5 % Calmodulin Y99 99.8 % 0 % 0.2%Melittin G1 0.5% Calmodulin Q8 99.95 % 0 % 0.05%Melittin R24 84 % 14 % 2 % Calmodulin R126 99.98 % 0 % 0.02%The equilibrium constants for the formation of methylol and/or Schiff Basemodified and cross-linked species were examined for each identified PFA cross-linking site. Figures 2.2 and 2.3 depict the equilibrium reaction for PFA cross-linking. The equilibrium constant expressions based on the abundance of eachspecies, measured by the MS peak area, were derived in section 2.7.2.2. In theseexpressions, K1MS ’, K2MS ’ and K3MS are the equilibrium constants for the forma-tion of methylol modifications, Schiff Base modifications, and cross-links, respec-tively. For the formation of cross-links at sites in which a methylol intermediatewas not identified, K(1+2)MS ’ was denoted as the equilibrium constant for the for-mation of Schiff Base modifications.Out of all the PFA cross-linking sites, only K77 was identified with both amethylol (+30 Da) and a Schiff Base (+12 Da) intermediate. In this current analy-sis, distinguishing between Schiff Base and intrapeptide cross-links, both of whichproduce a +12 Da mass shift, is not possible. However, in proteins where cross-linking is a function of its structural restraints, it is less likely that cross-linking willbe localized between residues existing on the same tryptic peptide, especially usingshort reaction times that promote more specific cross-linking. Previous PFA stud-ies have confirmed that the formation of Schiff Base modifications is more likelythan intrapeptide cross-links in proteins when using short reaction times similar tothis current study [36]. Furthermore, previous studies have shown that although a+30 Da modified form is typically identified along with a +12 Da modified format each modification site, it is possible to identify +12 Da modified form without a1705.2. Relative Abundance of Formaldehyde Cross-linking+30 Da counterpart [85]. Thus, to account for the maximum number of modifiedforms for each cross-linking site, it is assumed that all +12 Da modifications wereSchiff Base modifications.Table 5.4 lists the equilibrium constants for each step of the PFA cross-linkingreaction for each identified cross-linking site. Interestingly, all calculated equilib-rium constants were less than one, suggesting that the equilibrium lies toward theunmodified form for each modification site and toward the modified over cross-linked form for each site. Theoretically, the equilibrium should favor the formationof the more stable methylene bridge cross-link structure in comparison to the SchiffBase structure [86]. Although the unmodified amino group is more stable than aSchiff Base modified protein amino group, the formation of the stable methylenebridge is expected to drive the modification reaction forward. However, the equi-librium constants calculated in this study are not only a reflection of the stability ofeach reaction product but are governed by other factors introduced by the proteincomplex and its structure. It was shown in section 3.2.5, that only 63% of calmod-ulin and melittin are expected to be bound together, i.e. be cross-linked together,based on the dissociation constant of the complex. Therefore, the total percent ofreactive sites in their cross-linked form should be less than 63 %. Another factorto consider, is the possibility of multiple structural isomers produced from PFAcross-linking that would disperse the MS intensity across different products. Thiswas apparent in the two different cross-links that were identified at melittin G1,i.e. one cross-link to calmodulin Y99 and the other to calmodulin Q8, indicatingthe presence of two structural isomers. Consequently, the percent of Y99 and Q8in their cross-linked form and their cross-link formation equilibrium constant wasrelatively lower than with other cross-linking sites. Finally, the variation in theequilibrium constants across different cross-linking sites may be indicative of theaccessibility of each cross-linking site, which is dictated by the structural restraintsof the calmodulin-melittin complex. All together, this demonstrates that equilib-rium is not only dependent on the reactivity and individual amino acids, but alsothe specific protein structure and attributes.1715.2. Relative Abundance of Formaldehyde Cross-linkingTable 5.4: The calculated equilibrium constants for each cross-linking reaction stepfor each identified PFA cross-linking site in PFA treated calmodulin-melittinModification Site Cross-linking Site K1MS’ K2MS’ K(1+2)MS' K3MSCalmodulin-Calmodulin Cross-linksK77 Q3 0.3 0.6 0.1Calmodulin-Melittin Cross-linksG1 Y99 0.6 0.1G1 Q8 0.6 0.01R24 R126 0.2 0.15.2.2 Percent Abundance and Equilibrium of FormaldehydeModification Sites in Calmodulin-MelittinIn the previous section, the percent abundance and equilibrium of PFA cross-linking sites in their unmodified, modified and cross-linked forms were examined.For calmodulin K77, it was shown that the unmodified form is favored over themethylol form and the methylol form is favored over the Schiff Base form. In ad-dition, for melittin R24 and G1, it was demonstrated that the unmodified form isfavored over the Schiff Base form. It was examined whether this trend is also truefor sites where only modification and no cross-linking was observed. As shownin section 3.3.1.2, both methylol and Schiff Base modifications were localized oncalmodulin K75 and K94 via MaxQuant. Only Schiff Base modifications wereidentified on calmodulin K148, R106, R74, R86, and R90, and only a methylolmodification was localized on calmodulin K30 via MaxQuant. Percent abundanceand equilibrium constant calculations were performed analogous to the previoussection. These calculations could not be performed for the modified melittin seg-ment 21KRK23 since the two Schiff Base modifications identified on this segmentvia MaxQuant could not be localized to specific residues, as mentioned in section3.3.1.2. As shown in Table 5.5, the percent of all modification sites in their mod-ified form were < 33%, suggesting that the majority of these residues remainedunmodified. The percent abundances of all R residues in their modified form ( < 8%) were significantly lower than for K residues, further supporting that R residuesare less reactive than K residues in the PFA modification reaction, as mentioned insection 3.3.1.2 and in literature [36, 85].As displayed in Table 5.6, all equilibrium constants were less than one. This1725.2. Relative Abundance of Formaldehyde Cross-linkingindicates that the equilibrium lies toward the unmodified form over the modifiedform for all identified PFA modification sites, similar to the equilibrium constantsof the PFA cross-linking sites shown in the previous section. The equilibrium con-stants for the formation of a methylol (K1MS ’) and Schiff Base (K2MS ’) at K75were 0.1 and 2, respectively. This suggests that the Schiff Base modification for-mation is favored over the methylol intermediate and the methylol formation is notfavored over the unmodified form for K75. This is consistent with the majority ofmodification sites identified by MaxQuant, i.e. K148, R106, R74, R86, and R90,containing only Schiff Base and no methylol modifications. Nevertheless, excep-tions to the assumption that all +12 Da modifications correspond to Schiff Basesmay exist. It is hypothesized that the large K2MS ’ value may be due to K75 existingin an intrapeptide cross-link, which would drive the reaction toward the formationof a stable cross-link bridge. For K94, equilibrium constants K1MS ’ and K2MS ’were 0.1 and 0.5, respectively, suggesting that the unmodified form is favored overthe methylol form and the methylol form is favored over the Schiff Base form. Thisis consistent with the trend observed at the cross-linking site K77. Cross-linkingsites (calmodulin K77, melittin G1, melittin R24) modified in the first step of thereaction exhibited a higher average percent in their modified form (27% versus 15%) and larger average K1MS ’ /K(1+2)MS ’ values (0.4 versus 0.1) than the modifi-cation sites at which cross-links were not identified (calmodulin K75, K94, K30,K148, R106, R74, R86, and R90) . This is hypothesized to reflect the stability ofthe PFA cross-linking bridge, which drives the modification reaction foward whena cross-link formation follows.Overall, nine out of the 13 potential PFA modifications sites in calmodulin andapproximately four out of the five potential PFA modification sites in melittin wereactually identified with PFA modifications. Calmodulin K13, K21, and R37 andpotentially one of the melittin residues in 21KRK23 were not identified with PFAmodifications, suggesting that these were not accessible to PFA in the calmodulin-melittin structure. Although R126 was a potential modification site, no modifica-tions and only cross-linking was identified at this site. This suggests that R126 wasstill accessible to PFA. In contrast to the modification sites, only four of the 20potential PFA cross-linking sites and none of the potential melittin cross-linkingsites were observed to have formed cross-links in the second step of the PFA cross-1735.2. Relative Abundance of Formaldehyde Cross-linkinglinking reaction. The small subset of PFA reactive sites that were identified withcross-links demonstrates the specificity of PFA cross-linking, which is a functionof close-proximity interactions within the calmodulin-melittin structure.Table 5.5: The percent abundance of PFA modification sites in the unmodifiedand modified forms in PFA treated calmodulin-melittin; Note: additional decimalplaces are reported to clarify that values are > 0% or < 100%, as described insection 2.7Modification Site (Calmodulin Residue) Percent Unmodified Percent ModifiedK30 67% 33%R74 93% 7%K75 80% 20%R86 92% 8%R90 94% 6%K94 86% 14%R106 99.7% 0.3 %K148 79% 21%Table 5.6: The calculated equilibrium constants for the modification reaction foreach identified PFA modification site in PFA treated calmodulin-melittinModification Site (Calmodulin Residue) K1MS’ K2MS’ K(1+2)MS'K30 0.5R74 0.1K75 0.1 2R86 0.1R90 0.1K94 0.1 0.5R106 0.003K148 0.35.2.3 Percent Abundance and Equilibrium of FormaldehydeModification Sites in Ribonuclease-SThe percent PFA modification of the S-protein in the RNaseS system was exam-ined, using a similar approach as section 5.2.2. Four PFA modification sites were1745.2. Relative Abundance of Formaldehyde Cross-linkingidentified via MaxQuant in the RNaseS system: S-protein R33, K31, K91 andK98. Since unmodified forms of R33 and K31 were not identified, either a 100%modification of these sites occurred or most likely, not enough PFA peptides wereidentified to comment on the extent of their modification. Therefore, the analysiswas continued with K91 and K98. As listed in Table 5.7, the percent of K91 andK98 in their modified forms was 28 % and 26 %, respectively, which suggests thatthe majority of the identified PFA modification sites remained unmodified. Never-theless, the similar percent modification at both sites demonstrates consistency inreactivity across these PFA modification sites. Similar to the calmodulin-melittinsystem, the equilibrium constants for the formation of a modification at K91 andK98 was determined and are displayed in Table 5.7. For K98, both methylol andSchiff Base forms were identified with K1MS ’ and K2MS ’ values of 0.2 and 0.7,respectively. With no identified methylol form of K91, the equilibrium constant,K(1+2)MS ’ for the Schiff Base formation on K91 was determined to be 0.4. Similarto the calmodulin-melittin system, all equilibrium constants for the formation of aPFA modification in the RNaseS system were below one, suggesting that the equi-librium favored the unmodified forms of K91 and K98. Furthermore, even thoughRNaseS contains 12 potential PFA modification sites, PFA modifications were onlylocalized on four sites. All together, this suggests a reduced accessibility of modi-fication sites to PFA due to the structural constraints of RNaseS. Nevertheless, thelack of peptides with sites modified by PFA identified in the RNaseS system doesnot facilitate a proper analysis of accessibility in RNaseS.Table 5.7: The percent abundance of PFA modification sites in the unmodifiedand modified forms and the calculated equilibrium constants for the modificationreaction for each identified PFA modification site in PFA treated RNaseSModification Site (S-Protein Residue) Percent Unmodified Percent Modified K1MS’ K2MS’ K(1+2)MS'K91 72% 28% 0.4K98 74% 26% 0.2 0.71755.3. Cross-linked Product Classification and Abundance in the Calmodulin-Melittin System5.3 Cross-linked Product Classification and Abundancein the Calmodulin-Melittin SystemIn chapter 4, it was stated that uniform cross-linking is expected within calmod-ulin but not between calmodulin and melittin, due to the proposed existence of twodifferent conformations of the complex. To validate this claim, the relative abun-dances of conformations captured by EDC, PFA, sulfoDST, BS3 and sulfoEGScross-linked species were examined. MS/MS spectra confirmed different typesof cross-linked products: interpeptide calmodulin-calmodulin and calmodulin-melittin cross-linked species, in which antiparallel (N-terminal domains aligned)or parallel (C-terminal and N-terminal domains aligned) binding conformationsoccurred. In chapter 4, Tables 4.2, 4.3, 4.5, 4.7, and 4.9 highlight EDC, sulfoDST,BS3, sulfoEGS and PFA cross-linked species, respectively, that support an antipar-allel (blue) and parallel (white) binding orientation of calmodulin to melittin. Inter-peptide calmodulin-calmodulin cross-links for EDC, BS3, sulfoEGS and PFA aredisplayed in Tables 4.1, 4.4, 4.6, and 4.8, respectively. Relative abundances werecalculated by dividing the normalized peak area values of all identified cross-linkedspecies for each type of cross-linked species (calmodulin-calmodulin, calmodulin-melittin, parallel calmodulin-melittin or antiparallel calmodulin-melittin) by thesum of normalized peak area values of all identified cross-linked species. Thesevalues are depicted in Tables 5.8 and 5.9.Table 5.8: Relative abundance of calmodulin-calmodulin and calmodulin-melittincross-linked peptidesInteraction Captured Cross-LinkerEDC PFA SulfoDST BS3 SulfoEGSCalmodulin-Calmodulin 5% 94% 0% 6% 51%Calmodulin-Melittin 95% 6% 100% 94% 49%1765.3. Cross-linked Product Classification and Abundance in the Calmodulin-Melittin SystemTable 5.9: Relative abundance of calmodulin-melittin interpeptide cross-links sup-porting the parallel and antiparallel binding orientationCalmodulin-Melittin Orientation SupportedCross-LinkerEDC PFA SulfoDST BS3 SulfoEGSAntiparallel 98% 3% 88% 72% 64%Parallel 2% 97% 12% 26% 36%For EDC, sulfoDST, BS3, and sulfoEGS calmodulin-melittin cross-links (95,100, 94, and 49% abundance, respectively) were greater or almost equal in abun-dance to calmodulin-calmodulin cross-links, out of which most captured the an-tiparallel orientation (98, 88, 72, and 64% abundance, respectively). However,PFA cross-linked species supported the parallel orientation over antiparallel (97and 3% abundance, respectively). The PFA calmodulin-calmodulin cross-links, inwhich one major reaction product is expected, were observed to have a higher rel-ative abundance (94%) than calmodulin-melittin cross-links (6%), in contrast toother cross-linkers. A higher relative abundance is a reflection of the formation ofmore of the same product, i.e. more calmodulin-melittin complex molecules con-taining the same calmodulin-calmodulin cross-links. This supports the expecteduniformity in calmodulin’s conformation and the reliability of PFA cross-linkingto capture this uniformity. Interestingly, sulfoEGS demonstrated the least amountof uniformity in capturing calmodulin-calmodulin vs calmodulin-melittin cross-linking and in the orientation of calmodulin-melittin binding, which can be sup-ported by the reduced specificity of its long cross-linker bridge.Previous studies using Ca2+saturated calmodulin-melittin have supported thatcalmodulin and melittin bind in both orientations [99, 102] and more recent reportshave shown that the parallel binding is the major binding mode, which supports theobserved PFA cross-linked structures[100, 101, 103]. However, the specific geom-etry of binding is unknown. It is also possible that calmodulin binds to melittin inonly one orientation that fits the distance constraints of both antiparallel and par-allel binding. This theory is based on the ambiguous definition of antiparallel andparallel binding i.e. based on the alignment of domains of calmodulin and melittinwhich span a range of about 75 and 13 amino acids, respectively. For example, the1775.4. Crystal Structure Distance Constraintsdifference between an “antiparallel” or “parallel” classification could potentiallybe between two adjacent residues between the N-terminal and C-terminal domainor opposite ends of the domains. Therefore, a more precise evaluation of distanceconstraints imposed by each cross-linker on an amino acid level may be required.5.4 Crystal Structure Distance Constraints5.4.1 Measuring and Applying Cross-linking Distances on CrystalStructuresThe cross-linking distances were measured on the known calmodulin crystal struc-ture. Cross-linking distances were also used to derive possible orientations ofcalmodulin and melittin. On each protein component, it was assumed that sidechains can freely rotate about the alpha carbon with a radius equal to the length ofthe side chain. Side chains that can come into contact with the cross-linker withinthis radius were considered cross-linked to account for side chain flexibility. Toaccount for backbone flexibility, an additional 6 Å was added to the maximumcross-linking distance. This calculation was based on recent molecular dynamicsimulations [129]. Table 2.1 lists the maximum cross-linking distances for eachcross-linker for every possible combination of reactive sites, which consider thebackbone and amino acid side chain flexibility of each site. Cross-linking sites arelisted from top to bottom and modification sites are listed left to right. For exam-ple, the maximum cross-linking distance between residues D and K for the EDCcross-linker is 16.1 Å.5.4.2 Calmodulin Structure: Correlating Identified Cross-links toknown crystal structureMapping cross-links identified within calmodulin on its known crystal structurecan validate cross-linkers for their application to the ambiguous crystal structureof the calmodulin-melittin complex. Since calmodulin adopts a comparable struc-ture when binding melittin regardless of the presence of Ca2+, the Ca2+-saturated,binding crystal structure of calmodulin was used for this purpose [95, 125]. Iden-1785.4. Crystal Structure Distance Constraintstified distances between calmodulin-calmodulin cross-linked residues were mea-sured using PyMol.The calmodulin PFA cross-link was evaluated using the bound-state crystalstructure of calmodulin. The MS/MS evidence discussed in section 4.3, supportedcross-linking of K77 to Q3. As Table 2.1 indicates, PFA can cross-link K to Qup to 18.6 Å apart. The Cα-Cα distance between K77 and Q3 was measured tobe 11.8 Å, falling within the maximum cross-linking distance of PFA. Since thiscross-link formed within the distance constraints of the bound crystal structure ofcalmodulin, it is likely that this cross-link represents intramolecular cross-linkingwithin one calmodulin molecule. This is also conveyed by the position of its SDS-PAGE band (14 - 19 kDa) in Figure 3.1.For EDC cross-linked species 91VFDKDGNGYISAAELR106^22DGDGTITTK30, the intrapeptide cross-link could have formed betweenK30 and either D22 or D24. Inter-residue Cα-Cα distances measured on thebound calmodulin crystal structure of 11.5 and 13.6 Å, respectively support cross-link formation between either of these sites. The interpeptide cross-link in thisstructure between K94 and either D22 or D24 had Cα-Cα distances of 26.3 and30 Å, respectively, which were greater than the maximum cross-linking distanceof EDC cross-links between K and D (16.1 Å). The Cα-Cα distances betweeninterpeptide cross-linking sites E6 to K94 and E14 to K94 were significantlylarger (28.9 and 20.4 Å) than the maximum cross-linking distance of EDC (17.3Å). These interpeptide cross-links suggest that cross-linking occurred betweentwo different calmodulin molecules. This contradicts the molecular weight of~16 – 20 kDa measured by the location of SDS-PAGE gel band in which thesecross-linked species originated (see Figure 3.1) . However, the displaced gelbands for EDC cross-linked species, appearing significantly lower than what wasexpected, suggests that the structure of calmodulin may have been modified uponcross-linking. This change in protein surface area would affect the migration ofthe band. Two possible scenarios could have occurred: EDC formed cross-linksbetween two separate calmodulin molecules and the cross-linking caused theprotein to migrate further than expected in SDS-PAGE or EDC cross-linkingoccurred within a calmodulin molecule, altering it to a structure with inter-residuesdistances that do not match established calmodulin structures. The first scenario1795.4. Crystal Structure Distance Constraintsis supported by the study observing faster migration of proteins stabilized bydisulfide bonds in comparison to its reduced form [146], which was also explainedas a common phenomenon across all cross-linkers in this present study in section3.2. The second scenario is supported by the fact that EDC is known to affectprotein conformation [33]. Carboxylic groups are very abundant in calmodulin andare also only reactive with EDC, setting it apart from other cross-linker chemistryutilized in this present experiment. EDC forms cross-links with negativelycharged carboxyl groups and positively charged amino groups, replacing themwith neutral peptide bonds. This decrease in hydrophilic groups and increasein hydrophobic sites could induce a conformational change that decreases theprotein’s surface area thus making distances measured on the native calmodulinstructures irrelevant. Residues may be closer in distance than represented in thesestructures, allowing for EDC cross-linking to occur. Previous reports of formingmore compact protein structures upon the neutralization of charges support thistheory [169, 170]. It is hypothesized that this is due to the decrease in electrostaticrepulsion from adjacent negatively charged groups, promoting hydrogen bondingand forcing a more compact conformation.The Cα-Cα distance between BS3 cross-linked residues K77 and K94 of 26.0Å and sulfoEGS cross-linked residues K21 to K94 of 22.8 Å were within themaximum cross-linking distances of 30.2 and 34.9 Å, respectively. This supportscross-link formation within one calmodulin molecule, which is consistent with themolecular weight (14-19 kDa) measured by the SDS-PAGE analysis (Figure 3.1).Cross-linked species between identical peptides that were only observed with BS3and sulfoEGS indicates that cross-linking captured the dimeric interaction betweentwo different calmodulin molecules, that was observed in previously in literaturevia FTICR-MS [171].All The Cα-Cα distances were measured on the crystal structure of boundcalmodulin (Figure 5.1b). However, one possibility to be considered is thatcalmodulin existed in its unbound conformation (Figure 5.1a), thus making theidentified calmodulin-melittin cross-linked species a result of random contact be-tween uncomplexed melittin and calmodulin molecules in solution. Therefore, dis-tances between verified cross-linking sites for each cross-linker were measured onthe crystal structure of unbound, Ca2+-free calmodulin. Intramolecular cross-links1805.4. Crystal Structure Distance Constraintswere considered i.e. cross-linking of Q3 to K77 (PFA), K77 to K94 (BS3) andK21 to K94 (sulfoEGS) were compared to the maximum cross-linking distancesof each cross-linker. The distances between K77 and K94 and between Q3 andK77 in the unbound and bound state of calmodulin were both within the maximumcross-linking distance of BS3 and PFA. However, the distance between K21 andK94 (~52 Å) was significantly greater than the maximum length of the sulfoEGSin the unbound calmodulin structure. Also, Cα-Cα distances between interpeptideEDC cross-linking sites were even larger in the unbound versus bound calmodulincrystal structure. It has been shown previously that Ca2+-free and Ca2+-loadedcalmodulin-melittin share similar structures[95]. However in the absence of melit-tin, Ca2+-loaded calmodulin possesses a more compact, dumbbell structure in con-trast to Ca2+-free calmodulin. One possibility to also consider is that the calmod-ulin remained unbound to melittin and had bound to the trace amount of Ca2+ions in the deionized water. On the unbound, Ca2+-loaded calmodulin structure,both PFA and BS3 calmodulin cross-linking sites were within the maximum cross-linking distances of each cross-linker. However, the distance between calmodulinK21 and K94 (46.3 Å) was larger than the maximum cross-linking distance ofsulfo-EGS. In addition, the distances between EDC calmodulin cross-linking sitesE6 to K94 and E14 to K94 were 42.3 and 39.9 Å, respectively. These were alsomuch larger than maximum cross-linking distance of EDC and even larger thanthe distances measured on melittin-bound calmodulin structure. This supports theabsence of unbound, Ca2+-loaded calmodulin.Overall, this suggests that the calmodulin likely existed in its bound state, sup-porting the complex formation between melittin and calmodulin. This is consistentwith percent of expected bound calmodulin-melittin complex (63%) being higherthan the percent of unbound calmodulin and melittin (32% and 33%, respectively).Cross-linked species identified with PFA, BS3 and sulfoEGS captured the tertiarystructure of calmodulin. BS3 and sulfoEGS also captured the dimeric interactionbetween two calmodulin molecules. On the other hand, EDC cross-linked speciesdid not coincide with inter-residue distances of either bound or unbound calmod-ulin structures.1815.4. Crystal Structure Distance ConstraintsD22Q3E6K30 K77K94E14D24K21(a)D22Q322.8ÅE6K30K77K94E14D2426.0 Å26.3Å K2113.6Å(b)Figure 5.1: Identified cross-links were mapped on the Ca2+ -free unbound (a) andbound-state (b) calmodulin conformation. Orange and grey lines represent inter-residue distances that do and do not agree with maximum cross-linker distances,respectively. Cross-linking sites are highlighted in red and calmodulin C and Nterminal domains are colored in blue and teal, respectively. 1825.4. Crystal Structure Distance Constraints5.4.3 Calmodulin-Melittin Complex Structure: Applying distanceconstraints from Cross-linking to Propose Binding Orientationof Calmodulin-MelittinIn order to examine the unknown binding orientation of calmodulin to melittin,crystal structures of the bound conformation of melittin and of the bound con-formation calmodulin were oriented to fit the distance constraints imposed by theidentified cross-linked species using PyMol. The distance between melittin andcalmodulin residues in the identified cross-links were equated to the maximumcross-linking distances for each cross-linker. Table 5.10 summarizes the cross-linking sites and maximum distances used to derive structures. A minimum cross-linking distance constraint was imposed such that the Cα-Cα distance plus thebackbone flexibility distance and side chain lengths (to account for flexibility ofside chains) must be at least the length of the cross-linker bridge[172]. However,the scenario where inter-residue distances were rejected for being smaller than ex-pected only arose with N-terminus to lysine residues (minimum cross-linking dis-tance of 12.4 Å) cross-linked by the sulfoEGS cross-linker due to its long bridge.N-terminus to N-terminus (minimum cross-linking distance of 6 Å) cross-links arenot possible in this study due to the acetylated N-terminus of calmodulin and thelysine to lysine minimum cross-linking distance, 18.8 Å, is longer than all cross-linker bridges used in this study.1835.4. Crystal Structure Distance ConstraintsTable 5.10: The maximum distances between all MS/MS identified cross-linkingsites and the respective binding orientation it supports for each cross-linker; Par-allel orientations are shaded in white and antiparallel orientations are shaded inblue.Melittin Residue CalmodulinResidueCross-Linker Maximum Inter-residue Distance Binding OrientationG1 E14 EDC 10.9 Å ParallelK23 E11 EDC 16.1 Å Anti-ParallelK23 E54 EDC 16.1 Å Anti-ParallelR24 R126 PFA 22.1 Å ParallelG1 Q8 PFA 12.2 Å ParallelG1 Y99 PFA 12.9 Å Anti-ParallelK23 K30 sulfoDST 25.2 Å ParallelG1 K94 sulfoDST, BS3 18.8 Å Anti-ParallelG1 K75 BS3 23.8 Å Anti-ParallelK23 K148 BS3 30.2 Å ParallelK23 K94 BS3,sulfoEGS 30.2 Å Anti-ParallelK23 K77 BS3,sulfoEGS 30.2 Å Anti-ParallelK23 K21 sulfoEGS 34.9 Å ParallelStructures proposed by cross-links supporting parallel binding (Figure 5.2a),cross-links supporting antiparallel binding (Figure 5.2b), EDC cross-links (Fig-ure 5.3a), and PFA cross-links (Figure 5.3b), were examined. Parallel bindingwas classified as the binding of calmodulin’s C-terminal and N-terminal domainto melittin’s C-terminus and N-terminus, respectively. Antiparallel binding wasdefined as the binding of calmodulin’s N-terminal and C-terminal domain to melit-tin’s C-terminus and N-terminus, respectively.It is important to note that other structures with slight variations in distancesbetween melittin and calmodulin are always possible, but these variations did notsignificantly affect the general orientation of the two components. Furthermore, itis impossible to accurately depict the constant fluctuations of molecules in solutionwith one rigid structure.Four out of the six NHS ester cross-linked species involved melittin binding toK residues near the flexible linker of calmodulin (K94 and K77), which is known1845.4. Crystal Structure Distance Constraintsto bind to target peptides [98]. Although this is consistent with literature findings,one may have to be cautious when applying these large cross-linkers to the smallcalmodulin-melittin complex. The maximum sulfoEGS cross-linking distance is29 - 35 Å, which is almost the end to end distance of a melittin molecule (35.87Å [173]). Therefore whether sulfoEGS captured a parallel or antiparallel orien-tation is ambiguous. On the melittin structure, the Cα-Cα distance between G1and K23 is 30.1 Å and between K7 and K23 is 24.3 Å. The maximum BS3 cross-linking distance is 24-30 Å, so it is possible for BS3 to form a cross-link betweenan N-terminal lysine of calmodulin and a C-terminal lysine of melittin even if thesemolecules exist in a parallel orientation. In fact, when generating parallel and an-tiparallel binding structures that only satisfied EDC and PFA distance restraintsspecific to each orientation, all structures satisfied sulfoDST, BS3and sulfoEGSdistance restraints regardless of the orientation. The only exception was one an-tiparallel sulfoDST cross-link (the smallest NHS ester bridge of the three) that didnot fit the parallel conformation. Interestingly, it was not possible to satisfy bothEDC and PFA distance restraints simultaneously.The structure proposed by PFA cross-links agrees with the characteristics ofprevious NMR and spectroscopic experiments. For example, W19 in melittin isknown to be surrounded and blocked by the calmodulin C-terminal domain, whichis shown in the structure. Also Y99 is known to be crucial in binding to melit-tin, which is a PFA cross-linking site. Cross-linking between calmodulin Q8 andmelittin G1 is in compliance with the observation that residues 1 - 36 of calmodulinlie closest to the melittin’s helix [101]. Furthermore, as recent NMR studies haveshown, melittin primarily binds to the C-terminal domain of calmodulin, whichis also supported by the structure derived by PFA cross-links[95]. Distance con-straints imposed by EDC cross-linking was used to derive a binding conformationfor the calmodulin-melittin complex. Cross-linking supported an antiparallel-likestructure with the melittin N-terminus pointed toward the C-terminal calmodulindomain. Other than W19 in melittin being inaccessible upon binding to calmodulinand cross-linking between calmodulin E14 and melittin G1, EDC derived structuralattributes were inconsistent with recent NMR and spectroscopy experimental find-ings in contrast to that of PFA [95, 101].1855.4. Crystal Structure Distance Constraints(a)                    (b) Y99 MG1 E14 E54 K21 K77 K148 K75 MK23 R126  E11 MG1 Q8 MK23 K30 K94 MR24 E14 K30 MG1 MK23 K94 20.3Å K21 30.0Å E11 R126  MG1 Q8 E54 K77 K148 K75 MR24 27.1Å MK23 28.6Å Figure 5.2: Calmodulin-melittin binding structures (two views of the same struc-ture) proposed by cross-linking distance constraints that supported (a) parallel (yel-low lines) and (b) antiparallel (orange lines) binding;. Orange/yellow and grey linesrepresent inter-residue distances that do and do not agree with maximum cross-linker distances, respectively. Cross-linking sites are highlighted in red, melittin isshown in purple, calmodulin C and N terminal domains are shown in blue and teal,respectively.1865.4. Crystal Structure Distance ConstraintsW19 MG1 Y99 Q8 R126 22.0 Å MR24 12.9Å 16.1Å E11 E54 E14 MG1 MK23 10.9Å W19 (a)                    (b) Figure 5.3: Calmodulin-melittin binding structures proposed by EDC (a) and PFA(b) distance constraints; W19 on melittin is highlighted in yellow. Orange andyellow lines represent inter-residue distances that support antiparallel and parallelbinding, respectively. Cross-linking sites are highlighted in red, melittin is shownin purple, calmodulin C and N terminal domains are shown in blue and teal, re-spectively.Overall, PFA cross-linked species imposed distance constraints that were mostconsistent with recently established structural attributes of calmodulin-melittin in1875.4. Crystal Structure Distance Constraintsliterature [95, 101]. Also, this is the first time Ca2+-free calmodulin bound tomelittin, i.e. a transient complex, was stabilized by various cross-linkers andlow-resolution structural information was deduced using MS/MS confirmed cross-linked species. This also represents the first study of PFA cross-link identificationbetween non-covalently bound protein components, which demonstrated its po-tential for protein structure mapping. Although, more cross-links identified withhigher confidence are required to verify structures, how each protein complex mayrequire different cross-linkers was revealed. For example, small complexes suchas calmodulin-melittin may not be accurately defined by large cross-linker bridgessimilar to the NHS esters. Also, implementing EDC cross-linking may not be fa-vorable with proteins like calmodulin, which contain a high content of carboxylgroup side chains. Moreover, extensive EDC cross-linking may affect proteinstructure.188Chapter 6Arising Limitations of MassSpectrometric Data Analysis ofFormaldehyde Cross-linkedSpecies6.1 Identification of Limitations Arising in WorkflowUpon cross-linking, SDS-PAGE separation and trypsin digestion of proteins, thereare several factors that affect the subsequent MS-detection and identification ofresulting cross-linked peptides. First, the sensitivity, accuracy and resolution capa-bilities of the MS instrumentation to detect cross-linked species is vital. Second,since cross-linked species are identified and matched based on their monoisotopicmass, a reliable software is crucial for selecting the correct monoisotopic signal andcalculating the monosiotopic mass. Third, in the manual identification of cross-linked species, it is important to examine the number of MS candidate cross-linkedspecies that represent true MS/MS confirmed cross-linked species. This providesinsight into the complexity of the reaction mixtures produced from cross-linkingas a function of cross-linker type (i.e. EDC, PFA, sulfoDST, BS3and sulfoEGS) toimprove the identification of crosslinked species. Finally, present cross-link iden-tification software programs have been successfully applied to established cross-linkers. However, understanding whether these software programs are comparableto the manual identification of cross-linked species would be useful, especially forPFA cross-linked species, which have yet to be identified by such software. There-fore, these aspects are examined to gain insight into the limiting factors associated1896.2. Mass Spectrometer Comparisonwith this thesis work.6.2 Mass Spectrometer Comparison6.2.1 ABI QStar Versus Bruker Impact II QqTOF LC-MS/MSAnalysisOur lab was equipped with the ABI QStar XL QqTOF (QStar) [174] mass spec-trometer for the major duration of this project and the LC-MS/MS analysis of var-ious cross-linkers applied to the calmodulin-melittin and RNaseS protein systemswas performed. The Bruker Impact II QqTOF (Impact II) [14] was available foruse in November, 2014 for this current study. The mass accuracy and resolutionof the QStar is < 10 ppm and 15,000, respectively, with a femtomolar detectionlimit. For the Impact II the mass accuracy and resolution is < 2ppm and 40,000,respectively, with an attomolar detection limit and a dynamic range 5 orders ofmagnitude. The impact of the introduction of the new generation Impact II versusthe outdated, 15 year old QStar for this cross-linking project was examined usingthe calmodulin-melittin protein model system.6.2.2 Mass Spectrometric Data Analysis of Cross-linked SamplesSamples were prepared for the QStar analysis similar to the samples prepared forthe Impact II analysis, as described in chapter 3. PFA along with various es-tablished cross-linkers (EDC/sulfoNHS, BS3,and sulfoEGS) were applied to theCa2+-free calmodulin-melittin. Reaction mixtures were separated by SDS-PAGE,which provided cross-linking evidence similar to the SDS-PAGE shown in sec-tion 3.2. The LC-MS/MS data from the QStar was examined using the Analyst1.1 QS Software (Analyst) to prepare monoisotopic mass lists and visualize MSand MS/MS spectra. The monoisotopic mass lists were processed and analyzedusing the exact same procedure described in section 3.3. Control, peptide and im-possible cross-linked signals were filtered from mass lists using Microsoft Exceland remaining MS signals were matched to theoretical cross-linked species usingMathematica all using a + 0.2 Da window. The MS candidates were verified bymanually examining the presence of their MS signals using Analyst and match-1906.2. Mass Spectrometer Comparisoning their respective MS/MS spectra to expected theoretical fragment ions specificfor each cross-linked species. The total number of MS candidates for EDC, PFA,BS3, and sulfoEGS cross-linked calmodulin-melittin was five, 18, 15 and seven,respectively out of which only one BS3 cross-linked candidate possessed a suffi-cient amount of MS/MS evidence for confirmation. In contrast, the total numberof MS candidates identified in the Impact II analysis for EDC, PFA, sulfoDST,BS3, and sulfoEGS cross-linked calmodulin-melittin was 160, 335, 62, 77, and158, respectively, out of which seven, seven, two, 11 and five, respectively, wereinterpeptide cross-links confirmed via MS/MS. This is displayed in Table 6.1.Table 6.1: The number of identified calmodulin-melittin cross-linked candidatesidentified via MS and confirmed via MS/MS using the QStar and Impact IICross Linker MS Candidate Crosslinks MS/MS Confirmed CrosslinksQStar Impact II QStar Impact IIEDC 5 160 0 7PFA 18 335 0 7sulfoDST N/A 62 N/A 2BS3 15 77 1 11sulfoEGS 7 158 0 5The MS spectrum corresponding to a BS3 cross-linked structure,76MKDTDSEEEIR90^23KR24 (m/z = 896.93, z = 2), acquired on the QStar(Figure 6.1) and Impact II (Figure 6.2) is shown. The mass accuracy of this cross-linked species using the QStar and Impact II was 18 ppm and 2 ppm, respectively.The intensity of the signal detected by the Impact II was approximately threeorders of magnitude higher than the signal detected by the QStar. Also, the narrow,defined peaks produced by the Impact II in contrast to the overlapping, broaderpeaks generated by the QStar analysis illustrates the superior resolving capabilitiesof the Impact II. Figure 6.1 shows the MS/MS spectra of the confirmed BS3cross-linked species 76MKDTDSEEEIR90^23KR24 from the QStar acquisition. Aseries of unmodified y ions (Iy1 to Iy9) for peptide I confirmed its sequence andtwo type 2 ions localized the cross-link between calmodulin K77 and melittin K23.1916.2. Mass Spectrometer ComparisonSince peptide II is only two amino acids in length, only one backbone ion, IIy1corresponding to a terminal R residue, was present. However, since both peptideI and peptide II contain a terminal R residue, it is not clear whether the IIy1 wasproduced from the fragmentation of peptide I or II. No type 1 ions were presentto further confirm the cross-link structure. The same cross-linking structure wasidentified by the Bruker Impact II, as seen in section 4.2.3 and in Figure 6.2as a doubly charged species at m/z 896.93. The MS/MS signals acquired withthe Impact II were almost 2 orders of magnitude higher in intensity. A seriesof unmodified y ions (Iy1 to Iy9) confirmed the sequence of peptide I and anunmodified type 1 ion for the melittin peptide confirmed its presence. Seven type 2b ions (Ib2-II to Ib5-II and Ib8-II to Ib10+II) localized the cross-link to calmodulinK77 and melittin K23. Overall, with higher intensity signals and a more extensivefragment ion sequence coverage, the quality of the MS/MS spectrum acquiredby the Impact II was shown to be enhanced in comparison to the QStar acquiredMS/MS spectrum for this species.1926.2. Mass Spectrometer Comparison890 900.0 0 135 896.95 897.45 897.96 898.46 M/Z Intensity Calmodulin Peptide I Melittin Peptide II b 1 2 3 4 5 6 7 8 9 10 11M K D T D S E E E I Ry 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1MS Spectrum MS/MS Spectrum 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 0 380 Iy1 Iy2 Iy3 Iy4 Iy5 Iy6 Iy7 Iy7 Iy8 IIy1 Iy8* +  H 2 O Iy8* Ib2* Ib3* Ib2^ II Ib3^ II Figure 6.1: QStar acquired MS (middle) and MS/MS (bottom) spectrum of inter-peptide BS3+ calmodulin-melittin cross-link m/z 896.93 (z =2)1936.2. Mass Spectrometer ComparisonCalmodulin Peptide I Melittin Peptide II M/Z Intensity MS Spectrum MS/MS Spectrum 4.8 x 105 896.93 897.43 897.93 898.44 898.94 0 890 900 Iy1 Iy2 Iy3 Iy4 Iy5 Iy6 [Ib11+II]2+ [M]3+ Iy8 Iy9 Ib2+II Ib3+II Ib3+II [Ib4+II]2+ [IIb10+II]2+ Iy7 [IIb9+II]2+ [Ib8+II]2+ II 0 200 300 400 500 600 700 800 900 1000 X1K+NH3 4.4x 104 b 1 2 3 4 5 6 7 8 9 10 11M K D T D S E E E I Ry 11 10 9 8 7 6 5 4 3 2 1b 1 2K Ry 2 1Figure 6.2: Impact II acquired MS(middle) and MS/MS (bottom) spectrum of inter-peptide BS3+ calmodulin-melittin cross-link m/z 896.93 (z = 2); Proposed structurewith the sequence fragment ion evidence indicated on the backbone of the peptidethat corresponds to type 2 ions only (top). 1946.2. Mass Spectrometer ComparisonOut of the 18 MS candidates for PFA cross-linked species, the QStar acquiredan MS/MS spectrum (Figure 6.3) for only one with a signal, a triply charged speciesat m/z 603.24. This signal corresponded to the mass of the proposed structure95DGNGYISAAELR106^22RKR24 with a total additional mass modification of +72Da (two methylol and one Schiff Base modifications). As shown in Figure 6.3, theMS signal appeared at almost the noise level (intensity ~100) of the spectrum. Inthe MS/MS spectrum, only unmodified type 3 y-ions corresponding to peptide I(Iy1to Iy6) were present and without any b-ions it is ambiguous whether this rep-resents a cross-linked species or a modified missed cleaved peptide with the sameterminal sequence. Only one type 3 y-ion and a modified type 1 ion (II+42) forpeptide II was present. Again, since both peptide I and peptide II contain a termi-nal R residue, it is not clear whether the IIy1 was produced from the fragmentationof peptide I or II. Finally, several signals that did not correspond to the proposedPFA cross-linked structure or any modified/unmodified calmodulin or melittin pep-tide were present, suggesting a mixture with a contaminant. Overall, the MS/MSevidence was not sufficient to confirm the cross-linked species. In contrast, seveninterpeptide PFA cross-linked species were confirmed via the MS/MS acquiredwith the Impact II, as shown in chapter 4.1956.2. Mass Spectrometer ComparisonM/Z Intensity MS Spectrum b 1 2 3 4 5 6 7 8 9 10 11 12D G N G Y I S A A E L Ry 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3K R Ky 3 2 1600 6100115603.24603.57604.27603.91MS/MS Spectrum Calmodulin Peptide I Melittin Peptide II 100 150 200 250 300 350 400 450 500 550 600 0 95 Iy1 Iy2 Iy3 Iy4 IIy1 II +42 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 0 46 Iy6 Iy5 Figure 6.3: QStar acquired MS (middle) and MS/MS (bottom) of interpeptide PFAcalmodulin cross-link m/z 603.24 (z = 3); Proposed structure with the sequencefragment ion evidence indicated on the backbone of the peptide that correspondsto type 3 ions only (top). 1966.2. Mass Spectrometer ComparisonInterestingly, the MS signal the triply charged species at m/z 603.24 was de-tected in the control sample by the Impact II and was thus eliminated as a po-tential cross-linked species. Therefore, Figure 6.4 shows an example of the MSand MS/MS spectrum acquired by the Impact II for a different PFA cross-linkedspecies at m/z 666.33 and a triple charge to illustrate the evidence required to con-firm a PFA cross-linked species. This signal corresponded to the mass of calmod-ulin peptides75KM(ox)K77and 1A(ac)DQLTEEQIAEFK13 plus the 12 Da bridge. Amuch more extensive and selective MS/MS spectra was produced, allowing for theconfirmation of this cross-linked species. Type 3 (Iy1 to Iy10 and Ib2 to Ib6) andtype 1(I-NH3) ions confirmed 1A(ac)DQLTEEQIAEFK13. Type 3 (IIb2(ox), IIy1,IIy2(ox)), and type 1(II(ox)+12 and ) ions verified 75KM(ox)K77. The intensitiesof the MS and MS/MS signals from the Impact II of a confirmed PFA cross-linkedspecies are 5 and 4 orders of magnitude higher than the MS and MS/MS signalsfrom the QStar, respectively, due to its enhanced ion extraction and detection. Theclearly defined peaks comprising the MS signal of the cross-linked species exem-plifies the higher resolution capabilities of the Impact II versus the QStar. Also, themajority of the signals appearing in the MS/MS spectrum from the Impact II wereassigned to the cross-linking structure, illustrating the increased selectivity of theImpact II acquisition and/or HPLC separation.1976.2. Mass Spectrometer Comparison3.2 x 104 M/Z Intensity Iy5 Ib6 Iy6 Iy7 Iy8 Iy9 Iy10 Ib6 – H2O I – NH3 0.0 600 700 800 900 1000 1100 1200 2.1 x 104 Ib2 Ib3 Ib4 Iy4 Iy2, IIy2(ox) Iy3 Iy5 Ib5 IIy1,Iy2 IIb2 IIy2(ox)+12 IIy2+12 II + 12 0.0 100 200 300 400 500 600 Iy2, IIy2(ox)+12 Peptide I Peptide II b 1 2 3 4 5 6 7 8 9 10 11 12 13A D Q L T E E Q I A E F Ky 13 12 11 10 9 8 7 6 5 4 3 2 1b 1 2 3K M Ky 3 2 1666.34 666.67 667.00 667.34 667.67 0.0 665 669 1.1 x 106 MS Spectrum MS/MS Spectrum Figure 6.4: Impact II acquired MS (middle) and MS/MS (bottom) of interpep-tide PFA calmodulin cross-link m/z 666.33 (z = 3); Proposed structure with thesequence fragment ion evidence indicated on the backbone of the peptide that cor-responds to type 3 ions only (top). 1986.2. Mass Spectrometer ComparisonAs Table 6.1 shows, the cross-link identification at the MS and MS/MS levelwas drastically improved by the use of the Impact II. Furthermore, the quality ofMS/MS spectra required to confirm cross-linking in the calmodulin-melittin systemexceeds the capacity of the QStar and was possible only with the new generation,Impact II. Overall, this revealed the crucial role of a sensitive mass spectrometer inthis study of cross-linked species.6.2.3 Mass Spectrometric Analysis of Unmodified CalmodulinPeptidesTo verify that the instrument performance itself limited the quality of LC-MS/MSdata produced, the Impact II and QStar performance was evaluated using an un-modified calmodulin sample. The MS/MS data produced by the QStar and Im-pact II for the calmodulin sample was analyzed using the Mascot MS/MS peptidesearch[118]. The MS/MS Ion Search was performed using a peptide mass toler-ance and fragment mass tolerance of 0.2 Da, a significant threshold p < 0.05 anda score cut off of 20. Variable modifications were set to as follows: Trimethyl(K), Oxidation (M), Acetyl (N-term), Deamidated (NQ) and Acetyl (Protein N-term).Mascot identified 59 unique calmodulin peptides in the Impact II data with asequence coverage and average mass accuracy of the peptides of 100% and 2.2ppm, respectively. In contrast, Mascot only identified 11 unique calmodulin pep-tides in the QStar data with a sequence coverage and average mass accuracy of67% and 26.7 ppm, respectively. Tables 6.2 and 6.3 list the highest scoring matchfor each peptide identified in the Impact II and QStar data, respectively. This illus-trates that even in a simple, unmodified protein sample, the QStar was unable toproduce sufficient quality of data and further supported the limitations of using theQStar for this project.1996.2. Mass Spectrometer ComparisonTable 6.2: Mascot MS/MS search results for the Impact II analyzed calmod-ulin sample with the highest scoring match for each peptide identified listed.The sequence position (starting and ending residue), observed m/z, experimen-tal monoisotopic mass, theoretical monoisotopic mass, mass accuracy, number ofmissed cleavages, Mascot score, and sequence (trypsin cleavage site displayed inthe beginning and end of the sequence as “R.” or “K.” ) are listed left to right.2006.3. Assignment of Monoisotopic MassesTable 6.3: Mascot MS/MS search results for the QStar analyzed calmodulin samplewith the highest scoring match for each peptide identified listed. The sequenceposition (starting and ending residue), observed m/z, experimental monoisotopicmass, theoretical monoisotopic mass, mass accuracy, number of missed cleavages,Mascot score, and sequence (trypsin cleavage site displayed in the beginning andend of the sequence as “R.” or “K.” ) are listed left to right.Start – End Observed Mr(expt) Mr(calc) Mass Accuracy (ppm)Missed CleavageScore Peptide1 – 13 782.37 1562.72 1562.75 18.4 0 55 K.ADQLTEEQIAEFK.E + Acetyl (N-term)1 – 13 783.39 1564.77 1564.71 34.2 0 45 K.ADQLTEEQIAEFK.E + Acetyl (N-term); 2 Deamidated (NQ)31 – 37 403.21 804.41 804.42 12.9 0 32 K.ELGTVMR.S75 – 86 451.56 1351.65 1351.59 39.4 1 38 K.MKDTDSEEEIR.E75 – 90 496.76 1983.02 1982.94 41.3 3 40 R.KMKDTDSEEEIREAFR.V75 – 96 500.75 1998.97 1998.93 19.6 3 45 R.KMKDTDSEEEIREAFR.V + Oxidation (M)78 – 96 532.92 1595.72 1595.71 10.5 1 24 K.DTDSEEEIREAFR.V95 – 106 633.31 1264.61 1264.60 4.3 0 49 K.DGNGYISAAELR.H107 – 126 605.32 2417.23 2417.15 36.8 1 23 R.HVMTNLGEKLTDEEVDE127 – 148 1246.09 2490.16 2490.06 42.4 0 37 R.EADIDGDGQVNYEEFVQMMTAK.- + Deamidated (NQ)127 – 148 1255.05 2508.08 2508.02 23.4 0 28 R.EADIDGDGQVNYEEFVQMMTAK.- + Oxidation (M); 3 Deamidated (NQ)6.3 Assignment of Monoisotopic MassesDeriving possible cross-linked masses is dependent on obtaining an exclusive listof accurate monoisotopic masses from the raw MS spectra. The Bruker DaltonicsCompass Data Analysis 4.2 software offers three peak picking algorithms: Apex,SNAP and SumPeak. Apex picks peaks by calculating the derivatives of the MSsignal intensities such that the peak maximum would have a first derivative equalto zero and a negative number for the second derivative. This algorithm workswell for isotope-resolved peaks. SNAP takes into account molecular features tocalculate the isotopic distribution for determining the monoisotopic mass, whichis favorable for polymers such as proteins/peptides. Sum Peak uses a “pseudoslope” instead of calculating the derivative as with the APEX algorithm [175].There are also other peak picking software available such as DeconMSn [176],which calculates monosiotopic peaks by determining the isotopic distribution byoverlapping theoretical isotopic patterns with observed patterns. The three Brukerpeak picking algorithms and DeconMSn were tested using a S/N cut off of 2 ina PFA cross-linked calmodulin-melittin sample, the most complex sample in thecalmodulin-melittin MS data set. Figure 6.5a illustrates the accuracy of each peak2016.3. Assignment of Monoisotopic Massespicking method using a species with a charge of 5 at m/z 539.07. SNAP and Apexaccurately determined the monoisotopic signal, SumPeak selected all isotopic sig-nals and DeconMSn selected the highest intensity peak as the monoisotopic signalinstead of the first peak. As shown in Figure 6.5b, DeconMSn produced a list of151840 m/z values, treating every isotopic peak as a separate signal and calculatingits respective mass. To manually distinguish between the isotopic and monoiso-topic peaks in this large data set would be highly tedious and therefore, DeconMSnwas not used. The Bruker peak picking algorithms, Apex, SumPeak and SNAPgenerated a list of 1631, 2418 and 1701 m/z values. Although all algorithms cal-culated the monoisotopic peak correctly, Apex and SumPeak outputted isotopicpeaks in addition to the monoisotopic peak for each mass. Apex also missed 133monoisotopic signals that SumPeak had detected. Finally both Apex and SumPeakwere not able to calculate the monoisotopic mass of 480 and 569 signals, respec-tively, since both algorithms failed to interpret the charge of these signals. SNAP,on the other hand, was able to determine the charge of every signal and calculatedall monoisotopic masses. To prevent the tedious task of manually interpreting thecharges of signals using the APEX and SumPeak algorithms, SNAP was chosen asthe peak picking algorithm for this study.2026.3. Assignment of Monoisotopic Masses539.07 0 536 538 540 542 544 1.9 x104 539.07 539.27 539.47 539.67 538 541 1.9 x104 Intensity M/Z SNAP Sum Peak  (Could not assign charge) APEX DeconMSn DeconMSn Bruker Algorithms  Peak Picking Method APEX SUMPEAK SNAP Number of MS Signals  151840 1631 2418 1701 Accurate Assignment of Peak  Assigned highest intensity and treated every isotopic peak after as a separate peak Calculated correct mass but outputted all isotopic m/z signals;  Calculated correct mass but outputted all isotopic m/z signals;  Calculated correct Monoisotopic mass and only outputted m/z of monoisotopic peak Missing charge/ Failed to Deconvolute  0 480 569 0 (a)                              (b)          Figure 6.5: (a) Monoisotopic peak assigned for MS Signal m/z 539.07 (z = 5)by DeconMSn, SNAP, SumPeak and Apex peak picking methods as indicated bythe blue, red, purple and teal arrows, respectively. (b) Summary of peak pickingmethods for a PFA treated calmodulin-melittin sample2036.3. Assignment of Monoisotopic MassesHowever, in highly complex cross-linked peptide mixtures, even SNAP did notalways consistently pick the correct monoisotopic peak and the deconvoluted finalmass lists contained values that were + 1 Da off the actual mass. Assigning thewrong monoisotopic peak in proteomic MS spectra is a commonly observed issue. Furthermore, cross-linked peptides tend to be highly charged, larger species. Thelarger the species, the more isotopic forms exist, producing more isotopic peaks.Therefore the relative intensity of the monoisotopic peak decreases and often ap-pears at a lower intensity than the adjacent isotopic peaks and the software tends topick the most intense peak [177, 178]. The list of incorrectly assigned m/z for eachcalmodulin-melittin cross-linked sample can be found in Appendix A.4, which sup-ports this trend since most actual monoisotopic masses were ~ 1 Da lower than theassigned monoisotopic masses. Table 6.4 displays the percent of the total MS can-didate cross-linked species that had incorrectly assigned monoisotopic masses foreach cross-linker. The percent of incorrectly assigned monoisotopic masses is sim-ilar for EDC, PFA and sulfoEGS (31%, 27% and 31%, respectively). EDC andPFA produced the most number of reaction products due to a larger number oftheir reactive sites in calmodulin-melittin. Thus, with more species in the reactionmixture, the signal intensity is dispersed over more products and the monoiso-topic peak becomes more difficult to distinguish. For the NHS ester cross-linkers,which have the same cross-linking site specificity, the percent of incorrectly as-signed monoisotopic masses increased with larger cross-linker bridge lengths. Asthe length of the cross-linker bridge increases, the chance of two reactive residuesexisting within the distance of the cross-linker bridge increases, thus increasing thenumber of cross-linked products. An increase in the number of reaction productswould decrease the relative monoisotopic signal intensity, making it more ambigu-ous to the software to pick the correct monoisotopic peak.2046.4. Complexity of Cross-linked Candidates Confirmed by Mass SpectrometryTable 6.4: The percent of the total number of MS candidate cross-linked specieswith incorrectly assigned monoisotopic peaks by the software for each cross-linkerCross-Linker Percent of Incorrectly Assigned M/ZEDC 31 %PFA 27 %SulfoDST 10 %BS3 21 %SulfoEGS 31 %6.4 Complexity of Cross-linked Candidates Confirmedby Mass SpectrometryThe total number of MS candidates for EDC, PFA, sulfoDST, BS3, and sulfoEGScross-linked calmodulin-melittin was 160, 335, 62, 77, and 158 respectively (m/zlists are shown in Appendix A.4). Upon the manual inspection of each candidate’sMS/MS spectrum, the distribution of MS candidates was determined and is shownin Table 6.5.Table 6.5: The percent of the total number of MS confirmed candidate cross-linkedmasses that correspond to modified peptides, undetermined species, species withinsufficient MS/MS and cross-linked species for each cross-linker.Candidate Classification Cross-LinkerEDC PFA SulfoDST BS3 SulfoEGSPeptides 38% 47% 34% 29% 39%Undetermined Species 8% 18% 11% 19% 8%Insufficient MSMS Spectra 49% 33% 52% 38% 50%Cross-Link 4% 2% 2% 15% 4%MS candidates were classified as “modified peptides” if their MS/MS signalsmatched the sequence of only one peptide component and the mass of the candidate2056.4. Complexity of Cross-linked Candidates Confirmed by Mass Spectrometrywas equal to the mass of the matching peptide and a modification (i.e. either cross-linker or protein-specific modification). The relative occurrence of these peptidesthat share the same mass as a possible cross-linked species was 38%, 47%, 34%,29% , and 39% of the total number of MS candidates for EDC, PFA, sulfoDST,BS3 and sulfoEGS, respectively. “Undetermined Species” were MS candidateswith MS/MS signals that do no match the sequence of any calmodulin or melit-tin peptide and were 8% ,18%, 11%, 19%, and 8% of the total number of MScandidates for EDC, PFA, sulfoDST, BS3 and sulfoEGS, respectively. MS candi-dates with “Insufficient MSMS Spectra” are those with precursor signals in whichan MS/MS spectrum was not generated or contained only a few signals above thenoise level of the spectrum. These species were 49%, 33%, 52%, 38%, and 50% ofthe total number of MS candidates for EDC, PFA, sulfoDST, BS3 and sulfoEGS,respectively. Out of the total number of species that possessed sufficient MS/MSspectra, 46%, 64%, 45%, 48%, and 47% were identified as non-cross linked (pep-tides or undetermined) species for for EDC, PFA, sulfoDST, BS3 and sulfoEGS,respectively.The substantially larger number of MS candidates for PFA is due its reactiv-ity with several amino acids and numerous possible modifications and cross-links.Similar to PFA, EDC forms close proximity cross-links due to its zero-lengthbridge and has several reactive sites present in this model system, supporting itsrelatively higher number of MS candidates. For the NHS ester cross-linkers, thenumber of MS candidates increased as a function of the cross-linker length. Thisillustrates the higher probability of two cross-linking sites existing within the dis-tance of a cross-linker’s bridge for longer cross-linkers. This may reduce its speci-ficity for only capturing cross-linking sites that are close enough to interact orare structurally relevant [32]. Regardless, the high specificity of NHS ester cross-linkers comes from its reactivity with only N-terminal and K residues, which issupported by the production of relatively fewer reaction products in comparisonto PFA and EDC observed in this study. For PFA and sulfoDST, only 2% of thetotal candidates were interpeptide cross-linked species. Out of the total EDC andsulfoEGS MS candidates, only 4% were interpeptide cross-linked species. Finally,BS3 produced the highest percent of cross-linked species with 15% of its MS can-didates being actual interpeptide cross-links. In general, this revealed that MS/MS2066.4. Complexity of Cross-linked Candidates Confirmed by Mass Spectrometryconfirmed interpeptide cross-linked species comprise of a very minute subset ofthe MS candidates for all cross-linkers. PFA produced the highest percentage ofpeptides and other species with masses that were equal to possible cross-linkedspecies, supporting the complexity of its reaction products. This illustrates the cru-cial role of MS/MS to distinguish between actual cross-linked species and suchreaction products especially with PFA. The acquisition of MS/MS spectra in thisexperiment was dependent on a sufficient precursor ion signal intensity. Forminga few reaction products with a high yield would increase the signal intensity ofeach precursor ion and increase the chance of a better quality MS/MS spectrum.Interestingly, although PFA produced the most number of different reaction prod-ucts, it exhibited a relatively lower percent of insufficient MS/MS spectra for theseproducts. This suggests that the yield of these products was high enough to com-pensate precursor ion signal dispersion among several products. However, out ofthe PFA reaction products that possessed a sufficient MS/MS spectrum, the ma-jority of them corresponded to non-cross linked species supporting that non-crosslinked species matching the mass of candidate cross-linked species are abundantin the PFA reaction products. In all other cross-linkers besides PFA, the percent ofnon-cross linked species that matched the mass of their MS candidate cross-linkedspecies was lower illustrating their reduced complexity and more straightforwardidentification.In conclusion, NHS esters have high specificity that comes with a higherchance of false positive identification. Although smaller and less specific cross-linkers such as EDC and PFA do not introduce additional degrees of freedom froma linker bridge, they produce more reaction products. Nevertheless, a combinationof all these types of cross-linkers can build a more accurate MS/MS analysis ofprotein complexes. With the MS/MS verified cross-linked species representing asmall pool of the MS candidates, many of which corresponded to unknown struc-tures or modified peptides, this study also emphasized that MS/MS is even morecrucial in the case of PFA to confirm the presence of cross-links.2076.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links6.5 Manual versus Software Identification ofCalmodulin-Melittin Cross-links6.5.1 Cross-linking Software Search Parameters and EvaluationCriteriaA variety of software have successfully been applied to identify EDC, sulfoDST,BS3 and sulfoEGS cross-links, however, these automated methods have yet to beapplied to PFA cross-links [32, 34, 39–41]. Therefore, the reliability of software tofind PFA calmodulin-melittin cross-linked species at the MS/MS level was tested.Since no cross-linking software specifically designed for PFA exists, StavroX[120]and pLink[121] were selected since users can define custom cross-linkers withmultiple cross-linking sites, protein databases, and multiple cross-linker specificvariable modifications. These software programs were also tested on EDC, sul-foDST, BS3and sulfoEGS calmodulin-melittin cross-linked sample data for a com-parison. Both pLink and StavroX have previously been utilized to identify EDCcross-linked species [34] and NHS-ester cross-linked species [120, 121]. In bothStavroX and pLink software programs, the cross-linker bridge is assumed to not becleavable. A variation of StavroX, MeroX[119], assumes the cross-linker bridgefully fragments. Although PFA cross-links can theoretically contain both intactand fragmented cross-link bridges under CID [50], the majority of the PFA cross-linked species examined in chapter 4 did not possess an intact cross-linker bridge.Regardless, both StavroX and MeroX was tested on PFA cross-linked sampledata. StavroX was used for the established cross-linked samples since NHS es-ter and EDC cross-link bridges are not expected to significantly fragment underCID [51, 52]. In all software program searches, the possibility of trypsin cleavingafter a cross-linked residue was not excluded since cleavage after PFA cross-linkedresidues was observed manually in chapter 4. The general criteria for verifyingcross-linked species utilized in section 4.5 was used to filter cross-links identi-fied by the software. This criteria is similar to the scoring methods utilized inStavroX/MeroX and pLink, which are based on the percent of MS/MS fragmention evidence (i.e. the number of expected y and b ions divided by the total lengthof the peptide)[120, 165, 166]. For StavroX and MeroX, the annotated spectra2086.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksprovided by these software programs were inspected. In this spectra, the notationis used such that the smaller and larger peptide components are referred to as “β”and “α” (i.e. peptide I and II). The peptide N-terminus and C-terminus are denotedas “[” and “]”, respectively and the protein N-terminus and C-terminus are repre-sented by “{” and “}”, respectively. However, for consistency, the notation used inchapter 4 is used in the text (i.e. peptide I and II for “β” and “α”). For pLink, acorresponding annotated MS/MS spectrum for identified cross-linked species wasnot generated and thus these were verified manually using the Bruker DaltonicsCompass Data Analysis 4.2 software.6.5.2 Established Cross-linkers6.5.2.1 EDCFigure 6.6 lists the identified cross-linked species and illustrates the overlap be-tween each StavroX, pLink and manual cross-link identification method. StavroXidentified a total of 19 unique EDC cross-linked species, out of which 15 had insuf-ficient MS/MS evidence to confirm its presence, two corresponded to single pep-tides with missed cleavage sites, and two were actual cross-linked species. Therewas no overlap between the manual and StavroX cross-link species identification.Figure 6.7 shows an example of an annotated MS/MS spectrum of a EDC cross-linked species identified by StavroX. Six out of the 13 (46%) and two out of thefive (40%) expected backbone fragment ions for the melittin and calmodulin pep-tide was present, respectively. With sufficient y and b fragment ions for both com-ponent peptides with the cross-linker bridge intact, this species was accepted asa cross-link. pLink identified only three unique EDC cross-linked species, out ofwhich one corresponded to a single peptide with a missed cleavage site and twowere actual cross-linked species. All of these species were also identified withStavroX and there was also no overlap between the manual and pLink cross-linkedspecies identification.Out of two cross-linked species identified by the software, the doubly chargedspecies at m/z 580.84 corresponded to a signal that also appeared in the EDC con-trol sample and the triply charged species at m/z 577.96 was not selected by thepeak peaking software, because its isotopic peaks overlapped with a signal of a2096.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksspecies with a similar m/z and elution time. Therefore, these species were notidentified manually.As shown in Figure 6.6b (as indicated by residues highlighted in red), theStavroX and pLink identified cross-linked species demonstrated cross-linking be-tween melittin G1 and calmodulin E87 and E84. Although melittin G1 was re-vealed as a cross-linking site manually, calmodulin E84 and E87 were not found tobe cross-linking sites via manual identification.2106.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksStavroX Manual pLink 0 0 2 [M]calc (Da) Cross-Link Structure StavroX pLink Peptide 1 Peptide 2 m/z z [M]exp (Da) Mass Accuracy (ppm) m/z z [M]exp (Da) Mass Accuracy (ppm)  Calmodulin-Melittin Cross-Linked Species  1159.68 1GIGAVLK7 87EAFR90 580.84 2 1159.68 1.7 580.84 2 1159.68 8.3 1730.88 1GIGAVLK7 78DTDSEEEIR86 577.96 3 1730.88 1.2 577.96 3 1730.88 9.8 (a)                     (b)                      0 0 0 7 Figure 6.6: (a) Venn Diagram of MS/MS verified EDC cross-linked species identi-fied by each software (StavroX and pLink) and manual method, two methods (tworegion overlap), all methods (center region overlap); (b) The calculated monoiso-topic mass, cross-link peptide sequence (cross-linked residues highlighted in red),m/z, experimental monoisotopic mass, and mass accuracy for species identified byStavroX and pLink.2116.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksCharge State of  y/b ions b-ions y-ions Cross-Link Bridge  relative intensity of ions   0%        100% Figure 6.7: An example of a EDC cross-linked species (m/z =580.84, z = 2) andits annotated MS/MS spectrum from StavroX2126.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links6.5.2.2 SulfoDSTFigure 6.8 lists the identified cross-linked species and illustrates the overlap be-tween each StavroX, pLink and manual cross-link identification method. StavroXidentified a total of six unique sulfoDST cross-linked species out of which threedid not meet the acceptance criteria due to insufficient MS/MS evidence, and threewere actual cross-linked species. Out of the three StavroX identified and con-firmed cross-linked species, one of them was also identified manually. pLink iden-tified four unique sulfoDST cross-linked species out of which two could not beconfirmed due to insufficient MS/MS evidence, and two were actual cross-linkedspecies. Out of two verified pLink identified cross-linked species, one overlappedwith the manual identification and one was the unoxidized form of the cross-linkidentified by StavroX. Figure 6.9 shows the annotated MS/MS provided by Stavroxfor the cross-linked species with a charge of four at m/z 567.53 that was identifiedby both StavroX and pLink. Four out of the 13 (31%) and eight out of the 21(38%) expected backbone fragment ions for the melittin and calmodulin peptidewas present, respectively. With sufficient y and b fragment ions for both compo-nent peptides with the cross-linker bridge intact, this species was accepted as across-link.The cross-linked species identified only by pLink, the species at m/z 563.54,corresponded to a signal that also appeared in the control sample. The cross-linkedspecies only identified by StavroX possessed two m/z values that were missed bythe peak picking software, one of which, a species with a charge of four at m/z693.35 was mixed with signals of species with a similar m/z and elution time.Thus, these were not identified manually.2136.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links[M]calc (Da) Cross-Link Structure StavroX pLink Peptide 1 Peptide 2 m/z z [M]exp (Da) Mass Accuracy (ppm) m/z z [M]exp (Da) Mass Accuracy (ppm)  Calmodulin-Melittin Cross-Linked Species  2250.11 1GIGAVLK7 75KMKDTDSEEEIR86 563.54 4 2250.15 18.0 2769.36 1GIGAVLK7 75KM(ox)KDTDSEEEIR86 554.88 5 2769.37 4.3 693.35 4 2769.39 10.8 2524.29 1GIGAVLK7 91VFDKDGNGYISAAELR106 842.44 3 2524.30 4.0 842.44 3 2524.31 10.4 (a)                    (b)                  StavroX Manual pLink 0 1 0 0 1 1 1 Figure 6.8: (a) Venn Diagram of MS/MS verified sulfoDST cross-linked speciesidentified by each software (StavroX and pLink) and manual method, two meth-ods (two region overlap), all methods (center region overlap); (b) The calculatedmonoisotopic mass, cross-link peptide sequence (cross-linked residues highlightedin red), m/z, experimental monoisotopic mass, and mass accuracy for species iden-tified by StavroX and pLink. Cross-links that agreed with the manual detection arehighlighted in purple.2146.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksCharge State of  y/b ions b-ions y-ions Cross-Link  Bridge  relative intensity of ions   0%        100% Figure 6.9: An example of a sulfoDST cross-linked species (m/z = 567.53, z = 4)and its annotated MS/MS spectrum from StavroX2156.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links6.5.2.3 BS3Figure 6.10 lists the identified cross-linked species and illustrates the overlap be-tween each StavroX, pLink and manual cross-link identification method. A totalof 79 unique BS3 cross-linked species were identified with StavroX out of which58 did not satisfy the acceptance criteria due to insufficient MS/MS evidence, threehad insufficient MS evidence to confirm their presence, one was a peptide with amissed cleavage, and 17 were actual cross-linked species. Out of the 17 StavroXidentified and confirmed cross-linked species, three of them were also identifiedmanually. A total of nine unique BS3 cross-linked species were identified withpLink out of which one did not possess sufficient MS/MS evidence for its con-firmation, four did not possess corresponding MS signals, and four were actualcross-linked species. Out of the four pLink identified and confirmed cross-linkedspecies, two of them were also identified manually and all four were also identifiedby StavroX.Figure 6.11 shows an example of an annotated MS/MS spectrum of a cross-linked species identified by StavroX. Ten out the 27 (31%) and six out of the 29(22%) expected backbone fragment ions for the melittin and calmodulin peptidewere present, respectively. With sufficient y and b fragment ions for both com-ponent peptides with the cross-linker bridge intact, this species was accepted as across-link.There were 14 cross-linked species identified by the software that were notfound manually, which are listed in Figure 6.10b (shaded in white). Six out ofthese 14 matched signals appeared in the control sample (m/z = 600.88, z = 2),corresponded to a single peptide (m/z = 569.56, z = 4) or matched an impossiblecross-linked species (m/z = 406.25, z =3; m/z = 716.37, z =3; m/z = 579.30, z =4; m/z = 556.50, z = 5). The remaining eight out of 14 cross-linked species werenot found manually due to the absence of these signals in the peak lists generatedby SNAP. Out of these eight signals, five of them appeared as incorrectly assignedmasses (m/z = 721.70, z =3; m/z =573.55, z =4; m/z =773.73, z =3; m/z =649.83,z =4;m/z =890.74, z =4) and it is unknown why SNAP missed the remaining threesignals ( 706.36, z =4; 678.53, z =5; and 575.30, z =4).StavroX failed to find eight of the manually identified cross-linked species and2166.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linkspLink only identified one manually identified BS3 cross-linked species. All ofthe cross-linked species detected by the software supported the same or similarcross-linking sites as with manual detection: melittin G1 and K23 to calmodulinK75/K77, and calmodulin K75 to K94 (indicated by residues highlighted in redin Figure 6.10b) . Cross-linking between melittin K21 and calmodulin K94 wasonly detected by the software, however this is a similar region to the cross-linkingbetween melittin K23 and calmodulin K94, detected manually. Since similar cross-linking sites were identified by both manual and automated methods, the existenceof these cross-linking sites was further validated. Furthermore, it suggests that bothmethods identified true cross-links.2176.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links[M]calc (Da) Cross-Link Structure StavroX pLink Peptide 1 Peptide 2 m/z z [M]exp (Da) Mass Accuracy (ppm) m/z z [M]exp (Da) Mass Accuracy (ppm)  Calmodulin-Melittin Cross-Linked Species  1199.74 1GIGAVLK7 75KMK77 600.88 2 1199.75 8.3 1215.74 1GIGAVLK7 75KM(ox) K77 406.25 3 1215.75 11.5 2076.06 22RKR24 75KMKDTDSEEEIR86 693.03 3 2076.08 7.7 2146.10 1GIGAVLK7 76MKDTDSEEEIR86 716.37 3 2146.11 6.5 2162.09 1GIGAVLK7 76M(ox)KDTDSEEEIR86 721.70 3 2162.11 6.9 2274.20 1GIGAVLK7 75KMKDTDSEEEIR86 569.56 4 2274.22 10.1 759.07 3 2274.21 4.4 2290.19 1GIGAVLK7 75KM(ox)KDTDSEEEIR86 573.55 4 2290.21 7.9 2318.13 1GIGAVLK7 75KM(ox) KDTDSEEEIR86 773.73 3 2318.20 6.9 2548.37 1GIGAVLK7 91VFDKDGNGYISAAELR106 850.46 3 2548.38 4.3 840.46 3 2518.38 11.9 2595.31 22RKR24 75KM(ox) KDTDSEEEIREAFR90 649.83 4 2595.32 6.9 2777.45 1GIGAVLK7 75KMKDTDSEEEIREAFR90 556.50 5 2777.48 9.7 556.49 5 2777.45 0 2793.44 1GIGAVLK7 75KM(ox) KDTDSEEEIREAFR90 559.69 5 2793.47 7.5 2821.44 1GIGAVLK7 75KM(ox) KDTDSEEEIREAFR90 706.36 4 2821.45 4.3 3558.97 8VLTTGLPALISWIKR22 91VFDKDGNGYISAAELR106 890.75 4 3558.99 4.2 890.75 4 3559.00 8.4  Calmodulin-Calmodulin Cross-Linked Species  2297.19 75KMK77 91VFDKDGNGYISAAELR106 575.30 4 2297.21 10.0 2313.18 75KM(ox)K77 91VFDKDGNGYISAAELR106 579.30 4 2313.20 12.5 3387.63 75KM(ox)DTDSEEEIR86 91VFDKDGNGYISAAELR106 678.53 5 3387.66 8.0 (a)                (b)                StavroX Manual pLink 2 1 3 0 11 0 8 Figure 6.10: (a) Venn Diagram of MS/MS verified BS3 cross-linked species identi-fied by each software (StavroX and pLink) and manual method, two methods (tworegion overlap), all methods (center region overlap); (b) The calculated monoiso-topic mass, cross-link peptide sequence (cross-linked residues highlighted in red),m/z, experimental monoisotopic mass, and mass accuracy for species identified byStavroX and pLink. Cross-links that agreed with the manual detection are high-lighted in purple. 2186.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksCharge State of  y/b ions b-ions y-ions Cross-Link Bridge  relative intensity of ions   0%        100% Figure 6.11: An example of a BS3 cross-linked species (m/z = 890.75, z = 4) andits annotated MS/MS spectrum from StavroX; In the cross-linked structure shownon top, the “m” in the peptide sequence refers to M(ox), i.e. an oxidized M residue.2196.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links6.5.2.4 SulfoEGSFigure 6.12 lists the identified cross-linked species and illustrates the overlap be-tween each StavroX, pLink and manual cross-link identification method. StavroXidentified a total of 12 unique sulfoEGS cross-linked species out of which ten hadinsufficient MS/MS evidence to confirm their presence, and two were actual cross-linked species. pLink identified five unique sulfoEGS cross-linked species out ofwhich two did not possess sufficient MS/MS evidence for confirmation, two didnot produce sufficient MS signals and one was an actual cross-linked species. ThepLink identified sulfoEGS cross-linked species was also identified by StavroX andmanually. All StavroX identified sulfoEGS cross-linked species were also identi-fied manually. Figure 6.13 shows an example of an annotated MS/MS spectrumgenerated by StavroX for a cross-linked species identified by StavroX, pLink andmanually. Six out the 29 (21%) of the expected backbone fragment ions for eachcalmodulin peptide was present, respectively. With sufficient y and b fragment ionsfor both component peptides with the cross-linker bridge intact, this species wasaccepted as a cross-link.2206.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links[M]calc (Da) Cross-Link Structure StavroX pLink Peptide 1 Peptide 2 m/z z [M]exp (Da) Mass Accuracy (ppm) m/z z [M]exp (Da) Mass Accuracy (ppm)  Calmodulin-Melittin Cross-Linked Species  2539.21 22RKR24 76M(ox) KDTDSEEEIREAFR90 635.81 4 2539.23 8.7  Calmodulin-Calmodulin Cross-Linked Species  3733.78 91VFDKDGNGYISAAELR106 91VFDKDGNGYISAAELR106 934.46 4 3733.80 5.4 934.46 4 3733.82 11.2 StavroX Manual pLink 1 1 0 0 (a)                   (b)                0 0 3 Figure 6.12: (a) Venn Diagram of MS/MS verified sulfoEGS cross-linked speciesidentified by each software (StavroX and pLink) and manual method, two meth-ods (two region overlap), all methods (center region overlap); (b) The calculatedmonoisotopic mass, cross-link peptide sequence (cross-linked residues highlightedin red), m/z, experimental monoisotopic mass, and mass accuracy for species iden-tified by StavroX and pLink. Cross-links that agreed with the manual detection arehighlighted in purple.2216.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksCharge State of  y/b ions b-ions y-ions Cross-Link Bridge  relative intensity of ions   0%        100% Figure 6.13: An example of a sulfoEGS cross-linked species (m/z = 936.46, z = 4)and its annotated MS/MS spectrum from StavroX2226.5. Manual versus Software Identification of Calmodulin-Melittin Cross-links6.5.3 FormaldehydeStavroX and pLink identified cross-linked species were shown to identify cross-linked species from established cross-linkers, as expected since these software pro-grams were designed for such cross-linking chemistry. These software programswere used to search PFA cross-linked samples by inputting PFA-specific reactivesites, bridge composition and modifications to test whether they can be applied toPFA as well.Figure 6.14 lists the identified cross-linked species and illustrates the overlapbetween each MeroX, StavroX, pLink and manual cross-link identification method.MeroX identified a total of nine unique PFA cross-linked species out of whichseven did not satisfy the acceptance criteria due to insufficient MS/MS evidence,and two were true cross-linked species. StavroX identified a total of 11 uniquePFA cross-linked species out of which seven corresponded to single peptides, twowere not confirmed due to insufficient MS/MS evidence, and two were true cross-linked species. All MeroX and StavroX identified and confirmed PFA cross-linkedspecies were also identified manually. pLink identified two unique PFA cross-linked species out of which one had insufficient MS/MS evidence for its verifica-tion, and one was an actual cross-linked species that was also identified manuallyand by MeroX.To illustrate PFA cross-linked species identified by the software thatpossessed insufficient MS/MS evidence for confirmation, the proposedstructure and MS/MS spectrum of the highest scoring example, whichwas identified by StavroX, is shown in Figures 6.15 and 6.16, respec-tively. For the PFA cross-linked species with a charge of six at m/z701.01 (Score = 101), the following structure was proposed by StavroX:75KMK77^75KM(ox)K(+12)DTDSEEEIR(+12)EAFR(+12)VFDKDGNGYISAAELR126.For this calmodulin-calmodulin cross-linked species, StavroX iden-tified only eight type 3 ions (Iy2 to Iy7, Ib7 and Ib18) for75KM(ox)K(+12)DTDSEEEIR(+12)EAFR(+12)VFDKDGNGYISAAELR126 andone type 2 ion (Iy17+II), which is only 9 % of the total expected number offragment ions. In addition, there was no fragment ion evidence for 75KMK77.Therefore, this cross-linked species was classified as having insufficient MS/MS2236.5. Manual versus Software Identification of Calmodulin-Melittin Cross-linksevidence to confirm its presence.The MS/MS fragment ion evidence assignment of the software vs manualidentification was compared. Figure 6.17 shows an example of an annotatedMS/MS spectrum generated by MeroX for a cross-linked species identified byMeroX (m/z = 606.58, z = 4), pLink (m/z = 808.44, z = 3) and manually (m/z =484.46, z = 5), see Figure 4.32 for MS/MS spectrum) with the following structure:91VFDKDGNGYISAAELR106^1GIGAVLK7. In this case, no type 2 ions wereidentified by MeroX and manual analysis of this cross-linked species. A series ofunmodified type 3 for the calmodulin peptide 91VFDKDGNGYISAAELR106 andmodified type 3 ions for the melittin peptide 1GIGAVLK7 along with a type 1 ionfor the +12 Da modified melittin peptide were identified by both the MeroX andmanual analysis to confirm the cross-linked species. In both analysis methods,cross-linking was localized to melittin G1 and calmodulin Y99. MeroX identified34% and 55% of the expected fragment ions for 91VFDKDGNGYISAAELR106and 1GIGAVLK7, respectively, which is sufficient MS/MS fragment ion evidenceto confirm this cross-linked species. However, MeroX missed the type 3 ionsIy1, IIy1,IIb2, and IIb5 that were identified manually and the manual analy-sis did not identify the type 3 ion Iy6