UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Serological immune profiling using mass spectrometry Bundala, Matthew Matus 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_february_bundala_matthew.pdf [ 2.75MB ]
Metadata
JSON: 24-1.0360777.json
JSON-LD: 24-1.0360777-ld.json
RDF/XML (Pretty): 24-1.0360777-rdf.xml
RDF/JSON: 24-1.0360777-rdf.json
Turtle: 24-1.0360777-turtle.txt
N-Triples: 24-1.0360777-rdf-ntriples.txt
Original Record: 24-1.0360777-source.json
Full Text
24-1.0360777-fulltext.txt
Citation
24-1.0360777.ris

Full Text

SEROLOGICAL IMMUNE PROFILING USING MASS SPECTROMETRY by Matthew Matus Bundala MSc, The University of British Columbia, 2017 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Genome Science and Technology) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) November 2017 © Matthew Matus Bundala, 2017 ii  Abstract The adaptive immune system is a fantastically complex system comprised of specialized cells, organs and molecules. When properly functioning, our adaptive immune system is continually responding to, and protecting us from, a myriad of disease threats ranging from viruses to cancer. Immune dysfunction leaves us vulnerable to pathogens and can result in autoimmune disorders. Despite the adaptive immune system’s central importance in health, efficient immune profiling methods, and in particular approaches capable of functional characterization of adaptive immune responses, are still in their infancy. In this thesis I present a rapid and scalable serological immune profiling protocol based on antibody mediated identification of antigens (AMIDA). Serological antibodies are extracted from serum, covalently bound to magnetic beads, and washed with a panel of antigens. The bound immunoreactive antigens are then eluted, in-gel trypsin digested, and identified by tandem mass spectrometry. I demonstrate the application of this protocol for profiling immune responses of nine patients to a set of bacterial pathogens including Escherichia coli, Pseudomonas aeruginosa, Klebsiella pneumoniae, and Salmonella typhimurium. The identified list of antigens includes outer membrane proteins known to cause immune reaction, as well as novel immunogenic proteins. The data allows for characterization of differences in the global antigen reactivity between different patients and identifies individuals having pronounced pathogen recognition. In a small study I show that K. pneumoniae and P. aeruginosa are generally more reactive across the pathogens tested, and that S. typhimurium showed the weakest reactivity. The improved AMIDA-based protocol allows for efficient iii  identification of immunoreactive antigens and the profiling of patient reactivity. When coupled with complementary technologies such as immune repertoire sequencing this approach can be applied to high-value applications including vaccine development, biomarker identification, and therapeutic antibody discovery. iv  Lay Summary There is currently lack of a physiologically-accurate high throughput quantitative immuno-profiling technique. The goal of this work is to fill this gap with a novel method for measuring the immune response of a patient. To achieve this, I developed qAMIDA, a method which uses patient’s blood-borne immune molecules (antibodies) to pull down immunoreactive pathogenic molecules (antigens) from a pathogenic sample, which are then identified and quantified by mass spectrometry. To showcase the efficacy of qAMIDA, immune responses of nine patients were measured against five different bacterial pathogens. qAMIDA was able to identify established as well as novel antigens, and quantify the immunoreactivity of each antigen across the patients. These results position qAMIDA as a viable method for discovery of new biomarkers, identification of novel antigens, as well as development of vaccines. In clinical settings, qAMIDA can provide a snapshot of patient’s immune system, giving insight into patient’s disease history. v  Preface Dr. Carl Hansen and I identified the design of my master’s thesis. The antibody purification, binding to magnetic beads, bacterial culture work, and antigen immunoprecipitation was performed by me, with supervision from Dr. Hansen, Dr. Foster, and Dr. Heyries who is a postdoctoral fellow in the Hansen Lab. In-gel trypsin digestion, as well as preparation of peptides for mass spectrometry analysis up-to and including isotope labeling was ran by me, following a protocol established by Leonard Foster’s group. Jenny Moon and Nikolay Stoynov, PhD, who are research assistant and research associate respectively in the Foster Lab, ran the mass spectrometry analysis of the peptide samples. I have developed and applied the data analysis pipeline. My committee members Dr. Hansen, Dr. Foster, and Dr. Withers, as well as Dr. Heyries, Jenny Moon, and other members of Hansen and Foster laboratories, advised me at various parts of my project. I have written this thesis in its entirety, with the exception of section 3.1.7 which was written by Nikolay Stoynov, PhD. My supervisor Dr. Hansen advised me on the organizational structure of my thesis. The thesis’ grammar was proof-read by Mike VanInsberghe, who is a PhD student in Hansen Lab, as well as Dr. Hansen, before being submitted to my committee for approval. vi  Table of Contents Abstract .......................................................................................................................................... ii Lay Summary ............................................................................................................................... iv Preface .............................................................................................................................................v Table of Contents ......................................................................................................................... vi List of Tables ..................................................................................................................................x List of Figures ............................................................................................................................... xi List of Abbreviations ................................................................................................................. xiii Acknowledgements ......................................................................................................................xv Chapter 1: Introduction ................................................................................................................1 1.1 Overview of the Immune System ................................................................................... 2 1.1.1 Antibody structure ...................................................................................................... 3 1.1.2 V(D)J recombination .................................................................................................. 6 1.1.3 Affinity maturation and somatic hypermutation ......................................................... 7 1.2 Why Profile the Adaptive Immune System? .................................................................. 9 1.2.1 Infectious diseases and vaccine development ............................................................. 9 1.2.2 Autoimmune diseases ............................................................................................... 10 1.2.3 Immuno-oncology ..................................................................................................... 11 1.3 Immune Profiling Techniques....................................................................................... 12 1.3.1 Ig repertoire studies: next generation sequencing ..................................................... 12 vii  1.3.2 Immunoblotting and ELISA ..................................................................................... 14 1.3.3 Phage display ............................................................................................................ 15 1.3.4 Ig reactivity: protein microarrays.............................................................................. 16 1.3.5 Mass spectrometry as a tool for immunology ........................................................... 16 1.3.6 Overview of MS types .............................................................................................. 18 1.3.7 Comparative and quantitative mass spectrometry .................................................... 20 1.3.8 Overview of MaxQuant: protein search and quantification software ....................... 22 1.3.9 Antibody mediated identification of antigens: AMIDA ........................................... 23 1.4 Research Statement ....................................................................................................... 23 Chapter 2: Protocol Development ..............................................................................................25 2.1 Antibody Extraction and Linkage to Dynabeads .......................................................... 26 2.2 Bacterial Cell Lysis ....................................................................................................... 28 2.3 Antigen Incubation and Elution .................................................................................... 30 2.3.1 Bioanalyzer: antigen elution conditions .................................................................... 31 2.3.2 MS: antigen elution conditions ................................................................................. 32 2.3.3 MS: antigen elution by SDS and glycine .................................................................. 34 2.4 MS: Wash Conditions to Remove Contaminants ......................................................... 37 2.5 MS: Lysate Buffer Exchange ........................................................................................ 39 2.6 MS: Lowering IgG Contamination through Antigen Elution Modification ................. 41 Chapter 3: Serological Immune Profiling of Nine Patients on Five Bacterial Lysates .........43 3.1 Methods......................................................................................................................... 44 viii  3.1.1 Preparation of antigen panel ..................................................................................... 44 3.1.2 Preparation of antibody-bead complexes .................................................................. 45 3.1.3 Antigen binding and elution...................................................................................... 46 3.1.4 In-gel trypsin digest .................................................................................................. 46 3.1.5 Peptide cleaning ........................................................................................................ 48 3.1.6 Triple peptide formaldehyde labeling ....................................................................... 49 3.1.7 Mass spectrometry sample analysis .......................................................................... 51 3.1.8 Data analysis pipeline ............................................................................................... 52 3.1.8.1 STRING: protein network analysis ................................................................... 54 3.2 Results ........................................................................................................................... 55 3.2.1 qAMIDA identifies serologically reactive antigens ................................................. 55 3.2.2 Isotopically labeled proteins are relatively quantified by MS .................................. 59 3.2.3 STRING finds significant enrichment in GO Cellular Component domain ............. 64 3.3 Discussion ..................................................................................................................... 66 3.3.1 Potential antigens ...................................................................................................... 66 3.3.2 Heterogeneity of the data .......................................................................................... 67 3.3.3 Need for fractionation ............................................................................................... 68 3.3.4 Technical replicates show medium correlation ......................................................... 69 3.3.5 High variability of correlations between MS runs of the same sample .................... 70 3.3.6 MaxQuant ................................................................................................................. 74 Chapter 4: Conclusions and Recommendations for Future Steps ..........................................75 ix  4.1 Suggestions for Further Improvements ......................................................................... 76 4.1.1 Need for cross validation .......................................................................................... 77 4.1.2 Fractionation of antigen panel .................................................................................. 78 4.1.3 Need for technical replicates ..................................................................................... 78 4.2 Comparison to Recently Published MS-AMIDA Paper ............................................... 79 Bibliography .................................................................................................................................81 Appendices ....................................................................................................................................93 Appendix A Additional Attempts at Lowering Ig Contamination ............................................ 93 A.1 Antigen elution by urea ............................................................................................. 93 A.2 Cutting out IgG bands ............................................................................................... 94 Appendix B Overview of Bacterial Pathogens Used as an Antigen Source ............................. 96 B.1 Escherichia coli K12................................................................................................. 96 B.2 Klebsiella pneumoniae wild type and ΔcpsB mutant ................................................ 97 B.3 Salmonella typhimurium ........................................................................................... 97 B.4 Pseudomonas aeruginosa ......................................................................................... 97  x  List of Tables Table 2.1 The elution and wash conditions for antibody-bound antigens. ................................... 33 Table 2.2 The Pseudomonas proteins with highest intensity found in the SDS elution. .............. 36 Table 3.1 The primary aims to assess the performance of qAMIDA protocol. ............................ 43 Table 3.2 The modified settings for MaxQuant 1.5.7.4 data processing. ..................................... 53 Table 3.3 The list of proteins in the top 1 % quantile of all intensities, per bacterium. ............... 58 Table 3.4 Enriched GO Cellular Component terms. ..................................................................... 64 Table 3.5. The total number of identified proteins per lysate across all patients. ........................ 68 Table 4.1. The answers for primary experimental goals. .............................................................. 77 Table 4.2. The major differences between qAMIDA and MS-AMIDA experiments. ................. 79  xi  List of Figures Figure 1.1 Immunoglobulin G structure and sequence composition. ............................................. 5 Figure 1.2 The schematic of a mass spectrometer used for this project. ...................................... 19 Figure 2.1 The conceptual overview of the protocol workflow.................................................... 26 Figure 2.2 Beads coated with IgG incubated with hemagglutinin 3 influenza A antigens (H3 antigen) fluorescent markers. ........................................................................................................ 27 Figure 2.3 Fluorescent marker intesity on IgG-coated Dynabeads. .............................................. 28 Figure 2.4 E. coli lysate. ............................................................................................................... 30 Figure 2.5 The proteins before in-gel digestion by trypsin. .......................................................... 34 Figure 2.6 The natural logarithm of intensities between conditions. ............................................ 35 Figure 2.7 Comparison of densities. ............................................................................................. 38 Figure 2.8 The change of intensity distribution as a result of lysis buffer exchange. .................. 40 Figure 2.9 The effect of NEM and boiling on IgG contamination. .............................................. 41 Figure 3.1 The light, medium, and heavy labeling reactions. ....................................................... 50 Figure 3.2 Heatmap of all intensity values per protein identified in the MS spectrum. ............... 57 Figure 3.3 The binary heatmap of protein ratio availability. ........................................................ 60 Figure 3.4 The log of intensity ratio before and after normalization. The normalization employed was a “normalization by z-score”, which brings the mean to zero and standard deviation to one........................................................................................................................................................ 61 Figure 3.5 The quantitative comparison of antigens across patients. ........................................... 63 xii  Figure 3.6 The STRING interaction network of E. coli proteins.................................................. 65 Figure 3.7. The comparisons between two pairs of identical runs................................................ 70 Figure 3.8. The heatmap of the control sample in all MS runs (heavy label control). ................. 72 Figure 3.9. The Spearman correlation matrix between protein intensities of the “heavy” normalization channel. .................................................................................................................. 73  Appendix Figure A.1. ................................................................................................................... 94  xiii  List of Abbreviations AD .................................................................................................................. Anderson-Darling test AID ........................................................................................ Antigen-Induced cytidine Deaminase AMIDA .................................................................... Antibody Mediated Identification of Antigens ATP ............................................................................................................ Adenosine Triphosphate BCR............................................................................................................. B-Cell antigen Receptor BSA ............................................................................................................. Bovine Serum Albumin DPBS.................................................................................... Dulbecco’s Phosphate Buffered Saline DTT .............................................................................................................................. Dithiothreitol ELISA ................................................................................ Enzyme-Linked ImmunoSorbent Assay ESI................................................................................................................ Electrospray Ionization Fab region ..................................................................................... Fragment antigen binding region Fc region ........................................................................................... Fragment crystallisable region FDR ................................................................................................................. False Discovery Rate GC ........................................................................................................................... Germinal Centre Ig ............................................................................................................................. Immunoglobulin Ig-seq.................................................................................................... Immunoglobulin sequencing iTRAQ......................................................... isobaric Tag for Relative and Absolute Quantification KW ..................................................................................................................... Kruskal-Wallis test LB ............................................................................................................................ Lysogeny Broth xiv  MALDI ...................................................................... Matrix Assisted Laser Desorption/Ionization MQ ................................................................................................................................... MaxQuant MS ...................................................................................................................... Mass Spectrometry NEM ..................................................................................................................... N-Ethylmaleimide NGS..................................................................................................... Next Generation Sequencing OMP .......................................................................................................... Outer Membrane Protein PAGE ...................................................................................... PolyAcrylamide Gel Electrophoresis PEP .......................................................................................................... Posterior Error Probability qAMIDA .............................................. quantitative Antibody Mediated Identification of Antigens RSS ............................................................................................... Recombination Signal Sequence RT-PCR............................................................. Reverse Transcriptase Polymerase Chain Reaction SDS ............................................................................................................ Sodium Dodecyl Sulfate SILAC ........................................................................... Stable Isotope Labeling with Amino Acids STAGE ......................................................................................................... Stop-and-go Extraction TFA .................................................................................................................... Trifluoroacetic acid V(D)J.................................................................................................... Variable (Diversity) Joining xv  Acknowledgements I would like to thank my supervisor Dr. Hansen for his support and encouragement throughout the duration of my Master’s thesis research and writing. He was always willing to help and advise me whenever I ran into difficulties. I would also like to thank my co-supervisor Dr. Foster, who allowed me to use his laboratory’s resources and his experience which were crucial in finishing this project. I’m grateful to Dr. Heyries for dedicating his time to teach and train me in best laboratory practices. His comments and advice have made me a better scientist. Jenny Moon from the Foster Lab has shared her skill and laboratory techniques in mass spectrometry sample preparation, for which I am thankful. I would also like to thank Dr. Duncan, Dr. Brown, Mike VanInsberghe, and all other members of the Hansen and Foster laboratories for providing their experience to my project. Finally, I am thankful to my family, my girlfriend Danya, and my roommates Ben and Margaux for supporting my years of study and for keeping me on track. 1  Chapter 1: Introduction The adaptive immune system has been evolved over more than 300M years and is capable of generating lasting and exquisitely specific responses to nearly any foreign antigen. The two arms of the adaptive immune response include humoral immunity, which generates antibody responses against extracellular antigens, and cellular immunity, which is comprised of T cell responses against intracellular antigens presented as peptide fragments on the major histocompatibility complex (MHC). Both systems are capable of mounting highly specific responses during infection, which can be “remembered” for more rapid response to future challenge. In addition to these common features, the humoral response has the unique ability to adapt and optimize its antigen recognition capabilities through a process of somatic hypermutation and selection, the end result of which is the production of complex mixtures of highly specific and high-affinity antibodies. Because antibody responses can persist over long periods, the collection of antigens that are recognized by polyclonal mixtures of antibodies in serum are determined by the history of immunological challenge. The ability to identify the targets of serological antibodies is of practical use for understanding immune responses, for identifying disease-specific biomarkers, to aid in vaccine development, and to be used in the immunization of B-lymphocytes. In this thesis I describe the development and testing of a serological profiling method that identifies antigens reactive to humoral immunoglobulin G antibodies, termed quantitative antibody mediated 2  identification of antigens (qAMIDA). My implementation of qAMIDA builds from work by Gires et al. (1), and seeks to greatly improve the antigen detection capability using a combination of state-of-the-art mass spectrometry (MS) and label-based quantitative MS for relative protein abundance comparisons. Briefly, in qAMIDA antibodies are extracted from serum samples, immobilized on a solid support, and exposed to a pool of antigens to allow for polyclonal capture by specific antibody-antigen interactions. The antigens are then eluted from the solid support, digested, and the resulting fragments analyzed on a state-of-the-art mass spectrometer to identify protein targets. 1.1 Overview of the Immune System The immune system performs its function through a complex network of subsystems that can be categorized as either innate or adaptive. The innate response is highly conserved across organisms (2), and consists of physical and chemical barriers, as well as immune cells reacting to the common chemical properties of pathogens. This response represents the first line of defense and is both non-specific and rapid (3). The adaptive response is restricted to higher organisms, generally being present only in jawed vertebrates, and uses a combination of diversity and selection to “learn” the recognition of pathogen-specific antigens. The primary effectors of adaptive immunity against extracellular pathogens are immunoglobins (or antibodies), a highly diverse protein family produced by B-cells. Antibodies are produced by mature B cells and are first expressed as a membrane bound proteins called B-cell antigen receptors (BCRs). Each unique BCR is generated during B-cell maturation from hematopoietic stem cells in the bone 3  marrow. BCRs are composed of a complex of two identical heterodimers, each having a heavy and a light chain. During maturation the heavy and light chains are created through combinatorial shuffling of different gene segments, including segments V, D, J for the heavy chain and V, J for the light chain, in a process referred to as V(D)J recombination (subsection 1.1.2) (4). The joining of V, (D), J segments is imperfect, adding or in some cases removing nucleotides or short palindromic sequences from the junctions. These processes are nearly random, allowing for an incredible BCR diversity estimated on the order of 1011 possible naïve BCRs (5). After the B-cell matures, it is considered naïve, since its BCRs have not yet encountered foreign antigens. When an antigen binds to the B-cell, the BCR starts a signal transduction pathway and/or compartmentalizes the antigen and presents its peptides to helper T-cells, starting an immune response. This response involves further expansion of BCR diversity by a process called somatic hypermutation (section 1.1.3). 1.1.1 Antibody structure Antibodies are large molecules comprised of immunoglobulin (Ig) units. Each Ig unit is a Y-shaped molecule which consists of two identical light chains (~25 kDa each) and two identical heavy chains (~50 kDa each), connected by disulphide bonds. The five types of antibody classes – IgM, IgA, IgD, IgG, and IgE – are categorized based on the type of heavy chain, and are sometimes also referred to as the μ, α, δ, γ, and ε chains, respectively. Each of these different heavy chain classes elicits a different immune response. In humans there are two types of light chains, κ and λ, however these do not significantly change the function of the antibody (6).  4   Both the light and heavy chains contain variable and constant regions. The constant region is preserved within each Ig class, and is responsible for interacting with effector cells and other molecules. The antigen-binding variable region is unique for each B-cell and its clones. These regions are enormously diverse, thanks to V(D)J recombination (section 1.1.2), and enable the antibody to bind to virtually any antigen. The variable region on each chain contains complementarity-determining regions (CDRs) 1, 2, and 3. The CDR1 and CDR2 are within the V region, while the CDR3 is created by the junctions of the V, D (in heavy chains) and J regions. Given the combinatorial nature of V(D)J recombination, the CDR3 is hypervariable and is generally responsible for a substantial component of antibody specificity, though exceptions to this rule exist (7). The arms of the antibody, which contain the whole light chain and corresponding part of the heavy chain, are often referred to as Fab regions (Fragment antigen binding regions). The rest of the antibody (the heavy chain under the hinge region) is referred to as the Fc region (Fragment crystallisable). Figure 1.1 shows the structure of a most common IgG antibody. 5   Figure 1.1 Immunoglobulin G structure and sequence composition. All antibodies have IgG as a structural unit. The variable region is comprised of three hypervariable complementarity-determining regions (CDRs), shown in red. These regions are placed between nine framework regions, shown in grey, which form β sheets. The CDRs reside inside the outer folds connecting the β sheets, forming the paratope. All naïve B-cells have BCRs of class IgM or IgD. While IgD is a simple monomer, IgM exists as a pentamer consisting of five “Y” shaped immunoglobulins. Their functions differ as well: While IgD is primarily involved in B-cell activation, IgM is the initial antibody released into the Immunoglobulin GHeavy ChainLight ChainAntigen Binding Site(Paratope)Constant RegionVariable RegionFab regionFc regionA C’’B C’ D E F GCCDR1 CDR2 CDR3Variable Region6  bloodstream as a response to a pathogen. IgG, IgM, IgA, and IgE are all found in plasma. IgG is the most common plasma antibody responsible for both primary and secondary immune response, IgE is often involved in inflammation and allergic response, and IgA commonly protects mucosal surfaces. 1.1.2 V(D)J recombination The human genome cannot hold genetic information for all the possible BCR receptors, modeled to be on the order of 1018 (8). Instead, BCR genes get somatically recombined from V (variable), D (diversity) and J (joining) segments for heavy chains and V and J segments for light chains by random exon selection. This process is mediated by V(D)J recombinase, which is made out of two proteins RAG1 and RAG2. V(D)J recombinase recognizes recombination signal sequences (RSSs) adjacent to each V, D, and J coding segment. RSSs contain conserved sequences divided by a 12 or 23 base pair spacer (around one or two double-helix turns). The actual RSS sequence is variable, with different sequences having different recombination efficiencies, although the spacer length is critical for a successful recombination process (9). During recombination, two V(D)J recombinases first bind to a random pair of RSSs flanking coding sequences that are to be joined together. The DNA is bent in a U shape, and the V(D)J recombinases form a complex in which the four DNA strands are nicked. The two coding strands form hairpins that later get cleaved by Artemis endonuclease into sticky ends and joined together (10,11). During this process, additional nucleotides may be added onto the sticky ends by terminal deoxynucleotidyl transferase, greatly increasing the already large number of possible BCR sequences. The two 7  signaling strands then get cleaved into blunt ends, which are later joined into a signaling DNA loop that plays no further role in the recombination process. It is important to note that successful recombination of maternal or paternal chromosome silences the recombination on paternal or maternal chromosome, respectively. The V(D)J rearrangement first happens on a heavy chain, then a κ light chain, and if κ rearrangement is not productive, a λ light chain. If neither heavy chain nor any of the light chain rearrangements are productive, the B-cell dies. This process is called allelic exclusion, and it ensures that B-cell produces only one type of antibody (12). Notably, allelic exclusion has recently been shown to be imperfect, with observed 0.4% of mature B cells expressing two functional light chains (13). The productive rearrangements are then used to express IgMs on the cell surface as BCRs. If the BCRs are reactive to self antigens, the B-cell undergoes apoptosis or replacement of light chains by receptor editing. If the replaced light chains are still self-reactive, the immature B-cell is removed. 1.1.3 Affinity maturation and somatic hypermutation BCRs on naïve B-cells have low affinity and are often polyreactive as they can bind multiple antigens (14). After a naïve B-cell travels from bone marrow into a lymph node it is presented with an antigen by one of numerous ways (presented as antibody-antigen complex, soluble antigen, on a dendritic cell, or by other B cells), and in the appropriate context of T cell help, it becomes activated and starts to divide. Activation causes B cells to divide rapidly in germinal centers (GCs) and undergo somatic hypermutation of the variable regions of their antibodies. Somatic hypermutation is driven by an enzyme called antigen-induced cytidine deaminase 8  (AID). AID generates mismatches in the variable regions of Ig genes, which then get repaired by various mechanisms causing on average one-point mutation every 103 – 106 base pairs per B-cell division. If the mutated BCR genes lose affinity for the antigen the cells are no longer activated and undergo apoptosis, while neutral mutations are preserved and accumulate through successive generations. Rarely, mutations increase the affinity of the BCR, with the result of stronger activation and a selective advantage over other clones. In this way B-cells undergo a rapid evolution in which the highest affinity clones are expanded and further optimized. This process, called affinity maturation, allows for the generation of highly specific antibodies having affinities with equilibrium dissociation constants KD = 1 in the pM to nM range (15). In parallel with affinity maturation, AID can generate double-stranded breaks around the Ig constant region coding exons. These breaks get repaired by non-homologous end joining, which changes the antibody class (16). Ultimately, the B-cells in GCs follow two distinct fates. One subset will differentiate into long-lived plasma cells that reside in the bone marrow, can persist for the life of the organism, and serve primarily to secrete antibodies into the serum. Plasma cells are amongst the most prolific protein producing cells in the body, capable of secreting between 103 – 104 antibodies per second. Alternatively, some B-cells form memory B cells, which continue to express the affinity-matured surface-bound antibody (BCR). Memory B cells form the immunological memory and circulate through the organism to patrol for a future encounter with the target antigen, at which point they become activated and divide once more. This allows for a much more rapid and high-affinity response to secondary exposure to an antigen. 9  1.2 Why Profile the Adaptive Immune System? The immune system is a fine-tuned system of organs, cells and molecules that protects us from bacterial and viral pathogens, parasites, and rogue carcinogenic cells. Too little immune activity leaves us vulnerable to disease. Too much activity results in autoimmune diseases. It is important to understand how the immune system works, and to develop fast and efficient methods for studying the adaptive immune response. 1.2.1 Infectious diseases and vaccine development Our immune system can respond to virtually any infection. However, due to the stochastic nature of finding the correct antibody, often the response is too slow and the pathogen causes irreparable damage to the body. An effective solution to this problem is vaccination: the immune system is exposed to weakened or dead pathogens, or their antigens, initiating the development of antibodies and long-lived memory B-cells. As a result, upon exposure to an unattenuated pathogen, the immune system can respond more rapidly to prevent infection. Early vaccines were based on live or attenuated viruses, a method that is still employed successfully today. A successful example of this approach is the MMR vaccine that targets Measles, Mumps, and Rubella. An alternative vaccination strategy is based on recombinant antigens designed to elicit adaptive immune responses to specific proteins and/or epitopes on a pathogen. Examples of this include the seasonal flu vaccine which is composed of variants of influenza hemagglutinin and neuraminidase. A major challenge for flu vaccination is the rapid evolution of the virus and the need to predict and produce new antigen variants each year (17,18). In some cases, such as that 10  of HIV, the development of an effective vaccine has proven elusive (19). This is true despite the fact that HIV patients mount strong neutralizing responses to HIV, though they do not spontaneously clear the disease. This and other cases has spurred a field of rational vaccine design in which scientists attempt to analyze natural immune responses to infer the design of antigens and vaccine strategies that will be more effective. Yet another recent vaccine development is the use of DNA for vaccination, in which a DNA plasmid encoding for a known antigen is delivered to patients (20). These strategies are being actively pursued both for infectious disease and for application in oncology. In general, the field of vaccine research is faced with a myriad of different problems related to the complexity and variability of immune responses. New analytical tools capable of dissecting human immune responses would be of high utility in advancing vaccine development. 1.2.2 Autoimmune diseases Autoimmune disease is a condition where healthy constituents of the body get attacked by the immune system due to failure of the self-tolerance systems. For instance, multiple sclerosis, which is a demyelination in the central nervous system, is thought to be caused by T-cells reactive against myelin epitopes and is an example of a common autoimmune disease (21); notably, the effectiveness of anti-B cell therapies (e.g. Rituximab) in MS suggests that a humoral response may also be in play. Other common examples include Lupus, Crones disease, psoriasis, myasthenia gravis, and arthritis. The prevalence of autoimmune diseases such as multiple sclerosis has increased in Western countries over the last 60 years (22), a phenomenon that may 11  be due to increased hygiene at an early age and the resulting effect on immune development. In general, autoimmune diseases are considered incurable, with most treatments aimed at temporary relief of symptoms using antibody- or small molecule-based immune suppression (e.g., cyclosporine, Humira, IV Ig), but not addressing the underlying cause. The increased prevalence and general lack of a cure underlines the importance of better understanding of autoimmune diseases, in which the immune system fails to identify normal body cells as self-cells. Understanding which body-cell molecules (self-antigens) are responsible for immune activation could aid autoimmune diagnosis and treatment (23). 1.2.3 Immuno-oncology It is now recognized that one of the most important defenses against cancer is the recognition and removal of rogue cells by our immune system. When this system fails, there is a higher chance of malignancy even in a healthy individual. Understanding this mechanism is crucial for cancer immunotherapy – a field which has flourished in recent years with breakthrough new therapies that work by promoting anti-cancer immune responses (24). When malignancy occurs, the T-cells can be activated by recognizing mutated proteins displayed by the MHC. To prevent T-cells from attacking healthy cells, there are pathways that serve as a check on T-cell activation. It was shown that malignancies often hijack these pathways to hide themselves from the immune system (25). Recently there have been approvals for therapeutic antibodies that block or stimulate these pathways, allowing T-cells to ignore the break points in their development. Modulation of two of the pathways, cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) and 12  programmed death 1 (PD-1), have both emerged as powerful approaches for the treatment of cancer (26). Based on this success, a myriad of new immune activation strategies is now being advanced including oncolytic viruses, DNA vaccines, alternative checkpoint blockades, and combination therapies. Further understanding of the antibody-antigen reactivity that is generated during patient responses to these treatments is of central interest to the growing field of immuno-oncology research. 1.3 Immune Profiling Techniques Given the central importance of the immune system in health, there has been a large effort in developing means to profile the immune system. Despite these efforts, there are still gaps in our ability to measure antigen-antibody interactions, which can be filled with high-throughput immune profiling techniques. 1.3.1 Ig repertoire studies: next generation sequencing Next generation sequencing (NGS) is one of the most powerful tools being used to study the dynamics of human immune repertoires. Initial methods for Ig-seq were based on B-cell immortalization (27) and classic Sanger sequencing (28). Although they enabled initial understanding of V(D)J recombination and somatic hypermutation, the inherent low-throughput nature prevented the analysis of any significantly large part of the full Ig repertoire. The relatively recent development of high-throughput sequencing has transformed this field, allowing for comprehensive antibody repertoire sequencing from bulk populations (29). However, this 13  approach still faces challenges. First, conventional Ig-Seq methods start from pooled populations of B cells and therefore lose information on light and heavy chain pairing. Recently, attempts have been made to develop library preparation methods that address this limitation. One approach relies on placing single B-cells into unique microwells, and performing RT-PCR on their mRNAs with barcoded primers such that each well has a unique barcode pair for its heavy and light chains (30). Another method fuses the heavy and light chain mRNAs into a single paired sequence, so that sequencing assembly can reconstruct the fused paired sequence (31). In yet another method, the enormous diversity of the heavy and light chains is ingeniously used to track the separated pairs across many different sequencing subsets. By taking note of the pairs that always appeared in different subsets together, it is possible to reconstruct the pairing information (32). Finally, a commercial system by 10X genomics has recently been marketed that uses a combination of droplet-based single cell amplification and barcoded amplicon libraries to enable paired-chain Ig-Seq with throughputs of approximately 50,000 cells per run. While NGS technologies have been used extensively for Ig repertoire profiling research in recent years, these studies are fundamentally limited to describing the diversity of antibody sequences that exist in samples. They cannot inform on the functional binding properties of antibodies to connect sequence with antigen recognition. There is thus a need to couple Ig-seq to a complimentary method for assessing antibody reactivity. 14  1.3.2 Immunoblotting and ELISA Immunoblotting is an older technique where antigens are separated by a gel electrophoresis, and then transferred onto a sheet of nitrocellulose that contains antibodies (33). This creates a map showing which bands on the gel correspond to antigen of interest. The method is being constantly improved, and recently is able to identify even post-translational modifications of proteins (34). While immunoblotting is a cost effective and accessible technique, it is not high-throughput and does not contain any information about the identity of an unknown antigen. Enzyme-Linked ImmunoSorbent Assay (ELISA) is one of the most popular immune profiling methods available due to its simplicity and low cost (35). Direct ELISA involves binding of the antigen to a substrate, washing over with antibodies that are reactive to the antigen, and finally washing over with an anti-IgG antibody that is linked to an often-fluorescent marker. The resulting signal intensity estimates the antibody binding properties. In cases where the antigen is part of a complex mixture, the antibodies can be bound to the substrate instead. The corresponding antigen binds to its antibody, after which the antibody-antigen complexes are washed over with antibodies again, creating an antibody-antigen-antibody sandwich. Finally, the fluorescent anti-antibody Igs signal successful bindings (36). ELISA has similar upsides and downsides to immunoblotting: it is cost-effective and fast, and it can be easily parallelized. However, just like immunoblotting the identity of the binding antigen is often unknown. 15  1.3.3 Phage display One of the challenges in understanding antibody-antigen interactions is the difficulty in identification of the antigen. An ingenious solution to this problem is phage display (37,38). Bacteriophages are viruses that infect bacteria by entering into the cytoplasm and hijacking the cell’s machinery to reproduce in large numbers. In phage display technology, a phage library is created by fusing potential antigens’ DNA sequences with DNA sequence of the phage’s outer coat protein. Resulting phages express fused antigens on surface, and at the same time carry the DNA sequence of the expressed antigen inside their genomes. Antibodies are then fixed to a matrix, exposed to the phage library, and washed. Only the phages that remained after washing are then re-infected into bacteria where they replicate. To achieve higher specificity, the process can be repeated until only the antigens with high affinity are present in the library. These antigens are easily identified by NGS, since their DNA is present within the phage’s genome. The process can also be reversed, where the antigens are bound to a matrix and antibodies are expressed by the phage (39). The phage can also be modified to increase the random mutation rate, resulting in a process that mimics somatic B-cell hypermutation. After several rounds of selection, the majority of the phages contains DNA sequence of a highly reactive antibody (40). A downside of phage display is that the fused protein has a potential to have improper tertiary and quaternary structure; for this reason, antigen presentation is often restricted to small proteins or peptides. 16  1.3.4 Ig reactivity: protein microarrays Protein microarrays offer a high-throughput and parallel platform for studying protein-protein interaction (41). The analytical microarrays work on the same concept as ELISA, where they identify the antibody-antigen reactivity by either directly binding antibodies to immobilized antigens or by creating antibody-antigen-antibody complexes. The latter method has higher sensitivity, at a cost of higher signal-to-noise ratio due to antibody cross-reactivity (42). Reverse-phase protein microarrays work similarly to a direct ELISA. Many different samples are bound to a microarray, washed over by copies of an antibody, and samples with reactivity are then detected. Microarrays can also be generated by synthesizing short peptides. By synthesizing peptides that span the length of the proteome of a pathogen, immunogenic loci can be identified (43). This technique, while powerful, has limitations since the peptides lack secondary and tertiary protein structure that often is essential for successful antibody binding. In addition, the creation of the microarray can be expensive, which is one of the limiting factors in protein microarray application for routine diagnostics (44). 1.3.5 Mass spectrometry as a tool for immunology Tools like ELISA and immunoblotting are inherently low-throughput methods that are poorly suited to protein identification. A much higher-throughput method for protein analysis and identification is mass spectrometry. Mass spectrometry (MS) is most commonly used for 17  identification of the proteins present in the sample, as well as for relative or absolute quantitation (45,46), detection of protein modifications (e.g. methylation (47)), and identification of interaction partners (48). MS is able to achieve this by measuring the mass-to-charge ratio of ionized molecules, such as peptides, to a very high accuracy. The mass-to-charge spectrum and count is then programmatically compared with a whole-proteome database to select proteins that are most likely present in the sample. It is this ability to identify and quantify proteins in a sample that makes MS an attractive tool for immunology. Mass spectrometry is a tool based on the Lorentz Force, which is a combination of electric and magnetic forces on a charged particle flying through an electromagnetic field: ?⃗? = 𝑧?⃗? + 𝑧?⃗?  × ?⃗⃗?, where ?⃗? is force, 𝑧 is charge, ?⃗? is velocity, and ?⃗? and ?⃗⃗? are electric and magnetic fields resp. The mass spectrometer ionizes the particle to be measured, accelerates the particle through an electromagnetic field, and, based on the flight properties, determines its mass-to-charge ratio (m/z). In the case of proteomics, the ion source is a complex mixture of peptides. The mass spectrometer counts how many particles of certain m/z are detected. This creates a spectrum of intensities (counts) of peptides at each m/z, which then can be compared to an in-silico spectrum of peptides from the whole proteome of the studied organism. Although de novo sequencing of proteins is possible (49,50), most MS applications require prior knowledge of the protein sequences that might be present in the sample. 18  1.3.6 Overview of MS types One of the differentiators between the types of MS comes from the source of ions. Electrospray Ionization (ESI) creates ions from a solution and is often coupled to a liquid chromatography separation instrument, making it suitable for highly complex samples. Matrix Assisted Laser Desorption/Ionization (MALDI) uses laser pulses to ionize peptides from a crystalline matrix, and is more suited for simpler peptide mixtures. After the ionization step, the ions are accelerated through an electric field in a long tube. Only a specific m/z determined by the field strength will travel through the whole tube without a collision, filtering the number of ions reaching the detector at the end of the tube. Alternatively, instead of a simple electric field, the ions can be accelerated through a quadrupole, which consists of four rods with alternating voltage potentials. This alternating electric field causes the charged particle to move in a complex spiral pattern. The magnitude of these voltage potentials can select a very specific m/z range, increasing the resolution of the mass spectrometer. In tandem mass spectrometry (Figure 1.2), two quadrupoles are separated by a collision cell that fragments the ionized particles. The second quadrupole then selects a specific m/z range of the fragments, yet further increasing the mass spectrometer resolution. For more information see the review by Aebersold and Mann (51). The instrument used in this thesis was an ultrahigh performance liquid chromatography electrospray ionization quadrupole-quadrupole time of flight tandem mass spectrometer by Bruker, whose overview can be seen in Figure 1.2. 19   Figure 1.2 The schematic of a mass spectrometer used for this project. The peptide solution is first separated based on hydrophobicity by the chromatography column, ionized by the electrospray, and filtered in the first MS quadrupole. The ionized molecule is then fractured, and filtered by the MS/MS quadrupole. Finally, a time of flight detector measures the particle’s m/z ratio. The liquid chromatography tandem MS is used in both top-down and bottom-up proteomics (52). Top-down measurement works with whole proteins, and is limited to studying clean samples with little protein heterogeneity. The proteins also need to be fairly small to be within the optimal mass range of most MS machines. A more common bottom-up approach involves breaking the protein into a pool of smaller peptides, identifying peptides present inside this pool, and then using bioinformatic tools to infer what original proteins were present. Even though less trivial in execution, the bottom-up approach is superior in most cases thanks to few protein size limitations. In addition, recent advances in automation and sensitivity of mass spectrum analysis allow the identification of proteins in relatively complicated mixtures (53). The bottom-up approach can be further divided into shotgun proteomics, targeted proteomics, and data independent proteomics. Shotgun proteomics is able to identify thousands of proteins from a complex sample, and is often used in high-throughput discovery experiments. Targeted DetectorAcceleratorsQuadrupolesElectrosprayLiquid Chromatography Column Collision Cell20  proteomics is able to reproducibly quantify specific samples with a very high accuracy, for which it won the Nature Method of the Year 2012 award (54). It is often used for validation of a shotgun proteomic discovery. Unlike targeted and shotgun proteomics, data independent proteomics does not limit its search to a narrow mass range selected based on the precursor ions. Since it collects data from a much larger m/z range, a high resolution of the detector is very important. The current state of the art is still not fast enough to recognize different ions with a very similar m/z value, making the data independent proteomic approaches difficult (55). The shotgun proteomics approach is most suitable for this thesis, since the goal is to discover and quantify novel antigens. 1.3.7 Comparative and quantitative mass spectrometry Fundamentally MS is a qualitative method. Since a large part of proteomics is interested in relative and absolute abundances of proteins in different samples, substantial MS research is focused on developing quantitative techniques. There are two prominent streams of quantitative MS (qMS): label-based qMS that uses isotopic labels to distinguish between samples analyzed during a single qMS run, and label-free qMS that relies on ion intensity and spectrum counting to determine the peptide abundances in samples (56). Label-based qMS tags different samples with isotopes of different weight, analyzes all the samples within a single MS run, and differentiates between the samples based on specific mass shifts. This allows conceptually easy comparative mass measurements between samples, without any biases stemming from inconsistencies of separate MS runs. The techniques for label-based qMS differ in the mass shift marker 21  incorporation method. Metabolic labeling methods such as Stable isotope labeling of amino acids in cell culture (SILAC) uses isotopically defined medium to grow a cellular population of interest. The labeled amino acids get incorporated into all proteins present in the cell, resulting in a perfect proteome label coverage (57). One of the largest drawbacks is the need to culture the cell system of interest. To get around this limitation, the isotopic labels can be incorporated directly into purified proteins chemically. ICAT uses iodoacetyl reactive group to label cysteine residues only, however introduces a retention time shift that complicates the analysis (58,59). Dimethyl labeling uses formaldehyde and cyanoborohydride to add heavy, medium and light isotopes to lysine and every N-terminus of trypsin-digested sample peptides. Dimethyl labeling uses relatively inexpensive reagents, and its performance is similar to the SILAC method. (60,61). Isobaric tag for absolute and relative quantitation (iTRAQ) chemically labels all samples with tags of equal weight. The tags then dissociate in tandem MS into ions of variable mass (62,63). The advantage of iTRAQ is ability to run up to eight samples within a single qMS run and high accuracy of reported mass (64); the disadvantage is primarily in high reagent cost. Label-free proteomics is significantly cheaper and has simplified workflows compared to label-based qMS. The samples are analyzed on separate MS runs, and the number of spectra per protein is correlated to protein abundance (65). An alternative method correlates protein abundance to an area under the curve calculated by the integral of precursor ion intensity over retention time. This technique is limited to low co-elution of peptides which requires simple samples, however it offers high mass accuracy (66). 22  1.3.8 Overview of MaxQuant: protein search and quantification software The Bruker mass spectrometer outputs a specially designed folder that contains all the information collected by the instrument, including MS and MS/MS scans. The primary carrier of the raw data is a binary “.BAF” file, from which peptide masses and intensities can be extracted. These MS/MS fragmentation spectra are then compared against a proteome database to identify the peptides and subsequently the proteins. To quantify the proteins, the intensities of mass peaks are used to compute relative mass ratios. MaxQuant (MQ) (67) is a collection of automated algorithms that extracts the mass peaks from the binary data format, performs peptide/protein search, and computes the peak intensity as well as mass ratio between labeled pairs. Briefly, in the initial step, the MS mass peaks are combined into 3D structures where the three axes are retention time, m/z ratio, and the peak intensity (detection count). These 3D mass peaks are identified by a search engine Andromeda (53), a probabilistic search engine developed specifically to be used in MQ. It evaluates the probability that the matches between the observed and computed mass peaks happened by chance, and uses this probability to score and assign peptide matches. MQ is able to process labeled quantitative MS data as well by searching for labeled isotopes, and evaluating the fold changes between the labeled pair peak intensities. The output is a large table that includes rich information about each identified protein including the number of peptides detected in the sample and their protein sequence coverage, a posterior error probability (PEP) score (68), normalized ratios between the labeled isotope pairs, estimated protein intensity derived from the 3D MS peak volume, and more. A PEP value represents a probability that the peptide was not present in the mass spectrometer during the measurement. 23  The PEP is equivalent to the false discovery rate (FDR), but unlike FDR, it is not dependent on other proteins: FDR is better suited for working with a group of proteins (e.g., “this network of proteins appeared for FDR < 0.01”), while PEP is preferred when working with a single protein (e.g., “this protein is a potential biomarker with PEP < 0.01”). The isotope ratios are normalized so that their median is zero, which is based on the assumption that most proteins will have no change between the samples. 1.3.9 Antibody mediated identification of antigens: AMIDA The idea of using antibodies to capture and identify the immunoreactive antigens was first proposed by Gires et al. (1), who termed the method Antibody Mediated Identification of Antigens (AMIDA). AMIDA uses antibodies bound to protein A beads to immunoprecipitate antigens, which are then identified on a mass spectrometer. Gires et al. demonstrated AMIDA by using serological antibodies to bind antigens from lysed carcinoma cells, which were then eluted, cleaned on a 2D electrophoresis gel, digested with trypsin, and loaded onto matrix assisted laser desorption ionization time-of-flight MS. The MS peaks were identified with Matrix Science (Mascot) search engine. Finally, one of the identified antigens (human protein CK8) was quantified by binding to fluorescent antibody-coated beads. 1.4 Research Statement The goal of this thesis is to advance and improve the basic AMIDA technique with modern technologies and improved protocols. This is intended to enable practical use of this method in 24  high-throughput profiling of serum antibodies to determine the complement of recognized antigens. The goals of this research project are: 1. Update the AMIDA protocol with recent proteomics advances 2. Extend AMIDA by including a MS-based quantification of the discovered antigens 3. Validate the updated and extended protocol by identifying antigens from bacterial lysates 4. Identify new potential antigens and quantify the immunoreactivity of patients to these antigens 25  Chapter 2: Protocol Development This project’s aim is to develop a simple, high-throughput protocol for serological immune profiling. While the conceptual protocol framework was chosen a priori based on equipment availability and work by Gires et al.(1), most of the steps required an exploration of multiple conditions to determine the optimal methods and reagents. The protocol has several conceptual stages (Figure 2.1). Initially, the IgGs are extracted from the serum and bound onto magnetic beads. The IgG-bead complexes are then exposed to a mixture of antigens, provided by bacterial lysates. After a short incubation to allow antigens to bind to IgG paratopes, the bead-IgG-antigen complexes are thoroughly washed, and the antigens eluted. To prepare the eluted antigens for MS analysis, in-gel trypsin digestion breaks the larger proteins into lighter peptides to bring them within the ideal MS measurement range. The peptides are then isotopically labeled as light, medium, and heavy; the labeled peptides are pooled, and analyzed on MS. The resulting mass spectra are then interpreted using MaxQuant. A custom data analysis pipeline is used to interpret and visualize the MaxQuant output. 26   Figure 2.1 The conceptual overview of the protocol workflow. The protocol could be broken down into several crucial steps that needed separate optimization and verification. These steps include antibody extraction and binding to the magnetic beads, bacterial cell lysis, the lysis buffer exchange, the incubation of antigen-bead complexes with antigens and their subsequent wash, and the antigen elution conditions. 2.1 Antibody Extraction and Linkage to Dynabeads To obtain serum from a patient, blood samples first need to be collected, coagulated to remove fibrinogens, and spun down at high speeds to separate out blood cells such as leukocytes and erythrocytes. What remains is serum, which contains blood proteins including antibodies, antigens, and hormones, and remnants of other cells. Conventionally, IgG antibodies are isolated by binding to protein A or G beads, separated from the serum by centrifugation, and released by lowering the pH (69,70). These harsh acidic conditions, however, often leave IgG molecules denatured. To maximize the IgG extraction efficiency, the IgGs were isolated by a Melon Gel IgG Purification Kit (71), which contains a chromatography column with a proprietary resin that Melon Gel Tosylactivated Dynabeads M280Wash, Elute AntigensIn-gel Trypsin DigestESI-Q-TOF MS/MSFormaldehyde LabelingData AnalysisIncubationBacteriaLysis27  binds most known serum proteins excluding IgGs. The resulting IgG extraction efficiency is close to 100%, as claimed by Melon Gel manufacturer. Here we followed the manufacturer’s protocol, including the optional serum buffer exchange with Zeba Spin desalting column. The concentration of the antibody in the final solution was measured by the NanoDrop 1000 to be between 2.0 and 2.4 mg/mL across all serum samples. After extraction, the IgGs were immediately bound to magnetic beads. The Tosylactivated M280 Dynabeads (72) covalently bind to either the primary amine or sulphydryl groups of antibodies, which in theory is a strong enough bond to prevent antibody leakage in subsequent elution steps. The binding procedure closely followed manufacturer’s protocol. To test the successful antibody binding, the IgG coated beads were exposed to fluorescently marked hemagglutinin 3 proteins from influenza A (H3). Figure 2.2 shows the brightness difference between antibody-coated beads incubated with H3 and a control consisting of just the IgG coated beads.  Figure 2.2 Beads coated with IgG incubated with hemagglutinin 3 influenza A antigens (H3 antigen) fluorescent markers. The H3 antigen shows presence of IgGs on the beads. The control shows no fluorescence of the IgG-coated beads by themselves. Non-specific binding of H3 to the beads is prevented by a coat of BSA. 28  To reduce non-specific binding, after IgG coating the beads were blocked with BSA to minimize the amount of residual exposed reactive bead surface. To test this blocking, serum IgG coated beads were exposed for one hour to varying amounts of fluorescently marked antibodies, and the fluorescence intensity was measured under a microscope. Fluorescent Anti-IgA and anti-IgM antibodies showed low signal, while fluorescent anti-IgG antibody showed signal proportional to the amount of the marker since it bound to the serum IgGs coated to the beads, as shown in Figure 2.3. This difference suggests that the IgG-coated and BSA-inactivated beads have little residual non-specific binding activity.  Figure 2.3 Fluorescent marker intesity on IgG-coated Dynabeads. The anti-IgA/IgG/IgM markers were measured after laser excitation at 542 nm, 475 nm, and 633nm resp. The fluorescence intensity was measured using InfranView brightness measure tool, and corrected by substracting background brightness level. The weak signal increase from anti-IgA and anti-IgM fluorescent markers implies little non-specific binding to the bead surface. 2.2 Bacterial Cell Lysis We developed and tested this protocol in the context of measuring antibody reactivity against bacterial pathogens. There are many gram-negative opportunistic pathogens that are easy to grow 29  in laboratory conditions and humans get frequently exposed to. Widely used Escherichia coli is a gram-negative bacterium responsible for illnesses such as septic shock (73) and food poisoning (74). Due to the common presence of E. coli in human lives, it is likely that there are antibodies for it present in almost every individual (75). E. coli laboratory strain EMG2, also known as K-12 (76), was grown overnight at 37°C in 5 mL of LB broth (77), plated onto an LB-agar plate and grown overnight at 37°C, and a single colony selected and again grown overnight at 37°C in LB broth. The overnight bacteria were then lysed by sonication, while submerged in a lysis buffer based on general lysis protocol from European Molecular Biology Laboratory (78). The buffer contained 50 mM Tris-HCl pH 7.5 buffer, 100 mM NaCl for ionic strength normalization, 1 % Triton X-100 and 0.1 % sodium deoxycholate (SDC) which served as surfactants, and 5 % glycerol which stabilized proteins. The cells were sonicated on ice in on/off cycles for 5 and 59 seconds respectively, at 35 % power for a total of five minutes of sonication. The SDS-PAGE of the lysate stained with Coomassie Blue showed successful lysis, as shown in Figure 2.4. 30   Figure 2.4 E. coli lysate. The lysis was perfomed by sonication in a lysis buffer with Tris-HCl, SDC, and Triton X-100. The lysates are diluted by 2x between lanes from left to right. The bacterium strains were later expanded to include Pseudomonas aeruginosa, Salmonella typhimurium, and wildtype as well as capsid-lacking mutant Klebsiella pneumoniae. 2.3 Antigen Incubation and Elution The exposure of IgG-coated Dynabeads to antigens is a crucial step. The exposure needs to be long enough to allow the antibodies to come into contact with antigens, while also short enough to avoid unnecessary antibody dissociation from the beads. Dynabeads’ manufacturer recommendations suggest that 1 mg of IgG-coated Dynabeads conjugate 1-10 µg of target proteins. As concentration of target antigen was unknown, due to lack of prior knowledge of the antigen’s identity, a 1:1 volume of beads to cell lysate was used. 80 µL of Dynabeads and 80 µL 31  of cell lysate were mixed together with 300 µL of Dulbecco's phosphate-buffered saline (DPBS) and 0.1% w/v BSA, and incubated at 37°C for 30 minutes (72). The beads were then washed with DPBS, and immediately exposed to the antigen elution buffer. 2.3.1 Bioanalyzer: antigen elution conditions Antigen elution from antibodies is another critical step in any immunoprecipitation experiment. While many methods are available for dissociating antigens from antibodies, there is no single method that works for all applications, and careful optimization is required for every new protocol (79). The elution method needs to be mild enough so as not to disturb the covalent bond between the IgGs and Dynabeads, as well as the disulphide bond between the IgG chains. At the same time, the eluent needs to disrupt the connection between the antibody and its antigens. There are several antigen elution methods which are compatible with Dynabeads as recommended by the manufacturer. Since the eluted antigens were directly loaded onto a gel for cleanup and in-gel trypsin digest, the elution buffer had to be compatible with SDS-PAGE as well. The tested methods were based on lowering the pH to ~3, on chaotropic activity, and heating. The pH change was tested by incubation in 50 µL 1 M glycine pH 2.5 as well as 50 µL 0.1 M citric acid pH 2.5, both at 56°C for 15 min. End-over-end rotation in 50 µL 7 M urea represented the chaotropic activity, and heat elution was tested by boiling in DPBS solution at 95°C for 5 min. Successful elution was to be visualized on the Bioanalyzer High Sensitivity Protein Analysis Kit (80). The results were inconclusive, possibly due to the bioanalyzer’s sensitivity limit of 1 ng/µL. This limit was reported by manufacturer based on labeled BSA in 32  water; however, the antigens were dissolved in DPBS and elution buffers. All tested elution buffers were reported by manufacturer’s manual as compatible with Bioanalyzer with the note that sensitivity might worsen. Due to this, the experiment was repeated on a mass spectrometer which has higher sensitivity. 2.3.2 MS: antigen elution conditions Following the lack of satisfactory results with the Bioanalyzer, we decided to repeat the experiment on a mass spectrometer. MS is an incredibly sensitive technique, with detection limits as low as 5 ng target protein loaded on polyacrylamide gel (81). A different serum sample was used as an IgG source, referred to as F01. The serum had a known reactivity to Pseudomonas aeruginosa, and so the cellular lysate was changed from E. coli to P. aeruginosa. Pseudomonas is a common gram-negative bacterium commonly associated with hospital-acquired infections (82), and it was lysed and processed following the same protocol as E. coli. The tested elution conditions did not change from previous Bioanalyzer experiment, except for addition of boiling in 1 % sodium dodecyl sulfate (SDS) with 50 mM of NH4HCO3. SDS is a strong anionic surfactant, and ammonium bicarbonate serves as a buffer that stabilizes the eluted proteins. A negative control was introduced as well, which consisted of BSA-blocked magnetic beads lacking any IgGs. The tested conditions are summarized in Table 2.1. 33  Sample F01 #1 F01 #2 F01 #3 F01 #4 F01 #5 F01 #6 F01 #7 F01 #8 F01 #9 F01 #10 IgG No Yes No Yes No Yes No Yes No Yes Elution Condition 5 min PBS 95°C 15 min 1 M Glycine 56°C 15 min 0.1 M Citric acid 56°C 15 min 7 M Urea Room temp. 5 min 1% w./v. SDS in 50 mM NH4HCO3 95°C Table 2.1 The elution and wash conditions for antibody-bound antigens. The first attempted MS run was unsuccessful due to high content of unknown salt crystals in the samples (most likely ammonium chloride, which is used in peptide stage-tipping; see section 3.1.6). Fortunately, the in-gel digestion step (see section 3.1.4 for methods overview) showed visible eluted proteins for glycine and SDS only, which narrowed down the search for an ideal elution condition. The Coomassie-blue stained protein gel is shown in Figure 2.5. Ideally, the bands with BSA-only beads should be empty, and the IgG-coated beads should contain numerous bands corresponding to various eluted antigens. SDS and citric acid elution conditions show faint but visible differences between the experiment and the negative control, and so the experiment was repeated with these conditions only. 34   Figure 2.5 The proteins before in-gel digestion by trypsin. Although the MS analysis was not completed due to salt contaminants in the sample, the pre-digestion gel shows glycine and SDS boiling as the only viable elution methods. 2.3.3 MS: antigen elution by SDS and glycine The second MS run focused on glycine and SDS elution only, since these eluents showed visible antigen bands on the stained polyacrylamide gel. The MS run was successful, producing MS spectrum data processed by MaxQuant (MQ) (see section 3.1.8). By looking at Figure 2.6, it is clear that the glycine elution was not nearly as effective as elution by SDS boiling. PBS IgGPBS IgGGlycine IgGCitric IgGCitric IgGUrea IgGUrea IgGSDS IgGGlycine IgGSDS IgG35   Figure 2.6 The natural logarithm of intensities between conditions. The histogram represents the number of different proteins at each intensity value. Intensity value is correlated with the protein amount in the sample. Glycine eluted 10 human proteins and 3 Pseudomonas proteins, while SDS eluted 81 and 74 proteins resp. Overall, the glycine elution produced three Pseudomonas proteins and ten human proteins. The boiling SDS eluted 81 human proteins and 74 P. aeruginosa proteins. It is clear that while contamination from human proteins is a strong factor in both cases, the SDS elution is superior and so it was used in subsequent experiments. Table 2.2 summarizes P. aeruginosa proteins with highest intensity. These proteins seem to be largely porins and outer membrane proteins, which is in line with the expected enrichment of immune reactivity to exposed surface antigens (83). 05101510.0 12.5 15.0 17.5log(Intensity)Count ElutionGlycineSDS36  Protein Name Intensity Outer membrane porin F (OprF) 64555000 Immunodominant outer membrane protein L Peptidoglycan-associated lipoprotein 13491000 60 kDa chaperonin (GroEL protein) (Protein Cpn60) Heat shock protein (Fragment) 4909500 Outer membrane protein H1 (PhoP/Q and low Mg2+ inducible OmpH1) 4897400 Translation elongation factor Tu 4327200 Outer membrane lipoprotein I 4008000 Outer membrane protein W 3742000 Table 2.2 The Pseudomonas proteins with highest intensity found in the SDS elution. The intensity corresponds to the abundance of the peptides corresponding to the protein. Overall, 198 proteins were identified from 683 different peptides. Although the proteins with highest intensity (i.e., abundance) were likely true antigens, it is improbable that all of the 198 identified proteins were immunoreactive. For example, many proteins were ribosomal proteins, which are highly abundant and possibly contaminants. While the negative and positive controls did not produce satisfactory results, the list of proteins found still carries some information in it. Table 2.2 summarizes P. aeruginosa proteins in the top 20% of all identified proteins, ranked by intensity. The proteins with highest intensity seem to be porins and outer membrane proteins, which seems reasonable as these are often visible to antibodies (83).37  2.4 MS: Wash Conditions to Remove Contaminants After the Dynabeads get exposed to antigens, significant non-specific interactions between cell lysate proteins and the beads can persist. These “sticky” proteins can then skew the results, and provide a background noise that hides the real antigens. To minimize this contamination, immediately after incubation in cell lysate we washed the Dynabeads with DPBS. We observed that a too vigorous wash seems to damage the beads, leaving residue on the walls of the microcentrifuge tube. To find the ideal wash conditions, three different levels were tested: a quick wash, three consecutive quick washes, and three 5-minute long washes. Two new serum samples, referred here as A and B, were used for the antibody source. The steps were unchanged from previous MS runs, with the only change being the wash conditions and the antibody source serum. As before, the tested antigen pool was a P. aeruginosa lysate. To assess the effect of the different wash regimes, the identified protein’s intensities were plotted on a density plot shown in Figure 2.7. The intensity value represents the size (strength) of the peptide’s mass spectrum peak, and it is a measure of the peptide abundance. Protein intensity is a weighted mean of all of its peptide intensities. Ideally the immunoglobulin intensities would be minimized, since the wash should remove any weakly-bound IgGs that get eluted during the antigen elution step. The intensities of most of the pseudomonas proteins should be minimized as well, since we expect only a small number of antigens with a bulk of pseudomonas proteins being contaminants. The number of identified immunoglobulins was the smallest in the 3× wash, with around 25 identified Ig proteins. The 1× and 3× 5 min washes identified 34 and 33 Ig proteins. The number of pseudomonas proteins was 38  43, 30, and 57 for 1×, 3×, and 3× 5 min wash, respectively. It is difficult to interpret why the 3× 5 min wash did not reduce the number of identified proteins below the 3× wash. It is possible that the lower numbers of proteins in the 3× wash were due to chance, and it is the intensity value of the proteins that is more important than the actual identification count, which is shown in the density plot (Figure 2.7).  Figure 2.7 Comparison of densities. The 1× and 3×5min wash conditions had the largest difference in intensity means. However, none of these differences were significant. 0.000.050.100.159 12 15 18log(Intensity)Density WashA1xA3xA3x5minImmunoglobulin, serum A Immunoglobulin, serum BPseudomonas, serum A Pseudomonas, serum B0.000.050.100.1510.0 12.5 15.0 17.5log(Intensity)DensityWashB1xB3xB3x5min0.00.10.210.0 12.5 15.0 17.5log(Intensity)Density WashA1xA3xA3x5min0.00.10.20.30.310 12 14 16 18log(Intensity)DensityWashB1xB3xB3x5min39  To assess the differences between different wash conditions, an Anderson-Darling (AD) test as well as Kruskal-Wallis (KW) ANOVA test were performed, which is appropriate for the non-parametric nature of the experiment. The AD test is sensitive to the shape of the distributions, and returns a p-value for whether the experiments are samples of the same distribution. This yielded p-values for immunoglobulin serum A, immunoglobulin serum B, Pseudomonas serum A, Pseudomonas serum B (Figure 2.7) as 0.97, 0.99, 0.25, and 0.58 respectively. The Kruskal-Wallis test is more sensitive to the shifts in means between the distributions, and its p-values are 0.73, 0.86, 0.34, and 0.38 respectively. Since no significant difference between protocols was observed, the 3× quick wash was chosen based on qualitative features in the data, minimizing the high intensity peak present in 1× washes, while having comparable intensity means to the 3×5min washes. 2.5 MS: Lysate Buffer Exchange A notable feature of every previous MS run was a high contamination of the output with human proteins. Most of these proteins were immunoglobulin-related proteins, meaning that antibodies were either weakly linked to the Dynabeads, or a step in the protocol was dissociating the covalent bonds between the antibodies and tosyl groups on bead surface. Since the bacterial lysate contains ionic and anionic surfactants (Triton X-100 and SDC resp.), we hypothesized it might be the source of increased IgG leakage. To test this hypothesis, the cell lysate had its lysis buffer exchanged for DPBS. The buffer exchange was performed following manufacturer instructions in an Amicon Ultra filter with 10K and 3K cut-off. For the 3K Amicon filter the 40  lysate was diluted 10x in DPBS before the buffer exchange to prevent premature filter clogging. The buffer exchanged lysates, as well as the unfiltered lysate, were then exposed to Dynabeads coated with IgGs from serum A. The usual protocol of antigen elution, trypsin digest, and MS was then continued.  Figure 2.8 The change of intensity distribution as a result of lysis buffer exchange. There is no significant change between buffer exchange through Amicon 3K, 10K, or no buffer exchange. When looking at particular proteins, the A10K buffer exchange had the largest decrease in ribosomal proteins, which we consider contaminants. The overall distribution shape did not significantly change for the data, as confirmed by the AD and KW tests (p-values 0.99 and 0.8 for immunoglobulins, and 0.32 and 0.24 for Pseudomonas proteins). Despite the lack of significance, the 10K Amicon buffer exchange had the strongest decrease in Pseudomonas ribosomal proteins compared to the no buffer exchange and 3K Amicon buffer exchange, and so it was chosen for the future experiments. Unfortunately, the issue of high IgG concentrations was still present. 0.000.050.100.150.200.2510 12 14 16 18Log(Intensity)Density FiltrationA10KA3KNF0.00.10.20.310 12 14Log(Intensity)DensityFiltrationA10KA3KNFImmunoglobulin, serum A Pseudomonas, serum A41  2.6 MS: Lowering IgG Contamination through Antigen Elution Modification To lower the IgG contamination in the MS output, the antigen elution step was revisited. After the IgG-coated Dynabeads are exposed to the cellular lysate, the antigens are eluted with boiling (95°C) SDS. This choice was made due to SDS having the best elution efficiency (see section 2.3.2); however, this efficiency might come at a cost of high IgG contamination in the output. In an attempt to remedy this situation, we both added N-ethylmaleimide (NEM) to the eluent to block free sulfhydryl groups and prevent breaking disulfide bonds in the IgGs (84), and incubated at room temperature. A MS run with serum A was performed, where the elution with SDS at 95°C, elution with SDS and NEM at 95°C, and elution with SDS and NEM at room temperature were compared to assess the NEM’s effect on IgG contamination, as well as its effect on Pseudomonas antigen elution efficiency (Figure 2.9).  Figure 2.9 The effect of NEM and boiling on IgG contamination. The density plots show the natural logarithm of intensity of all the immunoglobulin and pseudomonas proteins. On the left density plot, the room temperature elution with NEM has an order of magnitude lower immunoglobulin intensity. This removal of immunoglobulin proteins correlates with slightly lower antigen elution as well; however, the effect is smaller (right density plot) and does not result in any reduction of number of identified pseudomonas proteins. 0.000.050.100.150.2010.0 12.5 15.0 17.5LogIntensityDensityFiltrationSDS.95CSDS.95C.NEMSDS.NEMImmunoglobulin, serum A0.00.10.20.310 12 14 16 18LogIntensityDensityFiltrationSDS.95CSDS.95C.NEMSDS.NEMPseudomonas, serum A42  The AD and KW tests both showed significantly low probabilities for these results to happen just by chance (3.3×10-5 and 2.2×10-5 for immunoglobulin proteins, and 3.2×10-4 and 2.0×10-4 for Pseudomonas proteins). The combination of room temperature and NEM addition results in less IgG contamination, at a small expense of weaker Pseudomonas antigen elution. Room temperature SDS with NEM was chosen for the final protocol. At this point the protocol was finalized and did not change during subsequent testing experiments. Future improvements of elution by urea as well as physical removal of gel bands corresponding to common heavy and light chain masses were tested as well; however, both procedures produced suboptimal results and did not change the protocol (see Appendix A for details). 43  Chapter 3: Serological Immune Profiling of Nine Patients on Five Bacterial Lysates After protocol optimizations, described in the previous chapter, it was important to assess the protocol’s performance in a large-scale profiling experiment. To do this I applied qAMIDA to obtain a MS data set of the immunoreactivity of nine patients’ serological antibodies against six bacterial lysate solutions. I developed a bioinformatics pipeline to interpret the raw MS data, producing a list of potential antigens as well as their relative abundances. The goals of this large-scale profiling study are summarized in Table 3.1. A short description of the bacteria used as antigens source is in Appendix B. Experimental Question and goals How was it answered • Assess qAMIDA’s performance in a larger-scale experiment • Identify any bottlenecks of the qAMIDA protocol in terms of experiment time and organizational complexity • Develop and optimize the bioinformatics pipeline for MS data analysis The design involved testing 9 patients’ sera against 5 bacterial antigen sources: K. pneumoniae, S. typhimurium, E. coli, P. aeruginosa, and a mutant K. pneumoniae that lacks the polysaccharide capsule. The relative antigen abundance was compared using a protein standard that was present in every MS run. • Determine the technical reproducibility of the protocol Both K. pneumoniae and mutant K. pneumoniae were tested against the same patient twice. • Test qAMIDA on a more complex mixture of potential antigens The serum samples were tested against a mixture of all bacterial lysates. • Determine whether the outer K. pneumoniae capsid has any effect on the qAMIDA output The antigens and their intensities were directly compared. Table 3.1 The primary aims to assess the performance of qAMIDA protocol. 44  The first section of this chapter is dedicated to methods. All experimental steps are described in detail. In the second “Results” subsection, we list the qualitative list of identified potential antigens. The relative quantitative data is then used to compare antigen reactivity across patients. Lastly, the protein list is analyzed by STRING, a network analysis tool. In the discussion, we talk about our choice of MQ as the MS engine. We describe some potential antigens in detail. The possible issues with the qAMIDA protocol (high heterogeneity of the data, low consistency across different MS runs, failure of MQ to properly quantify all proteins) are addressed as well. Lastly, we suggest possible future improvements, namely a need for sample fractionation. 3.1 Methods The immunoreactivities of nine patients’ serological antibodies were tested against six bacterial lysate solutions, resulting in 54 samples. Additionally, two samples had technical replicates (Patient1 serum against both mutant and w.t. K. pneumoniae lysate), and raw bacterial lysate solutions were analyzed by themselves as well. In total, 62 MS samples were analyzed. Experiments were performed according to the protocols described below. 3.1.1 Preparation of antigen panel The antigen panel consisted of bacterial lysates from five different strains: Escherichia coli strain EMG2 (76), Salmonella typhimurium LT2, Pseudomonas aeruginosa PAO1 (ATCC 15692) (85), Klebsiella pneumoniae (ATCC 43816), and a custom K. pneumoniae mutant which is missing the sugar capsid cpsB gene. The cells were grown overnight on a streaked agar plate in 45  an incubator (37°C), a single colony was picked and grown in 5 mL of super broth medium (tryptone 35 g/L, yeast extract 20 g/L, NaCl 5 g/L, 1 N NaOH 5 ml/L) in the same incubator. After the overnight growth, the cells were spun at 18,000 g at 4°C for 5 min. The supernatant was discarded, and the cell pellet was resuspended on ice in 6 mL of lysis solution (50 mM Tris-HCl, NaCl 5.8 g/L, 1 % w/w Triton X-100, 5 % w/w Glycerol, 0.1 % sodium deoxycholate, and protease inhibitor tablet (86), brought up to 12 mL with Dulbecco’s phosphate-buffered saline (DPBS)). The resuspended cells were then sonicated with 30 pulses of 5 seconds on, 59 seconds off. After sonication, the lysed cells were spun at 21,000 g for 20 min at 4°C, and the supernatant was aliquoted into smaller volumes and stored in a -20°C freezer for up to 7 days before use. 3.1.2 Preparation of antibody-bead complexes The antibodies were obtained from 125-150 µL serum samples. Serological antibodies were extracted by Pierce Biotechnology’s Melon Gel IgG Spin Purification Kit, together with Zeba Desalt Spin Column (as suggested by manufacturer). The Melon Gel and Zeba Column manuals were closely followed; in short, the serum was buffer-exchanged with Melon Gel extraction buffer in the Zeba Column, and the solution was then mixed with the Melon Gel resin. The proprietary Melon Gel resin binds most common contaminants from the serum, leaving behind undisturbed IgGs. The solution was then pushed through a column which retained the contaminants bound to the resin. The flow-through IgGs were collected in a microcentrifuge tube. 46  The purified IgGs were bound to Dynabeads M-280 Tosylactivated. These magnetic beads bind the antibodies through a covalent linkage by primary amine (NH2) or sulphydryl (SH) groups. The binding protocol followed the manufacturer’s manual; the antibodies and beads were mixed in 1.5 mL tubes, and left on a rotator overnight at 37°C. After binding, the antibody-bead complexes were exposed to bovine serum albumin (BSA) to inactivate any remaining tosyl groups. If not used immediately, the beads were stored in a fridge for up to one week. 3.1.3 Antigen binding and elution The antigen panel (lysate) solution and bead-bound antibodies were mixed together in 1:1 ratio at 80 µL each in 300 µL of 0.1 % BSA in DPBS, and incubated on a rotator at 37°C for 1 hour. After incubation, the beads were washed 3 times on a magnet with 0.1 % BSA in DPBS, and finally the bound antigens were eluted by addition of 100 µL elution buffer (1 % sodium dodecyl sulfate, 50 mM NH4HCO3, 20 mM N-Ethylmaleimide in water). The beads were removed on magnet, and the eluate was dried by vacufuge. 3.1.4 In-gel trypsin digest To clean up the eluate from remaining cellular debris and other pollutants, the samples were run through SDS polyacrylamide gel electrophoresis (SDS-PAGE). Each antigen sample was resuspended in 13 µL of SDS loading buffer (312.5 mM Tris pH 6.8, 10 % SDS, 50 % glycerol, 0.1 % bromophenol blue), and reduced by adding 1 µL of 1 M dithiothreitol (DTT). Addition of DTT reduces disulfide protein bonds, which stops the proteins from clumping together and 47  preventing movement through the gel matrix. The reduced samples were then loaded onto inhouse polyacrylamide gel with 10 % loading section and 12 % resolving section, and ran at 200 V for approximately 25 min. As soon as all of the sample entered the resolving portion of the gel, the electrophoresis was stopped. The gels were then washed two times with water, and fixed by soaking in 40 % ethanol, 10 % acetic acid for 30 minutes on a rotator. The trypsin digestion was performed directly on the proteins inside the gel (in-gel digest), closely following the method described by Shevchenko et. al (81). Although polyacrylamide gels are highly porous, the diffusion of larger particles like trypsin by Brownian motion is very slow. To speed up the entry of trypsin into the gel matrix, the gel was chopped into small 1x1x1 mm pieces, which tripled the available entry surface area. To visually aid the cutting, the in-gel proteins were stained with Coomassie blue. The chopped pieces were collected in 1.5 mL microcentrifuge tubes, and were rid of the Coomassie blue stain by numerous washes in a 1:1 mixture of ethanol and digestion buffer (50 mM NH4HCO3), finished by dehydration in an absolute ethanol. The in-gel proteins were then reduced with 10 mM DTT for 45 minutes at 56°C, and sulphydryl groups were blocked by soaking the reduced gel pieces in 55 mM iodoacetamide at room temperature for 30 minutes. This step was performed in dark due to light sensitivity of iodoacetamide. After, the remnants of iodoacetamide and DTT were removed by three times washing and dehydrating with digestion buffer and pure ethanol, resp. After the last dehydration, any remaining solution was evaporated in vacufuge, and trypsin solution (12.5 ng/µL, in digestion buffer) was added into the tube. The volume of trypsin solution was 48  just enough to cover all the gel pieces. The trypsin digestion was allowed to run for 18 hours at 37°C. To stop any remaining trypsin activity, 20 µL of 1 % trifluoroacetic acid (TFA) was added to the samples. To extract the digested proteins (peptides) from the gel, the pieces were washed twice with a 30 % acetonitrile and 0.5 % acetic acid solution, and once with 100 % acetonitrile. After each wash/extraction, the liquid which contained extracted peptides was transferred into a clean 1.5 mL tube. Finally, the wash/extraction liquid was evaporated in a vacufuge, leaving behind the trypsin-digested peptides. 3.1.5 Peptide cleaning The peptides were purified and cleaned before analysis by MS, following a protocol described by Rappsilber et al. (87) in which peptides are bound to a reverse-phase material, washed, and then eluted by a change of pH. Columns were constructed using standard pipette tips fitted with a disk made of Teflon mesh and embedded reverse phase beads. These tips allow for an easy STop-And-Go-Extraction, and are often referred to as STAGE tips. To make these STAGE tips, a small disk of Empore filter (88) is pushed into the pipette tip, followed by C18-bonded silica resin. Each tip can hold about 20 µg of peptides. In order to push liquid through the tip, either pressure can be applied to one end of the STAGE tip, or the tip can be held in a 1.5 mL micro centrifuge tube and spun on a benchtop centrifuge. Centrifugal force as low as 800 g is enough to push the liquid phase through the column. Before the peptides were introduced into the column, the reverse-phase resin was conditioned with 50 µL of methanol, and then washed with 0.5 % 49  acetic acid. The peptide samples were acidified to below pH 2.5 with 1 % TFA, and centrifuged through the STAGE tip. At this point, the peptides were retained within the hydrophobic matrix. The columns were then washed once more with 50 µL of 0.5 % acetic acid, after which they were stored overnight at 4°C. To elute the peptides, the hydrophobic attraction to the resin was disturbed by pushing through 50 µL of 80 % acetonitrile in 0.5 % acetic acid twice. The eluent was collected in a micro centrifuge tube, and dried down in a vacufuge. 3.1.6 Triple peptide formaldehyde labeling One of the objectives of this project is to obtain differential immune response across patients, measured by relative antigen content after immunoprecipitation. Since there are numerous patients and conditions, to reduce the time spent on the mass spectrometer as well as to minimize technical variability between different runs, the 62 conditions were multiplexed, and samples were analyzed in randomized pairs. Each pair was combined with a control sample, which consisted of a mixture of all experimental samples, and analyzed in a single MS run. The control sample was used as a standard for comparative quantitation across different MS runs. To differentiate between the two experimental samples as well as the control sample within each MS run, dimethyl chemical labeling was used to offset the masses of the samples’ proteins. The control always used a heavy label, while the two experimental conditions were labeled with light and medium isotopes. Here, sodium cyanoborohydride, sodium cyanoborodeuteride, and formaldehyde isotopes (CH2O, CD2O, and 13CD2O) were used to add dimethyl labels at the ε-amino group of lysine and at every N-terminus (α-amino group). These reactions were allowed to 50  run to completion, and the resulting mass offsets between different labels (Figure 3.1) was at least 4 Da.  Figure 3.1 The light, medium, and heavy labeling reactions. The reactions happen after trypsin digestion on N-terminus or ε-amino group of lysine. The mass increase from light to medium or from medium to heavy is (4+n) Da, where n is the number of lysine amino acids present in the peptide. Specifically, the dried samples from previous step were resuspended in 20 µL of 100 mM triethylammonium bicarbonate, and labeled with formaldehyde and cyanoborohydride: 20 µL of 200 mM CH2O, CD2O, or 13CD2O (light, medium and heavy labels resp.) was added to samples together with 2 µL of 1 M cyanoborohydride (light or medium label) or cyanoborodeuteride (heavy label). The samples were allowed to react in the dark for 90 minutes, after which the reaction was terminated by addition of 20 µL of 3 M NH4Cl. After 10 minutes in the dark, the samples were acidified to below pH 2.5 with 1 % TFA in preparation for STAGE-tipping; the light, medium, and heavy samples were mixed together, and finally the mixture was cleaned RNHHHHORNCH3CH3NaBH3CNRNHHDDORNCHD 2CHD 2NaBH3CNRNHHDDORNCD3CD3NaBD3CNC131313Light:Medium:Heavy:51  using reverse-phase STAGE tips as described in section 3.1.5. After cleaning, the samples were submitted to the Foster Lab to be analyzed on a Mass Spectrometer. 3.1.7 Mass spectrometry sample analysis The STAGE-tipped samples were analyzed on a quadrupole–time of flight mass spectrometer (Impact II; Bruker Daltonics) coupled to an Easy nano LC 1000 HPLC (ThermoFisher Scientific) using an analytical column that was 40–50 cm long, with a 75 μm inner diameter fused silica with an integrated spray tip pulled with P2000 laser puller (Sutter Instruments), packed with 1.9 μm diameter Reprosil-Pur C18AQ beads (Maisch, www.Dr-Maisch.com), and operated at 50°C with in-house built column heater. Buffer A consisted of 0.1 % aqueous formic acid, and buffer B consisted of 0.1 % formic acid and 80 % (vol/vol) acetonitrile in water. A standard 90min peptide separation was done, and the column was washed with 100 % buffer B before re-equilibration with buffer A. The Impact II was set to acquire in a data-dependent auto-MS/MS mode with inactive focus fragmenting the 20 most abundant ions (one at the time at a 18 Hz rate) after each full-range scan from m/z 200 to m/z 2000 at 5 Hz rate. The isolation window for MS/MS was 2–3 depending on the parent ion mass to charge ratio, and the collision energy ranged from 23 to 65 eV depending on ion mass and charge. Parent ions were then excluded from MS/MS for the next 0.4 min and reconsidered if their intensity increased more than five times. Singly charged ions were excluded from fragmentation. 52  3.1.8 Data analysis pipeline The raw mass spectrum data produced by the mass spectrometer was processed by MaxQuant (MQ) (67). The MQ software package uses Andromeda (53) for peptide identification, and outputs a table with all detected and identified peptides that have a false discovery rate better than 0.01. MaxQuant 1.5.7.4 was used with most settings left to default values. The modified settings are summarized in Table 3.2. 53  Tab Page Setting Group-specific parameters Type: Multiplicity 3 Type: Label DimethNter0 and DimethLys0 DimethNter4 and DimethLys4 DimethNter8 and DimethLys8 Modifications: Variable modifications Carbamidomethyl (C) Instrument: Instrument type Bruker Q-TOF (autodetected) Misc.: Re-quantify  Global parameters Sequences: Fasta files E. coli (83333) H. sapiens (9606) K. pneumoniae (72407) P. aeruginosa (208964) S. typhimurium (99287) Adv. identification: Match between runs  Protein quantification: Label min. ratio count 1 Use only unmodified peptides and… Oxidation (M) Acetyl (Protein N-ter) Configuration Sequence databases Whole proteomes obtained from Uniprot (89), latest as of March 08th 2017, for organisms 83333, 9606, 72407, 208964, and 99287. Table 3.2 The modified settings for MaxQuant 1.5.7.4 data processing. The MaxQuant peptide and protein outputs were then processed in a custom R script. This script removes any peptides/proteins that are common contaminants (e.g., bovine serum albumin or keratin) and reverse sequence proteins (likely identification errors as marked by MQ), and 54  divides each light or medium protein intensity value by the intensity of the corresponding heavy label. This normalization removes bias between MS runs that are caused by the mass spectrometer itself (90). 3.1.8.1 STRING: protein network analysis The output proteins from MQ were analyzed by STRING protein-protein interaction network (91). STRING probes numerous protein networks for enrichment, including well-known GO terms and KEGG pathways. Of particular interest is the GO Cellular Component protein network, since an enrichment of one of its terms can suggest a cellular location where the antigens originate from. To analyze the protein output from MQ, potential antigens belonging to the same bacterium, regardless of patient serum, were merged together to form five data sets, one for each of E. coli, K. pneumoniae w.t., K. pneumoniae mutant, P. aeruginosa, and S. typhimurium. These five data sets were compared to background data sets consisting of all proteins for the given bacterium: potential antigens as well as proteins from the lysate solutions. In cases where MQ merged multiple highly similar proteins as a single hit, only a single protein identifier was randomly selected and used for the analysis. The STRING analysis settings were left at default values, except for “minimum required interaction score” which was set to “high confidence (0.700)”. 55  3.2 Results In total, 31 MS runs were performed. The MQ’s Andromeda mass spectrum search identified 3082 unique peptides corresponding to at least 779, at most 1534 proteins. From all discovered proteins, at least 50 and up to 297 were human contaminants. The protein count uncertainty is a result of ambiguity in peptide-to-protein assignment, since multiple proteins can share the same peptides. The MS runs were also found to be highly heterogeneous. On average, each condition covered 18 % of the total identified protein pool. The coverage was also found to be highly variable across different conditions. The minimum total protein presence was 6.3 % from the serum of Patient1 when exposed to E. coli lysate. The maximum patient protein presence was 50.6 % in the Patient3 and mutant K. pneumoniae run, which was close to the highest protein coverage of 58.5 % in the lysate solution consisting of mixture of all five cells. 3.2.1 qAMIDA identifies serologically reactive antigens The improved qAMIDA protocol was used to make a qualitative list of potential antigens. Each combination of serum sample and bacterial lysate produced a list of candidate antigen proteins, shown as a heatmap in Figure 3.2. Each row and column represent a protein and one of the 54 tested conditions, respectively. The logarithm of the peak intensity, as reported by MQ, corresponds to the intensity of the colour in the heatmap. Despite MQ’s attempt at quantification of each protein in each condition, the data was highly heterogeneous with most proteins found in only a few conditions. The sparseness of this data is typical in untargeted mass spectrometry analysis, but creates challenges in the comparison of samples. Specifically, I observed only a few 56  proteins that were present in every sample exposed to the same lysate source. Despite this heterogeneity, a list was generated for the most likely antigen candidates based on the intensity of the proteins (Table 3.3). To make this list, all proteins were pooled together, grouped by the antigen source, and filtered for the top 1 % protein intensity percentile. The “Patients” column specifies how many patients’ antibodies pulled down the particular protein. The greyed-out rows belong to proteins with very strong intensity that, however, appeared in the pure lysate solution only. To focus on relevant antigens the list of detected proteins was prepared after removal of both human proteins and ribosomal proteins; human proteins are assumed to be contaminants from the serum sample, and ribosomal proteins are highly abundant intracellular proteins that were not of interest in our analysis of antibody responses. 57   Figure 3.2 Heatmap of all intensity values per protein identified in the MS spectrum. The different conditions are clustered by vector distance, as shown by the dendrogram. The clustering employed a complete-linkage method, grouping together similar conditions. The colour intensity corresponds to the logarithm of the protein intensity, which was calculated by MaxQuant from the mass spectrum of peptides belonging to the protein. Patient3KmLysate00KmLysate00KLysate00MixPatient6PLysate00PPatient7KLysate00EPatient4KPatient8PPatient1KmPatient7PPatient4PPatient9EPatient5PPatient6MixPatient3PPatient8SPatient2PPatient7SPatient9PPatient1PPatient4KmPatient5EPatient5KmPatient9MixPatient2KmPatient3KLysate00SPatient7EPatient4MixPatient6EPatient5MixPatient6SPatient7MixPatient9KPatient2MixPatient1EPatient1MixPatient8MixPatient3MixPatient9KmPatient2SPatient1SPatient4SPatient8EPatient4EPatient3EPatient2EPatient8KmPatient1KPatient6KmPatient7KmPatient5SPatient3SPatient8KPatient6KPatient5KPatient9SPatient2K0 10 25Log2 of intensityE. coliK. pneumoniae w.t.K. pneumoniae m.t.MixP. aeruginosaS. typhimuriumProtein IntensityIdentified PotentialAntigensMass Spectrometry Runs58  Organism Protein name (Antigen) Patients (out of 9) In Lysate K. pneumoniae Outer membrane protein 3a 9 ABSENT K. pneumoniae Glyceraldehyde-3-phosphate dehydrogenase 0 PRESENT K. pneumoniae Murein lipoprotein 1 ABSENT K. pneumoniae Outer membrane pore protein 1a, 1b, N, E 9 ABSENT K. pneumoniae Putative glycerol dehydrogenase 0 PRESENT K. pneumoniae Elongation factor Tu 7 PRESENT K. pneumoniae 60 kDa chaperonin (GroEL protein) 1 ABSENT K. pneumoniae, ΔcpsB Outer membrane protein 3a (II*Gd) 5 ABSENT K. pneumoniae, ΔcpsB Glyceraldehyde-3-phosphate dehydrogenase 0 PRESENT K. pneumoniae, ΔcpsB Outer membrane pore protein 1a, 1b, N, E 3 ABSENT K. pneumoniae, ΔcpsB Putative glycerol dehydrogenase 0 PRESENT K. pneumoniae, ΔcpsB Elongation factor Tu 9 PRESENT K. pneumoniae, ΔcpsB ATP synthase subunit beta 5 ABSENT K. pneumoniae, ΔcpsB DNA-directed RNA polymerase subunit beta' 3 ABSENT K. pneumoniae, ΔcpsB 60 kDa chaperonin (GroEL protein) 1 ABSENT E. coli Putative glyceraldehyde-3-phosphate dehydrogenase C (GAPDH-C) 0 PRESENT E. coli ATP-dependent RNA helicase DeaD 1 ABSENT E. coli Outer membrane protein C, 1b 5 ABSENT E. coli Glycerol kinase 2 ABSENT E. coli Tryptophanase 9 PRESENT E. coli Outer membrane protein A 4 PRESENT E. coli Thioredoxin 1 3 ABSENT E. coli Cytochrome bd-I ubiquinol oxidase subunit 1 2 ABSENT E. coli Chromosome partition protein MukF 1 ABSENT P. aeruginosa Elongation factor Tu (EF-Tu) 0 PRESENT P. aeruginosa Major outer membrane lipoprotein I 4 ABSENT P. aeruginosa Outer membrane porin F 9 PRESENT P. aeruginosa 60 kDa chaperonin (GroEL protein) 4 PRESENT P. aeruginosa Probable outer membrane protein 9 ABSENT P. aeruginosa Peptidoglycan-associated lipoprotein 1 ABSENT S. typhimurium Major outer membrane lipoprotein 1, 2 4 ABSENT S. typhimurium Outer membrane porin protein D, F, N, C 3 ABSENT S. typhimurium ATP-dependent RNA helicase DeaD 5 ABSENT S. typhimurium Outer membrane protein A 0 PRESENT S. typhimurium Elongation factor Tu (EF-Tu) 0 PRESENT S. typhimurium Thioredoxin 1 (Trx-1) 7 ABSENT S. typhimurium Phase 2 flagellin 0 PRESENT S. typhimurium MFS family galactose:proton symporter 1 ABSENT S. typhimurium Putative periplasmic protein 1 ABSENT S. typhimurium Pyruvate dehydrogenase E1 component 2 ABSENT Table 3.3 The list of proteins in the top 1 % quantile of all intensities, per bacterium. The patient column represents the number of patients that pulled down the protein as a potential antigen. The “In Lysate” column specifies whether the protein was identified in the pure lysate MS runs. The grey rows are proteins that appeared in lysate samples, but were not pulled down by any of the patients’ antibodies.  59  3.2.2 Isotopically labeled proteins are relatively quantified by MS While the list of detected antigens is in itself a highly informative result, we also assessed immunoreactivity across patients by quantitative proteomics. Direct protein intensity comparisons by label-free quantification between different MS runs are possible but can introduce an error due to differences in MS performance between runs (92). To remove this bias, the ratios of protein intensities from the condition (light or medium label) to the standard (heavy label) were used for inter-run comparisons. Unfortunately, despite best efforts, not all the proteins that were identified (shown in Figure 3.2) had their intensities reported in both condition and control channels. A heatmap of all the proteins that do have this ratio available for further processing is shown in Figure 3.3. This heatmap shows protein ratio presence after contaminants (human proteins), potential contaminants (ribosomal proteins) and nonsense matches (proteins from species not present in the run) were removed. It is these nonsense matches that most likely caused MaxQuant internal normalization to report many ratios as “NA” (no answer), despite the intensity values for these proteins still present. In addition, the ratios reported by MQ are supposed to be z-score normalized, which seems to not be the case (Figure 3.4). I computed the ratios manually by dividing the intensity value of the protein in the sample by its intensity value in the control, and included them in the results (dark blue in Figure 3.3). 60   Figure 3.3 The binary heatmap of protein ratio availability. The light blue are all the reported ratios between the experimental (light or medium) and control (heavy) lane. Due to MaxQuant’s difficulty in normalization of this data, all the ratios were recomputed and their logarithms z-score-normalized manually. The protein ratios MQ was unable to compute but were added manually are shown in dark blue. Patient PLysate00PLysate00KPatient3KmLysate00KmLysate00EPatient7PPatient3PPatient PPatient PPatient8PPatient PPatient1PPatient9PPatient7KPatient KmPatient KmPatient KmPatient KPatient1Km.aPatient KmPatient7KmPatient EPatient9SPatient SPatient3SPatient8KPatient7SPatient SPatient SPatient9KmPatient8SPatient1SPatient SPatient9EPatient1EPatient7EPatient EPatient EPatient8EPatient EPatient3ELysate00SPatient3KPatient1K.aPatient9KPatient1KmPatient KPatient1KPatient8KmPatient KPatient KMass Spectrometry SamplesProtein RatiosMQ computedManually computed61  The ratios were converted to a log2 scale. All the log-ratios belonging to a single MS run and label (e.g., Patient 3 and E. coli lysate) were z-score normalized to have average 0 and standard deviation 1. It is this normalized log-ratio that is used for antigen reactivity comparison across patients. This normalization assumes that the protein intensity across different patients stays relatively constant, which might not be true in a case where patient’s immune system was recently exposed to a tested pathogen. In this case, the raw ratios should be used instead. To visualize the effect of normalization, a boxplot showing ratios of each bacterial lysate before and after normalization is shown in Figure 3.4.  Figure 3.4 The log of intensity ratio before and after normalization. The normalization employed was a “normalization by z-score”, which brings the mean to zero and standard deviation to one. Following normalization, the patients’ immunoreactivities were directly compared. Figure 3.5 shows the comparison of antigens with the largest distance from the overall antigen mean. Due to large differences in protein coverage between different patients, some antigen sources such as ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●− . 0.02.5E K Km Mix P SCellPatient1Patient2Patient3Patient4Patient5Patient6Patient7Patient8Patient9●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●− 048E K Km Mix P SCellLog (ratio)Relative Intensity Ratios Normalized Relative Intensity Ratios262  S. typhimurium have no antigens shared by all patients. In this case the most common antigen is shown instead. 63   Figure 3.5 The quantitative comparison of antigens across patients. The antigens with highest normalized ratios were plotted across all the patients. These ratios are derived from the protein intensities of the sample to the control which is present in each MS run, allowing for interpatient comparisons without technical MS biases. It seems that the antigen performance is not consistent between the patients, suggesting that patients do not mount a response against the same antigen. 64  3.2.3 STRING finds significant enrichment in GO Cellular Component domain  The STRING analysis tool uses numerous known protein networks and databases to discover network overrepresentation (91). Of particular interest was overrepresentation of the Gene Ontology terms (93). From the five tested protein data sets, as detailed in section 3.1.8.1, GO Cellular Component domain was significantly enriched with E. coli proteins with a p-value of 0.01. The background dataset was the set of all identified E. coli proteins (both antigens and lysate proteins). The protein interactions are shown in Figure 3.6, and the enriched GO terms with their FDR-adjusted p-values are shown in Table 3.4. GO Biological Process, GO Molecular Function, KEGG Pathways (94–96), PFAM Protein Domains (97), INTERPRO Protein Domains and Features (98), which are all part of STRING’s analysis, were not significantly overrepresented. Enriched GO term Matching proteins FDRadjusted pvalue cell outer membrane lpp, ompN, pal, par, slyB, tut 0.0136 integral component of membrane acrF, cydA, ftsH, lpp, ompN, pal, par, tut, yajC 0.0386 external encapsulating structure lpp, ompN, pal, par ,slyB, tut 0.0386 cell envelope lpp, ompN, pal, par, slyB, tut 0.0386 Table 3.4 Enriched GO Cellular Component terms. Only E. coli potential antigens showed significant enrichment when compared to a background consisting of all identified proteins (including pure lysate proteins). Here are the GO terms which show significant overrepresentation as analysed by STRING database. 65   Figure 3.6 The STRING interaction network of E. coli proteins. The STRING protein interaction network analysis identified the E. coli potential antigens to have significantly different interactions than the background E. coli proteins, which consisted of all identified proteins including proteins from the cell lysate. 66  3.3 Discussion 3.3.1 Potential antigens Examination of the antigen candidates from Table 3.3 shows that for both wild type K. pneumoniae and the ΔcpsB mutant that lacks outer capsid, the top hits are outer membrane proteins (OMP). OMPs are probable antigen proteins for antibody responses since they are readily accessible to antibodies and B cell receptors (99). Of note is that the capsid-lacking mutant and the wild type K. pneumoniae share many of the likely antigens, suggesting that the lysis protocol was sufficient to release proteins from the capsule covering. The candidate E. coli antigens have higher cellular compartment variety. Six of the proteins are cytosolic, with the rest being part of the outer membrane. One suspected E. coli antigen that was reactive to every patient is tryptophanase, which is involved in tryptophan as well as nitrogen metabolism (100). The list of putative Pseudomonas antigens almost exclusively contains outer membrane proteins, with the exception of the GroEL protein. This molecular chaperone is essential for protein folding in many organisms (101) and may have been detected either through direct interaction with an antibody, or through association with other proteins that specifically bind to antibodies; we note that GroEL is expressed, but was not detected, in E. coli lysates, and thus is unlikely to have strong non-specific interaction with the beads. Among the top candidate Salmonella typhimurium antigens are two proteins of interest: thioredoxin and the DEAD box RNA helicase. Thioredoxin is indirectly linked to Salmonella pathogenicity thanks to its requirement for proper Salmonella pathogenicity island 2 function, which is known to contribute to its virulence (102); as with GroEL, thioredoxin is also present in E. coli but was not identified in the samples. The 67  DEAD box RNA helicase is an essential protein that is highly conserved, and was also detected in Klebsiella and E. coli samples, although its mass peaks were not among the top potential antigens in those cells. 3.3.2 Heterogeneity of the data As noted above, an aspect of the data that makes analysis more challenging is sparsity - the lack of protein overlaps between different measurements. This sparsity can be seen best in Figure 3.2, where it is clear that most of the proteins are found in only a few patients. There are three possible reasons for the high variability in protein coverage: I. The mass spectrometer was overloaded, and was not able to distinguish between all of the peptides hitting the detector. The identified peptides are only a fraction of what was present in the sample. II. The Andromeda search engine (part of MQ) was not able to distinguish all the individual peptides due to ambiguity in mass assignment. III. The immunoreactive antigens are truly heterogeneous, and each patient reacts to a very different set of antigens The first and second reason can be solved by lowering the complexity of the samples by fractionation of the samples, simplification of the “heavy” control samples, removal of one of the isotope labels (e.g., have “light” and “heavy” only), or by immunoprecipitating out the proteins 68  that are not of interest. The third, if correct, would suggest that z-score normalization of mass spectrometry runs should not be performed, and the intensity ratios should be used for direct comparisons in their raw form. The overloaded MS and/or MQ are likely the primary reason for this heterogeneity as shown in sections 3.3.3 to 3.3.5. 3.3.3 Need for fractionation A primary motivation for this work was to test how the combination of polyclonal antibody capture and MS/MQ would perform on more complex samples. Accordingly, patients’ serological antibodies were exposed to a mixture of all 5 cell lysates at once. The “Mix” condition did not identify as many candidate antigens as one might expect based on an additive model in which the total antigen coverage per single species lysate condition is preserved. Table 3.5 shows that the mixed lysate identifies about half as many proteins as would be expected in a perfect situation, where every potential antigen is present in the “Mix” samples. Filtering E. coli P. aeruginosa K. pneumoniae (w.t.) K. pneumoniae (mutant) S. typhimurium Total Mix NO 320 373 538 387 150 1768 523 YES 181 262 295 285 50 1073 Table 3.5. The total number of identified proteins per lysate across all patients. The first row represents all identified proteins, regardless of the lysate species. For example, a K. pneumoniae protein can be identified in E. coli lysate. This is a consequence of MQ attempting to identify each protein in each condition. The second row filters out these impossible matches. All the data presented in this thesis is filtered. Even with filtering, mixed lysate shows surprisingly little proteins. Ideally, it would contain 1073 proteins (sum from all single-bacterium runs). The identified number of proteins is probably lower due to MS and MQ reaching their resolution limits in a highly heterogeneous sample. There could be several reasons for this. First, the result could be in part due to bioinformatics analysis of highly homologous proteins. If numerous proteins match an identified peptide pool, 69  MaxQuant groups these proteins together as a single hit; thus, it is possible that some proteins were grouped together in the “Mix” analysis, thereby reducing the observed diversity of antigens. Analysis of the identity of detected proteins across samples suggests that this effect can only account for small portion of the reduction. Second, the mixed lysate sample was prepared such that it contained a lower concentration of each individual lysate. A lower antigen concentration could lead to a less efficient capture due to the finite affinity of antibody-antigen interactions, impacting the fraction of each antigen captured. Third, shotgun MS, while highly sensitive, often needs numerous runs or fractionations to achieve full proteome coverage (103). The more complicated lysate mixture may therefore have overloaded the resolving power of the instrument, resulting in a smaller number of identified proteins than expected. Based on the sparsity of data in all samples, including the pure lysates, MS and/or MQ overloading is likely the most significant factor for the protein count discrepancy. In this case the lysates should be fractionated (e.g., based on either cellular localization (cytosolic vs. membrane proteins) or protein size) prior to capture on beads. 3.3.4 Technical replicates show medium correlation Patient1 antibodies were exposed to both the mutant and wild type Klebsiella pneumoniae lysate twice in different MS runs. The correlations between intensity ratios of antigens commonly found in both runs are shown in Figure 3.7. The wild type and mutant K. pneumoniae has a correlation of 0.56 and 0.82 between runs, respectively. Since these isotope ratios are in theory 70  correcting for the technical variability of the MS instrument, the lack of perfect correlation is a consequence of variable peak detection within the heterogeneous samples.  Figure 3.7. The comparisons between two pairs of identical runs. The left side shows at intensity of antigens as detected from Patient1 mixed with wild type K. pneumoniae; on the right is the same Patient1 serum against mutant K. pneumoniae. 3.3.5 High variability of correlations between MS runs of the same sample The “heavy” label runs were used for analysis of a control mixture consisting of all samples. Thus, variability of raw intensity across runs in this channel can be used to assess technical variability. Inspection of Figure 3.8 shows a lack of consistency between these runs, and is represented as a correlation matrix in Figure 3.9. Again, this variability is attributable to overloading of the MS and to the MQ’s identification method. When a protein is discovered in a MS run in one of the isotopic channels, MQ attempts to quantify this same protein in the other            1 31.0 1.  .0  . Logged first run intensityLogged second run intensity   p         w.t. , Pearson cor. 0.                          1 3 0   Logged first run intensityLogged second run intensityMutant   p         , Pearson cor. 0.8 71  isotopic channels as well. The strength and richness of the first four columns in Figure 3.8, where either the light or medium isotopes were diverse lysate solutions, supports this explanation. The exception is the sixth column from the right that is not rich in protein coverage despite the medium label containing S. typhimurium lysate. However, the low lysis efficiency for the salmonella lysates may explain this observation. Despite the variability in coverage, the heavy channel can still serve as a means to normalize peak intensity for a fraction of the antigens that are detected in multiple channels. 72   Figure 3.8. The heatmap of the control sample in all MS runs (heavy label control). The same sample consisting of a mixture of all other MS runs was labeled with the heavy isotope and run in every MS run, serving as a control to minimize bias between MS runs. 3Km_KmLKL_MixLPL_6P7K_EL1P_4Km8P_4K1Km.b_7P4P_1K.a9P_7S2P_8S3P_6Mix9E_5P8K_3S9Km_2S4S_1S4E_8E3E_2E6S_7Mix2Mix_9K7E_4Mix6E_5Mix1E_1Mix3Mix_8Mix5Km_5E9Mix_2KmSL_3K5S_1Km.a6Km_7Km1K.b_8Km9S_2K5K_6KMS RunsAntigens73   Figure 3.9. The Spearman correlation matrix between protein intensities of the “heavy” normalization channel. This matrix shows correlation between identical samples. The samples were highly complex, consisting of a mixture of all lysates and antigens. This variability can be attributed to overloaded MS, MQ and its Andromeda search engine, as well as variability in the MS between separate runs. 1 0.1910.220.6610.430.080.221−0.010.030.260.5910.050.060.270.740.810.050.080.310.760.830.9410.030.130.330.740.830.950.9710.050.140.340.770.790.950.960.9810.090.170.360.740.770.90.930.960.9510.040.060.280.670.80.870.920.920.880.9310.150.370.530.270.20.240.310.340.360.350.2710.220.490.640.40.380.460.520.560.580.570.450.5910.310.470.520.340.180.260.360.370.390.380.250.580.7510.190.490.540.430.350.440.520.570.570.570.420.620.780.8310.130.380.50.510.450.550.630.670.690.660.50.620.790.840.8910.10.30.430.550.530.650.740.760.750.710.610.590.740.790.860.9310.150.380.460.460.410.50.60.630.610.590.480.580.740.850.880.90.9410.190.220.350.590.480.620.670.660.690.640.570.390.620.570.630.650.730.7210.080.160.30.570.530.670.740.730.730.670.630.390.560.540.640.690.790.730.7110.20.390.560.650.60.730.780.820.830.820.690.570.80.720.840.890.890.830.720.7310.090.210.40.730.720.860.890.920.930.880.810.460.680.50.70.80.830.720.780.770.8910.050.250.430.670.70.840.890.920.940.880.770.490.730.60.740.870.90.790.760.780.910.9510.060.180.370.720.760.910.950.970.960.910.840.420.640.510.690.790.880.750.720.80.880.950.9710.140.320.480.650.680.790.850.890.880.910.820.520.750.630.780.840.840.770.680.730.930.90.910.910.10.210.40.730.740.890.930.950.950.930.860.440.670.510.710.780.830.720.730.770.890.950.950.960.9410.070.220.410.690.750.870.930.950.950.960.90.450.670.530.690.780.830.720.680.730.880.920.940.950.960.9710.070.130.250.530.590.660.760.770.710.820.880.350.490.380.490.540.630.550.530.590.660.690.670.710.80.760.8310.160.30.430.630.60.70.790.820.820.870.80.520.720.650.710.80.810.730.630.660.870.830.850.820.930.870.920.8610.10.370.40.370.340.520.520.550.520.580.510.530.640.650.770.750.740.740.520.570.730.630.660.620.730.640.670.640.7410.090.230.370.620.610.820.760.790.790.810.760.410.590.50.620.70.730.650.60.630.770.780.790.80.80.810.830.730.810.811Intensity.H.9Km_2SIntensity.H.3Km_KmLIntensity.H.KL_MixLIntensity.H.4S_1SIntensity.H.PL_6PIntensity.H.9E_5PIntensity.H.2P_8SIntensity.H.1Km.b_7PIntensity.H.7E_4MixIntensity.H.1P_4KmIntensity.H.9P_7SIntensity.H.SL_3KIntensity.H.7K_ELIntensity.H.5S_1Km.aIntensity.H.5K_6KIntensity.H.9S_2KIntensity.H.2Mix_9KIntensity.H.5Km_5EIntensity.H.3E_2EIntensity.H.4E_8EIntensity.H.9Mix_2KmIntensity.H.3Mix_8MixIntensity.H.1E_1MixIntensity.H.6S_7MixIntensity.H.6Km_7KmIntensity.H.6E_5MixIntensity.H.8P_4KIntensity.H.3P_6MixIntensity.H.8K_3SIntensity.H.1K.b_8KmIntensity.H.4P_1K.aIntensity.H.9Km_2SIntensity.H.3Km_KmLIntensity.H.KL_MixLIntensity.H.4S_1SIntensity.H.PL_6PIntensity.H.9E_5PIntensity.H.2P_8SIntensity.H.1Km.b_7PIntensity.H.7E_4MixIntensity.H.1P_4KmIntensity.H.9P_7SIntensity.H.SL_3KIntensity.H.7K_ELIntensity.H.5S_1Km.aIntensity.H.5K_6KIntensity.H.9S_2KIntensity.H.2Mix_9KIntensity.H.5Km_5EIntensity.H.3E_2EIntensity.H.4E_8EIntensity.H.9Mix_2KmIntensity.H.3Mix_8MixIntensity.H.1E_1MixIntensity.H.6S_7MixIntensity.H.6Km_7KmIntensity.H.6E_5MixIntensity.H.8P_4KIntensity.H.3P_6MixIntensity.H.8K_3SIntensity.H.1K.b_8KmIntensity.H.4P_1K.a−1.0 −0. 0.0 0.5 1.0PearsonCorrelation74  3.3.6 MaxQuant There are many software packages available for MS data analysis. Mascot (104) and SEQUEST (105) are older software packages that performed well in the past, but do not keep up as well with high-resolution MS data (106). Newer algorithms such as Andromeda (53), Morpheus (107) and MS Amanda (108) are optimized for high-throughput, high-resolution MS data analysis, with each package claiming superiority of its particular search algorithm. Here we used Andromeda, a probability-based search algorithm that is similar to Mascot, which is integrated inside MaxQuant (MQ). One of Andromeda’s advantages is that it employs a parallel search algorithm in which a mass peak identified in any condition is quantified across all other conditions as well; this is true even when the peak is missed and regarded as a background noise in some samples. The upside of this approach is that protein intensity can be compared even when the intensity levels are very low, as was the case in our experiments. The downside of this approach is that a parallel search assumes the samples are mostly homogeneous and each peptide is present in each MS run, with the only differences being the peptide’s peak height. The assumption of homogeneity may be expected between experiments using the same lysate, but is not valid for conditions using different lysate solutions. To correct for this the MQ output was manually filtered to remove any non-sensical peptide hits (e.g., an E. coli peptide measured in K. pneumoniae lysate). 75  Chapter 4: Conclusions and Recommendations for Future Steps In this work I have developed a serological immune profiling protocol which can be used for antigen discovery from a variety of antibody and antigen sources. We demonstrate the utility of this protocol by successfully identifying antigens from a bacterial lysate antigen panel, and profiling antibody reactivity across patients. Furthermore, by using an internal control that minimizes technical error between different MS runs, we were able to quantitatively compare antigens detected multiple times across patients. These results could be of use to groups studying opportunistic bacterial pathogens and/or heterogeneity of immune reactions across patients. Of note is the large difference between patient immunoreactivity to common antigens (Figure 3.5), a feature that is not well described and warrants further investigation using orthogonal methods. The presented protocol offers a complimentary method for any research group attempting full characterisation of antibodies. Specifically, our method enables untargeted and highly multiplexed analysis of antibody reactivity that is not possible using alternative approaches. While ELISA is cheap and fast, it can only detect the aggregate reactivity to a complex sample, and provides no information on the identity of reactive antigens within a mixture. In contrast, phage display can be used to obtain antigen information, but is seriously limited in the size and folding of antigens that can be presented, and also does not allow for detection of interactions that require posttranslational modifications or complex tertiary and quaternary protein structure. Previously reported protocols for AMIDA have attempted to address these issues by 76  identification of antigens using MS directly from a cellular lysate, thereby allowing for testing against complex mixtures of proteins that present a native conformational structure. However, this earlier work relied heavily on gel separation of proteins and did not take advantage of modern MS analysis to obtain higher sensitivity and quantitative measurements. My work has contributed to establishing an optimized and quantitative version of this method, dubbed qAMIDA. qAMIDA improves upon the original method by Gires et al.(1) in several areas: it uses high-resolution ESI LC-MS/MS instead of a 2D protein gel for protein separation followed by MALDI-TOF MS, which greatly increases the identification power; it incorporates modern MS instrumentation and analysis software to improve sensitivity and antigen identification; and it employs isotope labeling to enable quantitative comparison of different samples across a large number of patients. I have demonstrated the application of this method in the profiling of human serum samples for reactivity against bacterial pathogens, providing means to assess patient-specific responses that are indicative of previous exposure to pathogens. 4.1 Suggestions for Further Improvements At the initiation of this work we defined a set of research questions to direct the development of the qAMIDA method. These are summarized in Table 4.1 below. 77  Experimental Questions  Answers • Is the protocol able to identify potential antigens in a larger scale experiment?  • What are the bottlenecks in the experiment in terms of time and organizational complexity?    • Does the bioinformatic analysis get more complicated, and could it be optimized? • Yes. Despite the sample complexity, the output provides a list of potential antigens and their relative intensities. • The in-gel digest was the slowest step. For 62 samples, laboratory work took about 12 working days. The MQ analysis, although fully automated, took about 240 hours on a desktop with Intel i74790 CPU and 16 GB RAM. • MQ is designed primarily for samples with little differences between runs. It is possible that different software could be more appropriate. • Is there variation between technical replicates, and if yes, is there a value in technical replicates? • The variation is highly significant. It would be interesting to run the same sample numerous times to see how the intensities vary between isotopes on the same run, as well as between runs. • Are there issues with a more complex antigen mixture? • Yes. We suspect that the resolving capacity of the MS and/or MQ was reached in lysate mixtures. Table 4.1. The answers for primary experimental goals. 4.1.1 Need for cross validation The potential antigens, i.e. proteins with highest intensity, consist mainly of outer membrane proteins and ribosomal proteins. While outer membrane proteins are great antigen candidates, ribosomal proteins could be simply contaminants since they are the most abundant proteins in bacteria (109). Even though this abundance makes them possible antigen candidates (110), it is impossible to tell whether their presence is due to true antibody-antigen response or simply due to a contamination from the lysate. Cross-validation is therefore essential to establish the false positive rate of analysis. Kamhieh-Milz et al. (111) used a dot blot assay where the potential 78  antigens were purified, washed over with serum antibodies, and detected with fluorescent anti-IgG antibodies. A similar validation method could be used for this dataset as well. 4.1.2 Fractionation of antigen panel The low correlation values for technical replicates, as shown in Figure 3.7 and Figure 3.9, as well as lack of any increase in data density coming from mixed lysates suggests that the resolving capacity of the MS and MQ has been exceeded. To lower the sample complexity, the cell lysates could be first fractioned to distinguish between membrane proteins, cytoplasmic proteins, and if possible, extracellular proteins which would provide the benefit of including toxins in the protein pool. Each fraction could use an optimized protein solubilisation method, further increasing the performance of qAMIDA. 4.1.3 Need for technical replicates Technical replicates would allow for a more robust statistical processing, including the application of pipelines developed for genomic data analysis such as limma (112), which rely on numerous measurements per sample. They would also provide means to determine whether the sample complexity is overloading the system. Even without fractionation, the combination of results from identical samples would improve the resolving power of MS instrument and MQ’s Andromeda search engine, since each run would discover different proteins (113). 79  4.2 Comparison to Recently Published MS-AMIDA Paper Kamhieh-Milz et al. (111) published an improved version of the original AMIDA protocol, which he termed Mass Spectrometry-based Antibody-Mediated Identification of Autoantigens (MS-AMIDA). MS-AMIDA, similarly to qAMIDA described here, uses quantitative MS to discover self-antigens. Serum antibodies were collected on a protein G substrate, covalently bound to protein G beads, which were then used to catch self-antigens from immune thrombocytopenic purpura patients. The output was a list of antigens with intensity ratios assigned to them. The most likely candidates were then validated on a dot blot. The major difference between qAMIDA and MS-AMIDA is that the former uses isotope triple labeling to minimize potential bias between different MS runs and to allow more accurate comparisons across all MS runs. The major technical differences are listed in Table 4.2. qAMIDA MS-AMIDA Analyzes foreign antigen sources (bacterial lysates) Analyzes self-antigens Uses Melon Gel for IgG extraction Uses protein G for IgG extraction Binds IgGs to tosylactivated Dynabeads Binds IgGs to protein G beads Uses isotopic labeling Does not label samples (employs label-free quantification) Does not validate Validates results on dot blot Table 4.2. The major differences between qAMIDA and MS-AMIDA experiments. The MS-AMIDA identified 332 proteins on average, out of which 185 were contaminants that appeared on IgG-coated beads without any exposure to the antigens. Kamhieh-Milz et al. noted that most of these proteins were IgG fragments. The reproducibility of this previous work was also very low; out of 181 potential AAGs, only 34 were found in a second MS-AMIDA run. 80  Although the authors stated there were “minor” differences between the two runs and concluded that the MS-AMIDA method was reproducible, further evidence would be needed to fully substantiate this claim. Here the testing of reproducibility was substantially more thorough and shows that the optimized method achieves greater correlation between runs. Overall, qAMIDA is a promising tool for serological immune profiling. It provides relatively fast and straightforward way of identifying potential antigens, and although it would benefit from further improvements such as a cross validation step (as demonstrated by MS-AMIDA (111)) and a fractionation of the bacterial lysate, it provides scalable high-throughput method to profile the serological immunoreactivity of patients. 81  Bibliography 1.  Gires O, Münz M, Schaffrik M, Kieu C, Rauch J, Ahlemann M, et al. Profile identification of disease-associated humoral antigens using AMIDA, a novel proteomics-based technology. Cellular and molecular life sciences: CMLS. [Online] 2004;61(10): 1198–1207. Available from: doi:10.1007/s00018-004-4045-8 2.  Kang D, Liu G, Lundström A, Gelius E, Steiner H. A peptidoglycan recognition protein in innate immunity conserved from insects to humans. Proceedings of the National Academy of Sciences. [Online] 1998;95(17): 10078–10082. Available from: doi:10.1073/pnas.95.17.10078 3.  Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P. Innate Immunity. 2002; Available from: https://www.ncbi.nlm.nih.gov/books/NBK26846/ [Accessed: 15th January 2017] 4.  Alt FW, Oltz EM, Young F, Gorman J, Taccioli G, Chen J. VDJ recombination. Immunology Today. [Online] 1992;13(8): 306–314. Available from: doi:10.1016/0167-5699(92)90043-7 5.  Murphy K, Weaver C. J   w y’s       b  l gy.. 9th edition. New York, NY: Garland Science/Taylor & Francis Group, LLC; 2016. 904 p.  6.  Davies DR, Chacko S. Antibody structure. Accounts of Chemical Research. [Online] 1993;26(8): 421–427. Available from: doi:10.1021/ar00032a005 7.  Xu JL, Davis MM. Diversity in the CDR3 region of V(H) is sufficient for most antibody specificities. Immunity. 2000;13(1): 37–45.  8.  Elhanati Y, Sethna Z, Marcou Q, Callan CG, Mora T, Walczak AM. Inferring processes underlying B-cell repertoire diversity. Philosophical Transactions of the Royal Society B: Biological Sciences. [Online] 2015;370(1676): 20140243. Available from: doi:10.1098/rstb.2014.0243 9.  Sawchuk DJ, Weis-Garcia F, Malik S, Besmer E, Bustin M, Nussenzweig MC, et al. V(D)J Recombination: Modulation of RAG1 and RAG2 Cleavage Activity on 12/23 Substrates by Whole Cell Extract and DNA-bending Proteins. Journal of Experimental Medicine. [Online] 1997;185(11): 2025–2032. Available from: doi:10.1084/jem.185.11.2025 10.  Lu H, Schwarz K, Lieber MR. Extent to which hairpin opening by the Artemis:DNA-PKcs complex can contribute to junctional diversity in V(D)J recombination. Nucleic Acids Research. [Online] 2007;35(20): 6917–6923. Available from: doi:10.1093/nar/gkm823 82  11.  Bassing CH, Alt FW. The cellular response to general and programmed DNA double strand breaks. DNA Repair. [Online] 2004;3(8–9): 781–796. Available from: doi:10.1016/j.dnarep.2004.06.001 12.  Brady BL, Steinel NC, Bassing CH. Antigen Receptor Allelic Exclusion: An Update and Reappraisal. The Journal of Immunology. [Online] 2010;185(7): 3801–3808. Available from: doi:10.4049/jimmunol.1001158 13.  DeKosky BJ, Kojima T, Rodin A, Charab W, Ippolito GC, Ellington AD, et al. In-depth determination and analysis of the human paired heavy- and light-chain antibody repertoire. Nature Medicine. [Online] 2014;21(1): 86–91. Available from: doi:10.1038/nm.3743 14.  Depoil D, Fleire S, Treanor BL, Weber M, Harwood NE, Marchbank KL, et al. CD19 is essential for B cell activation by promoting B cell receptor-antigen microcluster formation in response to membrane-bound ligand. Nature Immunology. [Online] 2008;9(1): 63–72. Available from: doi:10.1038/ni1547 15.  Landry JP, Fei Y, Zhu X. Simultaneous Measurement of 10,000 Protein-Ligand Affinity Constants Using Microarray-Based Kinetic Constant Assays. Assay and Drug Development Technologies. [Online] 2012;10(3): 250–259. Available from: doi:10.1089/adt.2011.0406 16.  Eisen HN. Affinity Enhancement of Antibodies: How Low-Affinity Antibodies Produced Early in Immune Responses Are Followed by High-Affinity Antibodies Later and in Memory B-Cell Responses. Cancer Immunology Research. [Online] 2014;2(5): 381–392. Available from: doi:10.1158/2326-6066.CIR-14-0029 17.  Poon LLM, Song T, Rosenfeld R, Lin X, Rogers MB, Zhou B, et al. Quantifying influenza virus diversity and transmission in humans. Nature genetics. [Online] 2016;48(2): 195–200. Available from: doi:10.1038/ng.3479 18.  Vaccines against influenza WHO position paper – November 2012. Releve Epidemiologique Hebdomadaire. 2012;87(47): 461–476.  19.  Burton DR, Ahmed R, Barouch DH, Butera ST, Crotty S, Godzik A, et al. A Blueprint for HIV Vaccine Discovery. Cell Host & Microbe. [Online] 2012;12(4): 396–407. Available from: doi:10.1016/j.chom.2012.09.008 20.  Wang R. Induction of Antigen-Specific Cytotoxic T Lymphocytes in Humans by a Malaria DNA Vaccine. Science. [Online] 1998;282(5388): 476–480. Available from: doi:10.1126/science.282.5388.476 83  21.  Nakahara J, Maeda M, Aiso S, Suzuki N. Current Concepts in Multiple Sclerosis: Autoimmunity Versus Oligodendrogliopathy. Clinical Reviews in Allergy & Immunology. [Online] 2012;42(1): 26–34. Available from: doi:10.1007/s12016-011-8287-6 22.  Bach J-F. The Effect of Infections on Susceptibility to Autoimmune and Allergic Diseases. New England Journal of Medicine. [Online] 2002;347(12): 911–920. Available from: doi:10.1056/NEJMra020100 23.  Abel L, Kutschki S, Turewicz M, Eisenacher M, Stoutjesdijk J, Meyer HE, et al. Autoimmune profiling with protein microarrays in clinical applications. Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics. [Online] 2014;1844(5): 977–987. Available from: doi:10.1016/j.bbapap.2014.02.023 24.  Clinical Cancer Advances 2016. [Online] ASCO. Available from: http://www.asco.org/research-progress/reports-studies/clinical-cancer-advances [Accessed: 31st January 2017] 25.  Dunn GP, Old LJ, Schreiber RD. The Immunobiology of Cancer Immunosurveillance and Immunoediting. Immunity. [Online] 2004;21(2): 137–148. Available from: doi:10.1016/j.immuni.2004.07.017 26.  Buchbinder EI, Desai A. CTLA-4 and PD-1 Pathways: Similarities, Differences, and Implications of Their Inhibition. American Journal of Clinical Oncology. [Online] 2016;39(1): 98–106. Available from: doi:10.1097/COC.0000000000000239 27.  Hammerschmidt W, Sugden B. Genetic analysis of immortalizing functions of Epstein–Barr virus in human B lymphocytes. Nature. [Online] 1989;340(6232): 393–397. Available from: doi:10.1038/340393a0 28.  Ehlich A, Martin V, Müller W, Rajewsky K. Analysis of the B-cell progenitor compartment at the level of single cells. Current Biology. [Online] 1994;4(7): 573–583. Available from: doi:10.1016/S0960-9822(00)00129-9 29.  Robinson WH. Sequencing the functional antibody repertoire—diagnostic and therapeutic discovery. Nature reviews. Rheumatology. [Online] 2015;11(3): 171–182. Available from: doi:10.1038/nrrheum.2014.220 30.  Busse CE, Czogiel I, Braun P, Arndt PF, Wardemann H. Single-cell based high-throughput sequencing of full-length immunoglobulin heavy and light chain genes: New technology. European Journal of Immunology. [Online] 2014;44(2): 597–603. Available from: doi:10.1002/eji.201343917 84  31.  DeKosky BJ, Ippolito GC, Deschner RP, Lavinder JJ, Wine Y, Rawlings BM, et al. High-throughput sequencing of the paired human immunoglobulin heavy and light chain repertoire. Nature Biotechnology. [Online] 2013;31(2): 166–169. Available from: doi:10.1038/nbt.2492 32.  Howie B, Sherwood AM, Berkebile AD, Berka J, Emerson RO, Williamson DW, et al. High-throughput pairing of T cell receptor and sequences. Science Translational Medicine. [Online] 2015;7(301): 301ra131-301ra131. Available from: doi:10.1126/scitranslmed.aac5624 33.  Towbin H, Gordon J. Immunoblotting and dot immunobinding — Current status and outlook. Journal of Immunological Methods. [Online] 1984;72(2): 313–340. Available from: doi:10.1016/0022-1759(84)90001-2 34.  Kusch K, Uecker M, Liepold T, Möbius W, Hoffmann C, Neumann H, et al. Partial Immunoblotting of 2D-Gels: A Novel Method to Identify Post-Translationally Modified Proteins Exemplified for the Myelin Acetylome. Proteomes. [Online] 2017;5(1): 3. Available from: doi:10.3390/proteomes5010003 35.  Ermann J, Rao DA, Teslovich NC, Brenner MB, Raychaudhuri S. Immune cell profiling to guide therapeutic decisions in rheumatic diseases. Nature Reviews Rheumatology. [Online] 2015;11(9): 541–551. Available from: doi:10.1038/nrrheum.2015.71 36.  Zhu L, He J, Cao X, Huang K, Luo Y, Xu W. Development of a double-antibody sandwich ELISA for rapid detection of Bacillus Cereus in food. Scientific Reports. [Online] 2016;6: 16092. Available from: doi:10.1038/srep16092 37.  Smith GP, Petrenko VA. Phage Display. Chemical Reviews. [Online] 1997;97(2): 391–410. Available from: doi:10.1021/cr960065d 38.  Smith GP. Filamentous Fusion Phage: Novel Expression Vectors that Display Cloned Antigens on the Virion Surface. Science. [Online] 1985;228(4705): 1315–1317. Available from: doi:10.2307/1694587 39.  Abi-Ghanem D, Waghela SD, Caldwell DJ, Danforth HD, Berghman LR. Phage display selection and characterization of single-chain recombinant antibodies against Eimeria tenella sporozoites. Veterinary Immunology and Immunopathology. [Online] 2008;121(1–2): 58–67. Available from: doi:10.1016/j.vetimm.2007.08.005 40.  Making Antibodies by Phage Display Technology. Annual Review of Immunology. [Online] 1994;12(1): 433–455. Available from: doi:10.1146/annurev.iy.12.040194.002245 85  41.  Reymond Sutandy F, Qian J, Chen C-S, Zhu H. Overview of Protein Microarrays. Current protocols in protein science / editorial board, John E. Coligan ... [et al.]. [Online] 2013;0 27: Unit-27.1. Available from: doi:10.1002/0471140864.ps2701s72 42.  Poetz O, Schwenk JM, Kramer S, Stoll D, Templin MF, Joos TO. Protein microarrays: catching the proteome. Mechanisms of Ageing and Development. [Online] 2005;126(1): 161–170. Available from: doi:10.1016/j.mad.2004.09.030 43.  Frank R. The SPOT-synthesis technique: Synthetic peptide arrays on membrane supports—principles and applications. Journal of Immunological Methods. [Online] 2002;267(1): 13–26. Available from: doi:10.1016/S0022-1759(02)00137-0 44.  Cretich M, Damin F, Chiari M. Protein microarray technology: how far off is routine diagnostics? The Analyst. [Online] 2014;139(3): 528–542. Available from: doi:10.1039/C3AN01619F 45.  Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. Quantitative mass spectrometry in proteomics: a critical review. Analytical and Bioanalytical Chemistry. [Online] 2007;389(4): 1017–1031. Available from: doi:10.1007/s00216-007-1486-6 46.  Elliott MH, Smith DS, Parker CE, Borchers C. Current trends in quantitative proteomics. Journal of Mass Spectrometry. [Online] 2009; n/a-n/a. Available from: doi:10.1002/jms.1692 47.  Björndal H, Hellerqvist CG, Lindberg B, Svensson S. Gas-Liquid Chromatography and Mass Spectrometry in Methylation Analysis of Polysaccharides. Angewandte Chemie International Edition in English. [Online] 1970;9(8): 610–619. Available from: doi:10.1002/anie.197006101 48.  Gingras A-C, Gstaiger M, Raught B, Aebersold R. Analysis of protein complexes using mass spectrometry. Nature Reviews Molecular Cell Biology. [Online] 2007;8(8): 645–654. Available from: doi:10.1038/nrm2208 49.  Andreotti S, Klau GW, Reinert K. Antilope—A Lagrangian Relaxation Approach to the De Novo Peptide Sequencing Problem. IEEE/ACM Trans. Comput. Biol. Bioinformatics. [Online] 2012;9(2): 385–394. Available from: doi:10.1109/TCBB.2011.59 50.  Zhang J, Xin L, Shan B, Chen W, Xie M, Yuen D, et al. PEAKS DB: De Novo Sequencing Assisted Database Search for Sensitive and Accurate Peptide Identification. Molecular & C ll l r Pr t    cs : MCP. [Online] 2012;11(4). Available from: doi:10.1074/mcp.M111.010587 [Accessed: 13th March 2017] 86  51.  Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. [Online] 2003;422(6928): 198–207. Available from: doi:10.1038/nature01511 52.  Gillet LC, Leitner A, Aebersold R. Mass Spectrometry Applied to Bottom-Up Proteomics: Entering the High-Throughput Era for Hypothesis Testing. Annual Review of Analytical Chemistry. [Online] 2016;9(1): 449–472. Available from: doi:10.1146/annurev-anchem-071015-041535 53.  Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, Mann M. Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment. Journal of Proteome Research. [Online] 2011;10(4): 1794–1805. Available from: doi:10.1021/pr101065j 54.  Marx V. Targeted proteomics. Nature Methods. [Online] 2013;10(1): 19–22. Available from: doi:10.1038/nmeth.2285 55.  Panchaud A, Jung S, Shaffer SA, Aitchison JD, Goodlett DR. Faster, Quantitative, and Accurate Precursor Acquisition Independent From Ion Count. Analytical chemistry. 2011;83(6): 2250–2257.  56.  Anand S, Samuel M, Ang C-S, Keerthikumar S, Mathivanan S. Label-Based and Label-Free Strategies for Protein Quantitation. Proteome Bioinformatics. [Online] Humana Press, New York, NY; 2017. p. 31–43. Available from: doi:10.1007/978-1-4939-6740-7_4 57.  Ong S-E, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, et al. Stable Isotope Labeling by Amino Acids in Cell Culture, SILAC, as a Simple and Accurate Approach to Expression Proteomics. Molecular & Cellular Proteomics. [Online] 2002;1(5): 376–386. Available from: doi:10.1074/mcp.M200025-MCP200 58.  Han DK, Eng J, Zhou H, Aebersold R. Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nature Biotechnology. [Online] 2001;19(10): 946–951. Available from: doi:10.1038/nbt1001-946 59.  Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nature Biotechnology. [Online] 1999;17(10): 994–999. Available from: doi:10.1038/13690 60.  Hsu J-L, Huang S-Y, Chow N-H, Chen S-H. Stable-isotope dimethyl labeling for quantitative proteomics. Analytical Chemistry. [Online] 2003;75(24): 6843–6852. Available from: doi:10.1021/ac0348625 61.  Boersema PJ, Aye TT, van Veen TAB, Heck AJR, Mohammed S. Triplex protein quantification based on stable isotope labeling by peptide dimethylation applied to cell and 87  tissue lysates. PROTEOMICS. [Online] 2008;8(22): 4624–4632. Available from: doi:10.1002/pmic.200800297 62.  Choe L, D’Ascenzo M, Relkin NR, Pappin D, Ross P, Williamson B, et al. 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer’s disease. Proteomics. [Online] 2007;7(20): 3651–3660. Available from: doi:10.1002/pmic.200700316 63.  Wiese S, Reidegeld KA, Meyer HE, Warscheid B. Protein labeling by iTRAQ: A new tool for quantitative mass spectrometry in proteome research. PROTEOMICS. [Online] 2007;7(3): 340–350. Available from: doi:10.1002/pmic.200600422 64.  Iliuk A, Galan J, Tao WA. Playing tag with quantitative proteomics. Analytical and Bioanalytical Chemistry. [Online] 2009;393(2): 503–513. Available from: doi:10.1007/s00216-008-2386-0 65.  Neilson KA, Ali NA, Muralidharan S, Mirzaei M, Mariani M, Assadourian G, et al. Less label, more free: approaches in label-free quantitative mass spectrometry. Proteomics. [Online] 2011;11(4): 535–553. Available from: doi:10.1002/pmic.201000553 66.  Wang M, You J, Bemis KG, Tegeler TJ, Brown DPG. Label-free mass spectrometry-based protein quantification technologies in proteomic analysis. Briefings in Functional Genomics & Proteomics. [Online] 2008;7(5): 329–339. Available from: doi:10.1093/bfgp/eln031 67.  Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology. [Online] 2008;26(12): 1367–1372. Available from: doi:10.1038/nbt.1511 68.  Käll L, Storey JD, MacCoss MJ, Noble WS. Posterior Error Probabilities and False Discovery Rates: Two Sides of the Same Coin. Journal of Proteome Research. [Online] 2008;7(1): 40–44. Available from: doi:10.1021/pr700739d 69.  Ey PL, Prowse SJ, Jenkin CR. Isolation of pure IgG1, IgG2a and IgG2b immunoglobulins from mouse serum using protein A-Sepharose. Immunochemistry. [Online] 1978;15(7): 429–436. Available from: doi:10.1016/0161-5890(78)90070-6 70.  Schneider C, Newman RA, Sutherland DR, Asser U, Greaves MF. A one-step purification of membrane proteins using a high efficiency immunomatrix. The Journal of Biological Chemistry. 1982;257(18): 10766–10769.  71.  Melon Gel IgG Spin Purification Kit - Thermo Fisher Scientific. [Online] Available from: https://www.thermofisher.com/order/catalog/product/45206 [Accessed: 5th February 2017] 88  72.  Dynabeads M-280 Tosylactivated - Thermo Fisher Scientific. [Online] Available from: https://www.thermofisher.com/order/catalog/product/14203 [Accessed: 5th February 2017] 73.  Taylor FB, Chang A, Ruf W, Morrissey JH, Hinshaw L, Catlett R, et al. Lethal E. coli septic shock is prevented by blocking tissue factor with monoclonal antibody. Circulatory Shock. 1991;33(3): 127–134.  74.  LeClerc JE, Li B, Payne WL, Cebula TA. High Mutation Frequencies Among Escherichia coli and Salmonella Pathogens. Science. 1996;274(5290): 1208–1211.  75.  Hayashi F, Smith KD, Ozinsky A, Hawn TR, Yi EC, Goodlett DR, et al. The innate immune response to bacterial flagellin is mediated by Toll-like receptor 5. Nature. [Online] 2001;410(6832): 1099–1103. Available from: doi:10.1038/35074106 76.  Bachmann BJ. Pedigrees of some mutant strains of Escherichia coli K-12. Bacteriological Reviews. 1972;36(4): 525–557.  77.  Bertani G. STUDIES ON LYSOGENESIS I. Journal of Bacteriology. 1951;62(3): 293–300.  78.  Protein Purification - Extraction and Clarification - Choice of lysis buffer and additives - EMBL. [Online] Available from: https://www.embl.de/pepcore/pepcore_services/protein_purification/extraction_clarification/lysis_buffer_additives/ [Accessed: 8th February 2017] 79.  Firer M. Efficient elution of functional proteins in affinity chromatography. Journal of Biochemical and Biophysical Methods. [Online] 2001;49(1–3): 433–442. Available from: doi:10.1016/S0165-022X(01)00211-1 80.  Protein Analysis Kits. [Online] Available from: http://www.genomics.agilent.com/en/Bioanalyzer-Protein-Kits/Protein-Analysis-Kits/?cid=AG-PT-104&tabId=AG-PR-1154 [Accessed: 10th February 2017] 81.  Shevchenko A, Wilm M, Vorm O, Mann M. Mass Spectrometric Sequencing of Proteins from Silver-Stained Polyacrylamide Gels. Analytical Chemistry. [Online] 1996;68(5): 850–858. Available from: doi:10.1021/ac950914h 82.  Peleg AY, Hooper DC. Hospital-Acquired Infections Due to Gram-Negative Bacteria. New England Journal of Medicine. [Online] 2010;362(19): 1804–1813. Available from: doi:10.1056/NEJMra0904124 83.  Alcántar-Curiel MD, Garcı́a-Latorre E, Santos JI. Klebsiella pneumoniae 35 and 36 kDa Porins Are Common Antigens in Different Serotypes and Induce Opsonizing Antibodies. 89  Archives of Medical Research. [Online] 2000;31(1): 28–36. Available from: doi:10.1016/S0188-4409(99)00083-1 84.  Zhu ZC, Chen Y, Ackerman MS, Wang B, Wu W, Li B, et al. Investigation of monoclonal antibody fragmentation artifacts in non-reducing SDS-PAGE. Journal of Pharmaceutical and Biomedical Analysis. [Online] 2013;83: 89–95. Available from: doi:10.1016/j.jpba.2013.04.030 85.  Stover CK, Pham XQ, Erwin AL, Mizoguchi SD, Warrener P, Hickey MJ, et al. Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen. Nature. [Online] 2000;406(6799): 959–964. Available from: doi:10.1038/35023079 86.  cOmplete<SUP>TM</SUP> Protease Inhibitor Cocktail CORO. [Online] Sigma-Aldrich. Available from: http://www.sigmaaldrich.com/catalog/product/roche/coro [Accessed: 7th May 2017] 87.  Rappsilber J, Mann M, Ishihama Y. Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nature Protocols. [Online] 2007;2(8): 1896–1906. Available from: doi:10.1038/nprot.2007.261 88.  3M EmporeTM SPE Extraction Disks. [Online] Sigma-Aldrich. Available from: http://www.sigmaaldrich.com/analytical-chromatography/sample-preparation/spe/3m-empore/extraction-disks.html [Accessed: 9th May 2017] 89.  The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Research. [Online] 2017;45(D1): D158–D169. Available from: doi:10.1093/nar/gkw1099 90.  Tabb DL, Vega-Montoto L, Rudnick PA, Variyath AM, Ham A-JL, Bunk DM, et al. Repeatability and Reproducibility in Proteomic Identifications by Liquid Chromatography−Tandem Mass Spectrometry. Journal of Proteome Research. [Online] 2010;9(2): 761–776. Available from: doi:10.1021/pr9006365 91.  Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Research. [Online] 2015;43(Database issue): D447-452. Available from: doi:10.1093/nar/gku1003 92.  Kramer G, Woolerton Y, Straalen JP van, Vissers JPC, Dekker N, Langridge JI, et al. Accuracy and Reproducibility in Quantification of Plasma Protein Concentrations by Mass Spectrometry without the Use of Isotopic Standards. PLOS ONE. [Online] 2015;10(10): e0140097. Available from: doi:10.1371/journal.pone.0140097 90  93.  Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene Ontology: tool for the unification of biology. Nature genetics. [Online] 2000;25(1): 25–29. Available from: doi:10.1038/75556 94.  Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Research. 2000;28(1): 27–30.  95.  Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Research. [Online] 2017;45(D1): D353–D361. Available from: doi:10.1093/nar/gkw1092 96.  Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Research. [Online] 2016;44(D1): D457-462. Available from: doi:10.1093/nar/gkv1070 97.  Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Research. [Online] 2016;44(D1): D279–D285. Available from: doi:10.1093/nar/gkv1344 98.  Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, et al. InterPro in 2017—beyond protein family and domain annotations. Nucleic Acids Research. [Online] 2017;45(D1): D190–D199. Available from: doi:10.1093/nar/gkw1107 99.  Sippel JE, Mamay HK, Weiss E, Joseph SW, Beasley WJ. Outer membrane protein antigens in an enzyme-linked immunosorbent assay for Salmonella enteric fever and meningococcal meningitis. Journal of Clinical Microbiology. 1978;7(4): 372–378.  100.  Botsford JL, Demoss RD. Escherichia coli Tryptophanase in the Enteric Environment. Journal of Bacteriology. 1972;109(1): 74–80.  101.  Laminet AA, Ziegelhoffer T, Georgopoulos C, Plückthun A. The Escherichia coli heat shock proteins GroEL and GroES modulate the folding of the beta-lactamase precursor. The EMBO Journal. 1990;9(7): 2315–2319.  102.  Negrea A, Bjur E, Puiac S, Ygberg SE, Åslund F, Rhen M. Thioredoxin 1 Participates in the Activity of the Salmonella enterica Serovar Typhimurium Pathogenicity Island 2 Type III Secretion System. Journal of Bacteriology. [Online] 2009;191(22): 6918–6927. Available from: doi:10.1128/JB.00532-09 103.  Wang H, Chang-Wong T, Tang H-Y, Speicher DW. Comparison of Extensive Protein Fractionation and Repetitive LC-MS/MS Analyses on Depth of Analysis for Complex Proteomes. Journal of Proteome Research. [Online] 2010;9(2): 1032–1040. Available from: doi:10.1021/pr900927y 91  104.  Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. [Online] 1999;20(18): 3551–3567. Available from: doi:10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2 105.  Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society for Mass Spectrometry. [Online] 1994;5(11): 976–989. Available from: doi:10.1016/1044-0305(94)80016-2 106.  Kim S, Pevzner PA. Universal database search tool for proteomics. Nature communications. [Online] 2014;5: 5277. Available from: doi:10.1038/ncomms6277 107.  Wenger CD, Coon JJ. A proteomics search algorithm specifically designed for high-resolution tandem mass spectra. Journal of Proteome Research. [Online] 2013;12(3): 1377–1386. Available from: doi:10.1021/pr301024c 108.  Dorfer V, Pichler P, Stranzl T, Stadlmann J, Taus T, Winkler S, et al. MS Amanda, a universal identification algorithm optimized for high accuracy tandem mass spectra. Journal of Proteome Research. [Online] 2014;13(8): 3679–3684. Available from: doi:10.1021/pr500202e 109.  Ishihama Y, Schmidt T, Rappsilber J, Mann M, Hartl FU, Kerner MJ, et al. Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics. [Online] 2008;9(1): 102. Available from: doi:10.1186/1471-2164-9-102 110.  Syu WJ, Kahan B, Kahan L. Epitopes of Escherichia coli ribosomal protein S13. Journal of Protein Chemistry. 1989;8(6): 701–717.  111.  Kamhieh-Milz J, Sterzer V, Celik H, Khorramshahi O, Fadl Hassan Moftah R, Salama A. Identification of novel autoantigens via mass spectroscopy-based antibody-mediated identification of autoantigens (MS-AMIDA) using immune thrombocytopenic purpura (ITP) as a model disease. Journal of Proteomics. [Online] 2017;157: 59–70. Available from: doi:10.1016/j.jprot.2017.01.012 112.  Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. [Online] 2015;43(7): e47–e47. Available from: doi:10.1093/nar/gkv007 113.  Liu H, Sadygov RG, Yates JR. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Analytical Chemistry. [Online] 2004;76(14): 4193–4201. Available from: doi:10.1021/ac0498563 92  114.  Wilson J, Elgohari S, Livermore DM, Cookson B, Johnson A, Lamagni T, et al. Trends among pathogens reported as causing bacteraemia in England, 2004-2008. Clinical Microbiology and Infection: The Official Publication of the European Society of Clinical Microbiology and Infectious Diseases. [Online] 2011;17(3): 451–458. Available from: doi:10.1111/j.1469-0691.2010.03262.x 115.  Greene SL, Su WP, Muller SA. Pseudomonas aeruginosa infections of the skin. American Family Physician. 1984;29(1): 193–200.  116.  Petkovšek Ž, Eleršič K, Gubina M, Žgur-Bertok D, Starčič Erjavec M. Virulence Potential of Escherichia coli Isolates from Skin and Soft Tissue Infections. Journal of Clinical Microbiology. [Online] 2009;47(6): 1811–1817. Available from: doi:10.1128/JCM.01421-08 117.  Ohara T, Itoh K. Significance of Pseudomonas aeruginosa colonization of the gastrointestinal tract. Internal Medicine (Tokyo, Japan). 2003;42(11): 1072–1076.  118.  Cortés G, Borrell N, de Astorza B, Gómez C, Sauleda J, Albertí S. Molecular analysis of the contribution of the capsular polysaccharide and the lipopolysaccharide O side chain to the virulence of Klebsiella pneumoniae in a murine model of pneumonia. Infection and Immunity. 2002;70(5): 2583–2590.  119.  Lawlor MS, Handley SA, Miller VL. Comparison of the Host Responses to Wild-Type and cpsB Mutant Klebsiella pneumoniae Infections. Infection and Immunity. [Online] 2006;74(9): 5402–5407. Available from: doi:10.1128/IAI.00244-06  93  Appendices Appendix A Additional Attempts at Lowering Ig Contamination A high human protein presence, with Ig proteins often having the strongest mass peak intensity, prompted us to attempt various Ig removal techniques. Here we describe techniques that did not provide results worth pursuing. A.1 Antigen elution by urea The elution by 1 % SDS in 50 mM NH4HCO3 with NEM was compared against elution by 7 M urea. Since 7 M urea is a salt at high concentration, it does not evaporate and needs to be removed from the eluate in order to be loaded into an SDS-PAGE gel. Two methods were tested for urea removal: the eluate was buffer-exchanged through a 3K Amicon filter, and urea was precipitated in an ice-cold ethanol bath. All the other steps of the qAMIDA protocol were kept the same. The source of antibodies was serum F01, and the antigens were represented by P. aeruginosa lysate. Appendix Figure A.1 shows the results: while elution by urea does lower the intensity of human protein contaminants, it also lowers the number of P. aeruginosa proteins by almost an order of magnitude. Using SDS with NEM thus remained the elution method of choice. 94   Appendix Figure A.1. The histograms show the base 2 logarithm of protein intensity ratios of Urea VS SDS elution (e.g., a protein with Log2Ratio of -1 corresponds to urea eluting this protein with half the intensity of SDS elution). The urea elution reduced the human protein intensity by 8× on average; however, the pseudomonas protein intensities were reduced by the same amount as well. It should be noted that the Amicon buffer exchange and the ethanol precipitation had significantly different intensity values (p-value of 0.02 for both KW and AD tests), although both were underperforming when compared to the SDS with NEM elution. A.2 Cutting out IgG bands A more direct approach to remove IgG contamination was to physically cut out the SDS-PAGE bands corresponding to the most common light and heavy chain masses, 25 kDa and 50 kDa, respectively. The immunoglobulin contamination lowered from 14 Ig proteins to 12 Ig proteins; 05101520− −3 0 3Log2RatioProtein count0.00.51.01.52.0− − − 0Log2RatioProtein countExperimentAmiconEtOHP. aeruginosa Proteins Human Proteins95  however, at the same time, the overall human protein contamination increased from 28 to 36. We suspect that this was due to much longer handling of the gel, which might have increased the chances of contamination from the environment. The P. aeruginosa proteins decreased after band removal as well, from 236 to 194 identified proteins. This decrease is significant, and resulted in us abandoning the idea of cutting out contaminant IgG bands.  96  Appendix B Overview of Bacterial Pathogens Used as an Antigen Source The development and optimization of qAMIDA protocol necessitates a model system on which the protocol’s efficacy can be continuously tested. We had little prior information about the antibody repertoire of the patients’ serum samples, and so it was important to choose antigen sources which have a high chance of eliciting a response in any patient. Gram negative bacteria are opportunistic pathogens which account for about half of blood infections (114), and strains such as Pseudomonas aeruginosa and Escherichia coli are commonly present on our skin and inside our digestive tract (115–117). In addition to a high chance of reactivity between serological antibodies and gram-negative antigens, gram negative strains are easy to obtain and grow in laboratory conditions. The following five strains were used as an antigen source. B.1 Escherichia coli K12 E. coli is a rod-shaped bacterium commonly present inside the lower intestine. Some strains are a common cause of food poisoning, which increases the chances of patients having memory B cells producing antibodies against E. coli proteins. We used the EMG2 (K12) strain, which is a common easily obtainable laboratory strain. We expect most patients to have immunoreactivity to E. coli. 97  B.2 Klebsiella pneumoniae wild type and ΔcpsB mutant Klebsiella pneumoniae is an encapsulated rod-shaped bacterium that is one of the major knows causes of bacterial pneumonia. A strong reasons for K. pneumoniae’s high virulence is its polysaccharide capsule that prevents the immune system from accessing the cell (118). To test for any differences in serological response due to the polysaccharide shell, we obtained both the wild type K. pneumoniae strain ATCC 700721, and a mutant K. pneumoniae which has a deletion of the cpsB gene (119). The mutant was developed by Dr. Finlay’s laboratory. B.3 Salmonella typhimurium S. typhimurium is a pathogenic strain that is present in intestine lumen. It has a lipopolysaccharide outer coating (LPS) which provides protection and increases toxicity. The LPS is made up of lipid A which connects the LPS to the outer membrane, a polysaccharide core, and O-antigen. This O-antigen is commonly recognized by host’s immune system. The strain used is SL1344. B.4 Pseudomonas aeruginosa P. aeruginosa is an opportunistic pathogen often found in patients with cystic fibrosis and with heavy burns. Due to the bacterium’s natural resistance to antibiotics, P. aeruginosa is a common cause of hospital-acquired infections and is of particular danger to immunocompromised patients. As the bacterium is part of a natural skin flora, most patients’ immune system should have an experience encountering its antigens. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0360777/manifest

Comment

Related Items