@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix skos: . vivo:departmentOrSchool "Science, Faculty of"@en ; edm:dataProvider "DSpace"@en ; ns0:degreeCampus "UBCV"@en ; dcterms:creator "Sharma, Govinda"@en ; dcterms:issued "2018-12-18T17:02:34Z"@*, "2018"@en ; vivo:relatedDegree "Doctor of Philosophy - PhD"@en ; ns0:degreeGrantor "University of British Columbia"@en ; dcterms:description """Though it is understood that T-cells are a critical component of the immune system’s ability to destroy foreign invaders and attack cancerous cells, very little is known regarding the specific epitopes that are recognized by T-cells to carry out these functions. Generally, the epitopes mediating this immunity are short, contiguous peptides derived from antigenic proteins presented on major histocompatibility complex (MHC) molecules for inspection by T-cells. The ability to rapidly and deeply search peptide space to determine specific peptide epitopes that are naturally processed, presented, and capable of eliciting functional T-cell responses is a critical unmet need in the study of adaptive immunity. Here, I describe a novel method for deep T-cell epitope profiling that enables simultaneous in vitro interrogation of target cell populations encoding high-diversity minigene libraries with T-cell populations-of-interest. Targets eliciting T-cell reactivity are selectively isolated by fluorescence-activated cell sorting (FACS) and identified by deep amplicon sequencing. The approach was extensively validated using known murine T-cell receptor (TCR)/peptide-MHC pairs and it was shown that this method can unambiguously identify canonical minigenes from libraries of vastly more candidate antigens in parallel than would be feasibly tractable using conventional methods. The capability of this strategy was extended by applying a synthetic biology approach. Using pairs of immortalized natural killer (NK)-like effector cell lines and naturally tolerated target cell lines, I showed that fully reconstituting the TCR/CD3 complex in effectors and expressing relevant MHC-/minigene-coding sequences in targets is sufficient to re-direct the cytotoxicity of NK-like cell lines towards antigenic targets of recombinant TCR. These results provide indication that it should be possible to use an entirely synthetic framework for functionally screening recombinant TCR-of-interest against minigene libraries without the requirement to first isolate and expand primary T-cell clones or donor-derived antigen-presenting cells. The high-throughput T-cell antigen profiling methods described and validated in this thesis could allow investigators to generate TCR epitope data broader in scope than previously possible to better understand basic T-cell biology, develop better predictive models of T-cell reactivity, and rationally design T-cell-based immunotherapeutics for the treatment of cancer, infectious disease, and autoimmunity."""@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/68080?expand=metadata"@en ; skos:note " NOVEL IN VITRO METHODS FOR THE DISCOVERY OF FUNCTIONAL T-CELL RECEPTOR EPITOPES FROM LARGE PEPTIDE-CODING LIBRARIES by Govinda Sharma B.Sc., Simon Fraser University, 2010 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Genome Science and Technology) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) December 2018 © Govinda Sharma, 2018 ii The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled: NOVEL IN VITRO METHODS FOR THE DISCOVERY OF FUNCTIONAL T-CELL RECEPTOR EPITOPES FROM LARGE PEPTIDE-CODING LIBRARIES submitted by Govinda Sharma in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Genome Science and Technology Examining Committee: Robert Holt, Medical Genetics Supervisor Brad Nelson, Medical Genetics Supervisory Committee Member Leonard Foster, Biochemistry and Molecular Biology Supervisory Committee Member Ryan Morin, Molecular Biology and Biochemistry, Simon Fraser University Supervisory Committee Member Carl Hansen, Physics and Astronomy University Examiner Kenneth Harder, Microbiology and Immunology University Examiner Sai Reddy, Biosystems Science and Engineering, ETH Zürich External Examiner iii Abstract Though it is understood that T-cells are a critical component of the immune system’s ability to destroy foreign invaders and attack cancerous cells, very little is known regarding the specific epitopes that are recognized by T-cells to carry out these functions. Generally, the epitopes mediating this immunity are short, contiguous peptides derived from antigenic proteins presented on major histocompatibility complex (MHC) molecules for inspection by T-cells. The ability to rapidly and deeply search peptide space to determine specific peptide epitopes that are naturally processed, presented, and capable of eliciting functional T-cell responses is a critical unmet need in the study of adaptive immunity. Here, I describe a novel method for deep T-cell epitope profiling that enables simultaneous in vitro interrogation of target cell populations encoding high-diversity minigene libraries with T-cell populations-of-interest. Targets eliciting T-cell reactivity are selectively isolated by fluorescence-activated cell sorting (FACS) and identified by deep amplicon sequencing. The approach was extensively validated using known murine T-cell receptor (TCR)/peptide-MHC pairs and it was shown that this method can unambiguously identify canonical minigenes from libraries of vastly more candidate antigens in parallel than would be feasibly tractable using conventional methods. The capability of this strategy was extended by applying a synthetic biology approach. Using pairs of immortalized natural killer (NK)-like effector cell lines and naturally tolerated target cell lines, I showed that fully reconstituting the TCR/CD3 complex in effectors and expressing relevant MHC-/minigene-coding sequences in targets is sufficient to re-direct the cytotoxicity of NK-like cell lines towards antigenic targets of recombinant TCR. These results provide indication that it should be possible to use an entirely synthetic framework for functionally screening recombinant TCR-of-interest against minigene libraries without the requirement to first isolate and expand primary T-cell clones or donor-derived antigen-presenting cells. iv The high-throughput T-cell antigen profiling methods described and validated in this thesis could allow investigators to generate TCR epitope data broader in scope than previously possible to better understand basic T-cell biology, develop better predictive models of T-cell reactivity, and rationally design T-cell-based immunotherapeutics for the treatment of cancer, infectious disease, and autoimmunity. v Lay Summary T-cells are a critical component of the body’s defensive arsenal. They are a diverse population consisting of individual T-cells which each bear their own T-cell receptor (TCR) variant. Millions of distinct TCR variants exist in the body, all recognizing their own set of specific antigens: protein fragments, from normal protein turnover that occurs in all cells of the body, sampled and displayed at cell surfaces. Presented antigens are inspected by T-cells via their TCR and, if any are recognized by a patrolling T-cell, these cells are considered infected or aberrant and a response aimed at destroying them is initiated. Developing therapies that manipulate natural T-cell responses to treat disease requires knowledge of the T-cell antigens at the heart of these responses. I have developed a new methodology that enables T-cell populations-of-interest to be tested against sets of possible T-cell antigens far larger than can be practically investigated with previous methods. vi Preface The overall project was designed and conducted in collaboration with my Ph.D. supervisor, Dr. Robert Holt. The work was carried out at the Michael Smith Genome Sciences Centre and Terry Fox Laboratories, both departments of the British Columbia Cancer Agency, and was covered by UBC Ethics Certificate number H07-00463. Considerations for specific chapters are as below. Section 1.5 is an updated version of the following review article, which was primarily written by me with input and editing from Dr. Holt. Sharma G. and Holt, R.A. (2014). T-cell epitope discovery technologies. Hum. Immunol. 75, 514-519 Chapters 2 and 3 have been, collectively, submitted for publication as a single manuscript which is currently undergoing revision. I, with Dr. Holt, conceptualized and designed the experiments presented in these chapters. I also prepared all experimental materials, carried out all experimental investigations, and conducted all formal analyses. I wrote the original draft of the manuscript, with input and editing from Dr. Holt. Chapter 4 is unpublished. I conceptualized and designed the experiments presented in this chapter. I also prepared all experimental materials with assistance from Zahra Ali, who was responsible for some of the molecular biology work. Preliminary experiments involving CD19-CAR not included in this thesis but informative to the experiments that were ultimately included were done by Chris May. I conducted all other experimental investigations and formal analyses in Chapter 4. vii Table of Contents Abstract ......................................................................................................................................... iii Lay Summary .................................................................................................................................v Preface ........................................................................................................................................... vi Table of Contents ........................................................................................................................ vii List of Tables ............................................................................................................................... xii List of Figures ............................................................................................................................. xiii List of Abbreviations ...................................................................................................................xv Acknowledgements ................................................................................................................... xvii Chapter 1: Introduction ................................................................................................................1 1.1 Opening remarks ............................................................................................................. 1 1.2 Generation of peptide-MHC antigens ............................................................................. 2 1.3 T-cell receptor generation, function, and profiling ......................................................... 3 1.3.1 Assembly and selection of T-cell receptors ................................................................ 3 1.3.2 TCR-mediated T-cell activation ................................................................................. 4 1.3.3 TCR repertoire profiling ............................................................................................. 7 1.4 T-cell immunotherapy ..................................................................................................... 9 1.4.1 Cancer ......................................................................................................................... 9 1.4.2 Autoimmunity ........................................................................................................... 11 1.4.3 Infectious disease ...................................................................................................... 13 1.5 T-cell antigen discovery methods ................................................................................. 15 1.5.1 Early genomic/cDNA library screening .................................................................... 15 1.5.2 Proteomic approaches ............................................................................................... 16 1.5.3 MHC multimers ........................................................................................................ 18 1.5.3.1 Flow cytometry/mass cytometry ....................................................................... 18 1.5.3.2 Arrays ................................................................................................................ 20 1.5.4 pMHC display systems ............................................................................................. 21 1.5.5 Cell-based functional screening ................................................................................ 22 1.6 Thesis overview ............................................................................................................ 24 viii Chapter 2: Implementing a novel reporter system to detect functional activity of cytotoxic T cells.............................................................................................................................................27 2.1 Introduction ................................................................................................................... 27 2.2 Results ........................................................................................................................... 31 2.2.1 Infection and verification of low-copy viral integration events in APC lines .......... 31 2.2.2 Expression of epitope-encoding minigene results in efficient detection of targeted cells ....................................................................................................................................... 33 2.2.3 GzmB delivery to targeted cells in mixed APC populations is highly specific ........ 34 2.2.4 Peak expression of GzmB-induced signal sufficiently precedes apoptosis of target cells ....................................................................................................................................... 35 2.2.5 Factors affecting the %FRET-shift signal and magnitude of shift ........................... 37 2.3 Discussion ..................................................................................................................... 39 2.4 Methods......................................................................................................................... 42 2.4.1 Cell culture ................................................................................................................ 42 2.4.2 CTL activation and expansion .................................................................................. 42 2.4.3 Construction of plasmid vectors ............................................................................... 42 2.4.4 Virus production ....................................................................................................... 43 2.4.5 Viral transduction...................................................................................................... 43 2.4.6 CTL/APC co-cultures ............................................................................................... 44 2.4.7 Flow cytometry/FACS .............................................................................................. 44 2.4.8 Quantitative PCR ...................................................................................................... 45 2.4.9 Data analysis ............................................................................................................. 45 Chapter 3: Validating a FRET-shift-based library screening approach ................................46 3.1 Introduction ................................................................................................................... 46 3.2 Results ........................................................................................................................... 48 3.2.1 Antigenic minigenes are detectable with high sensitivity from complex minigene libraries ................................................................................................................................. 48 3.2.2 Application of bioinformatic filtering to non-canonical minigenes refines putative hits ......................................................................................................................................... 53 ix 3.2.3 Antigenic minigenes are detected by polyclonal populations of input T cells containing diluted model T cells ........................................................................................... 55 3.2.4 Antigen identification from co-cultures of complex minigene libraries and polyclonal T-cell populations ............................................................................................... 57 3.3 Discussion ..................................................................................................................... 59 3.4 Methods......................................................................................................................... 63 3.4.1 Cell culture ................................................................................................................ 63 3.4.2 CTL activation and expansion .................................................................................. 63 3.4.3 Construction of minigene library plasmid ................................................................ 63 3.4.4 Virus production ....................................................................................................... 64 3.4.5 Viral transduction...................................................................................................... 64 3.4.6 Quantitative PCR ...................................................................................................... 64 3.4.7 CTL/APC co-cultures ............................................................................................... 65 3.4.8 Flow cytometry/FACS .............................................................................................. 65 3.4.9 Sequencing ................................................................................................................ 66 3.4.10 Data analysis ............................................................................................................. 67 Chapter 4: Adapting FRET-shift T-cell antigen identification assay to human systems ......68 4.1 Introduction ................................................................................................................... 68 4.2 Results ........................................................................................................................... 69 4.2.1 Primary B cells as antigen presenting cells............................................................... 69 4.2.2 Minimal signaling domains are required to initiate YT-Indy GzmB/PFN release ... 73 4.2.3 Construction artificial antigen-presenting cell (aAPC) and reconstituted cytotoxic T cell (rCTL) lines .................................................................................................................... 75 4.2.4 TCR/pMHC mediated FRET-shift signal is detectable in synthetic screening platform ................................................................................................................................. 79 4.2.5 Addition of HLA genes to target cells does not suppress YT-Indy cytotoxicity ...... 80 4.2.6 Time-dependent decay in YT-Indy potency may be a result of constitutive granzyme/perforin degranulation .......................................................................................... 83 4.3 Discussion ..................................................................................................................... 84 4.4 Methods......................................................................................................................... 87 x 4.4.1 Cell culture ................................................................................................................ 87 4.4.2 B-cell stimulation and expansion .............................................................................. 88 4.4.3 TCR-seq and TCR cloning........................................................................................ 88 4.4.4 Virus production ....................................................................................................... 89 4.4.5 Viral transduction...................................................................................................... 90 4.4.6 Effector/target co-cultures ........................................................................................ 90 4.4.7 Flow cytometry/FACS .............................................................................................. 90 4.4.8 Data analysis ............................................................................................................. 91 Chapter 5: Conclusions and future directions ..........................................................................92 5.1 Future directions in applying high-throughput T-cell epitope screening...................... 93 5.1.1 Whole microbial proteome screening of infectious pathogen-reactive T cells ......... 93 5.1.2 Whole tumor exome screening of tumor-associated T cells ..................................... 94 5.1.3 Exploring T-cell receptor cross-reactivity in libraries of random minigenes ........... 95 5.1.4 Assessing safety of engineered TCR therapeutics .................................................... 96 5.1.5 Characterizing the T-cell receptor reactivity of T-cell lymphomas .......................... 97 5.2 Future directions in technology development for FRET-shift/amplicon sequencing methodology ............................................................................................................................. 98 5.2.1 Validation of reconstituted CTL platform in library screening contexts .................. 98 5.2.2 Overcoming library bottlenecks ................................................................................ 99 5.2.3 Class II peptide-MHC antigens ............................................................................... 100 5.2.4 Tools to facilitate construction of immune networks ............................................. 102 5.3 Final Remarks ............................................................................................................. 103 Bibliography ...............................................................................................................................104 Appendices .............................................................................................................................. 124 Appendix A - Enhanced protocol for the production of high-diversity, high-quality minigene plasmid libraries ...................................................................................................................... 124 A.1 Materials ................................................................................................................. 126 A.2 Procedure ................................................................................................................ 128 A.3 Results ..................................................................................................................... 137 A.4 Supplementary information .................................................................................... 140 xi Appendix B - Nucleic acid sequences used ............................................................................ 143 B.1 Annotated plasmid backbones ................................................................................ 143 B.1.1. pMND-silent-FRET ....................................................................................... 143 B.1.2. pMND-libDest-FRET2 .................................................................................. 147 B.1.3. pMND-Multi .................................................................................................. 152 B.1.4. pCMV-ΔR8.91 ............................................................................................... 156 B.1.5. pCMV-VSV-G ............................................................................................... 161 B.2 Nucleic acid sequences inserted into plasmid backbones ....................................... 164 B.3 Relevant oligonucleotide sequences used ............................................................... 169 Appendix C - Code ................................................................................................................. 172 C.1 R Code for functional lentivirus titering ................................................................. 172 C.1.1. Lentivirus titering example output ................................................................. 173 C.2 R Code for FRET-shift assay sequencing data processing & analysis pipeline ..... 174 C.1.2. Example key to be input to sequencing data analysis pipeline ...................... 181 Appendix D - FACS gating schema ....................................................................................... 182 D.1 Representative gating strategy for FRET purity sorting in EL4 cells ..................... 182 D.2 Representative gating strategy for FRET-shift assay sorting in EL4 cells ............. 183 D.3 Representative gating strategy for FRET2 purity sorting in K562 cells ................. 184 Appendix E - Cell and read count summary from FRET-shift FACS/amplicon sequencing experiments ............................................................................................................................. 185 xii List of Tables Table 3-1. Construction of random minigene library expressed in ID8 cells ............................... 49 Table 3-2. Construction of random minigene library expressed in EL4 cells .............................. 52 Table 3-3. Database searching of random minigene hits from 1:10,000 spiked samples ............. 54 Table 4-1. Monitoring NIH-3T3-CD40L feeder cells after radiation exposure. .......................... 71 xiii List of Figures Figure 2-1. Summary of novel T-cell antigen identification method. .......................................... 29 Figure 2-2. Validation of the in-house integrations/cell qPCR assay. .......................................... 33 Figure 2-3. FRET-shift assay testing. ........................................................................................... 34 Figure 2-4. Assessing specificity of FRET-shift reporter system. ................................................ 35 Figure 2-5. OT-I time course. ....................................................................................................... 36 Figure 2-6. The size and magnitude of FRET-shift is influenced by the identity of the target cell....................................................................................................................................................... 38 Figure 2-7. ID8 and EL4 target cell lines vary significantly in MHC-I expression. .................... 39 Figure 3-1. Overview of T-cell antigen library screening approach. ............................................ 46 Figure 3-2. OT-I spiked library screening. ................................................................................... 51 Figure 3-3. pmel-1 TCR screening against 1:10,000 hgp100 minigene-spiked random library. . 53 Figure 3-4. Performance of diluted model T-cells in FRET-shift assays. .................................... 57 Figure 3-5. FRET-shift/deep amplicon sequencing approach in co-cultures of mixed CTL populations + mixed target cell populations. ................................................................................ 58 Figure 3-6. Placement of the validated FRET-shift/deep amplicon sequencing strategy in the ecosystem of existing methods in use for T-cell antigen discovery. ............................................ 61 Figure 4-1. Lentiviral transduction of CD40L-stimulated primary B-cells with alternatively pseudotyped virus. ........................................................................................................................ 73 Figure 4-2. Feasibility of antigen-specific redirection of YT-Indy cytotoxicity. ......................... 75 Figure 4-3. Reconstitution of peptide-MHC complex in K562 cell line. ..................................... 76 Figure 4-4. Initial (unsuccessful) attempt to recombinantly express TCRαβ at the surface of YT-Indy cells. ...................................................................................................................................... 77 Figure 4-5. Confirmation of successful TCRαβ, CD3ε, and CD8α surface expression on transduced YT-Indy cells. ............................................................................................................. 78 Figure 4-6. FRET-shift assay of HSDL1 mutant-reactive rCTL and wild-type or mutant HSDL1 minigene-expressing aAPC. .......................................................................................................... 80 Figure 4-7. Summary of replicate YT-Indy/721.221.FRET co-culture experiments. .................. 82 Figure 4-8. Resting CD107a surface levels of several effector and target cell lines. ................... 84 xiv Appendix Figure A-1. Graphical overview of the enhanced plasmid library construction strategy for accurately and efficiently inserting high-diversity minigene libraries in lentiviral transfer plasmid backbones. ..................................................................................................................... 126 Appendix Figure A-2. Demonstration of minigene library excision from 4% agarose gel. ....... 132 Appendix Figure A-3. Quantitation of DNA after PlasmidSafe DNase treatment. .................... 138 Appendix Figure A-4. Quantitation and quality assessment of library generated using enhanced library construction protocol. ...................................................................................................... 139 Appendix Figure A-5. Adaptation of Illumina Y-shaped adapter strategy to inserting sheared DNA fragments into plasmid vectors. ........................................................................................ 141 Appendix Figure C-1. Example output of lentivirus functional titering script. .......................... 173 Appendix Figure D-1. Representative gating strategy for FRET purity sorting in EL4 cells. ... 182 Appendix Figure D-2. Representative gating strategy for FRET-shift assay sorting in EL4 cells...................................................................................................................................................... 183 Appendix Figure D-3. Representative gating strategy for FRET2 purity sorting in K562 cells. 184 xv List of Abbreviations 5’-RACE Rapid amplification of 5’ complementary DNA ends aAPC Artificial antigen-presenting cell ACT Adoptive cell therapy APC Antigen-presenting cell BCR B-cell receptor BLASTP Protein-protein basic local alignment search tool CAR Chimeric antigen receptor cDNA Complementary DNA CDR3 Complementarity determining region 3 CFP Cyan fluorescent protein CFU Colony-forming unit CPL Combinatorial peptide library Cr51 Chromium-51 CTL Cytotoxic T cell EBV Epstein-Barr Virus ELISPOT Enzyme-linked immunospot assay FACS Fluorescence-activated cell sorting FRET Förster Resonance Energy Transfer FSC-A Forward scatter pulse area FSC-H Forward scatter pulse height FSC-W Forward scatter pulse width GALV Gibbon ape leukemia virus gDNA Genomic DNA GzmB Granzyme B HIV Human immunodeficiency virus HLA Human leukocyte antigen IFNγ Interferon-gamma ITAM Immunoreceptor tyrosine-based activation motif kanR Kanamycin resistance gene KIR Killer immunoglobulin-like receptor MeV Measles virus MHC Major histocompatibility complex MRTC Myelin-reactive T cells NFAT Nuclear factor of activated T cells NK Natural killer ORF Open reading frame pAPC Professional antigen-presenting cell PBMC Peripheral blood mononuclear cells PCR Polymerase chain reaction PFN Perforin PI Propidium iodide pMHC Peptide/major histocompatibility complex PTM Post-translational modification xvi qPCR Quantitative polymerase chain reaction rCTL Reconstituted cytotoxic T cell RFP Red fluorescent protein scFv Single-chain variable fragment scMHC Single-chain peptide/major histocompatibility complex SSC-A Side scatter pulse area SSC-H Side scatter pulse height SSC-W Side scatter pulse width TAA Tumor-associated antigen TCR T-cell receptor TIL Tumor-infiltrating lymphocytes TNFα Tumor necrosis factor alpha Tregs Regulatory T cells TU Transducing unit VSV Vesicular stomatitis virus YFP Yellow fluorescent protein β2M Beta-2-microglobulin xvii Acknowledgements When I was about 7 years old, I heard someone mention the phrase “Ph.D.” and I asked my father what that meant. He told me that it’s a thing where you go off and study for 5 years and then write a 200-page book at the end of it. Horrified at the thought of this, I vividly remember vowing to myself that I would never, ever do a Ph.D. I have many, many people to acknowledge and thank for helping me to arrive at this point. I want to thank my supervisor, Rob Holt, for his vision, guidance, and patience. We set the bar high with this project, but Rob was there to help me rise to the occasion and do the best work I possibly could. I will always aspire to follow his example with my own mentees in the future. My thesis committee, Brad Nelson, Ryan Morin, and Leonard Foster, have also been a key part of my training. Not many thesis committees are composed of members based across four different sites, and they have been wonderfully cooperative and flexible in terms of holding committee meetings. Much more than that, they have been an excellent resource and I am thankful for their insights, feedback, and encouragement along the way. I have been truly fortunate to be able to conduct my work at the BC Cancer Research Centre. We have all the toys that I could have possibly wanted to use for my experiments, and a community of scientists willing to share them – I know that not everyone in this field is so lucky, so for this I am grateful. Moreover, the support staff in place here is incredible, from lab operations to project management to IT support and everything in between. In this regard, I would especially like to acknowledge and thank Maiya Moore and Payal Sipahimalani. I was able to do the research presented here thanks to the endless kindness of our financial donors and the non-profit organizations, like the BC Cancer Foundation, that play a huge part in keeping the BC Cancer Research Centre running. My research was also funded by NSERC, Genome Canada, Genome BC, the US DoD, and the US NIH – in other words, by the public. So, my heartfelt thanks to those of you who pay their taxes. Thank you to my labmates in the Holt group and the Genome Sciences Centre at-large, as well as our close collaborators at the Deeley Research Centre in Victoria. I’ve learned a lot from all of you. You’ve freely given me so much great advice over the years, and I appreciate this very much. I’d especially like to acknowledge and thank Daniel Woodsworth and Scott Brown for their technical, intellectual, emotional, and moral support. We’ve been through a lot together during our respective Ph.D. programs; I’ll always look back fondly on “the lost boys” era. Also, thank you to Scott for being so patient with me whenever I got mad at computers. To all my other labmates along the way who have been so generous with their time and energy in helping me accomplish this work – Eric Yung, Lisa Dreolini, Mauro Castellarin, Ewan Gibb, Chris May, Kyla Cochrane, Sophie Sneddon, and Craig Rive – thank you, I couldn’t have asked to be part of a better group. I’ve also been fortunate to have had the opportunity to directly mentor two xviii extremely bright undergraduate students, Zahra Ali and Sarania Sivasothy, who have been a huge help to me in carrying out some of the work presented in this thesis. However, it wasn’t just the contributions of those in our group that need to be acknowledged. I must thank Spencer Martin and Julian Smazynski for being a continual source of brilliant ideas in the lab and a continual source of bad ones in the pub. Thank you to Dave Kroeger, John Webb, Victoria Hodgson, Nicole Little, and Julie Nielsen for all of your feedback, advice, and help as I carried out this work. A very special thank-you also goes to Alessia Gagliardi, Davide Pellacani, and Jerry Tien for being a tremendous source of moral support throughout my Ph.D. program. This task would have been absolutely impossible if not for the huge amount of love, support, and patience from my wonderful friends and family. It would be impossible to name every single one of you who have been important to me while I’ve been on this journey, but I think I can confidently say: you know who you are. Every day I spent doing my research and writing this thesis, I was reminded of the qualities of mine that were directly instilled in me by parents. I have never known anyone with the same relentless, superhuman work-ethic of my mother, Barkha, or anyone as incredibly incisive and intuitive as my father, Ashok. I have always admired and aspired to these qualities. More than that, however, I thank my parents for their unconditional and undying love and support. They have given so much in order to ensure that the next generation will flourish. For what they have given me in terms of finance, time, and freedom, I owe them an immeasurable debt. Finally, I thank my future wife, Loveneet. This achievement is, without question, as much yours as it is mine. You have been there every step of the way. You were always excited about the little victories along the way and celebrated them with me. You were always there at the low points to help pull me out of the mud. I could not have done this without you.1 Chapter 1: Introduction 1.1 Opening remarks The adaptive immune system is an extraordinary apparatus that serves to protect the body from a universe of potential microbial invaders or aberrant self-cells gone rogue. The basic mechanism by which the adaptive immune system achieves this protection is to produce a repertoire of millions of unique T- and B-cell clonotypes, each of which is armed with a distinct surface-expressed antigen receptor which, in turn, is able to recognize its own repertoire of distinct antigen ligands. Recognition of antigen by T-cell receptors (TCR) or B-cell receptors (BCR) expressed on peripheral T cells and B cells, respectively, is central to the execution of a robust immune response that leads to the clearance of pathogenic agents. In this thesis, I will focus specifically on T cell-mediated immunity. In particular, I will be discussing αβ T cells, which express highly variable αβ TCR responsible for engaging with short peptide-based ligands. This is to the exclusion of some T cell subsets – natural killer T cells and mucosal associated invariant T cells, which each express semi-invariant αβ TCR and engage non-peptidic ligands, as well as γδ T cells, which express an alternative form of the TCR encoded at the TCRγ and TCRδ genomic loci. Though, from this point on, I will use the term ‘T cell’ to refer specifically to classical αβ T cells and ‘TCR’ to specifically refer to variant αβ TCR, the non-classical T-cell subsets mentioned above should be acknowledged as relevant elements of the overall T-cell compartment. In the subsequent sections of Chapter 1, I will explore the complexity inherent in the generation of TCR and TCR antigens and the complexity of their interactions. I will also examine the means by which T-cell receptor identity and reactivity are currently investigated, and the implications that successful T-cell immunoprofiling has for application in medicine. The intricacy of T cell biology described below coupled with the tremendous potential that exists to treat disease with rationally designed T cell-based therapeutics underscores an urgent need for new tools capable of probing T-cell immunity. Excitingly, the advent of techniques, instrumentation, and computing capability responsible for ushering in the genomics era has been mobilized towards the study of 2 T-cell repertoires. Despite this, there are still shortcomings in the field of T-cell epitope discovery which should benefit greatly from application of high-throughput approaches. 1.2 Generation of peptide-MHC antigens T-cell antigens consist of short peptides, derived from intracellular turnover of antigenic proteins (Townsend et al., 1986), displayed at the surface of antigen-presenting cells (APC) by membrane-bound major histocompatibility complex (MHC) proteins. Together, these peptide-MHC (pMHC) complexes are inspected by T cells (Zinkernagel and Doherty, 1975), and pMHC containing immunogenic epitopes can induce a T-cell response. The MHC proteins are categorized as either class I or II and are encoded, in humans, by a cluster of genes known as the human leukocyte antigen (HLA) system located on chromosome 6. Class I proteins are a heterodimer consisting of α-chains encoded at the HLA-A, HLA-B, or HLA-C loci and the invariant β2 microglobulin protein encoded by the B2M gene. Class II heterodimeric proteins consist of α and β chains, which are encoded adjacently at the HLA-DP, HLA-DQ, and HLA-DR loci (The MHC sequencing consortium, 1999). Notably, the HLA genes are the most highly polymorphic region of the human genome, consisting of approximately 12,000 known human alleles (Robinson et al., 2015). MHC class I molecules are expressed on the surface of nearly every cell of the body and present a sampling of short (8-14 amino acids in length) peptides (Ekeruche-Makinde et al., 2013) derived from proteolytic turnover of proteins of both endogenous and exogenous (Moore et al., 1988; Morrison et al., 1986) origin. To accomplish this, nascent MHC-I proteins are held in a semi-folded conformation in complex with the endoplasmic reticulum (ER)-resident chaperone, calnexin, until successful formation of MHC-I α/β2M heterodimer trigger the release of the newly assembled complex from calnexin (Degen et al., 1992; Ortmann et al., 1994). Released MHC-I α/β2M dimers then form a complex with calreticulin, ERp57, and tapasin (Bouvier, 2003; Ortmann B et al., 1997) where they are held in a semi-stable state receptive to peptide binding. Peptides generated by normal degradation of cytosolic proteins are translocated to the ER by a heterodimer composed of the proteins TAP-1 and TAP-2 (Kleijmeer et al., 1992; Shepherd et al., 1993). Translocated peptides can bind chaperoned MHC-I α/β2M dimers to form 3 a complete pMHC-I molecule. At this point, peptides with partial binding to MHC-I but longer than the optimal peptide size range can be trimmed by the aminopeptidase, ERAAP, to produce a final pMHC-I complex with optimized peptide-MHC fit (Kanaseki et al., 2006). Upon completion of the pMHC-I molecule, the entire structure is moved via anterograde transport to the cell membrane. MHC class II primarily exists on the surface of B cells, monocytes and dendritic cells (Boegel et al., 2018) – also known as professional antigen-presenting cells (pAPC) – and are responsible for priming CD4+ helper T-cell responses in secondary lymphoid tissue and activating these responses in the periphery. The peptide-MHC-II complex is not assembled in the endoplasmic reticulum, like MHC-I, but within the vesicular system of pAPC. Newly synthesized MHC-II molecules are complexed with an invariant chain (Anderson and Miller, 1992), CD74, which blocks any peptide binding until the complex has trafficked through the Golgi network and reached the cell surface (Germain and Hendrix, 1991), where it can subsequently enter the vesicular network in early endosomes. Fusion of endocytic lysosomes or autophagosomes with these early endosomes results in a structure referred to as the MIIC endosome (Peters et al., 1991), where endocytosed proteins are degraded by vesicle-resident reductases and proteases (mainly cathepsins S and L) and CD74 chaperone molecules binding precursor MHC-II molecules are removed and replaced with high-affinity peptides from these degraded extracellular proteins, a process catalyzed by HLA-DM in humans (Denzin and Cresswell, 1995). Peptide-MHC-II complexes generated in the MIIC endosome are then moved to the cell surface where they can be inspected by TCR. Notably, the structure of MHC class II molecules allows for the presentation of peptides not subject to strict length constraints like their MHC class I counterparts. 1.3 T-cell receptor generation, function, and profiling 1.3.1 Assembly and selection of T-cell receptors Following the elucidation of MHC restriction nearly a decade prior (Zinkernagel and Doherty, 1975), the T cell-specific receptor responsible for inspecting and recognizing pMHC complexes on antigen-presenting cells was not isolated until the early 1980s (Allison et al., 1982; Haskins et 4 al., 1983; Hedrick et al., 1984; Meuer et al., 1983; Yanagi et al., 1984). From these early efforts, it was learned that pMHC interactions are mediated by the αβ TCR, a heterodimeric integral T cell membrane protein composed of an α and a β subunit. In humans, the TCRα locus is encoded on chromosome 14 and is composed of 54 V-segments and 61 J-segments. The TCRβ locus is composed of 77 V-segments, 2 D-segments, and 14 J-segments (Folch and Lefranc, 2000a, 2000b, Scaviner and Lefranc, 2000a, 2000b). During thymic development of T cells, each of these germline loci undergoes somatic rearrangement in which one segment of each class (V, D in the case of the β locus, and J) is recombined through the stepwise actions of the V(D)J recombinase complex (Bassing et al., 2002). The diversity of unique receptors generated by this process is extended beyond the number of possible V(D)J combinations by random, template-independent addition or deletion of nucleotides at each joint to produce a hypervariable region of each TCR gene which codes for complementarity determining region 3 (CDR3). These CDR3 motifs are the principal point of contact with the peptide-containing portion of any pMHC complex being inspected by a T cell (Davis and Bjorkman, 1988). Post-recombination, nascent T cells undergo positive and negative selection in the thymus. Here, pAPC re-circulating from the periphery as well as thymic epithelial cells present a spectrum of self-antigens against which newly recombined TCR are tested (Derbinski et al., 2001). Any receptors with an inability to bind pMHC at all (Bevan, 1977) or receptors that bind self pMHC too strongly (Kappler et al., 1987) are deleted to yield a diverse repertoire optimized for tolerance of self-antigens but poised to recognize foreign or aberrant antigen encountered in peripheral tissue. 1.3.2 TCR-mediated T-cell activation The exact mechanism by which TCR recognition initiates a downstream signaling cascade is still somewhat unclear. Unlike many other receptor-ligand interactions, such as BCR (antibody)/antigen interactions, the TCR/pMHC interaction is selected in the thymus to be a relatively low-affinity interaction. Despite this, the TCR repertoire is able to marshal its considerable structural diversity into accurately and sensitively discriminating self from non-self, suggesting that there must be a conserved set of rules that govern the activation of a T cell in response to an epitope. 5 It has previously been observed that steady-state affinity of TCR/pMHC interactions is not the primary predictor of T-cell activation and that the kinetic on-/off-rates of these interactions strongly influence T-cell activation. This is why it has been often noted that TCR with similar affinity for different pMHC molecules can deliver variably agonistic or antagonistic responses (Degano et al., 2000; Huang et al., 2010). Numerous models have been hypothesized and experimentally verified in vitro to explain these observations, suggesting that, most likely, a confluence of mechanisms is responsible for initiating TCR-driven T-cell responses in vivo. For example, studies indicate that a size-restrictive immunological synapse is formed upon TCR/pMHC engagement that is able to exclude signal-dampening phosphatases and kinases, notably CD45 and CSK, respectively (James and Vale, 2012), while promoting TCR clustering that leads to sufficient phosphorylation for signal transduction (discussed further below) to proceed (Taylor et al., 2017). Other studies show that physical forces inherent in cell-cell contact cause allosteric conformational changes in the TCR molecule that result in a “catch-bond” interaction that holds the TCR/pMHC complex in a compact state with a lifetime proportional to TCR potency (Das et al., 2015). These theories of a size restrictive immunological synapse and the formation of a compact catch-bond interaction seem to be readily reconciled: a strong TCR/pMHC interaction results in a shortened intercellular distance that enhances the segregation of phosphatases and kinases at the immune synapse (Sibener et al., 2018). To explain the means by which TCR are able to use these spatial mechanisms of activation to react to relevant antigens with such exquisite selectivity, kinetic proofreading and serial triggering have been hypothesized. These theories state that a TCR must engage and disengage with cognate pMHC multiple times to achieve a cumulative threshold level of activation and continue downstream signaling (Kersh et al., 1998a). The contention of kinetic proofreading theory is that this occurs in order to ensure that spurious interactions do not elicit reactivity, resulting in high TCR accuracy; while serial triggering theory posits that rapid engagement/disengagement is the property that enables a single pMHC complex to activate hundreds of TCR molecules (Valitutti et al., 1995), thus high TCR sensitivity. 6 These models are satisfying as they are able to account for the ability of the TCR/pMHC to initiate responses with great selectivity and potency in the context of such a low-affinity interaction, but also because they are potentially compatible with the phenomenon of cross-reactivity (which will be discussed further in Section 5.1.3). Cross-reactivity is the property of individual TCR to productively interact with many possible peptide epitopes. Given that the TCR repertoire is of a fixed and limited size, and that the possible number of peptides that could exist is immense, this characteristic is a necessary feature of T cells (Sewell, 2012). The kinetic and spatial mechanisms utilized by the T cell at the immune synapse afford TCR the flexibility to iteratively scan pMHC molecules and initiate a response if a sufficient fit is achieved. This is suggestive of a TCR binding plasticity (Bridgeman et al., 2012) that still requires further investigation in order to decode TCR cross-reactivity. Though initiating events in TCR signaling have not been fully elucidated, it is known that ligation of a TCR with its cognate antigen induces the phosphorylation of immunoreceptor tyrosine-based activation motifs (ITAMs) (Reth, 1989) present on the cytoplasmic tails of the CD3 γ, δ, ε, and ζ domains, which are membrane-bound proteins that associate with the T-cell receptor αβ chains to form the TCR signaling complex (Clevers et al., 1988). This phosphorylation is primarily carried out by the cytoplasmic kinase, Lck, which associates with the cytoplasmic domains of CD8 or CD4 co-receptors which are able to localize to the site of TCR/pMHC interaction by forming their own contacts with MHC-I or -II, respectively (Veillette et al., 1988). Phosphorylated tyrosines on ITAMs of CD3 form binding sites for another kinase, ZAP-70 (Chan et al., 1992), resulting in its recruitment to the TCR signaling complex and its activation. Activated ZAP-70 drives downstream responses by phosphorylating the scaffolding molecules SLP-76 and LAT (Zhang et al., 1998). Critically, phosphorylated SLP-76 and LAT recruit phospholipase C-γ (PLCγ) to the TCR signaling complex (Yablonski et al., 2001) where it is activated by the receptor tyrosine kinase, Itk (Bogin et al., 2007). Activation of PLCγ results in cleavage of phosphatidylinositol-(3,4,5)-triphosphate (PIP3) into diacylglycerol (DAG) and inositol triphosphate (IP3), which then function as second messengers to trigger calcium release from the endoplasmic reticulum (Imboden and Stobo, 1985; Imboden et al., 1985) and initiate the MAP kinase cascade. These downstream events result in the promotion of T-cell proliferation 7 and differentiation during activation in the secondary lymphoid tissue or the initiation of effector functions in activated circulating T cells. CD8+ Cytotoxic T cells (CTL) are able to carry out their effector functions by directly killing target cells and secreting cytokines. The primary cytokine response from CTL activated by an appropriate antigen is to release interferon-γ (IFNγ); a response that can be synergistically enhanced by their additional secretion of tumor necrosis factor α (TNFα). Functionally, this IFN-γ-led cytokine response serves to enable recruitment of more immune cells to the site of infection or tumor, enhance antigen processing and presentation in nearby professional APC, induce the production and secretion of reactive oxygen species in macrophages, inhibit the proliferation of viral pathogens and their host cells, and initiate apoptotic pathways in host cells (Slifka and Whitton, 2000). However, despite the array of functions carried out by the CTL cytokine response, the most crucial effector function of CTL is the direct delivery of granzymes, predominantly granzyme B (GzmB), to antigen-specific target cells. In a resting CTL, GzmB is stored in lysosomal granules along with perforin (PFN), a key mediator of granzyme trafficking, until calcium flux from activation of the TCR results in polarized degranulation of the cell directed to occur specifically at the immunological synapse (Lyubchenko et al., 2001; Yanelly et al., 1986). At this point, perforin is released into the extracellular space and travels across the synapse to transiently introduce pores into the surface of the target cell. At the same time, GzmB is also liberated from lytic granules and is able to permeate perforin pores (Lopez et al., 2013). Once inside the target cell, GzmB, a serine protease, is able to cleave procaspases-8 (an apical caspase), -3, -7 (executioner caspases), and BID (a pro-apoptotic member of the Bcl-2 family). Thus, GzmB mediated cytotoxicity can initiate apoptosis from multiple distinct points in intracellular apoptosis pathways (Pinkoski et al., 2001). 1.3.3 TCR repertoire profiling Immune repertoire sequencing has become a powerful tool in the interrogation of the size and clonality of T-cell and B-cell subsets. Key to this approach is the targeted cDNA synthesis of TCR or BCR chains by 5’-RACE (Freeman et al., 2009) or multiplexed variable gene-specific primers (Robins et al., 2009) coupled with constant region-specific primers proximal to the 3’ 8 end of the J-gene. This is followed by massively parallel short-read sequencing to reveal the diversity of immune receptors in the repertoire at-large or in specific contexts. In contrast to conventional RNA-seq, which seeks to analyze the breadth of the transcriptome, the objective of TCR-seq and BCR-seq experiments is to assess a very narrow target region of the transcriptome at extraordinary depth in order to fully characterize the TCR and BCR expressed in immune cell populations. Meaningful analysis of TCR repertoires by sequencing TCRα and/or β transcript cDNA is made possible by the property of these re-arranged receptor chains to concentrate their diversity in the CDR3 hypervariable region. Thus, it is not necessary to sequence the entire transcript to measure the clonality of analyzed T-cell populations. Rather, paired-end short reads covering the CDR3 and J-gene region on the reverse read and the 5’ end of the V-gene on the forward read are sufficient to infer TCRα or β clonotypes. This implies that each paired-end read corresponds to a single α or β transcript, a feature which readily facilitates the measurement of clonal frequency as well as identity. Clonotype reconstruction is done using bioinformatic tools (Bolotin et al., 2013, 2015; Thomas et al., 2013; Yu et al., 2015b) that extract CDR3 regions and match partial V- and J-gene from sequencing reads to reference gene segments contained in the International Immunogenetics Information System (IMGT) reference database (Lefranc et al., 2015). Initial TCR-seq efforts focused on deep characterization of either TCRα or TCRβ since no tractable high-throughput approaches existed at the time to do paired TCR-seq for identification of both chains by bulk sequencing while also determining linkage information regarding which α-chains naturally paired with which β-chains in the sample. Original studies generally focused on the β-chain, since it tends to be a more diverse repertoire than α and also because T cells only carry one productive TCRβ, while it is possible that a given T cell could be expressing α-chains from both alleles, one of which is non-productive. Since this first generation of TCR-seq studies, however, a number of innovations have been implemented that now enable paired αβ TCR-seq using a variety of barcoding-, bioinformatic-, or droplet-based methods (Han et al., 2014; Howie et al., 2015; Turchaninova et al., 2013), albeit with a lower throughput than single-chain TCR-seq. 9 Currently, TCR sequencing approaches are routinely applied to understanding the properties of a healthy repertoire and characterizing the clonal biases that occur in disease contexts. However, despite the insights gleaned from these experiments, they are all still limited to only inferring the roles of significant clonotypes. T-cell repertoire profiling still has yet to incorporate the other side of the immune synapse; that is, the major limitation faced by deep TCR sequencing is that it does not provide any information regarding the antigens that are recognized by the clonotypes revealed in these studies. 1.4 T-cell immunotherapy 1.4.1 Cancer It was initially proposed in the 1950s that the immune system could potentially prevent neoplastic disease by recognizing tumor-associated antigens (TAA) stemming from altered self that is characteristic of the disease (Burnet, 1957). Since the formulation of this cancer immunosurveillance hypothesis, it was tested by multiple groups who compared the progression of spontaneous or induced tumors in immunocompromised, thymectomized, or athymic mice to tumor progression in immunocompetent mice (Dunn et al., 2002). These experimental approaches were largely inconclusive and no consensus was reached until several key studies demonstrated that neutralizing endogenous IFNγ with anti-IFNγ antibodies (Dighe et al., 1994), knocking out the PFN gene (Broek et al., 1996), or knocking out the RAG 1/2 genes (Shankaran et al., 2001) accelerated spontaneous carcinogenesis in mice. These data confirmed that the immune response, particularly the adaptive immune response, is able to naturally control malignant tumors. Numerous studies that have since characterized this anti-tumor lymphocyte response in cancer patients have concluded that larger tumor-infiltrating lymphocyte (TIL) responses, particularly those comprised largely of CD8+ cytotoxic T cells, contribute to enhanced survival outcomes in many solid tumor types (Brown et al., 2014; Galon et al., 2006; Sato et al., 2005). Attempts to restart and/or boost this naturally-occurring anti-cancer immunity response led to initial clinical attempts to use cancer vaccines and adoptive cell therapy (ACT) as strategies to 10 treat cancer. Adoptive cell therapy approaches, which involve the ex vivo expansion of TIL populations and subsequent re-infusion of these cells autologously back into donors, were first conducted in the late 1980s (Rosenberg et al., 1988) and were shown to mediate objective clinical regressions in melanoma patients. Since these initial efforts, subsequent ACT trials have been initiated, most of which have also found success in melanoma (Rosenberg and Restifo, 2015) – thought to be due to the highly mutated genomic landscape of this cancer type (Alexandrov et al., 2013; Lawrence et al., 2013). Non-synonymous coding mutations, when incorporated into MHC-presented peptides, give rise to mutational neoantigens: novel targets that can be recognized as distinct from self by the TCR repertoire and, therefore, eliminated. It has been observed that, indeed, this is a major mechanism by which anti-tumor T-cell responses are mounted (Lennerz et al., 2005). As a consequence of this, tumors with higher mutational loads tend to be more immunogenic and respond better to immunotherapy than tumors with low overall mutation counts (Rizvi et al., 2016; Snyder et al., 2014). The introduction of a novel class of immune modulating therapies called checkpoint blockade inhibitors has proven to be a revolutionary development and has resulted in a resurgence of promising new treatment modalities in indications that were previously very difficult to treat with immunotherapy. These inhibitors block the negative signals that control the progression of T-cell differentiation or activation, most prominently CTLA-4 (Leach et al., 1996) and PD-1/PD-L1 (Freeman et al., 2000). The result is rejuvenation of tumor-reactive T cells found in the tumor infiltrating populations that can lead to positive clinical outcomes, including complete remissions in some cases. In spite of this, many patients receiving checkpoint blockade therapies do not respond to the treatment. The reasons for this are not clear, however, main key predictors of response include mutational load as discussed above and the presence of a large and highly polyclonal TIL repertoire prior to initiation of checkpoint blockade therapy (Tumeh et al., 2014). In light of the development of immune checkpoint blockade and elucidation of the role of mutational neoantigens in anti-tumor immunity, opportunities now exist in the field of cancer vaccinology to deliver tumor-specific mutant antigens in combination with checkpoint inhibitor drugs to achieve higher rates of remission than previously achieved by vaccination alone (Gubin 11 et al., 2014). Coupled with the observation that polyclonal TIL populations result in stronger and more durable responses in checkpoint inhibitor therapy, novel vaccine strategies have been designed in which multiple tumor-mutant neoantigens are delivered simultaneously (Ott et al., 2017; Sahin and Türeci, 2018). This trend towards administering a multiplicity of tumor antigens at once implies that most of these antigens will be personal as the majority of tumor mutations are not conserved across individuals. Therefore, going forward, rapid high-throughput methods for T-cell antigen discovery from tumor exomes will be a critically necessary tool for the design and manufacture of TCR-based cancer immunotherapies 1.4.2 Autoimmunity Destruction of self-tissue by overactive immune cells forms the basis of many common disorders, referred to as autoimmune disorders. In some cases, this destruction is mediated by autoreactive T cells that were not deleted in the thymus during their development and are said to have escaped central tolerance. Typically, self-destructive tendencies from autoreactive T cells are held in check by numerous regulatory mechanisms collectively referred to as peripheral tolerance. An important mediator of peripheral tolerance is the activity of a specialized subset of T cells referred to as regulatory T cells (Tregs), which operate by promoting immune suppression instead of cytotoxicity upon engagement of their rearranged TCR with cognate antigen (Legoux et al., 2015). In the event that the suppressive functions of peripheral tolerance are reduced, absent or otherwise circumvented, T cells with self-reactivity are free to exert autoimmune effects in tissues expressing target self-antigens. Therapeutic strategies have previously aimed to contend with T cell-driven autoimmunity by strengthening antigen-specific immunosuppression via vaccination or by adoptive transfer of expanded autologous Treg populations. Following are some specific examples of these therapeutic strategies applied to two very prevalent autoimmune disorders. Type-1 diabetes is a metabolic disorder characterized by a loss of insulin production that results in patients becoming dependent on parenterally provided insulin. Loss of insulin production is 12 caused by the autoimmune destruction of pancreatic islet cells, which are responsible for its biosynthesis. Adoptive cell therapy using Tregs expanded by non-antigen-specific stimulus ex vivo from bulk PBMC has been demonstrated to induce remission of diabetes in children (Marek-Trzonkowska et al., 2012). However, similar to adoptive T-cell therapies for cancer treatment, it would be expected that potency of adoptive T-cell transfers would be improved by enriching Treg populations to be infused with clonotypes specific for tolerizing diabetes-related pancreatic epitopes. In this vein, work in murine models of diabetes has explored the concept of Tregitopes (Cousens et al., 2014). This class of epitopes consists of highly conserved endogenous epitopes, mainly found in IgG chains. Peptide-MHC molecules containing these determinants are capable of activating thymus-derived natural Tregs (nTregs) to convert peripheral T cells into induced Tregs (iTregs). In nature, Tregitopes are hypothesized to tolerize immune response against idiotypic antigens developed in the hypervariable regions of lymphocyte receptors. Administration of a panel of Tregitopes in conjunction with immunogenic pancreatic peptides was shown to completely prevent the onset of diabetes in vivo in a cohort of NOD mice as well as suppress reactivity to the human diabetes-related antigen, GAD65, in in vitro experiments using patient PBMC (Cousens et al., 2013). Epstein-Barr Virus (EBV)-infection of autoreactive B cells in the central nervous system has been hypothesized to be a root cause of multiple sclerosis. Autoreactive B cells comprise a sizable proportion of the normal naïve B-cell population; up to 20% of these cells can express an autoreactive BCR (Wardemann, 2003). EBV infection of naïve B cells causes rapid proliferation and differentiation into mature memory B cells by co-opting normal B-cell activation and differentiation pathways. Thus, EBV enables B cells to bypass immune checkpoints intended to prevent autoreactive B cells from entering the periphery (Hochberg et al., 2004). In healthy individuals, anti-EBV T-cell responses have been observed to control EBV-infected B-cell populations, however, in multiple sclerosis, defective T-cell control of EBV enables these B cells to accumulate in the central nervous system (Pender et al., 2017). Once there, it is hypothesized that myelin-reactive EBV-infected B cells encounter self-antigen, causing them to provide survival and activation signals to autoreactive T cells which are naturally present but are suppressed by regulatory mechanisms in healthy individuals (Pender and Burrows, 2014). It is 13 these activated T cells, in turn, that are directly responsible for autoimmune attack on the central nervous system that is characteristic of this disease. In view of this, both adoptive cell therapy and vaccine-based strategies have been used as a means of treating multiple sclerosis. In the former, adoptive transfer of ex vivo-expanded EBV-reactive CTL has been performed to attempt to remove EBV infected B cells responsible for driving autoreactivity. While at the proof-of-concept stage at this point, the results of this approach to date have not shown any severe side effects and have demonstrated clinical improvement in subjects receiving therapy (Pender et al., 2014). The latter, vaccine-based strategy was conducted by selecting myelin-reactive T cells (MRTC) out of patient peripheral blood mononuclear cells (PBMC) by expansion in response to candidate myelin peptides added to bulk cultures. Expanded MRTC are attenuated by irradiation ex vivo and administered autologously as a vaccine. The rationale is that TCR expressed by MRTC are recognized as idiotypic antigen, resulting in an adaptive response intended to reduce the count of autoreactive T cells. Initial clinical results showed a statistically significant decrease in the rate of relapse in patients receiving therapy (Loftus et al., 2009). The different approaches to immunotherapy in the field of autoimmunity, as highlighted in the stories of type-1 diabetes and multiple sclerosis, demonstrate the need for further knowledge regarding common tolerizing antigens and immunogenic antigens from self-tissue/pathogens that trigger the onset of autoimmunity to begin with. In all of the immunotherapeutic strategies described above, preliminary studies have been limited to using extremely restricted panels of known epitopes presented on common HLA alleles to target or select reactive T cells. In order to translate the positive findings in these initial studies to the patient population at-large, advanced strategies for rapidly, functionally, broadly detecting T-cell epitopes are needed. 1.4.3 Infectious disease Reverse vaccinology (Rappuoli, 2000) represents another avenue of directing T-cell responses to treat and prevent disease, in particular infectious disease. An alternative approach to the traditional vaccinology route in which pathogens-of-interest are cultured and biochemically 14 deconstructed to find immunogenic components, reverse vaccinology involves identifying putative antigens starting from genomic data. While often employed successfully in developing bacterial vaccines, this approach is usually overlooked with regard to pathogens with larger genomes, high strain variability, or in cases where existing whole live attenuated vaccines elicit undesirable side effects (Bruno et al., 2015). In a typical reverse vaccinology workflow, candidate T-cell epitopes are identified for their potential use as vaccines by filtering in silico generated overlapping peptide sequence libraries from putative open reading frames for their predicted ability to bind host MHC molecules. Hits are then evaluated in vitro or in animal models to validate their T-cell reactivity. Validation of epitopes represents the main bottleneck in the development of peptide vaccines. It is often observed that in silico analyses of pathogenic genomes yield greater than >1000 predicted hits per MHC allele investigated, however, only <5% of these potential epitopes are typically carried forward for validations (Moise et al., 2009). To expand on the principle of reverse vaccinology, synthetic candidate epitopes based on naturally occurring analogues can also be developed. The potential in these synthetic analogues is enormous as it can allow for the development of peptide vaccines capable of eliciting a stronger response than its natural counterpart, or can be designed to have a better safety profile (Pentier et al., 2013). However, a requirement for screening optimized epitope variants or discovering novel epitopes with optimal properties is the ability to perform deep searching of unbiased epitope space. To this end, combinatorial peptide library (CPL) screening methods have been developed (Borràs et al., 2002; La Rosa et al., 2001). In this approach, libraries of peptides are designed such that, at each position, 19 pools are generated in which each of the 19 amino acids (cysteine is generally excluded) is held constant while all other positions are completely degenerate. For example, to construct a CPL screen of decamers, 190 (10 x 19) different peptide pools are made. Each CPL is then added to co-cultures of APC and T cells-of-interest and assessed for T-cell reactivity. At each position, the well with the highest activity is determined and the particular amino acid responsible for this activity is said to be the optimal residue at this position; synthetic epitopes can then be constructed by assembling each of the optimal residues at each position. Though powerful, these CPL screening approaches do not incorporate natural antigen processing/presentation and may detect reactivity from peptides present at non-15 physiological concentrations. Therefore, these types of screening experiments need to be coupled to complementary, physiologically relevant methods of detecting TCR reactivity. There is great potential in scanning microbial proteomes to find the most well-suited peptide epitopes for use as vaccines able to elicit optimal anti-pathogen immune responses. There is also opportunity to design optimized synthetic peptides that could conceivably generate stronger immunity than that generated by naturally occurring microbial antigens. To take advantage of these possibilities, additional tools that allow investigators to explore, more completely, pathogen genomes with respect to the T-cell epitopes that they encode and to explore unbiased epitope space in a physiologically relevant manner would be essential to future peptide vaccine strategies for infectious disease. At present, most techniques currently used to perform T-cell antigen discovery are limited in their ability to search for and find T-cell epitopes at a scale that meets the growing demands of the emerging therapeutic approaches that have been discussed throughout Section 1.4. 1.5 T-cell antigen discovery methods 1.5.1 Early genomic/cDNA library screening Initial TCR antigen discovery efforts were focused on melanoma since these tumor cells are generally more amenable to the creation of stable cell lines (Chen et al., 1997) and were noted to readily generate tumor-reactive T-cell populations in vitro. Pioneering work was done in which cytotoxic T cell (CTL)-sensitive cells isolated from tumor were selected under pressure by continuous incubation with tumor-reactive CTL until stable antigen-loss tumor cell variants were isolated. A cosmid library could then be constructed from the genomic DNA of the original cells, transfected into the antigen-loss variant line, and co-cultured with T-cell clones-of-interest derived from patient peripheral blood. Subsequent chromium-51 (Cr51) release assays (Boyle, 1968; Brunner et al., 1968), a classical read-out of immunocytotoxicity that measures the relative quantity of radioactivity present in supernatants from co-cultures of lymphocytes and Cr51 pre-loaded APC, were then used to measure CTL reactivity. In these experiments, the more radioactivity detected in a supernatant sample, the more target cell lysis had occurred and this read-out could be used to reveal transfectants in which CTL sensitivity was restored. 16 Additionally, enzyme-linked immunosorbent assays (ELISA) assays were used in these studies as an orthogonal indicator of CTL reactivity as this technique could be used to capture and measure IFNγ and/or TNFα released by activated T cells into the supernatants of CTL/APC co-cultures. Specific cosmid-transfected APC clones eliciting T-cell responses were recovered for characterization and this approach led to the discovery of the now well-known melanoma associated antigens, MAGE-1, -2, and -3 (van der Bruggen et al., 1991). In another study, a similar approach was used whereby melanoma-derived cDNA libraries instead of genomic DNA libraries were transfected into a non-melanoma cell line for screening against patient derived tumor-infiltrating lymphocytes (TIL) (Kawakami et al., 1994). These investigations led to the identification of another classic melanoma antigen, MART-1. A major impediment of the above methodologies was the requirement to create stable target cell lines expressing genes from the tissue under interrogation. Many other important diseases, including many other tumor types, cannot be investigated in this way due to the inherent difficulties in establishing permanent cell-lines from most types of primary tissue and from the potential for high background to be observed when using surrogate cell lines. An alternative cDNA screening method provided a partial solution to these problems by utilizing an indirect route to T-cell antigen discovery; serological analysis of recombinant cDNA expression libraries (SEREX). Fundamentally, SEREX is an antibody-epitope screening technique involving the display of a prokaryotically-expressed cDNA library screened against patient sera (Sahin et al., 1995). Given that humoral immune responses often require the assistance of CD4+ helper T cells, antigens discovered using this approach often provide an indirect indication of T-cell specificity. Indeed, the well-known CTL antigen, NY-ESO-1, was first identified in a SEREX screen (Chen et al., 1997) although further experiments were required to confirm its activity in cell-mediated immunity (Jäger et al., 1998). 1.5.2 Proteomic approaches Acid elution of MHC-bound peptide followed by analysis by bulk Edman sequencing was an early technique used in the characterization of T-cell epitopes. It was this approach that provided strong evidence for the hypothesis that individual MHC alleles bound specific peptide motifs 17 (Falk et al., 1991). Subsequently, tandem mass spectrometry (MS) for peptide sequencing was developed and enabled higher-resolution identification of epitopes isolated by peptide elution, leading to the discovery of another well-known melanoma epitope, pmel-1 or gp100(280-288) (Cox et al., 1994). Using this method, investigators were able to characterize individual peptides present in high-performance liquid chromatography (HPLC) fractions that were capable of eliciting in vitro T-cell responses as measured by Cr51 release. Despite extensive up-front HPLC fractionation in this study, the high complexity of the MHC-presented peptidome still resulted in numerous candidate peptides being detected in each immunogenic fraction due to co-elution of immunogenic and non-immunogenic peptides. Therefore, individual peptides from immunogenic fractions needed to be individually tested in vitro to determine which specific peptides were responsible for eliciting T-cell reactivity. These issues regarding peptide complexity encountered in the Cox et al., 1994 publication echo through to immunoproteomic studies conducted today. Though contemporary mass spectrometry systems have advanced significantly in sensitivity, resolution, mass accuracy, and acquisition speed (Holčapek et al., 2012), the challenge still remains in determining which detected peptides are most appropriate for downstream testing. Unlike other analytical methods such as RNA-seq, quantitative PCR, HPLC, or Western blotting, the peak intensity of a given peptide in mass spectrometry does not correlate with its abundance in the sample. Biases in the ability of a peptide to become ionized in mass spectrometry contribute considerably to its detection and, thus, there is a risk of false negatives arising from presented peptides that fail to fly. Nonetheless, immunoproteomic techniques maintain an important role in T-cell antigen discovery efforts. The view afforded by mass spectrometry of all MHC-presented peptides at the cell surface allows for analysis of T-cell antigens in situ. Cancer antigens can be modeled with existing cell lines and antigens from infectious agents can be monitored by infecting cells with whole pathogen in vitro (Karunakaran et al., 2008). These strategies are useful with respect to high-resolution epitope mapping, assessing the abundance at which individual peptides are processed and presented, and characterizing peptides carrying endogenous post-translational modifications (PTMs) (Mohammed et al., 2008; Petersen et al., 2009). 18 1.5.3 MHC multimers A major development in the progression of epitope screening technology was the advent of MHC tetramers (Altman et al., 1996). The first tetramer system employed engineered molecules consisting of a specific pMHC complex fused to a biotinylation signaling domain that, when conjugated to biotin, could be bound to phycoerythrin-conjugated avidin. The natural tetramerization properties of avidin resulted in the combination of multiple identical pMHC proteins to form a single molecule while the fluorescence characteristics of phycoerythrin enabled detection. Circumventing the problems associated with the weak and transient interactions characteristic of pMHC/TCR binding in cell-free experiments, the higher avidity tetramers provided a means of directly staining cognate TCR with candidate antigen complexes. Initially, the most daunting challenge facing application of MHC-tetramer or higher-order MHC multimers to high-throughput antigen discovery studies was the onerous manufacturing process necessary for the construction of individual pMHC multimer reagents. In large part, this is due to the instability of empty MHC molecules. Both the peptide and MHC components of the complex must be present in the folding reaction to result in viable pMHC protein. This characteristic made the possibility of tractably constructing high-complexity, unbiased pMHC antigen screening libraries very remote until the first descriptions of conditional pMHC tetramers (Bakker et al., 2008; Toebes et al., 2006). These specialized molecules are pre-folded MHC containing a placeholder peptide ligand modified to contain a photocleavable moiety. Upon cleavage of the photolabile analogue, a peptide of interest can then be added to the reaction to bind and rescue MHC molecules from denaturation. This advancement was a key technical improvement that opened the door for a number of multimer-based T-cell antigen discovery strategies, discussed further in the following sub-sections, to be implemented. 1.5.3.1 Flow cytometry/mass cytometry Since the original description of MHC tetramer staining, tetramers have become an indispensable tool for detecting T cells reactive to known antigens-of-interest. However, for contexts in which known T cells or TCR-of-interest have been isolated and antigen identification is needed, 19 specialized strategies are required to search a meaningful fraction of epitope space for relevant TCR epitopes. To this end, a variety of schema have been developed that employ either combinatorial surface tagging of T cells with pMHC multimers or DNA barcode-conjugated pMHC multimers. Combinatorial approaches involve producing multimer reagents consisting of many unique pMHC molecules, each of which are conjugated to unique combination of multiple different fluorophores, thereby giving any labeled T cells a multicolored surface code that can be interpreted to determine precisely which pMHC complex is binding to the TCR-of-interest. This combinatorial staining was first demonstrated by using flow cytometry to successfully enable the detection of 15-25 antigens per tube (Hadrup et al., 2009; Newell et al., 2009) using relatively few (<3) different detection channels. The capability of combinatorial multimer encoding was then extended by leveraging the highly multi-parametric nature of mass cytometry: a technique which combines the front-end fluidics of flow cytometry with a back-end mass spectrometry read-out. Single cells are labeled with antibodies or multimers conjugated to stable heavy metal isotopes rather than organic fluorophores and focused into an inductively coupled plasma torch at the interrogation point, ionizing the entire contents of individual cells. Masses of the isotopes tagging each cell can then be detected and the corresponding parameters can be monitored with virtually no signal spillover between channels. Using the combinatorial approach previously described for fluorescence flow cytometry, 3-dimensional staining using 10 different metal tags was used to test 109 different antigens for TCR binding by mass cytometry (Newell et al., 2013). Theoretically, the dynamic range of the mass cytometry would enable researchers to probe cells with approximately 100 different metal tags although, at present, the number of isotope tags currently commercially available is far fewer. Regardless, this type of multi-parametric monitoring has the potential for very large numbers of antigens to be screened, with the major limitation, however, that TCR cross-reactivity cannot be unambiguously assessed by this method. More recently, fluorescent pMHC multimers tagged with biotinylated DNA of known sequence have been described. The result is a pMHC multimer reagent that can stain antigen-binding T 20 cells and, upon collection of stained cells by fluorescence-activated cell sorting (FACS), be subjected to PCR amplification to recover the associated tags. These DNA tags are then read as a barcode by DNA sequencing. This incorporation of sequencing as a back-end read out of MHC multimer-based flow cytometry lends itself to a very high level of multiplexing as a large number of possible DNA barcodes can be constructed and read in parallel. Indeed, in initial demonstrations of this method, investigators were able to pool >1,000 distinct multimer-DNA barcode reagents and use them to assess T-cell populations (Bentzen et al., 2016). In addition, as is the case in other sequencing methods, the read count of each individual barcode is directly proportional to the abundance of the barcode in the multimer-stained T-cell population. Therefore, the sequencing read-out used in this method is able to both identify and quantitate TCR reactivities present in samples. 1.5.3.2 Arrays Alternatively, pMHC multimer arrays have been developed and provide an attractive prospect for antigen screening approaches due to the spatial addressability of individual pMHC complexes. The first microarrays for this application consisted of pMHC tetramer complexes immobilized to glass slides coated with polyacrylamide. Labeled T-cell populations-of-interest were washed over these slides and visualized by microscopy (Soen et al., 2003). Subsequent array designs incorporated co-spotting of pMHC molecules with capture antibodies for specific cytokines of interest (Stone et al., 2005) to allow for the added ability to monitor functional activation of T cells interacting with cognate immobilized pMHC. These initial pMHC microarrays, however, were unable to match the sensitivity of flow cytometric pMHC multimer methods and suffered from poor reproducibility. Despite this, the observed 0.1% limit of these methods to detect antigen-responsive cells in bulk CD8+ T-cell populations still provides a great deal of utility in biologically- or clinically-relevant investigations (Brooks et al., 2015), especially as the microarray format is able to accommodate approximately 1,000 – 2,000 unique spots per slide. Nonetheless, the sensitivity of prototypical pMHC microarray design was subsequently improved, notably by using MHC-Ig dimers instead of biotin-streptavidin tetramers and applying 21 mildly shearing fluid flow rates across the array surface (Deviren et al., 2007). This iteration of the pMHC microarray brought their limit of detection to a similar order of magnitude as single-staining MHC multimer flow cytometry. Reproducibility was also improved by optimizing the chemistry used to affix pMHC molecules to the array surface. In particular, amine-coated glass microarray slides were spotted with DNA oligonucleotides and complementary single-stranded DNA molecules were conjugated to pMHC tetramers, thus immobilizing them via highly specific nucleic acid interactions rather than by non-specific, and often denaturing, surface-protein interactions (Kwong et al., 2009). 1.5.4 pMHC display systems Expression systems have been described as a means of displaying T-cell antigens on the surface of insect cells (Crawford et al., 2006) or yeast cells (Birnbaum et al., 2014). In these methods, libraries of minimal epitope-coding DNA sequences are inserted into an acceptor plasmid containing the native components of the MHC dimer separated by flexible linker regions. Epitopes of the correct size are ligated in frame with single-chain MHC (scMHC) such that, upon expression of the transgene, a library of distinct scMHC proteins is generated. For insect cell display, scMHC constructs are delivered via baculoviral vector while for yeast display, scMHC plasmids are transfected into to host cells by electroporation. Upon delivery of vector, recombinant scMHC complexes can self-assemble and successfully traffic to the cell surface in the absence of any antigen processing and presentation machinery. The displayed pMHC antigens are then fished for by mixing library-expressing cells with soluble TCR multimer reagents analogous to the pMHC multimers described above. TCR multimers are conjugated to fluorescent markers or magnetic beads by which positively TCR-stained display cells can be isolated by FACS or magnetic separation, respectively. Application of this type of approach has been able to achieve a very high level of scalability, libraries of >108 unique randomly generated peptide-coding sequences have been generated and assessed by soluble TCR multimer staining (Gee et al., 2018). 22 1.5.5 Cell-based functional screening Since the description of the original genomic and cDNA library screening efforts described in the early 1990s, functional mammalian cell-based assays have evolved both in terms of antigen encoding and functional read-out. The primary functional read-out of T-cell-antigen recognition in contemporary antigen discovery studies has shifted away from Cr51 release and ELISA and towards the enzyme-linked immunospot (ELISPOT) assay (Czerkinsky et al., 1983; Taguchi et al., 1990). This assay relies on the detection of cytokines, usually IFN, secreted by antigen-stimulated T cells upon recognition of APC. Similar in principle to ELISA, ELISPOT is considerably more sensitive than ELISA and provides accurate estimation of the frequency of reactive T cells since the assay enumerates spots formed by individual activated T cells in each antibody-coated test well. Alternatively, another commonly used class of functional TCR-antigen read-outs involves the use of a reporter T-cell line constructed with a -galactosidase (Karttunen and Shastri, 1991), luciferase (Anmole et al., 2015), or fluorescent protein (Siewert et al., 2012) gene placed under the control of the NFAT promoter, which is transcribed from specifically in the event of T-cell activation. Along with the various methods of detecting functional T-cell read-outs, a number of alternative methods have been developed to exogenously provide target cells in these studies with candidate antigens to be screened against T-cell populations-of-interest. Theoretically, any of the functional read-out methods described above could be mixed-and-matched with any of the antigen-delivery schema described below to produce a custom antigen discovery strategy based on the strengths and limitations of each component technique. The simplest, most direct method by which target cells can be loaded with candidate antigen is by peptide-pulsing. The principle behind peptide-pulsing is to provide enough test peptide to APC culture media such that the extracellular concentration of the peptide is sufficient to out-compete natively presented peptides for MHC binding sites. This approach is useful for determining the affinity of known epitopes to their MHC binding partner or as a verification of 23 peptides previously identified as antigenic by other means. Since peptide-pulsing bypasses the natural antigen processing and presentation pathway, library screening approaches based on pulsing high-diversity peptide pools (such as those described in Section 1.4.3) should be regarded with the caveat that putative hits from these screens may not naturally be intracellularly processed and loaded onto MHC molecules. Another strategy to deliver candidate antigen panels to APC is to provide whole protein in culture media. Purified proteins can be added to media as-is, in the form of protein-coated microbeads (Turner et al., 2001; Valentino et al., 2011), or contained within E. coli cells expressing recombinant proteins. This approach then depends on the characteristic of professional antigen-presenting cells, most notably dendritic cells, to efficiently uptake material from their surroundings and effectively present them on MHC class-II and, via cross-presentation, MHC class-I. In effect, this style of antigen delivery is analogous to peptide-pulsing – sophisticated transfection or gene delivery methods are not required – but, in this case, antigen-processing and presentation of candidate peptides is accounted for. The drawback with whole protein pulsing, however, is the requirement that highly specialized pAPC are used; dendritic cells are in low abundance in the types of donor tissue that would likely be available to investigators and require in vitro activation and differentiation prior to their use in experiments. Nucleic acid-encoding antigen libraries are also used as a means of loading APC with candidate antigen sequences. Though whole cDNA/open-reading frame (ORF)-based methods, reminiscent of the original antigen discovery efforts carried out in melanoma, are still in use, the construction and screening of minigene libraries has become more widespread. Minigene approaches have previously been employed by using short-peptide-coding sequences synthesized as tandem arrays containing many pooled minigenes together in a single construct and delivered to host APC by RNA (Hondowicz et al., 2012) or DNA transfection (Lu et al., 2014). Alternatively, libraries of individual minigenes have been generated and delivered to target cells (Birnbaum et al., 2014; Siewert et al., 2012). Though the appeal of minigene screening approaches is that they also incorporate consideration of natural antigen processing and presentation while, in theory, providing a quicker route to the determination of minimal immunogenic epitopes than do whole 24 cDNA/ORF methods, these minigene-based screening methods still require considerable post hoc elucidation of positive hits. Primarily, this is in the form of panning to remove noise and spurious hits and/or deconvolution of tandem arrays to isolate individual epitopes. It should be noted that, while minigene approaches can be used to construct enormously complex libraries in target cell populations, no assays currently exist that can exhaustively assess these complex populations functionally; the capacity of current mammalian cell-based assays are a bottleneck to effectively being able to survey large minigene libraries. Generally, this has been dealt with by performing biased selection of a subset of candidate antigens based on in silico analysis. 1.6 Thesis overview Currently, the tools available to do high throughput T-cell antigen discovery are limited in capability, largely due to the complexity of TCR/pMHC interactions and the enormous genetic diversity inherent in TCR-, MHC-, and peptide-coding space. At the same time, the opportunity to develop immunotherapies in a wide variety of disease types represents a new frontier of biomedical research and, with it, comes the potential to provide enormous clinical benefits to many patient populations. To forge ahead, interrogate this TCR complexity, and aid in creating successful novel therapeutic regimens, new tools for understanding the T-cell response are needed. To this end, the objective of the research presented here, was to develop a set of in vitro methods that enhance our capacity to rapidly and broadly characterize the scope of T-cell reactivity. Each chapter presented here outlines a distinct component of a novel approach to performing high-throughput, functional T-cell antigen discovery which, when taken together, represents an overall strategy that should occupy an important role in the ecosystem of methods currently in use for profiling T-cell reactivity. At the time of this writing, an earlier version of Chapter 1, Section 1.5, has been published and Chapters 2 and 3 have been submitted as a single manuscript currently undergoing revision. In addition, the methods described in Chapter 2 and 3 are the subject of national phase patent applications in multiple jurisdictions, currently under adjudication. The methods and data 25 presented in Chapter 4 have been filed as a provisional patent application and it is anticipated that it will be published in the future along with additional data. Chapter 2 is the description and validation, with mouse model tissues, of a novel fluorescence-based reporter system for detecting reactivity-eliciting APC from cytotoxic T cells under interrogation. The reporter presented in this chapter represents a new class of detection methods to be mobilized towards the detection of T-cell cytotoxicity that efficiently and specifically allows for isolation of antigen-bearing targets out of a pooled bulk population of APC. Chapter 3 is the characterization of this reporter system when combined with deep sequencing techniques as it applies to screening large libraries of candidate peptide-coding sequences. In this chapter, I explore the sensitivity with which bona fide T-cell antigens can be extracted from a background library of potential antigens and the robustness of the approach to detecting T-cell reactivity from polyclonal mixtures of input T-cells. Chapter 4 is a synthetic biology approach to applying the reporter system to human TCR without the need to isolate and enrich for primary T-cell clones or APC. This chapter seeks to demonstrate that the fluorescence assay and deep sequencing approaches described in Chapters 2 and 3 can be performed even when only DNA sequences of TCR-of-interest are available. In Chapter 5, I discuss the future of this technology as is relates to the need for continued validation and benchmarking, the potential applications for which it well-suited, and the opportunities that exist for enhancing the capabilities of the approach. Finally, included in this thesis are a set of appendices providing supporting procedures and materials used to carry out the experimental work presented in Chapters 2, 3, and 4. Appendix A outlines a unique and novel cloning scheme that I developed for inserting high-quality, high-complexity minigene libraries into lentiviral transfer plasmids. Though the library generated in Appendix A was not actually the one used in the main text of the thesis, I have included this protocol as an additional tool to be used in future efforts to apply the methods described in the 26 main chapters. Appendix B contains relevant nucleic acid sequences used throughout this thesis including plasmid backbone, insert, and oligonucleotide sequences. Appendix C contains the R code that was used to perform analyses of viral titering and amplicon sequencing experiments. Appendix D includes representative FACS screenshots demonstrating the gating strategies used to prepare cell lines for co-culture experiments and to isolate cell populations for sequencing analysis. Appendix E contains a summary of the cell and read counts obtained from FACS and sequencing experiments, respectively, that were conducted in this thesis. 27 Chapter 2: Implementing a novel reporter system to detect functional activity of cytotoxic T cells 2.1 Introduction Identifying T-cell epitopes is a difficult task due to the complexity of antigen-specific T-cell activation. Contributing to this complexity are several factors. First, the number of unique short peptides that could possibly exist makes for an enormous T-cell epitope space to be searched. Second, peptide-presenting MHC molecules are encoded in humans by HLA genes that are polygenic and highly polyallelic, with different HLA alleles encoding MHC variants having distinct peptide-binding and TCR-binding preferences. Third, variation in the intracellular expression level of antigenic proteins and biases in proteolytic processing also influence pMHC immunogenicity (Bassani-Sternberg et al., 2015). Finally, TCR/pMHC interactions are transient (Das et al., 2015), promiscuous (Wooldridge et al., 2012), and low-affinity (Stone and Kranz, 2013) The various methods of antigen screening described in Section 1.5 are generally classed into either function- or affinity-based approaches. In the functional class of methods, candidate antigens are presented on target cell surfaces and tested for their ability to generate functional T-cell responses. These responses are then identified either by using a T cell-based read-out such as cytokine release (Taguchi et al., 1990), or activation of an NFAT-linked reporter (Anmole et al., 2015; Karttunen and Shastri, 1991; Siewert et al., 2012) or, alternatively, monitor destruction of APC to measure functional T-cell activation (Boyle, 1968; Brunner et al., 1968). These configurations all have in common the requirement to load target cell populations with individual candidate antigens and test each of them for T-cell recognition “one-by-one” in separate reaction wells, in isolation from all other candidates. Pooling strategies can increase the search space, but would, subsequently, need to be laboriously deconvoluted (Chevalier et al., 2015; Hondowicz et al., 2012; Lu et al., 2014). Thus, functional cellular assays have yet to be scaled in a manner that could conceivably enable exhaustive screening of large sets of potential epitopes, such as those spanning an entire proteome. 28 In contrast to function-based screening assays, affinity-based methods such as single-chain MHC display (Birnbaum et al., 2014; Crawford et al., 2006) or combinatorial/barcoded pMHC-multimer surface staining (Bentzen et al., 2016; Newell et al., 2013) have been developed that seek to circumvent many of the limitations mentioned above. Although scalable, these methods bypass natural antigen processing, presentation, and T-cell activation, and rely solely on TCR/pMHC affinity as a proxy for T-cell recognition. Important biophysical parameters, including kinetic on/off rates, allosteric effects of the TCR on downstream signaling, pMHC half-life, and the action of co-receptors, are all known to be critical to the activation of T cells (Bridgeman et al., 2012; Harndahl et al., 2012; James and Vale, 2012; Kersh et al., 1998b; Reinherz, 2015; Stone et al., 2009) but are not taken into account in these techniques. Therefore, affinity-based methods may in some cases yield high affinity epitopes that are physiologically irrelevant, while missing other epitopes that are of low affinity but still physiologically important (Martinez et al., 2016). Here, I describe a novel functional-based parallel screening approach for the discovery and characterization of T-cell epitopes. Fundamental to this design is the expression of candidate epitopes, encoded as minigenes in APC, and a reporter system that is co-expressed in APC instead of in the T cell. This allows for targeted APC to be selectively recovered from irrelevant bystanders and their epitopes to be subsequently identified by sequencing the minigene-coding region contained in isolated targets. Configuring the reporter read-out in this way allows for populations of target cells containing high-complexity, unbiased epitope libraries to be pooled and simultaneously screened against T-cell populations-of-interest. This new approach leverages the exquisite specificity of the granzyme/perforin pathway. In vivo, CTL recognizing an immunogenic APC employs this pathway to deliver GzmB into the target cell without entry into bystander cells (Woodsworth et al., 2015). Once inside a target cell, GzmB initiates an apoptotic cascade that leads to cell death. I constructed a reporter fusion protein consisting of cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP) moieties separated by a peptide linker that also acts as a cleavage substrate for GzmB (Packard et al., 2007). When fused, the CFP-YFP reporter protein produces Förster resonance energy transfer 29 (FRET) signal while partially quenching CFP emission upon excitation with violet light. Cleavage of the fusion protein by GzmB causes a loss of FRET signal and concomitant rescue of free CFP signal. The resulting shift of cells carrying cleaved FRET-reporter is easily distinguishable in FACS and, thus, allows for targeted cells to be isolated and recovered for further characterization (Figure 2-1). Figure 2-1. Summary of novel T-cell antigen identification method. In the absence of GzmB, the reporter protein emits a resting FRET signature (i) when excited with violet light. Upon entry of GzmB to the cell, the reporter is cleaved. The loss of resting FRET signal combined with the rescue of free CFP signal results in a FRET-shift (ii) that can be monitored in FACS to isolate cells undergoing T-cell targeting. Minigene sequences are encoded in tandem with the FRET-reporter with both components separated by a 2A ribosomal-skipping signal peptide (Kim et al., 2011) that allows for independent expression of both sequences. Advantageously, the 2A system provides a means to ensure that only viral cassettes with productive in-frame minigenes will fluoresce owing to the property of 2A-containing transcripts to be transcribed bicistronically by a single ribosome. The 2A system also ensures that each component is expressed with a 1:1 stoichiometry, raising the 30 possibility that higher or lower antigen-expressers could be selected for study on the basis of their fluorescence intensity. A key design principle in developing a minigene-reporter system that can be encoded in bulk APC cell lines is the ability to encode a single unique minigene in each constituent cell of the transduced APC population. The presence of multiple minigenes per cell would be expected to introduce ambiguity into downstream analysis as it would not be possible to determine which of the minigenes detected in FRET-shifted cells is responsible for eliciting T-cell reactivity. This problem is particularly pronounced when considering that common methods of transfection such as electroporation or reagent-based nucleic acid delivery typically result in the delivery of hundreds to thousands of plasmid copies per cell nucleus (Cohen et al., 2009; Glover et al., 2010; Hornstein et al., 2016), only one or very few of which would actually trigger a T-cell response. To accomplish the goal of producing low copy-number minigene-expressing target cells, lentiviral vector systems were used to deliver minigene-FRET transgenes. These lentiviral vectors are derived from engineered versions of the lentiviridae subfamily of the retroviridae family of Group VI viruses. Specifically, they are composed of genes taken from the Human immunodeficiency virus-1 (HIV-1) genome and organized to yield a self-assembling vector system capable of entering target mammalian cells, reverse transcribing their RNA genome encoding the user-defined gene-of-interest, and stably integrating the transgene into the host genome. Notably, in the interests of biosafety, much of the native HIV-1 genome has been either removed or mutationally inactivated. For further biosafety and also to facilitate modularity, the remaining essential genetic elements for viral assembly and transduction are encoded across multiple distinct plasmids. In the case of 2nd generation vector systems – used throughout this thesis – they are distributed across 3 plasmids: a packaging plasmid encoding functional gag, pol, tat, and rev genes, an envelope plasmid encoding env protein typically derived from the broad-tropism Vesicular stomatitis virus (VSV), and a transfer plasmid encoding the gene-of-interest along with the desired promoter all flanked by lentiviral long-terminal repeats (Zufferey et al., 1997). 31 2.2 Results 2.2.1 Infection and verification of low-copy viral integration events in APC lines With the assumption that viral integrations occur independently, randomly, and with equal probability across all cells of a population, the genome integration of lentiviral cassettes can be thought of as following a Poisson distribution in which individual cells of a bulk APC population represent distinct non-overlapping intervals into which some number, k, of viral integrations can occur. In other words, the number of transgene copies received by a single cell is governed by the probability mass function: (2-1) Where λ represents the ratio of transducing units to cell number (which can also be regarded as the multiplicity of infection or average expected number of integrations per cell) and P(k) is the probability that a given cell contains k viral copies (which can also be viewed as the expected frequency of cells containing k viral copies in the bulk population). A simple way to assess the distribution of integration events in a transduced cell population is to monitor the proportion of cells expressing a fluorescent marker included in the viral vector. By measuring the proportion of fluorescent cells, and, therefore, the proportion of cells carrying at least one viral integration, it is possible to estimate the average number of transducing units per cell and, by extension, the expected number of cells bearing a given number of transgenes. This is done by evaluating equation 2-1 for k > 0 as shown in eq. 2-2 and solving for λ as shown in eq. 2-3 (where , the proportion of cells detected to be fluorescing after transduction). (2-2) (2-3) 32 For target cells that have undergone sorting to remove untransduced cells, however, this Poisson-based estimation of viral integrations cannot be used. To measure viral integrations per cell on FACS-purified APC populations prior to their use in FRET-shift experiments, I developed a qPCR-based assay consisting of a primer set specific for the MND promoter present in minigene-FRET transgenes and a primer/probe set specific for beta-actin. Five-point plasmid standard curves were constructed from serial dilution of a known quantity of purified plasmids containing the query sequences for each of these primer sets. Comparison of Ct values from experimental samples to standard curves was used to estimate the copy number of each query sequence and calculate the ratio of integrated viral cassettes to cellular content in each sample. The qPCR assay was validated by directly comparing the measured average viral copy number per cell (λqPCR) of a set of cell populations transduced with increasing volumes of virus to the estimated copy number per cell (λfluor) of the same populations based on their percent fluorescence in flow cytometry. The hypothesis was that if the qPCR assay outlined above is able to accurately measure integrations per cell, then λqPCR should not be statistically different from λfluor. After test transduction of mock target cells with minigene-FRET encoding virus, cells were measured by flow cytometry and first confirmed to follow a Poisson distribution by fitting λfluor in eq. 2-2 to the measured % fluorescent values by non-linear least squares regression (p < 0.001). The values of λqPCR were then plotted alongside the results of fluorescence-based viral titering (Figure 2-2) and determined not to be statistically significantly different, thereby validating the qPCR-based approach to quantitating viral copy number per cell. 33 Figure 2-2. Validation of the in-house integrations/cell qPCR assay. To benchmark the assay, a virus titering curve was prepared and the proportion of transduced cells was measured by flow cytometry. Integrations/cell were measured in the genomic DNA of these cell populations by qPCR and used to calculate the expected proportion of cells carrying at least one integrated viral cassette. The resulting data from both methods (three biological replicates per sample, error bars denote SD) were tested with a two-sample Kolmogorov-Smirnov test at 95% confidence and found not to be statistically different. 2.2.2 Expression of epitope-encoding minigene results in efficient detection of targeted cells To validate the GzmB-cleavable reporter gene as a novel tool for T-cell epitope identification, I used the murine ovarian cancer cell line, ID8 (Roby et al., 2000), as APC for the well-characterized model TCR, OT-I (Hogquist et al., 1994). ID8 cells were virally transduced with minigene constructs coding for a 40 amino acid stretch of the chicken ovalbumin protein (OVAL241-280) with either the intact OT-I minimal epitope (SIINFEKL), or a scrambled version of this epitope (LKNFISEI), at the center of this region. Cytotoxic T cells were expanded by anti-CD3/28 stimulation from splenocytes of the OT-I TCR-transgenic mouse (Clarke et al., 2000), and co-cultured with each target cell line separately. Flow cytometric analyses of co-cultures indicated that SIINFEKL+ cells underwent significant (p < 0.0001) and substantial (Cohen’s d > 40) cleavage of their encoded reporter protein relative to the scrambled negative control. These data provide evidence that the FRET-shift assay described herein is capable of efficiently detecting target cells harboring the correct antigen (Figure 2-3). 34 Figure 2-3. FRET-shift assay testing. ID8 cells expressing either the Ova minigene fragment with native epitope intact (OVAL241-280) or the Ova minigene containing a scrambled epitope (OVAL257-264 SIINFEKL  LKNFISEI) were cultured with or without OT-I CD8+ T cells at a 1:1 ratio for 4 hrs. (a) Representative plots of FRET signal (ex405/em525) versus CFP signal (ex405/em450) and (b) proportions of cells shifting into Targeted gate in all replicates (n=3, underlaid bar chart and error bars denote mean ± SD) are shown. Significance was determined using an unpaired, 1-tailed Student’s t test. Effect size was calculated as the difference of standardized means (Cohen’s effect size). 2.2.3 GzmB delivery to targeted cells in mixed APC populations is highly specific Once the GzmB-sensitive FRET-shift assay was shown to be epitope-specific in homogenous target cell populations, I tested the specificity of the method in a mixed population. As GzmB is a soluble effector molecule released by CTL on recognition of cognate epitope, I considered the possibility that perforin and GzmB escaping from immunological synapses formed between a CTL and APC could potentially diffuse into irrelevant bystander cells and produce false-positive signal. To evaluate this, Ova minigene-expressing cells and scrambled control minigene-expressing cells were mixed in a 1:1 ratio and co-cultured with OT-I CTL. Upon completion of co-culture, cells were sorted according to the gating scheme shown in Figure 2-4b. Transduced ID8 cells that FRET-shifted into the Targeted gate under T-cell pressure and cells in the Untouched gate of CTL- controls were both recovered and analyzed by genomic qPCR to determine the relative proportion of Ova minigenes and scrambled minigenes present in either population. After correcting for the average integrations/cell of each transduced ID8 cell line 35 using qPCR assay validated in Section 2.2.1, I found that >95% of the cells captured in the Targeted gate expressed Ova minigenes (Figure 2-4c, p < 0.0001) while the cells captured in the Untouched gate were found to remain in a roughly 1:1 ratio of Ova minigene-expressers to scrambled control-expressers as expected (Figure 2-4a, p > 0.2, no significant difference). Figure 2-4. Assessing specificity of FRET-shift reporter system. Ova minigene-expressing targets or scrambled control cells were combined at a 1:1 ratio to form a binary mixed target population. Mixed targets were co-incubated with OT-I CTL for 4h at 1:1 effector:target ratio prior to FACS analysis in triplicate for all conditions (underlaid bar chart and error bars denote mean ± SD). Recovered cells were lysed and genomic DNA was purified and used as template for qPCR using a custom TaqMan assay. Significance was determined using an unpaired, 1-tailed Student’s t test. 2.2.4 Peak expression of GzmB-induced signal sufficiently precedes apoptosis of target cells Counter-intuitively, the GzmB FRET-shift assay relies on isolating antigen-encoding target cells that have received a granzyme B dose from activated CTL and, consequently, have initiated apoptosis. I sought, therefore, to investigate the kinetics of FRET-shift signal as it relates to the kinetics of apoptosis progression in my in vitro co-culture system. To do this, I again employed 36 the OT-I CTL/Ova minigene model system. Four different effector:target ratios were set up and monitored for both % FRET-shifted and % propidium iodide (PI)-positive at 10 different time points. FRET-shift signal was detectable as early as 1 hr after initiation of CTL/target cell co-culture and steadily rose to a peak value after 6-8 hours. Signal from PI staining – indicative of loss of cell membrane integrity and cell death – was observed to spike 2-4 hours after peak FRET-shift signal was attained before sharply declining as cells deteriorated due to apoptosis (Figure 2-5). Thus, there is a safe-sorting window during which APC that have received a granzyme hit can be recovered and subject to DNA extraction for epitope identification. Figure 2-5. OT-I time course. Ova minigene-expressing target cells were exposed to varying lengths of time with OT-I T cells at different effector: target ratios. The proportions of cells undergoing FRET-shift and entering apoptosis (as measured by PI uptake) were both monitored for each time-point and ratio. All individual data points are shown as values standardized to the mean of the CTL- control at each time point; illustrated lines passes through group means, error bars denote SD. 37 2.2.5 Factors affecting the %FRET-shift signal and magnitude of shift To further explore the characteristics of the GzmB-sensitive FRET-shift assay, signal detection was assessed using a second model mouse TCR/pMHC pair as well as an alternative host target cell line. Cytotoxic T cells derived from the pmel-1 TCR transgenic mouse line (Overwijk et al., 2003), which are known to be stimulated by recognition of the hgp100(25-33) peptide, KVPRNQDWL in the context of mouse H-2Kd, were used for additional testing. To this end, an hgp100 minigene sequence was constructed encoding the minimal epitope along with 16 amino acids of endogenous flanking sequence and inserted into the minigene-FRET lentiviral cassette in similar fashion to the Ova minigenes described above. After virus production, both hgp100 minigene and Ova minigene constructs were transduced into the EL4 cell line (Gorer, 1950), a mouse T-cell lymphoma induced in the C57BL/6 strain. Co-cultures were performed using both OT-I CTL and pmel-1 TCR CTL against Ova minigene and hgp100 minigene-expressing EL4 cells, respectively. From these experiments, I observed that both a higher percentage of target cells were FRET-shifted in Ova minigene-expressing EL4 cells compared to their ID8 counterparts and that the resulting FRET-shift was larger in magnitude (Figure 2-6). Furthermore, these data confirmed that pmel-1 TCR CTL were able to elicit FRET-shift in response to hgp100 antigen, albeit generating a lower %FRET-shift signal than that observed in cultures of OT-I CTL and Ova minigene-expressing targets. Given that the pmel-1 TCR is known to be a less potent TCR than OT-I (Kaluza et al., 2012), it, therefore, stands to reason that the %FRET-shifted signal observed in this assay could be proportional to the reactivity of TCR clonotypes-of-interest to their cognate antigens. 38 Figure 2-6. The size and magnitude of FRET-shift is influenced by the identity of the target cell. EL4 cells were transduced with Ova and hgp100 minigene viruses and co-incubated with either expanded OT-I CTL or pmel-1 TCR CTL at 1:1 effector:target ratio for 4h. Conversely, comparing Figure 2-3a and Figure 2-6, it does not appear that the magnitude of the FRET-shifts observed in either case are primarily influenced by the identities of the TCR/pMHC proteins involved but, rather, the identity of the host target cell line used. One possible explanation for this is that lower levels of MHC-I expression result in lower T-cell triggering in terms of the number of cells degranulating and the size of granzyme B dose delivered, therefore producing a less pronounced FRET-shift signal. By using anti-H-2Kb and anti-H-2Db monoclonal antibody surface staining (Figure 2-7), I observed that the expression level of both H-2 alleles (H-2 being the mouse homologue of human HLA) expressed by C57BL/6 mice were significantly higher in EL4 cells than in ID8 cells. Further, the EL4 cells showed surface MHC expression that mirrored that of wild-type C57BL/6 splenocytes, suggesting that the difference between EL4 and ID8 is not a case of EL4 over-expression of MHC-I but rather a down-39 regulation of MHC-I in ID8 cells, consistent with the well-established tendency of many tumor types to reduce or eliminate MHC expression in order to evade anti-tumor immunity. Figure 2-7. ID8 and EL4 target cell lines vary significantly in MHC-I expression. EL4 cells express both the known C57BL/6 mouse MHC-I alleles, H-2Kb and H-2Db, at much higher levels than do ID8 cells. K562 cells (a human cell line) were included as a negative control and wild-type C57BL/6 splenocytes were used as a positive control. Expression was measured by surface staining with anti-MHC Class I (H-2Db) (clone 28-14-8, eBioscience) and anti-MHC Class I (H-2Kb) (clone AF6-88.5.5.3, eBioscience) according to manufacturer’s protocols. 2.3 Discussion In this chapter, I have reduced-to-practice a novel procedure for detecting the effector action of cytotoxic T cells in response to cognate peptide-MHC complexes. By using a lentiviral vector system and a fluorescent reporter system that functions as a FRET-based proteolytic sensor, I demonstrate a strategy for preparing target cells to be co-cultured with T-cell populations-of-interest. Importantly, several features of the technology described in this chapter are crucial for its application to screening large libraries of APC-encoded minigenes, which will be explored in Chapter 3. The ability of lentiviral vectors to be tightly controlled with respect to the number of minigene insertion events per cell provides a means to circumvent the ambiguity that arises when characterizing APC populations that have been prepared by transfection. Additionally, a reporter encoded within the APC themselves and sensitive to early events in T-cell cytotoxicity allow for high complexity minigene library-expressing target cells to be pooled and screened simultaneously in a single co-culture mix. This is in contrast to the requirement of all other existing function-based co-culture assays to perform individual co-culture assays for each 40 individual candidate antigen under interrogation. A third important feature of the approach is the extraordinary specificity with which granzyme B is delivered to intended target cells by activated T cells. Selective entry of GzmB molecules at the immune synapse ensures that false positive signal does not confound antigen-identification and, again, enables diverse minigene-expressing APC populations to be pooled and queried in parallel. During the development of the methodology outlined in this chapter, it was also necessary to incorporate two key steps to be performed prior to using transduced targets in FRET-shift assays. In reducing this approach to practice, I found that it is necessary to functionally titer newly-produced virus batches prior to generating experimental APC cell lines. Other approaches, such as p24 ELISA assays were initially used to titer the yield of virus obtained from productions runs but were, ultimately, found to be far too inaccurate for the high-precision transduction needed here. Instead, I instituted a functional virus titering protocol based on the experiment described in Section 2.2.1. Briefly, immediately after virus production, a small aliquot of virus was set aside for infecting known cell numbers with increasing volumes of concentrated supernatant. Each of these infected populations was measured for % fluorescence and functional titering was performed by fitting, by non-linear least squares regression, a value of T to eq. 2-4 (Appendix C.1): (2-4) Where T = titer in transducing units (TU)/μL, v = volume of virus added in μL, and n = number of cells seeded in each titering sample. Titer values provided by this methodology were used to inform follow-on transductions in order to achieve single-copy number transduction in experimental APC cell lines. Typically, an MOI (TU/cell number) of 0.2 was selected to generate minigene-bearing APC lines since >90% of positively transduced cells would be expected to contain single-copy genome integrations of viral transgenes according to eq. 2-1. Secondly, purity-sorting transduced APC cell lines (Appendix D ) is essential to remove untransduced cells from the target population but, also, to ensure that all YFP+ populations are 41 expressing the expected resting FRET-signature. It was noted that immediately after every transduction of the FRET fusion transgene, many cells expressing YFP signal did not express the characteristic diagonal signature in flow cytometry plots of CFP versus FRET that would be indicative of the stoichiometric expression of both integrated fluorescent proteins. This was viewed as an unavoidable source of attrition in the preparation of APC that, most likely, stemmed from disruption by spontaneous recombination of lentiviral cassettes as they were necessarily passaged through bacterial cells to amplify sufficient quantities of plasmid to produce virus with. After virus production and transduction, only cells displaying baseline FRET-signature were selected for use in downstream experiments. To mitigate this attrition from non-productive reporter, I found that the choice of bacterial strain used to amplify minigene-FRET plasmid is a critical variable. When using strains harboring a non-functional recA1gene, it was observed that the percentage of YFP+ cells displaying intact FRET-signature rose from <10% to >40%. Since the original description of FRET in the 1940s (Forster, 1948), it has found innumerable applications in the study of biology. This is due in large part to the extreme sensitivity of energy transfer to fluorophore distance and orientation, which allows for sensitive measurement of intermolecular distances and conformational changes. Very high efficiency FRET transfer can be achieved routinely through the use of organic dyes and quantum dot reagents; however, the biophysical complexity of whole proteins has made achieving this level of efficiency between fluorescent proteins somewhat enigmatic. Optimization of fluorescent proteins to exhibit desirable properties for intracellular FRET is an active area of research (Bajar et al., 2016) which should be monitored for opportunities to improve the FRET-shift assay described in this chapter. For example, variants of both CFP and YFP, termed CyPet and YPet, have both been described that have been engineered by directed evolution to produce stronger FRET emission (Nguyen and Daugherty, 2005). The implication of high-efficiency FRET between fluorescent proteins is that improved energy transfer should manifest as a larger separation between unshifted and shifted populations in FRET-shift assays, improving specificity and sensitivity of future cell screening experiments. In Chapter 4, I will introduce and apply a second GzmB-cleavable FRET reporter, called FRET2, composed of the CyPet and YPet proteins mentioned above. 42 2.4 Methods 2.4.1 Cell culture All cell cultures were maintained in RPMI-1640 supplemented with 2 mM GlutaMAX, 1 mM sodium pyruvate, 50 μM β-mercaptoethanol, 10 mM HEPES, 100 U/mL penicillin, 100 U/mL streptomycin, and 10% heat-inactivated fetal bovine serum. Culture media and supplements were all sourced from Gibco. Cultures were maintained at 37°C and 5% CO2 atmosphere. 2.4.2 CTL activation and expansion Fresh-frozen dissociated splenocytes from OT-I mice (a gift of Dr. B. Nelson, BC Cancer Agency Deeley Research Centre), pmel-1 TCR mice (from intact spleens shipped by The Jackson Laboratory), or wild-type C57BL/6 mice were thawed, washed, adjusted to a density of 5x106 cells/mL in supplemented RPMI + 50 U/mL rhIL-2 (Peprotech). Splenocytes were seeded into a flat-bottom 96-well plate (1x106 cells/well) that had been pre-coated overnight with 100 μL solution of low endotoxin, azide-free anti-mouse CD3 (clone 145-2C11) at 5 μg/mL and anti-mouse CD28 (clone 37.51) (Biolegend) at 10 μg/mL. Coated wells were washed 2x with PBS and 1x with complete media immediately prior to the addition of cells. Splenocytes were removed from coated plates after 48 hours, adjusted to 1x106/mL by adding fresh media + rhIL-2 to conditioned media, and cultured in U-bottom 96-well plate wells. Expanding T cells were split again on day 5, adjusting density to 1x106 cells/mL, and used in experiments on day 7. 2.4.3 Construction of plasmid vectors The lentiviral transfer plasmid was derived from the pCCL-c-MNDU3-PGK-EGFP backbone (Li et al., 2010). The PGK-EGFP portion of the plasmid was replaced with a custom multiple cloning site via the EcoRI/BamHI sites in the original vector (courtesy of Dr. Eric Yung). Enhanced YFP (EYFP) and mCerulean (ECFP) coding sequences were PCR amplified from the Addgene plasmids #11180 and #15214, respectively, using Phusion High-Fidelity polymerase (New England Biolabs). The CFP forward primer was tailed with P2A coding sequence while the YFP forward primer was tailed with the GzmB cleavable substrate coding sequence. Tailed CFP and YFP amplicons were then sequentially inserted into the lentiviral transfer vector backbone pCCL-c-MNDU3-MCS intermediate using FseI/AgeI and AgeI/AscI restriction 43 cloning, respectively. A stuffer fragment ~1kb in length containing multiple in-frame stop codons was placed upstream of the P2A sequence via BamHI/EcoRI restriction cloning to yield a parental minigene acceptor plasmid, referred to as pMND-silent-FRET (Appendix B.1.1), that remains fluorescently silent until the stuffer is swapped for productive minigenes. The Ova-minigene was prepared by amplifying directly from cDNA recovered from the ID8.G7 cell line, which stably expresses full-length ovalbumin protein. Ova scrambled control and hgp100 minigene inserts were prepared by overlap extension PCR of ssDNA oligos (Appendix B.3). Ova, Ova-scrambled, and hgp100 minigenes were cloned into linearized pMND-silent-FRET parental backbone via BamHI/EcoRI restriction cloning (Appendix B.2). 2.4.4 Virus production To generate Ova, Ova-scrambled, and hgp100 minigene lentiviruses, 40 µg of each transfer plasmid was separately combined with 36 µg of pCMV-ΔR8.91 and 4 µg of pCMV-VSV-G plasmids (courtesy of Dr Eric Yung). These DNA mixes were incubated with 7.5 mL OptiMEM (Gibco) and 0.5 mL of TransIT-LT1 reagent (Mirus) for 30 minutes at room temperature. To 8 x 10 cm culture plates containing 40% confluent HEK293T (DuBridge et al., 1987) cells (ATCC), 1 mL of transfection mix was added per plate for each minigene. In all cases, media was replaced 18 hours post-transfection and viral supernatants were then collected at 48 and 72 hours post-transfection. To concentrate virus, supernatants were ultracentrifuged (110,000 RCF, 90 minutes, 4°C) and pellets were resuspended in 1 mL OptiMEM (Gibco) by gentle shaking at room temperature, 45 minutes. Titers of viruses were determined by testing (in duplicate) 1, 2, 4, 8, 16, or 32 μL of concentrated virus on 5x104 HeLa cells in 24-well format with a final volume of 500 μL of complete culture media and measuring the % of fluorescent cells detected (>5σ above negative control) in flow cytometry 48 hours later. 2.4.5 Viral transduction For transduction of ID8 cells with Ova or Ova-scrambled minigene viruses, 4x106 cells were plated at ~50% confluency in a 1 x 10 cm tissue culture dish per virus. Concentrated viruses (0.4 mL) were each diluted with 2.6 mL of complete media and each added to plates. Plates were incubated in 3 mL volumes for 18 hrs, 37°C, at which point cultures normal culture was resumed 44 in 10 mL of complete media. For transduction of random minigene virus, 1.2x107 ID8 cells were plated at ~50% confluency in 3 x 10cm tissue culture dishes. Concentrated virus (1.2 mL) was diluted with 7.8 mL of complete media and 3 mL of transduction mix was added to each plate. Plates were incubated in 3 mL volumes for 18 hrs, 37°C, at which point normal culture was resumed in 10 mL of complete media. For transduction of EL4 cells with Ova or hgp100 minigene viruses, 20 μL of concentrated virus was added to 1x105 cells in 500 μL of complete media in a single well of a 24-well plate. After 24 hours, the cells were diluted to 5 mL in complete media and transferred to a T-25 flask for continuous culture. 2.4.6 CTL/APC co-cultures Activated and expanded CTL were used in FRET-shift assays on day 7 post-stimulation. Cell populations were enumerated using the Countess automated cell counter (Invitrogen) to determine input cell number for CTL/APC co-cultures. For FRET-shift assays using ID8-derived target cells, co-cultures were performed for 4 hours at a 1:1 effector:target ratio in round-bottom 12x75 mm FACS tubes containing 1x105 target cells. For FRET-shift time-course experiments using ID8-derived target cells, co-cultures were performed for specified times and at specified ratios in flat-bottom 96-well plates containing 5x104 target cells/well. For FRET-shift assays using EL4-derived target cells, co-cultures were performed for 4 hours at a 1:1 effector:target ratio in U-bottom 96-well plates containing 1x105 target cells/well. Cultures were maintained at 37°C and 5% CO2 atmosphere 2.4.7 Flow cytometry/FACS Virally transduced APC lines were purity-sorted to remove untransduced targets and cells carrying non-productive minigene-reporter transgenes. Cells were either sorted on BD FACSAria Fusion or BD FACSAria II by gating for single-cells (as determined by FSC-A vs. FSC-W and SSC-H vs. SSC-W) that were PI- (ex. 561, em. 610), YFP+ (ex. 488, em. 530) and emitting resting FRET-signature in FRET (ex. 405, em. 525/50 + 505LP) vs. CFP (ex. 405, em. 450/50) plots. For analysis of CTL co-cultures, cells were washed 1X with PBS, resuspended in 500 μL PBS + 5% FBS + 1 μg/mL propidium iodide (PI), filtered through 40 um nylon mesh, and kept on ice for duration of analysis. Cytometric analyses were performed on BD LSR II 45 Fortessa and FACS cell isolation was done on BD FACSAria Fusion. Target cells to be analyzed were selected by gating for PI- and YFP+ singletons. Cells undergoing T-cell targeting were sorted on the basis of FRET-shift in FRET (ex. 405, em. 525/50 + 505LP) vs. CFP (ex. 405, em. 450/50) plots. Gated cells were collected in 15 mL conical Falcon tubes containing 3mL of 55% FBS/45% culture medium. Immediately after collection, cells were pelleted at 300 x g, 5 min and subject to gDNA isolation using DNAzol reagent (Invitrogen). 2.4.8 Quantitative PCR For integrations/cell assay, SYBR Green assay reagent and MND promoter-specific primer set (Appendix B.3) were used in parallel with a TaqMan qPCR assay specific to the mouse beta actin 3’UTR (ABI part #4352933E; TaqMan Universal PCR Master Mix) for analysis of transduced cells at the genomic level. Five-point standard curves constructed from known amounts of plasmid containing MND and mActB target sequences were analyzed by each primer set to infer the number of lentiviral integrations and total cell content present in each sample. For enumerating Ova minigene and scrambled minigenes in genomic DNA of collected cells from mixed target populations, a custom duplex TaqMan assay (Applied Biosystems; assayID: AHUAOFV; TaqMan Universal PCR Master Mix) able to discriminate between Ova minigene and scrambled minigene, was used. A 5-point standard curve of known amounts of plasmid containing target sequences for either minigene was constructed and used to infer relative minigene numbers of each type in each sample. Unmixed Ova minigene expressing cells and scrambled minigene expressing cells were included as controls to correct for differences in average insert copy numbers between the two populations. All samples were performed in sets of 4 technical replicates. 2.4.9 Data analysis Gel and CFU analyses were performed using ImageQuant TL v8.1.1. Flow cytometry analyses were performed using FlowJo V10.0.8r1. Quantitative PCR analyses were performed using SDS2.4. Sequence data handling was performed using Geneious v8.1.2 and Bioconductor package seqinr. Other data handling and statistical analyses were performed using R v3.1.2 and RStudio. 46 Chapter 3: Validating a FRET-shift-based library screening approach 3.1 Introduction Having demonstrated the ability of the GzmB-sensitive FRET-shift assay to efficiently and specifically identify target cells undergoing cytotoxic attack from T cells, the next step was to assess the utility of this read-out in isolating antigenic minigenes from complex mixtures of minigene-expressing cells. For adaptation of the FRET-shift assay to high-throughput T-cell epitope screening, it would be necessary to clone large libraries of short-peptide coding DNA sequences into a lentiviral transfer plasmid backbone alongside the GzmB-cleavable FRET-reporter, generate lentiviral minigene libraries, and then infect surrogate, sero-matched target cells at a multiplicity of infection that favors a single minigene per APC. High-complexity transduced target library cell lines could be co-incubated with expanded CTL clones or populations-of-interest and sorted based on their FRET-shift status to isolate cells carrying putatively antigenic minigenes. These minigenes could then be PCR-amplified and sequenced to reveal the landscape of epitopes eliciting reactivity from the screened CTL (Figure 3-1). Figure 3-1. Overview of T-cell antigen library screening approach. Target cells are ① provided with libraries of lentivirally-delivered short-peptide coding minigene sequences and ② exposed to T cells of-interest. Any targets recognized by T cells are then subject to GzmB loading and cleavage of their internally expressed FRET-reporter. Cells carrying putatively antigenic minigenes are then isolated by FACS ③, minigenes encoded within these cells are recovered by PCR ④, and the resultant amplicons are characterized by deep sequencing ⑤. 47 In this chapter, I demonstrated the feasibility of characterizing isolated cell populations from FRET-shift assays by minigene amplicon sequencing. Moreover, I explored the sensitivity of this approach – sensitivity being defined from two different standpoints. The first definition of sensitivity considered was the limit of detection of a canonical test antigen when the antigen is present in a background minigene population in decreasing abundances. The second definition of sensitivity was considered to be the limit of detection of T-cell reactivity, given a particular antigenic stimulus, when the reactive T cell is present in mixed T-cell populations in decreasing abundances. The general principle behind the experiments carried out in this chapter was to analyze the performance of the method with respect to both definitions of sensitivity by carrying out a series of spike-in experiments wherein known test minigene and/or TCR was spiked in to a population of background cells at decreasing abundances until signal was no longer detectable. Doping random minigene or wild-type T cells with canonical minigene or model clonotypes, respectively, was done at the cellular level by combining transduced or activated cell lines at set ratios. This is as opposed to spiking at the DNA level by combining different plasmid constructs or at the viral level by combining different populations of assembled virions. The rationale for spiking at the cellular level was that doing so would avoid incorporating variability arising from the upstream virus production and transduction processes and, therefore, specifically address the questions of method sensitivity described above. For the work presented in this chapter, I again used the OT-I/Ova and the pmel-1 TCR/hgp100 mouse model systems. Model OT-I or pmel-1 TCR T cells were diluted with wild-type C57BL/6 mouse T cells activated and expanded from splenocytes. I expected that a high-diversity diluent population, such as wild-type T cells, would provide a more relevant testing framework than a single negative control diluent population. Likewise, in contrast to using a single negative control minigene, such as the scrambled Ova minigene introduced in Section 2.2.2, I constructed a diverse library of random minigenes to use as a diluent into which Ova or hgp100 minigenes could be added. This was done to provide a relevant simulation of the challenges that could 48 potentially be encountered during library construction and sequencing analysis of FACS-recovered target cell populations. 3.2 Results 3.2.1 Antigenic minigenes are detectable with high sensitivity from complex minigene libraries To measure sensitivity as defined by the limit of detection of antigen from mixed populations, I produced a library of random minigenes from degenerate synthetic oligonucleotides containing a stretch of 48 random bases flanked by BamHI/EcoRI restriction sites and conserved primer annealing regions. These random minigenes were amplified by PCR, digested by BamHI/EcoRI, and ligated into the minigene site of the FRET reporter-containing lentiviral transfer plasmid (Appendix B.1.1). Plasmid ligation products were transformed into electrocompetent bacteria and amplified on solid agar to obtain ≈1.8x106 random minigene clones. Plasmid DNA was isolated and used to generate lentivirus, which was subsequently titered and transduced into ID8 cells at an MOI favoring one insertion event per cell to produce a random minigene-expressing target cell population (Table 3-1). One challenge associated with minigene library construction is the presence of stop codons. Stop codons in random DNA sequence are expected to arise at a predictable rate that can be modeled by the equation: (3-1) Where x is the number of stop codons present in a random sequence and n is the amino acid length of the translated DNA sequence. In a random minigene library of 48 base pair length, 54% of minigenes would be expected to be non-productive due to premature stops. Like in Chapter 2, the experiments carried out in this chapter were done using target cell populations that were pre-sorted by FACS to obtain a pure resting FRET signature in the target cells. The difference in the case of the library-transduced APC populations generated here, however, is that the fluorescent units (FU – transducing units that produce functional fluorescence) obtained in viral titering (as 49 discussed in Section 2.3) are an underestimate of the real measure of transducing units present in the virus pool. Even though non-productive minigenes, theoretically, would be invisible in FACS, they still do integrate into APC gDNA and could present a potential source of background in downstream amplicon sequencing if they co-infect with a productive minigene and “hitch-hike” along with it through FACS. To account for this, the MOI used for transducing library cell populations was set based on the following: (3-2) Where the target fluorescence – the number of fluorescent units added per cell in the APC population – was reduced by a factor proportional to the expected rate of non-productive minigenes to ensure that the overall MOI, including invisible minigenes, is not favorable to the production of multiply transduced target cells. Table 3-1. Construction of random minigene library expressed in ID8 cells. Random minigene library insert DNA was cloned into pMND-silent-FRET plasmid backbone (Appendix B.1.1) and transformed into bacteria to yield 1.8x106 cfu. Random minigene plasmid isolated from transformants was used to make random minigene lentivirus. Virus was functionally titered and delivered to host ID8 cells at an MOI of 0.36 TU/cell. Transduced ID8 were expanded 2.5-fold before FRET purity-sorting. The recovered cells were then expanded 6-fold further before being used in random minigene screening. After FRET-shift/amplicon sequencing, Lincoln-Peterson capture-recapture estimation was done by comparing the unique minigene sequences detected in all pairwise combinations of the six co-culture conditions sequenced in this experiment. The mean ± standard deviation of these estimates is reported below. It was estimated that 1.3x106 unique minigenes were represented in the input APC population, indicating some loss in diversity in the final transduced target population relative to the plasmid-inserted minigene population. Step PCR/ digestion/ purification Ligation/ transformation Virus production Transduction Target cell library prep/ CTL co-cultures Sequencing/ Capture-Recapture estimation Estimated library size •96 ng •2.7 pmol •1.8x106 CFU •1.9x107 infectious units •54% expected attrition due to stop codons •MOI=0.36 •Expanded 2.5x prior to FACS •1.1x107 cells recovered from purity sorting •Expanded 6x prior to cryopreservation •3x106 used per co-culture •1.6x106 filtered reads across 6 samples •Estimated 1.34±0.03x106 unique minigenes expressed by APC population 50 Five separate cell populations were prepared for screening by combining Ova minigene-expressing ID8 cells with random minigene-expressing ID8 cells in abundances ranging from 1:10 to 1:100,000. Each spiked cell population was then co-cultured with OT-I CTL and sorted by FRET status to obtain Shifted and Unshifted gates for all spike-in levels (Figure 3-2a, inset). Gates were defined by establishing the boundaries of resting FRET signature on CTL- control populations. Genomic DNA was isolated from each collected cell population and used as template for PCR amplification of integrated minigenes using Illumina adapter-tailed primers for direct sequencing on the Illumina MiSeq platform. Cell counts from FACS and sequencing read counts for each gate are summarized in Appendix E By analyzing the difference in relative abundances of Ova-minigene reads detected in the Shifted and Unshifted gates of each spiked library screen, I determined that at the 1:10, 1:100, and 1:1,000 spike-in levels, the Ova-minigene was the most prevalent sequence detected in the Shifted gates and was considerably enriched relative to Unshifted populations (Figure 3-2a). At 1:10,000, the Ova-minigene was no longer observed to be the most dominant enriched sequence, however, it was still in the top 5 most highly enriched minigenes and was detected at >10 standard deviations above background (Figure 3-2b). These data indicate that cells presenting relevant epitope are easily detectable even when present at a population frequency of 1:10,000. 51 Figure 3-2. OT-I spiked library screening. Random minigene cell libraries spiked with Ova minigene-expressing cells were co-incubated with OT-I CTL at a 1:1 effector:target ratio for 4h prior to FACS analysis. Recovered cells were lysed and integrated minigenes were amplified from genomic DNA using PCR primers specific for the conserved transgene region flanking the minigene site. Primers with indexed Illumina adapter tails were used for direct sequencing of amplicons on the Illumina MiSeq platform with 2x250 paired-end chemistry. (a) The % of reads detected in each gate that encoding the SIINFEKL epitope. Each bar represents one individual measurement. (b) The read count frequency of all unique sequences found in 1:10,000 spike-in sample expressed as the difference between the relative frequency in the Shifted gate and the Unshifted gate. I next sought to determine if the sensitivity demonstrated in the OT-I/Ova system was reproducible in the pmel-1 TCR/hgp100 system. At this point, the EL4 line had been carefully assessed and was selected for use as host APC line to maximize the probability of detecting hgp100 minigene since the pmel-1 TCR is less potent than OT-I and EL4 was shown to generate superior FRET-shift signal compared to ID8 cells (Section 2.2.5). To do this, a second batch of random minigene virus library was generated from the same plasmid DNA stock as was used to make the ID8 random minigene cell line described above and used to transduce EL4 cells to yield a random minigene-expressing line (Table 3-2). To this random minigene-expressing EL4 cell line, hgp100 minigene-expressing EL4 were spiked-in at an abundance of 1:10,000. The spiked library population was then co-cultured with pmel-1 TCR CTL and sorted based on FRET-shift status. Both the Shifted and Unshifted populations were isolated by FACS (representative gating strategy shown in Appendix D.2) and used to prepare minigene amplicon libraries for Illumina sequencing. Raw reads were processed and analyzed as 52 done previously with Ova-spiked libraries to reveal that pmel-1 TCR CTL were, indeed, also able to identify the hgp100 antigen out of random minigene background at a level of enrichment >10 standard deviations above background (Figure 3-3) and that the hgp100 minigene was detected in the top 15 most highly enriched minigenes. Table 3-2. Construction of random minigene library expressed in EL4 cells. Random minigene plasmid produced in Table 3-1 was used to make a 2nd batch of random minigene lentivirus. Virus was functionally titered and delivered to host EL4 cells at an MOI of 0.11 TU/cell. Transduced EL4 were expanded 4-fold before FRET purity-sorting. The recovered cells were then expanded 125-fold further before being used in random minigene screening. After FRET-shift/amplicon sequencing, Lincoln-Peterson capture-recapture estimation was done by comparing the unique minigene sequences detected in all pairwise combinations of the seven co-culture conditions shown in Figure 3-3/Figure 3-5. The mean ± standard deviation of these estimates is reported below. It was estimated that 4.7x105 unique minigenes were represented in the input APC population. Step PCR/ digestion/ purification Ligation/ transformation Virus production Transduction Target cell library prep/ CTL co-cultures Sequencing/ Capture-Recapture estimation Estimated library size •96 ng •2.7 pmol •1.8x106 CFU •1.1x107 infectious units •54% expected attrition due to stop codons •MOI=0.11 •Expanded 4x prior to FACS •2x106 cells recovered from purity sorting •Expanded 125x prior to cryopreservation •3x106 used per co-culture •6.7x106 filtered reads across 7 samples •Estimated 4.68±0.2x105 unique minigenes expressed by APC population 53 Figure 3-3. pmel-1 TCR screening against 1:10,000 hgp100 minigene-spiked random library. Random minigene expressing EL4 cell library spiked with hgp100 minigene-expressing EL4 cells at an abundance of 1:10,000 were co-incubated with pmel-1 TCR CTL at a 1:1 effector:target ratio for 4h prior to FACS analysis. Recovered cells from Shifted and Unshifted gates were lysed and integrated minigenes were amplified from genomic DNA using PCR primers specific for the conserved transgene region flanking the minigene site. Illumina adapters and indexes were added in a 2nd round of PCR and resultant amplicons were sequenced on the Illumina MiSeq platform with 2x250 paired-end chemistry. The differences in relative abundance for all distinct minigene sequences detected in the Shifted gate and the Unshifted gate are shown. 3.2.2 Application of bioinformatic filtering to non-canonical minigenes refines putative hits One does not have a priori knowledge of T-cell epitopes when performing unbiased epitope discovery. Here, I noted that a small number of randomly encoded minigenes met the 10-sigma cutoff and ranked higher in shifted-gate enrichment than the known cognate minigenes. I, therefore, evaluated whether these sequences could contain bona fide epitopes by analyzing them with NetMHC-3.0 (Lundegaard et al., 2008) to determine if they contained any predicted MHC-binding peptides and by aligning them to the GenBank non-redundant cds translation database by BLASTp (Altschul et al., 1990) to determine if they had any significant sequence similarity to known proteins (Table 3-3). 54 Table 3-3. Database searching of random minigene hits from 1:10,000 spiked samples Test TCR Rank in screen Encoded minigene Minimal peptides with predicted IC50 < 500nM? Top E-score from BLASTp alignment OT-I 1 NTAKTTGKPSVVQATM No 22 OT-I 2 HIHTDWPQYPPCLNLI No 1.8 OT-I 3 MLVLLPDEVSGLEQLESIINFEKLTEWTSSNVMEERKIKV Yes, predicted IC50 = 44.40 nM 1.4x10-18 pmel-1 TCR 1 QILLPIHKHITTLVGA No 11 pmel-1 TCR 2 SNADVQHSTIRPQPHT No 5.7 pmel-1 TCR 3 SKNKPLLNGVTCIMSK No 11 pmel-1 TCR 4 STRYSGTVLYLVVQRR No 2.0 pmel-1 TCR 5 DAHCGNVEPTTKTLNL No 8.1 pmel-1 TCR 6 EHEVTLDGTIKATLAK No 5.7 pmel-1 TCR 7 FKFTILKRSTIESIH No 14 pmel-1 TCR 8 LSCPCQNLLPHVSFPR No 2.0 pmel-1 TCR 9 QYGRTPKGPFLLTAPK No 2.9 pmel-1 TCR 10 REKVYMPPILKHGPDN Yes, predicted IC50 = 409.90 nM 2.0 pmel-1 TCR 11 RQNQHFALHKNESKKD No 1.0 pmel-1 TCR 12 RTHKNNLNTTDAFPLI No 2.0 pmel-1 TCR 13 TVKFRPPPPAVWTDPR No 0.25 pmel-1 TCR 14 NLRNITHAGGRITPH No 2.4 pmel-1 TCR 15 LLHLAVIGALLAVGATKVPRNQDWLGVSRQLRTKAWNRQLY Yes, predicted IC50 = 123.00 nM 3.0x10-19 None of the minigenes identified in the OT-I/Ova screen other than the canonical Ova minigene contained predicted MHC-binders, nor did they align with any significance to any protein sequences in GenBank, suggesting that they were background noise. Contrastingly, as expected, evaluation of the recovered Ova minigene by NetMHCpan identified SIINFEKL as a strong MHC binder (IC50 = 44 nM) and BLASTp alignment to the GenBank database yielded OVAL (Gallus gallus) as the top hit (E-score = 1.4x10-18). In the case of pmel-1 TCR/hgp100 screening, this analysis produced the expected canonical hgp100 epitope, KVPRNQDWL, as a predicted MHC binder (IC50 = 123 nM) and aligned the detected hgp100 minigene to the reference human gp100 sequence (E-score = 3.0x10-19). However, in addition, I recovered a second minigene 55 sequence encoding a peptide (KVYMPPIL) predicted to bind H-2Kb-encoded MHC with an IC50 of 409.9 nM, possibly representing a novel epitope for this TCR. This putatively novel pmel-1 TCR epitope was not found, by BLASTp alignment, to have significant similarity to any known, naturally-occurring proteins. This is not a completely unexpected finding nor does it necessarily exclude the peptide as a bona fide epitope. Since the BLAST algorithm tends to be biased against short query sequences and penalizes gapped alignments, this search strategy requires that candidate epitope sequences have an essentially perfect match exist in the subject database to return a significant hit. Though not successful in this instance, it should be possible to map hits from unbiased epitope space to naturally-occuring proteins via conserved motifs within immunogenic peptides using more advanced search strategies, such as those that have been implemented elsewhere (Gee et al., 2018). Together, these results demonstrate the ability of this assay to identify T-cell epitopes without a priori knowledge of the antigenicity of any minigenes present in screening libraries. It is unclear, however, what the primary source of background noise in these experiments was. One possibility is technical noise: recovering insufficient numbers of cells in unshifted FACS gates or failing to exhaustively sequence recovered unshifted cells could give rise to a scenario in which some minigenes could be detected in shifted gates but, by chance, left undetected in unshifted populations. This effect would manifest as an artificial enrichment of these minigenes leading to spurious hits. A second possibility is that specific differences in the binding and activation characteristics of individual test TCR lead to variable amounts of background noise in these experiments. For example, the larger number of co-enriched minigenes in the pmel-1 TCR experiment relative to the OT-I experiment is suggestive that the pmel-1 TCR could be a more promiscuous TCR than OT-I. 3.2.3 Antigenic minigenes are detected by polyclonal populations of input T cells containing diluted model T cells The validation experiments carried out using OT-I and pmel-1 transgenic mouse strains demonstrate feasibility when using large, monoclonal T-cell populations as input into minigene 56 library screening experiments. However, a major practical consideration is the considerable initial effort that is necessary to isolate clonal T-cell populations-of-interest. Therefore, I tested screening polyclonal populations of activated T cells against libraries of random minigene spiked with known cognate antigenic minigenes. As a first test, OT-I splenocytes, pmel-1 splenocytes, and wild-type C57BL/6 splenocytes were all activated and expanded by anti-CD3/28 stimulation and mixed together at varying ratios of model T cells (either OT-I or pmel-1 CTL) to wild-type C57BL/6 CTL. These mixed OT-I and pmel-1 CTL populations were then co-cultured with pure populations of EL4 cells expressing Ova minigene-FRET or hgp100 minigene-FRET cassettes, respectively, and assessed for their ability to elicit FRET-shift signal. In the case of both the OT-I CTL and the pmel-1 CTL, it was observed, as expected, that as the abundance of minigene-reactive model CTL decreased, the proportion of cells undergoing FRET-shift also decreased. However, in both cases, I found that even at 1:3,000 abundance, the model TCR CTL were able to result in statistically significant FRET-shift signal relative to wild-type C57BL/6 splenocyte-derived CTL alone (Figure 3-4). 57 Figure 3-4. Performance of diluted model T cells in FRET-shift assays. Expanded OT-I CTL and pmel-1 TCR CTL were each diluted into expanded wildtype C57BL/6 CTL at abundances ranging from 1:3 down to 1:3,000. OT-I and pmel-1 TCR cell mixtures were then co-cultured with pure, unmixed populations of EL4 cells transduced with Ova minigene and hgp100 minigene, respectively. Co-cultures of unmixed target cells and unmixed CTL were also performed as controls. Data shown are 3 replicate measurements with underlaid bar chart and error bars denoting mean ± SD. Significance was determined using an unpaired, 1-tailed Student’s t test. 3.2.4 Antigen identification from co-cultures of complex minigene libraries and polyclonal T-cell populations Next, I tested epitope detection using mixed populations of target cells screened against mixed populations of CTL. Three separate “mixed + mixed” co-culture conditions were prepared for each model TCR/antigen pair and, upon completion of co-culture, were sorted into Shifted and Unshifted gates. Similar to the previous sequencing experiments, amplicon libraries were constructed from gDNA derived from cells collected in each gate and sequenced using the 58 Illumina MiSeq. Raw reads were processed, and unique sequences detected in each co-culture condition were analyzed by subtracting relative abundance in the Unshifted gate from relative abundance in the matched Shifted gate. Cell counts from FACS and sequencing read counts for each gate are summarized in Appendix E . Canonical minigenes were said to be successfully identified if the Δ relative abundance was found to be >10σ greater than the mean. For both model TCR/antigen pairs, canonical minigenes were readily identified as strongly enriched putative hits out of a 1:10,000 mixture when T cells were diluted to an abundance of 1:30 (Figure 3-5). Figure 3-5. FRET-shift/deep amplicon sequencing approach in co-cultures of mixed CTL populations + mixed target cell populations. Random minigene cell libraries spiked with either Ova minigene-expressing cells or hgp100 minigene expressing cells were co-incubated with wild-type C57BL/6 CTL spiked with OT-I or pmel-1 TCR CTL, respectively. The specific mixtures used for testing are described in panel headers. Target and CTL populations were combined at a 1:1 ratio and co-cultured for 4h before being sorted on FRET-shift status. Recovered cells from Shifted and Unshifted gates from each screening conditions were lysed and integrated 59 minigenes were amplified from genomic DNA using PCR primers specific for the conserved transgene region flanking the minigene site. Illumina adapters and indexes were added in a 2nd round of PCR and resultant amplicons were sequenced on the Illumina MiSeq platform with 2x250 paired-end chemistry. The y-axes of all panels indicate the differences in relative abundance for all distinct minigene sequences detected in the Shifted gates and the Unshifted gates of each co-culture condition. 3.3 Discussion In this chapter, I have demonstrated the power of this approach to simultaneously assess vastly more epitopes for their ability to be processed, presented, recognized, and elicit reactivity than would be tractable with current methods (Almeida et al., 2009). The experiments in this chapter show that, even when present at frequencies as low as 1 in 10,000, relevant antigen can readily be detected from background. However, it is possible that for scaled-up versions of these studies in which naïve library screening with no spiked-in test antigen are performed, it should be attainable to screen 1x106 minigene sequences for T-cell reactivity. This estimate is based on the postulate that, given the 1:1 effector:target ratio and the small scale of 3x106 target cells per condition used in the experiments in this chapter, a copy number of 300 cells carrying a given antigenic minigene is approximately the minimum threshold number that is needed for detection by FRET-shift FACS and amplicon sequencing (1:10,000 = 300 cells/3,000,000 cells). For naïve library screening with no a priori knowledge of antigens composing the library, it would be necessary to meet this level of clonal redundancy for all distinct minigene-expressing clones in the population. To achieve this with 1x106 minigene sequences, it then follows that co-culture experiments should be done at a scale of roughly 300 million target cells per condition. This would correspond to 30 x 96-well plates (with 1x105 target cells per well) and approximately 12 hours of FACS time, both of which are practically achievable parameters. I have also shown that this strategy is sufficiently robust to detect individual T-cell reactivities out of polyclonal mixtures of T cells when their abundance is ~1-5%. Though other methods currently in use today show superior performance in terms of detecting very low abundance T-cell clonotypes in complex T-cell repertoires, the ability of my FRET-shift/amplicon sequencing method should still be useful for screening T-cell populations taken from relevant biological 60 contexts. T cells directly from disease lesions or the peripheral blood of vaccinated individuals are often observed to contain antigen-specific T cells present at >1% frequency (Ekeruche-Makinde et al., 2012; Klarenbeek et al., 2012). Moreover, sophisticated strategies to rationally enrich for reactive T cells by activating sub-pools of PBMC have been used to generate “mini-lines” in which reactive T cells are known to be present at a minimum threshold frequency (Martin et al., 2017; Theaker et al., 2016). The ecosystem of methods available to perform T-cell antigen screening is illustrated in Figure 3-6. Presented visually, we can see that the FRET-shift/amplicon sequencing approach validated here is conceptually oriented in a direction opposite to most current methods, which were originally developed from the perspective of profiling the reactivity of repertoires of T cells against small panels of antigens-of-interest. In contrast, I propose my approach as a means to assess T-cell clonotypes which have already been determined to be interesting or relevant and, by exposing them to large libraries of candidate antigens, understanding the repertoire of epitopes that they recognize. 61 Figure 3-6. Placement of the validated FRET-shift/deep amplicon sequencing strategy in the ecosystem of existing methods in use for T-cell antigen discovery. The methods described in Section 1.5, currently constituting the suite of approaches available currently for T-cell antigen screening, and my novel FRET-shift/amplicon sequencing methodology are mapped with respect to their robustness in detecting T-cell antigens from highly mixed T-cell populations (left) and their ability to screen high-diversity libraries of candidate antigens (right). On both scales, higher placement is more desirable; the ideal method would be represented by a horizontal line connecting the tops of both columns. It should also be noted that the FRET-shift/amplicon sequencing strategy has a number of other features not depicted in Figure 3-6. As already discussed in Chapters 1 and 2, it is a highly physiologically relevant approach that incorporates many aspects of the T cell/APC interaction, from including natural antigen processing and presentation to maintaining biophysically realistic receptor/ligand interactions to monitoring the activation of T-cell effector function. No other T-cell antigen profiling assays used in common practice currently so thoroughly recapitulate T-cell biology in in vitro screening. Another critically important feature of the FRET-shift/amplicon sequencing method is that it can be performed rapidly in comparison to other methods. For example, to screen 1,000 distinct peptides by FRET-shift assay, target cells would be seeded into a single well of a 96-well plate, 62 co-incubated with T cells, and transferred to a single FACS tube for sorting. In contrast, ELISPOT, a method also capable of screening 1,000 distinct peptides, would require directed distribution of target cells across more than 10 x 96-well plates. These plates would then be subject to numerous blocking, binding, washing, and detection steps in order to develop signal in each well, which is an onerous pipetting task. As a second example demonstrating the rapid capability of FRET-shift/amplicon sequencing, other T-cell antigen profiling efforts (Birnbaum et al., 2014; Siewert et al., 2012) have encoded libraries of up to 108 distinct minigene sequences and screened them with T cells or TCR-based soluble reagents. However, it is important to consider that these methods rely on plasmid transfection of APC. As discussed in Chapter 2, transfection results in the delivery of a very high multiplicity of minigene plasmids to host cells and, therefore, these approaches require a number of rounds of iterative screening (panning) in order to sufficiently enrich the signal for identification. In contrast, the FRET-shift assay, while only querying on the order of 106 unique minigenes, required no panning to identify antigenic minigenes. In other words, the reduction taken in library diversity caused by lentiviral transduction comes with the dividend that antigens could be identified immediately. However, a panning strategy could be used, in principle, to increase sensitivity in FRET-shift/amplicon sequencing experiments and, by extension, increase the number of minigenes that could be simultaneously assessed. Panning refers to the practice of subjecting screened populations to multiple subsequent rounds of re-screening in order to enhance signal-to-noise ratio and detect rare events. This could be accomplished in the context of FRET-shift/amplicon sequencing by directing some of the shifted-gate minigene amplicons generated for sequencing towards also producing daughter viral libraries. These subsequent libraries would then be used to re-transduce a new host target population which could be re-screened against CTL populations-of-interest. To facilitate rapid re-screening, it may be possible to delay the progression of apoptosis by ectopically expressing anti-apoptotic proteins, such as XIAP and/or Bcl-2 (Lickliter et al., 2007; Sedelies et al., 2008), in target cells or providing apoptosis-inhibiting drugs (Lee et 63 al., 2000) in culture media such that cells could be re-screened without the need to regenerate virus and re-transduce APC. 3.4 Methods 3.4.1 Cell culture All cell cultures were maintained in RPMI-1640 supplemented with 2 mM GlutaMAX, 1 mM sodium pyruvate, 50 μM β-mercaptoethanol, 10 mM HEPES, 100 U/mL penicillin, 100 U/mL streptomycin, and 10% heat-inactivated fetal bovine serum. Culture media and supplements were all sourced from Gibco. Cultures were maintained at 37°C and 5% CO2 atmosphere. 3.4.2 CTL activation and expansion Fresh-frozen dissociated splenocytes from OT-I mice (a gift of Dr. B. Nelson, BC Cancer Agency Deeley Research Centre), pmel-1 TCR mice (from intact spleens shipped by The Jackson Laboratory), or wild-type C57BL/6 mice were thawed, washed, adjusted to a density of 5x106 cells/mL in supplemented RPMI + 50 U/mL rhIL-2 (Peprotech). Splenocytes were seeded into a flat-bottom 96-well plate (1x106 cells/well) that had been pre-coated overnight with 100 μL solutions of low endotoxin, azide-free anti-mouse CD3 (clone 145-2C11) at 5 μg/mL and anti-mouse CD28 (clone 37.51) (Biolegend) at 10 μg/mL. Coated wells were washed 2x with PBS and 1x with complete media immediately prior to the addition of cells. Splenocytes were removed from coated plates after 48 hours, adjusted to 1x106/mL by adding fresh media + rhIL-2 to conditioned media, and cultured in U-bottom 96-well plate wells. Expanding T cells were split again on day 5, adjusting density to 1x106, and used in experiments on day 7. 3.4.3 Construction of minigene library plasmid Double-stranded random minigenes were generated by performing 30 cycles of PCR on synthesized RM_template ssDNA (20 nM final concentration) with RM_FWD and RM_REV (Suppl. Tab. 2). Random minigene libraries were BamHI/EcoRI digested, purified by 3% agarose gel electrophoresis, recovered by β-agarase digestion/ethanol precipitation and inserted into BamHI/EcoRI linearized pMND-silent-FRET. 64 3.4.4 Virus production To generate random minigene lentivirus, 80 µg of pMND-RM-FRET (Appendix B.2) transfer plasmid was combined with 72 µg of pCMV-ΔR8.91 and 8 µg of pCMV-VSV-G plasmids. These DNA mixes were incubated with 15 mL OptiMEM (Gibco) and 1 mL of TransIT-LT1 reagent (Mirus) for 30 minutes at room temperature. To 16 x 10 cm culture plates containing 40% confluent HEK293T cells, 1 mL of transfection mix was added per plate. In all cases, media was replaced 18 hours post-transfection and viral supernatants were then collected at 48- and 72-hours post-transfection. To concentrate virus, supernatants were ultracentrifuged (110,000 RCF, 90 minutes, 4°C) and pellets were resuspended in 1 mL OptiMEM (Gibco). Titers of viruses were determined by testing (in duplicate) 1, 2, 4, 8, 16, or 32 μL of concentrated virus on 5x104 HeLa cells in 24-well format with a final volume of 500 μL of complete culture media and measuring the % of fluorescent cells detected (>5σ above negative control) in flow cytometry 48 hours later. 3.4.5 Viral transduction For transduction of ID8 cells with random minigene virus, 1.2x107 ID8 cells were plated at ~50% confluency in 3 x 10cm tissue culture dishes. Concentrated virus (1.2 mL) was diluted with 7.8 mL of complete media and 3 mL of transduction mix was added to each plate. Plates were incubated in 3 mL volumes for 18 hrs, 37°C, at which point normal culture was resumed in 10 mL of complete media. For transduction of EL4 cells with random minigene virus, EL4 cells were adjusted, in complete media, to a density of 7.5x105 cells/mL in a final volume of 150 mL. To these cells, 1.6 mL of concentrated viral stock was added, and the resulting cell/virus mix was split across 15 x 10 mm dishes (10 mL per dish). 3.4.6 Quantitative PCR To measure random minigene integrations/cell, SYBR Green assay reagent and MND promoter-specific primer set (Appendix B.3) were used in parallel with a TaqMan qPCR assay specific to the mouse beta actin 3’UTR (ABI part #4352933E; TaqMan Universal PCR Master Mix) for analysis of transduced cells at the genomic level. Five-point standard curves constructed from known amounts of plasmid containing both MND and mActB target sequences were analyzed by 65 each primer set to infer the number of lentiviral integrations and total cell content present in each sample. 3.4.7 CTL/APC co-cultures Activated and expanded CTL were used in FRET-shift assays on day 7 post-stimulation. Cell populations were enumerated using the Countess automated cell counter (Invitrogen) to determine input cell number for CTL/APC co-cultures. For FRET-shift assays using ID8-derived target cells, co-cultures were performed for 4 hours at a 1:1 effector: target ratio in round-bottom 12x75 mm FACS tubes containing 1x105 target cells. For random minigene-expressing ID8 library screening, co-culture size was scaled up to 8x105 target cells/tube and 4 tubes were prepared per screening condition. Upon completion co-cultures, replicate tubes were pooled prior to FACS analysis. For FRET-shift assays using EL4-derived target cells, co-cultures were performed for 4 hours at a 1:1 effector: target ratio in U-bottom 96-well plates containing 1x105 target cells/well. For random minigene-expressing EL4 library screening, 32 wells per screen were prepared and pooled prior to FACS analysis. In all cases, co-cultures were maintained at 37°C, in 5% CO2 atmosphere. 3.4.8 Flow cytometry/FACS Virally transduced APC lines were purity-sorted to remove untransduced targets and cells carrying non-productive minigene-reporter transgenes. Cells were either sorted on BD FACSAria Fusion or BD FACSAria II by gating for single-cells (as determined by FSC-A vs. FSC-W and SSC-H vs. SSC-W) that were PI- (ex. 561, em. 610), YFP+ (ex. 488, em. 530) and emitting resting FRET-signature in FRET (ex. 405, em. 525/50 + 505LP) vs. CFP (ex. 405, em. 450/50) plots. For analysis of CTL co-cultures, cells were washed 1X with PBS, resuspended in 500 μL PBS + 5% FBS + 1 μg/mL propidium iodide (PI), filtered through 40 um nylon mesh, and kept on ice for duration of analysis. Cytometric analyses were performed on BD LSR II Fortessa and FACS cell isolation was done on BD FACSAria Fusion. Target cells to be analyzed were selected by gating for PI- and YFP+ singletons. Cells undergoing T-cell targeting were sorted on the basis of FRET-shift in FRET (ex. 405, em. 525/50 + 505LP) vs. CFP (ex. 405, em. 450/50) plots. Gated cells were collected in 15 mL conical Falcon tubes containing 3mL of 55% 66 FBS/45% culture medium. Immediately after collection, cells were pelleted at 300 x g, 5 min and subject to gDNA isolation using DNAzol reagent (Invitrogen). 3.4.9 Sequencing Sequencing libraries were generated by amplifying minigenes from isolated gDNA of gated cells. Primers were designed to anneal to regions flanking the minigene site on the viral transgene (Suppl. Tab. 2) and were either directly tailed with Illumina adapterized primers containing population-specific index sequences during oligo synthesis or tailed with Illumina adapters/indexes during a second round of PCR. For libraries prepared with single-round PCR, 25 cycles were performed with Phusion polymerase (New England Biolabs) and indexed amplicons were pooled and gel-purified. For libraries prepared using 2-round PCR protocol, 21 cycles were used for first round, gel-purification was performed, and 10 cycles were done in second round PCR (both using Phusion polymerase). Indexed amplicons from 2-round protocol were pooled and gel-purified again. A description of which samples were subject to 1-round PCR library procedure or 2-round round library procedure and a description of the pooling strategies used can be found in Appendix E In all cases sequencing was performed on the Illumina MiSeq platform using PE250 chemistry. To improve %PF during cluster formation steps, either PhiX control DNA was added to flow cells and/or a staggered primer strategy was employed to ensure sufficiently even base-pair distribution at each cycle. Upon completion of sequencing runs, paired-end reads were assembled to form minigene-contigs using FLASh (Magoč and Salzberg, 2011) with default parameters except for max-overlap = 304 and “allow outies” functionality enabled. Assembled reads were quality filtered (keeping reads with 90% of bases >Q20) and trimmed using FASTX-Toolkit (Hannon, 2010). Reads were then clustered using Starcode (Zorita et al., 2015) to collapse divergent sequences arising from PCR or sequencing error (Appendix C.2). For prediction of MHC binders from detected minigenes, NetMHCpan 3.0 was used with default parameters. For database searching of detected minigenes, BLASTP alignment to all non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental 67 samples from WGS projects (release 218.0)(Benson et al., 2017) was performed using default parameters (with automatic adjustment for short input sequences). 3.4.10 Data analysis Gel and CFU analyses were performed using ImageQuant TL v8.1. Flow cytometry analyses were performed using FlowJo V10.0.8r1. Quantitative PCR analyses were performed using SDS2.4. Sequence data handling was performed using Geneious v8.1.2 and Bioconductor package seqinr. Other data handling and statistical analyses were performed using R v3.1.2 and RStudio. 68 Chapter 4: Adapting FRET-shift T-cell antigen identification assay to human systems 4.1 Introduction While murine models proved to be invaluable tools for the validation of the novel FRET-shift/amplicon sequencing epitope discovery approach, several new challenges arise when considering how to adapt the method to human cells. In mouse, obtaining perfectly MHC-matched host APC lines into which minigenes can be encoded only requires matching the mouse strains from which target cells and T cells were isolated. In contrast, the study of T-cell populations from highly outbred human donors requires more advanced strategies for ensuring compatibility between target and T-cell MHC genotypes. These can include in vitro culture of primary cells from the T cell donor or the construction of artificial APC (aAPC) lines by providing MHC-null cells with relevant HLA gene sequences. Murine systems are also an ideal testing system because the existence of TCR transgenic mouse strains enables the generation of very large numbers of clonal T-cell populations that, in human contexts, would require substantial up-front effort to achieve. In many contexts, it may not even be possible to achieve the level of T-cell expansion or enrichment necessary for high-throughput antigen discovery efforts. For example, chronically activated T cells or T cells subjected to the highly immunosuppressive tumor microenvironment are often permanently attenuated and unable to proliferate or function significantly ex vivo. In other scenarios, T cells can be identified as potentially relevant to health and disease based on TCR-seq data but cannot be used for functional antigen profiling due to unavailability of source material. Approaches that enable selected TCR sequences to be functionally screened against high-diversity peptide-encoding minigene libraries without the requirement to first isolate and expand primary clones harboring those TCR would represent a major improvement to T-cell immunoprofiling technology. In this chapter, I explore alternative strategies that could be used to generate sufficient numbers of antigen-presenting cells harboring relevant HLA alleles. To this end, I describe a novel and fairly radical approach for generating functional TCR responses from immortalized cytotoxic 69 cell lines expressing recombinant TCR genes as a way of forgoing the necessity to enrich and expand primary T cells expressing TCR-of-interest. 4.2 Results 4.2.1 Primary B cells as antigen presenting cells Antigen-presenting cells autologous to the donor from which the T cells-of-interest was derived is an attractive, and perhaps the most obvious, choice for generating a host cell population because they are perfectly MHC-matched to the TCR under interrogation. However, primary mammalian tissue is generally not proliferative in in vitro culture and, in some cases, would require invasive procedures to isolate. Given these parameters, this approach would largely be restricted to cell types found in the blood. I, therefore, investigated the use of B cells derived from donor PBMC as a candidate host cell type for use in FRET-shift library screening. B cells comprise the other major arm of the adaptive immune system opposite T cells and follow a similar course of development. B cells also undergo V(D)J recombination but do so at their B-cell receptor (BCR) loci; their T-cell receptor loci are not rearranged, are not expressed, and, therefore, are not a source of interference in T-cell antigen identification experiments. Activation of naïve B cells in vivo occurs in secondary lymphoid tissue when antigen-specific B cells present MHC class II-bound peptides derived from internalized antigenic proteins and are recognized by CD4+ helper T cells. These helper T cells, via their CD40L surface proteins, deliver co-stimulation to B cells by binding CD40. Co-stimulation initializes rapid and sustained proliferation of the activated B cells in processes known as germinal center reactions. In vitro, this proliferative signal is mimicked by stably expressing CD40L signal in common laboratory cell lines, such as NIH 3T3 or L929. Along with IL-4 stimulation, co-culture of PBMC with irradiated (non-proliferative) CD40L-expressing cells has been shown to yield 50-fold expansion of highly pure activated B cells after 14 days and that viability and proliferative capacity is maintained for several weeks (Liebig et al., 2009). As described in Section 2.1, conventional lentiviral vectors incorporate the VSV-G protein as a substitute for the native HIV-1 env protein in a process referred to as pseudotyping. Generation 70 of VSV-pseudotyped lentiviral particles with VSV-G protein embedded in the viral envelope is a strategy that is commonly employed to expand the tropism of these vectors from infecting only CD4+ T cells, as is in the case of naturally-occurring HIV-1, to a variety of mammalian cell types. The VSV pseudotype achieves this broad tropism via the interaction of VSV-G with the low-density lipoprotein (LDL) receptor (Finkelshtein et al., 2013), which is an expressed surface protein in numerous tissue types. However, primary human B cells are extremely recalcitrant to transduction with conventional lentiviral or γ-retroviral vectors (Serafini et al., 2004) as they have very low level of endogenous LDL receptor expression (Amirache et al., 2014; Uhlen et al., 2015 [https://www.proteinatlas.org/ENSG00000130164-LDLR/tissue]) To contend with this issue, it has been previously determined that alternative pseudotyping is a feasible route towards enabling entry of lentiviral particles into target B cells. I tested two such approaches: first, a combination of measles virus (MeV) hemagglutinin (Δ24) and fusion (Δ30) envelope protein variants (Frecha et al., 2011) and, second, a chimeric envelope consisting of the extracellular and transmembrane domains of gibbon ape leukemia virus (GALV) envelope protein and the cytoplasmic tail T- and R-regions of murine leukemia virus (Christodoulopoulos and Cannon, 2001; Christodoulopoulos et al., 2010; Mock et al., 2012). To test these configurations, I obtained aliquots of plasmids encoding the above-mentioned envelope protein variants and produced virus using our in-house TransIT-LT1 based plasmid transfection protocol but with the modified plasmid ratios used in Frecha et al. (2011) and Mock et al. (2012) for MeV and GALV, respectively. In both cases, viral supernatants were concentrated by ultracentrifugation and stored at -80℃. Resultant viruses were functionally titered over HEK-293T cells to yield 7.5x105 TU/batch of MeV pseudotyped virus and 4.4x106 TU/batch of GALV pseudotyped virus. In order to transduce B cells, I intended to follow the procedures outlined by Frecha et al. (2011) and Mock et al. (2012) for MeV and GALV, respectively, but with the one major departure that B cells be removed from CD40L feeders, incubated with lentivirus only for a set period, and then returned to fresh CD40L feeder cells. Previous studies of B-cell transduction using the MeV and GALV pseudotypes did not investigate viral infection while interrupting B-cell stimulation. 71 While not necessary to remove CD40L-expressing cells from B cells for successful infection, I theorized that it would be likely that much of the productive virus delivered to a mixed culture would be absorbed by the feeder cell population. In the context of delivering minigene libraries to B cells, this could potentially represent a large bottleneck in the level of library diversity achievable in the desired APC population. Prior to testing this proposed transduction workflow, I investigated the kinetics of CD40L surface signal after irradiation. Optimal CD40L stimulation is performed by replenishing B-cell cultures with fresh irradiated CD40L feeders every 72 hours, otherwise cells will rapidly diminish in viability (Rush and Hodgkin, 2001). I sought to determine what happens to surface CD40L signal in the days immediately following irradiation in order to inform the timing of transduction. Six wells in 6-well plate format each containing adherent CD40L cells were subjected to 78 Gy of X-ray radiation. Immediately after dosing, one well was stained with anti-CD40L mAb and PI and assayed by flow cytometry. The mean fluorescence intensity of anti-CD40L, the proportion of PI- cells, and the number of cells collected in 60 seconds at constant flow from constant sample density were collected as data points. This procedure was repeated on each of the other irradiated wells using the same instrument every 24 hours for 120 hours total. Table 4-1 summarizes the results. Table 4-1. Monitoring NIH-3T3-CD40L feeder cells after radiation exposure. Time point post-irradiation Cell count (#cells/60s) % Viable cells (PI-) Mean Fluorescence Intensity (anti-CD40L [FITC MK13A4]) % CD40L stimulus normalized to T=0 No dose delivered NA NA 420 NA 0 hours 198,775 96.8 297 100 24 hours 136,585 84.5 272 54.9 48 hours 113,457 90.6 322 57.9 72 hours 50,474 70.8 361 22.6 96 hours 39,220 72.5 432 21.5 120 hours ND ND ND negligible The data indicate that the level of detectable surface CD40L is immediately knocked down and slowly recovers until reaching normal resting levels at 96 hours post-irradiation. However, in that time, it is evident that the cells undergo severe cell death and, by T=96, constitute a viable 72 cell population <20% of that seen at T=0. Merging the cell count, cell viability, and CD40L mean fluorescence intensity and normalizing to T=0, the cumulative CD40L signal seen by B-cell cultures holds at ~50% of T=0 levels in the subsequent 48 hours but then dramatically decreases by T=72. Based on these data, I concluded that B cells should be transduced shortly after stimulus with CD40L-cells, as opposed to at the end of their typical 72 hour stimulation cycle, and that viral incubation time must be restricted. B cells for transduction were prepared by thawing an aliquot of PBMC and stimulating them with CD40L-cells for a minimum of 3 cycles of CD40L replenishment to obtain pure B-cell cultures. Twenty-four hours after stimulation with fresh CD40L-cells, viral transductions were carried out with MeV- and GALV-pseudotyped viruses. An MOI of 1 and 1.5, respectively, was used. Virus was co-incubated with B cells for 18 hours, at which point B cells were returned to fresh CD40L cells. Resultant B cells were analyzed by flow cytometry to assess transduction efficiency. In both cases, YFP+ cells were obtained. Measles virus pseudotyped vector resulted in 20% transduction efficiency while GALV pseudotyped vector resulted in 13%, despite the higher MOI used in this condition. Both transductions showed extremely poor viability with both cultures undergoing a >50% drop in live cell count relative to mock transduction. Furthermore, the mock transductions themselves, in both cases, also yielded very low viability cells as both populations were <40% PI-. Taken together, these results indicate that, while the 10-20% transduction efficiency achieved in both conditions is highly favorable to single-copy viral integration of potential library minigenes, the titers achieved in virus production and the severe attrition observed in viable cell counts imply that the complexity of minigene libraries that could be delivered to B-cell populations is limited in scalability. 73 Figure 4-1. Lentiviral transduction of CD40L-stimulated primary B cells with alternatively pseudotyped virus. a) The basic schematic for adapting FRET-shift assays to human systems using primary autologous materials. To use expanded, B cells as minigene-FRET2-bearing autologous APC, it was first necessary to achieve successful transduction of these cells with lentivirus. b) PBMC-derived B cells stimulated with CD40L-expressing feeders were transduced with measles virus pseudotyped lentivirus containing FRET2 transgene. c) PBMC-derived B cells stimulated with CD40L-expressing feeders were transduced with gibbon ape leukemia virus pseudotyped lentivirus containing FRET2 transgene. YFP+ gates in both cases were drawn on live, single-cell populations and set at 5 standard deviations above YFP background observed in mock transductions. 4.2.2 Minimal signaling domains are required to initiate YT-Indy GzmB/PFN release Given the inefficiency observed in transducing autologous B cells with lentivirus, I instead chose to focus on an entirely synthetic biology approach to adapting the FRET-shift based antigen identification assay to human cells. I hypothesized that, given an immortalized natural killer (NK)-like cell line and a complementary target cell line which is naturally tolerated by the NK line, that the pair could be used as an artificial platform on which recombinant TCR sequences and minigene libraries could be functionally tested. If, in the absence of any modification to 74 these starting cell lines, the NK cells do not initiate a cytotoxic response to the target cells, they could then be subject to genetic modification to add elements necessary for inducing and detecting de novo antigen-specific cytotoxicity. Initial testing of this synthetic CTL/APC framework was carried out using the YT-Indy and K562 cell lines. The YT-Indy cell line (Montel et al., 1995) is a cytotoxic NK-like cell line derived from acute lymphoblastic leukemia and is a variant of the YT cell line (Yodoi et al., 1985). YT-Indy, unlike the parental YT, is known to naturally tolerate the K562 cell line (Lozzio and Lozzio, 1975), a widely used chronic myelogenous leukemia cell line. K562 cells do not natively express any class-I or -II HLA alleles but do express beta-2-microglobulin and possess an intact antigen processing and presentation pathway. For these reasons, K562 cells have become the gold-standard chassis onto which aAPC are routinely constructed (Butler and Hirano, 2014). I first addressed whether the YT-Indy cell line sufficiently expressed proximal signaling proteins necessary to result in granzyme/perforin release upon introduction of exogenous antigen receptors. To this end, chimeric antigen receptors (CAR) were used as a preliminary test case. These CAR constructs are composed of affinity domains consisting of single-chain antibody variable (scFv) fragments fused to intracellular T-cell signaling domains – the most critical one for cytotoxic response being CD3ζ (Sadelain et al., 2013). The specific CAR construct used in my preliminary test was our internally BC Cancer Agency-designed and manufactured CD19-reactive CAR, which is based on previous FMC63 scFv CAR designs known to elicit cytotoxicity from primary T cells against CD19+ populations (Imai et al., 2004). The CD19 CAR transgene was lentivirally delivered to YT-Indy cells. In parallel, CD19 coding sequence was transduced into K562 previously prepared to express the FRET2 reporter introduced in Section 2.3. In both cases, the viral cassettes did not include fluorescent protein marker, therefore, surface expression of each component was confirmed by mAb staining and used to sort transduced cells to purity. The resultant cell lines were co-cultured and analyzed by flow cytometry (Figure 4-2). The results show that YT-Indy.CD19CAR cells specifically 75 recognize K562.FRET2.CD19, indicating that the minimal signaling domains present in the intracellular region of the CAR design are sufficient to integrate with the downstream signaling network used by YT-Indy cells to mediate target cytolysis. As an aside, this experiment was also the first GzmB FRET-shift assay performed with the CyPet/YPet FRET2 reporter gene, which, as expected, resulted in a larger separation between shifted and unshifted populations than were observed in previous chapters. Figure 4-2. Feasibility of antigen-specific redirection of YT-Indy cytotoxicity. YT-Indy cells with or without CD19-reactive CAR were co-cultured with K562 with or without an integrated CD19 transgene. Co-cultures were performed at a 1:1 effector:target ratio for 4h. a) Experiments were performed in quadruplicate and each data point is expressed as the difference in %FRET-shift between effector+ condition and background no-effector controls. b) Representative gating used to obtain %FRET-shift values. 4.2.3 Construction artificial antigen-presenting cell (aAPC) and reconstituted cytotoxic T cell (rCTL) lines I then sought to translate the findings from the CAR pilot experiment to natural TCR/pMHC interactions. This was done using a model TCR previously discovered by our group to recognize a mutational neo-epitope derived from the mutated HSDL-1 protein found in a high-grade serous ovarian cancer patient (Castellarin et al., 2013; Wick et al., 2014). Artificial APC lines were 76 constructed from K562 cells by lentivirally delivering both the known HLA allele responsible for presenting the mutant HSDL-1 epitope, HLA-C*14:03, and either the wild-type HSDL-1 minigene (encoding amino acids 4-43 and containing the wild-type peptide CYMEALAL) or the L25V mutant version of the minigene, containing the mutated minimal epitope, CYMEAVAL. Both minigene and HLA-C expressing plasmids are described in Appendix B.2. Expression of HLA-C at the cell surface was confirmed by antibody staining using a pan HLA-C binding monoclonal antibody (Figure 4-3). Figure 4-3. Reconstitution of peptide-MHC complex in K562 cell line. a) Overall schematic of the aAPC configuration used in the synthetic screening platform designed here. K562 cells are transduced with two separate lentiviral cassettes: one encoding the relevant HLA allele and the other encoding minigene-FRET2 constructs. b) Flow cytometric analysis confirming the robust expression RFP expression marker included with HLA-C*14:03 transgene and of HLA-C, via antibody surface staining with PE-conjugated DT9 mAb clone (BD Biosciences), at the cell surface of HSDL1 mutant and wild-type minigene-FRET2 expressing K562 cells. Unstained controls are included to show signal compensation performed to remove RFP carryover into PE detection channel. c) Surface mAb staining and flow cytometric analysis of HLA-C expression in HLA-C*14:03 aAPC in relation to levels detected in healthy donor PBMC. In parallel, initial attempts to generate HSDL-1 mutant reactive rCTL were carried out under the hypothesis that providing the TCRαβ chains of this clonotype, along with CD8α and CD3ζ domains would be minimally sufficient for redirecting YT-Indy cytotoxicity towards HSDL1 77 mutant epitope-bearing K562. Previous reports have shown that CD8α are able to form homodimers capable of binding MHC-I (Sun and Kavathas, 1997) while YT-Indy.CD19CAR results from Section 4.2.2 are suggestive of the fact that CD3ζ domains contain adequate ITAM sequences for interfacing with Lck/ZAP-70-driven calcium signaling typical of activated T cells and NK cells. A TCRαβ-RFP multicistronic viral cassette was constructed (Wick-TCRαβ, Appendix B.2) and a helper cassette was constructed as a bicistronic CD8α/CD3ζ transgene and both were inserted into pMND-Multi (Appendix B.1). The cassettes were delivered sequentially into YT-Indy cells and assayed for TCR and CD8 expression. These cells robustly expressed surface CD8α and intracellular RFP, but no surface TCRαβ was detected (Figure 4-4). Figure 4-4. Initial (unsuccessful) attempt to recombinantly express TCRαβ at the surface of YT-Indy cells. a) Overall schematic of the 1st rCTL configuration used in the synthetic screening platform designed here. YT-Indy cells are transduced with two separate lentiviral cassettes: one encodes an HSDL1 L25V mutant-reactive TCRαβ complex and the other contains a bicistronic CD3ζ/CD8α transgene. B) Flow cytometric analysis confirms the robust expression of an RFP marker included with the TCRαβ transgene but no detection of TCR at the cell surface by antibody surface staining with FITC-conjugated IP26 mAb clone (Abcam). C) Expression of CD8α at the cell surface is confirmed by staining with APC-conjugated HIT8a mAb (Biolegend). This was a somewhat unexpected finding given that many previous reports had successfully introduced ectopically expressed TCR chains and observed successful expression. However, in revisiting these studies, I noted that most studies involving recombinant TCR, involve encoding TCRαβ in cells of T cell origin such as the Jurkat cell line, T hybridomas, or primary T cells (Anmole et al., 2015; Cameron et al., 2013; Giannoni et al., 2013; Karttunen and Shastri, 1991; Siewert et al., 2012; Zhou et al., 2016). These cells natively express CD3 whereas NK cells, including YT-Indy cells, do not. Alternative approaches in which TCR chains were successfully 78 recombinantly expressed in non-T-cell lineages (Hall et al., 1991; James and Vale, 2012) were done so by providing all CD3 subunits to the host cell line. Though the CD3δγε have been shown to be dispensable in TCR signaling, these proteins do appear to have critical functions in TCR assembly and surface transport (Buferne et al., 1992; Call et al., 2002; Dietrich et al., 1996). For the second attempt to reconstitute TCR in YT-Indy cells, I employed an amended strategy consisting of double-infecting YT-Indy cells already containing the TCRαβ-RFP cassette (from the first attempt described above) with two helper cassettes. The first was a multicistronic transgene encoding all four CD3 subunits separated by distinct 2A signal sequences, the second helper virus contained both the CD8α and CDβ genes expressed as a bicistronic transgene (Appendix B.2). After production of each of these viruses, they were delivered in tandem to YT-Indy.TCRαβ cells. In this case, surface expression of TCR was detected by antibody staining and, after sorting for double-positive CD8+/TCR+ cells, further confirmation of successful reconstitution was done by antibody surface staining for CD3ε and the specific Vβ TCR segment used by the HSDL1-reactive TCR (Figure 4-5). Figure 4-5. Confirmation of successful TCRαβ, CD3ε, and CD8α surface expression on transduced YT-Indy cells. a) YT-Indy cells previously transduced with mutant HSDL1-reactive TCRαβ encoding construct were double-infected with two helper cassettes designed to deliver complete CD3 and CD8 complexes. b) Measurement of 79 surface CD8α (using HIT8a mAb clone, from Biolegend) and TCRαβ (using mAb clone IP26, from Abcam) indicated successful co-infection. c-d) Double-positive population was isolated by FACS and characterized by surface expression staining once more. Robust expression, stronger than that of normal human PBMC, was detected for CD8α, CD3ε (using mAb clone UCHT1, from eBiosciences), and TCRβ (which was detected with an orthogonal antibody, from Beckman-Coulter, specific for the known Vβ segment used by the Wick et al. TCR). 4.2.4 TCR/pMHC mediated FRET-shift signal is detectable in synthetic screening platform Upon successful expression of all the necessary components outlined in Figure 2, I then tested the ability of the system to detect antigen-specific granzyme-mediated cytotoxicity via FRET-shift assay. Co-culture experiments were performed in which CD8+ rCTL carrying the mutant HSDL-1 reactive TCR (rCTL.CD8.WickTCR) were incubated with either the wild-type or mutant HSDL-1 minigene-expressing K562 aAPC. Recognition of mutant-minigene expressing target cells by the rCTL resulted in a modest but unambiguous FRET-shift signal (Figure 4-6), indicating that de novo antigen-specific cytotoxicity had, indeed, been achieved in a fully reconstituted system. 80 Figure 4-6. FRET-shift assay of HSDL1 mutant-reactive rCTL and wild-type or mutant HSDL1 minigene-expressing aAPC. Each target line was incubated with or without effectors for 4h at a 1:1 effector:target ratio and analyzed by flow cytometry. FRET-status of live, YFP+ cells was monitored and shifted gates were drawn based on resting FRET-signatures observed in no rCTL controls. 4.2.5 Addition of HLA genes to target cells does not suppress YT-Indy cytotoxicity Since the synthetic antigen-identification platform hypothesis required the introduction of HLA-coding sequences to an otherwise MHC-null K562 cell line, I investigated the possibility that the modest FRET-shift observed in Figure 4-6 was due to a suppressive effect mediated by MHC/KIR interactions in parallel to the expected activating TCR/pMHC interactions occurring between rCTL and aAPC. Natural killer cells carry out their function in vivo by balancing activating and inhibitory signals transduced through NK-specific receptor/ligand interactions. These receptors are categorized as natural cytotoxicity receptors (NCRs), natural killer group 2 (NKG2) receptors, and killer immunoglobulin-like receptor (KIR) families. The KIR family of receptors primarily functions by interacting with MHC molecules to suppress NK cytotoxicity, 81 thereby providing one avenue by which these cells can distinguish healthy self-cells from foreign, aberrant, or infected cells. To test this, 721.221 cells, an MHC-null lymphoblastoid cell line (Shimizu et al., 1988) which is known to be targeted for lysis by YT-Indy cells were used. The hypothesis here was that the introduction of HLA genes into the 721.221 line would result in suppression of the natural cytotoxicity of YT-Indy towards these cells. Coding sequences for HLA-C*14:03 and HLA-A*02:01 were independently transduced into previously prepared 721.221.FRET cells. Co-culture experiments were conducted between YT-Indy cells and each target cell population, 721.221.FRET, 721.221.FRET.C1403, or 721.221.FRET.A0201, in sets of 3 technical replicates. To obtain biological replicates, three repeat experiments were conducted on three different dates. The resulting Δ%FRET-shift measurements are shown in Figure 4-7. With respect to assessing the effect of HLA expression on YT-Indy cytotoxicity, the data show that there is no significant effect. However, a surprising trend was noted across the three replicate experiments: I found that over the course of conducting these experiments that there is a significant decrease (p < 0.0001) in signal at each time point. Plotting the mean Δ%FRET-shift of 721.221.FRET cells in response to YT-Indy as a function of time, I also found a strong linear correlation (R > 0.995) between the potency of YT-Indy cells and the number of days elapsed in vitro since thaw (Figure 4-7d). Unexpectedly, in searching for some explanation regarding the small FRET-shift signal observed in rCTL/aAPC co-culture by testing for the presence of KIR/MHC interactions, I have inadvertently uncovered an orthogonal, and much stronger, explanation. It appears to be evident that, with prolonged time in culture, YT-Indy cells have the tendency to lose their cytotoxicity capacity. Until this point, FRET-shift assays used in the development of my synthetic screening platform had been performed using modified YT-Indy cells that had been cultured for >20 in all cases, largely because of the lag time associated with transducing cells, expanding them, purity sorting by FACS, and re-expansion prior to their direct testing. Therefore, this effect had gone unnoticed prior to these experiments. 82 Figure 4-7. Summary of replicate YT-Indy/721.221.FRET co-culture experiments. YT-Indy effectors were co-cultured with 721.221.FRET targets containing either HLA-A*02:01, HLA-C*14:03, or no HLA allele and analyzed in sets of 2 or 3 technical replicates. Experiments were repeated for a total of 3 replicate experiments. a-c) Data points are expressed as the difference in %FRET-shift between effector+ condition and background no-effector controls. One-way ANOVA comparing the mean FRET-shift of each HLA allele in all 3 experiments indicate that there is no significant difference in FRET-shift between the 3 HLA phenotypes tested. One-way ANOVA comparing the means of all 3 HLA conditions in each experiment, however, indicate that there is a highly significant difference in FRET-shift across the 3 experiments. d) Plotting the mean FRET-shift of each condition versus the elapsed time (in days) of YT-Indy culture prior to co-culture shows a strong linear association. Linear regression between the means of the no-HLA condition is shown along with its corresponding R2 value. 83 4.2.6 Time-dependent decay in YT-Indy potency may be a result of constitutive granzyme/perforin degranulation To find a possible mechanism by which this time-dependent loss-of-function may be occurring, I first looked to the granzyme/perforin pathway of YT-Indy cells. I had previously hypothesized that the low potency of rCTL that was being observed was due to the fact that YT-Indy cells are a heterogeneous population and only a small fraction of these cells may be actually responsible for mediating cytotoxicity. To this end, I had tried to rationally select a subline of high-potency YT-Indy cells by isolating CD107a+ cells out of rCTL/aAPC co-cultures. Surface expression of CD107a, a marker associated with the interior leaflet of lytic granules and lysosomes, is an often-used indicator of granzyme/perforin degranulation toward a relevant target. In my initial investigations, I found that rCTL had a higher than expected resting CD107a+ signal and I, therefore, was unable to specifically isolate reactive effector cells from this population. The finding from the previous section indicating that YT-Indy cells are subject to a time-dependent decay in cytotoxicity prompted a review of CD107a+ surface expression. To do this, I assessed the baseline surface CD107a levels in unmodified YT-Indy cells that were thawed 12 days prior. The results of anti-CD107a surface mAb staining and flow cytometry are shown in Figure 4-8. Comparison of YT-Indy CD107a+ cells to K562, 721.221, and NK-92MI cell lines indicates that YT-Indy cells are, indeed, remarkable for their apparently constitutive degranulation as they displayed surface CD107a levels >8-fold higher than the next highest comparator cell line used in this experiment. These data are in agreement with published reports of constitutive granzyme B release from YT cells (Prakash et al., 2009). This property of YT-Indy cells and its consequences with respect to its use as an rCTL chassis in the synthetic screening system described here are discussed in further detail below. 84 Figure 4-8. Resting CD107a surface levels of several effector and target cell lines. Unmodified K562, 721.221, NK-92MI, and YT-Indy cells were stained with Alexa Fluor 647-conjugated H4A2 mAb clone (Biolegend) in the absence of any permeabilization or fixation. Analysis was performed on live, single-cell gates and done by setting a CD107a+ gate at 5 standard deviations above the unstained controls of each line. The data indicate that YT-Indy display a much higher level of surface CD107a expression than the comparators. 4.3 Discussion I have described, in this chapter, a novel approach to profiling TCR reactivity by developing a synthetic antigen-identification system in which T-cell reactivity can be interrogated via recombinantly expressed TCR and HLA coding sequences. With the data presented above, I propose that combining this approach with the high-throughput FRET-shift/amplicon sequencing assay described in earlier chapters will provide a means to circumvent the limitations encountered in generating library-transduced primary autologous cells and in the laborious and challenging steps involved with isolating and expanding primary T-cell populations-of-interest. The focus of this chapter was primarily on using synthetic biology approaches to implement FRET-shift assay based screening, in part precipitated by my assessment that the use of primary B cells is not scalable to the level needed to screen large minigene libraries. I will note, however, at this point that the techniques employed in Section 4.2.1 are not an exhaustive list of possible strategies that could be used to generate high-diversity minigene libraries in primary cells. Soluble CD40L and anti-CD40 mAb have been previously shown to also initiate B-cell 85 proliferation (Mazzei et al., 1995), albeit less strongly than CD40L-expressing feeder cells. Nonetheless, these reagents provide the opportunity to transduce B cells without removing them from stimulatory signal and bypass concerns about library virus sequestration in feeder cells. Furthermore, autologous T cells could be used as APC as they are another abundant cell type found in peripheral blood that can readily be bulk-expanded. The disadvantage with using T cells as APC in this case, however, would be that endogenously expressed TCR in these populations can present a source of interference in TCR antigen identification experiments. Another option for producing large numbers of donor-derived APC for antigen screening is to immortalize primary B cells. This has previously been achieved by using laboratory strains of EBV to transform primary B cells into B-lymphoblastoid cell lines (B-LCL) (Sun et al., 2000) or by stably expressing the proteins Bcl-6 and Bcl-xL in target B cells (Kwakkenbos et al., 2010; Linnemann et al., 2015) to produce immortalized germinal center-like B-cell lines by preventing terminal B-cell differentiation and inhibiting apoptosis, respectively. Problematically, these strategies are labor-, time-, and materials-intensive. EBV transformation generally only successfully immortalizes 2-3% of initial B-cell populations (Hui-Yuen et al., 2011), while Bcl-6/Bcl-xL transformation requires retroviral transduction of B cells, which, as discussed above, is highly inefficient. Therefore, several weeks of outgrowth are required to obtain cell numbers suitable for antigen discovery experiments. It should also be noted here that a key distinction between artificial APC and autologous APC is that artificial APC only express the individual HLA allele(s) provided to them transgenically whereas autologous APC will express the full HLA genotype under which the T-cell populations-of-interest developed. There are advantages and disadvantages to both approaches. Libraries screened in the context of APC with the full HLA genotype can be thought of as a top-down approach in which APC do not need to be HLA-typed prior experiments nor do multiple individual APC lines need to be constructed in order to test all possible peptide-MHC combinations; relevant minigenes can immediately be isolated from performing CTL/autologous APC co-cultures. Any information regarding which epitope/MHC/TCR combination yielded a positive hit would need to be deconvoluted subsequently. On the other hand, libraries screened in the context of APC containing individual HLA alleles allows for the discovery of T-cell epitopes 86 while also elucidating, bottom-up, the specific MHC variants responsible for mediating all detected reactivities. In terms of assessing T-cell reactivity by expressing recombinant TCR clonotypes-of-interest in host cytotoxic cells, the modified immortalized NK cell line/APC system described in this chapter offers several potential advantages over previously described methods. Similar to the discussion of B cells above, these advantages include the capacity of rCTL/aAPC to be kept in long-term cell culture, the absence of endogenous TCR expression in NK cells, and the short turnaround time required to generate modified NK/APC cell lines. In contrast, expansion of bulk primary T cells is limited in scalability since these cells are subject to senescence after a limited culture lifetime. Moreover, bulk TCR-transduced primary T-cell populations can be used as effector cells in antigen identification assays but can be confounded by endogenously expressed TCR chains resulting in irrelevant TCR recognition mediated by native or mispaired TCRαβ dimers. Alternative methods to generate immortalized reconstituted CTL include transforming individual T-cell clones-of-interest using oncogenic agents or by hybridization with existing immortalized cell lines. However, the drawbacks with these methods are that they are time-intensive, do not always generate a viable clone, do not generally maintain granzyme/perforin cytotoxicity, and require investigators to have previously isolated a T-cell clone expressing the TCR clonotype of interest. The synthetic antigen-identification platform yielded a modest FRET-shift signal in comparison to the much larger FRET-shifts observed previously in co-cultures of OT-I CTL/Ova-expressing APC or pmel-1 TCR CTL/hgp100-expressing APC. Initial investigations into the cause of this have thus far yielded the possibility that the native tendency of YT-Indy cells to constitutively secrete GzmB in the absence of antigenic triggering results in a gradual loss of potency with prolonged culture time. It is interesting to note, however, that the very strong FRET-shift signal obtained in co-cultures of YT-Indy cells that had been thawed only 5 days prior suggest that this effect may not necessarily track with the overall passage number of the cell stock. Rather, it may be possible that cryopreserving and thawing cells may have a resetting effect in which YT-cells are able to restore granzyme/perforin levels to those needed to produce large FRET-shift signal. 87 Further experiments are needed to confirm if this is the case. If it is, a relatively easy solution may exist for solving low FRET-shift in rCTL/aAPC experiments. While constitutive GzmB secretion is a promising explanation, it still may be possible that additional factors contribute to the muted FRET-shift signal observed in rCTL/aAPC co-cultures. Further experiments comparing the FRET-shift signal induced in aAPC by primary CTL and reconstituted CTL bearing the same HSDL1 mutant-reactive TCR clonotype are necessary to understand whether this effect is TCR-intrinsic. Additional experiments involving other model human TCR and antigens should also provide an indication as to whether the low FRET-shift signal are attributable specifically to the HSDL1 mutant-reactive TCR. Another possibility is that cause is due to other cell-intrinsic factors. One area to investigate would be the suitability of other pairs of immortalized NK cells lines and matched APC. The prominent immortalized NK cell lines, NK-92 (Gong et al., 1994), and an IL-2 independent variant, NK-92MI (Tam et al., 1999) represent alternative rCTL base cell lines that could be used to construct a synthetic screening platform. A number of cell lines have been identified that are resistant to NK-92 cytotoxicity (Romanski et al., 2016) and, thus, available candidate aAPC could potentially exist for such an alternative aAPC system. Finally, a third arm of troubleshooting is to consider antigen-intrinsic effects. Having already demonstrated that the transgenically expressed HLA genes are successfully trafficking to the cell surface and ruled out that HLA expression suppresses cytotoxicity via KIR engagement, quantitating the amount of MHC bound HSDL-1 peptide at the aAPC cell surface by would conclusively determine if low FRET signal was a property of the model antigen used. This could conceivably be accomplished by employing selected reaction monitoring mass spectrometry techniques on MHC-eluted peptides (Tan et al., 2011) from HSDL1 minigene-bearing aAPC lines. 4.4 Methods 4.4.1 Cell culture All cell cultures were maintained in RPMI-1640 supplemented with 2 mM GlutaMAX, 1 mM sodium pyruvate, 50 μM β-mercaptoethanol, 10 mM HEPES, 100 U/mL penicillin, 100 U/mL 88 streptomycin, and 10% heat-inactivated fetal bovine serum. Culture media and supplements were all sourced from Gibco. Cultures were maintained at 37°C and 5% CO2 atmosphere. 4.4.2 B-cell stimulation and expansion Eighteen hours prior to B-cell stimulation, NIH-3T3-hCD40L cells (a gift of Dr. B. Nelson, BC Cancer Agency Deeley Research Centre) were plated in a 6-well plate at a density of 2x105 cells/well. At the time of B-cell activation, adherent CD40L cells were X-ray irradiated at a dose of 78 Gy and washed 2x with fresh media. To irradiated CD40L feeders, thawed and washed PBMC from leukopak (Stemcell Technologies) were added at a density of 4x106 cells/well in a final volume of 4 mL complete RPMI + 50 U/mL hIL-4 (Peprotech) + 625 ng/mL cyclosporine A (Sigma). After 3 days, half of the conditioned media was removed and replaced with fresh complete RPMI + hIL-4 + cyclosporine A. On day 6, B-cell clusters were transferred to freshly irradiated, adherent CD40L cells (plated 18h prior at 2x105 cells/well) at a density of 4x106 cells/well in a final volume of 4 mL cRPMI + 50 U/mL hIL-4 + 625 ng/mL cyclosporine A. From this point onward, cycles of fresh CD40L-cell/B-cell cultures were prepared every 3 days with complete RPMI + IL-4 only. 4.4.3 TCR-seq and TCR cloning A vial of primary T-cell clone bearing the HSDL-1 mutant reactive TCR from Wick et al., 2014 (a gift of Dr. B. Nelson, BC Cancer Agency Deeley Research Centre) was thawed and processed for RNA isolation using Qiagen RNeasy Mini kit. First strand synthesis was performed using Superscript II reverse transcriptase (Invitrogen) and C6 primer + template switching oligo for α-chain synthesis and TCRA-5 primer + template switching oligo for β-chain synthesis. First round synthesis was performed using Phusion polymerase (New England Biolabs) and a 5:1 mixture of UPM-Short:UPM-LTS_T forward priming and TCRA-3 or C9B reverse primers to amplify α- or β-chain cDNA, respectively. A second round nested PCR was then performed on 1st round products using Phusion polymerase and NLTS_III_T_trunc forward primer and TCRA-2_tail or C14_III_Tail reverse primers for α-chain PCR and β-chain PCR, respectively. Recovered TCR amplicons were gel-purified, A-tailed with Platinum Taq polymerase (Invitrogen), and inserted by TOPO-TA cloning into pCR4 plasmid (Invitrogen). TOPO-cloned TCR chains were 89 transformed in DH10B-T1R E. coli (Invitrogen) and, after overnight growth, 12 colonies of each α and β were selected of liquid culture. Liquid cultures were grown overnight, processed by standard plasmid miniprep procedures and submitting for Sanger sequencing (Genewiz). The obtained reads were analyzed using MiTCR to unambiguously determine the α- and β- clonotypes of the Wick TCR. To stitch on the complete constant regions for each chain and add restriction enzyme tails to facilitate insertion into pMND-Multi (Appendix B.1.3), Phusion polymerase was used in conjunction with the following primer pairs: α-chains were amplified from TOPO-TCRα plasmid using TRAV26-201.f_BamHI and TRAC(33-53).r ; β-chains were amplified from TOPO-TCRβ plasmid using TRBV6-601.f_MluI and TRBC2(23-38).r; α constant regions were amplified from HSDL1-reactive T cell total cDNA using TRAC.f and TRAC.r_NheI; β-constant regions were amplified from HSDL1-reactive T cell total cDNA using TRBC2.f and TRBC2.r_EcoRI. The two α amplicons and two β amplicons were then combined in separate overlap extension PCR reactions with Phusion polymerase and TRAV26-201.f_BamHI /TRAC.r_NheI or TRBV6-601.f_MluI/ TRBC2.r_EcoRI primer pairs, respectively. After overlap extension, completed chains were gel-purified, restriction enzyme digested, and sequentially cloned in to recipient plasmid vector. All primers referenced here are described in Appendix B.3. 4.4.4 Virus production For production of all VSV pseudotyped lentivirus, 40 µg of transfer plasmid was combined with 36 µg of pCMV-ΔR8.91 and 4 µg of pCMV-VSV-G plasmids. For production of MeV pseudotyped lentivirus, 30 µg of transfer plasmid was combined with 30 µg of pCMV-ΔR8.91, 10 μg of pCMV-HΔ24, and 10 μg of pCMV-FΔ30 plasmids (courtesy of Dr. A. Gagliardi, BC Cancer Agency; originally a gift of Dr. E. Verhoeyen, ENS Lyon). For production of GALV pseudotyped lentivirus, 40 µg of transfer plasmid was combined with 30 µg of pCMV-ΔR8.91 and 10 µg of pCMV-GALV/TR plasmid (a gift of Dr. D. Nègre, ENS Lyon). All DNA mixes were incubated with 7.5 mL OptiMEM (Gibco) and 0.5 mL of TransIT-LT1 reagent (Mirus) for 30 minutes at room temperature. To 8 x 10 cm culture plates containing 40% confluent HEK293T cells, 1 mL of transfection mix was added per plate. In all cases, media was replaced 18 hours post-transfection and viral supernatants were then collected at 48- and 72-hours post-90 transfection. To concentrate virus, supernatants were ultracentrifuged (110,000 RCF, 90 minutes, 4°C) and pellets were resuspended in 1 mL OptiMEM (Gibco). Titers of viruses were determined by testing (in duplicate) 1, 2, 4, 8, 16, or 32 μL of concentrated virus on 1x105 K562 cells in 24-well format with a final volume of 500 μL of complete culture media and measuring the % of fluorescent cells detected (>5σ above negative control) in flow cytometry 48 hours later. 4.4.5 Viral transduction For transduction of B cells with MeV pseudotyped virus, 1x105 B cells, removed from CD40L stimulation <24 hours after incubation with fresh feeder cells, were combined with 1x105 TU of lentivirus in 1 well of a 24-well plate in a final volume of 500 μL complete media. After 18 hours, B cells were returned to a fresh plate of irradiated CD40L feeder cells. For transduction of B cells with GALV pseudotyped virus, 2 wells of a non-tissue culture-treated 24-well plate were coated with Retronectin (Takara) according to manufacturer protocol. After washing coated wells, 1.5x105 TU of virus was loaded onto one of the coated wells and a mock well was prepared by loading blank OptiMEM to the other coated well. The plate was centrifuged for 2 hours, 2000 x g, at 4℃. Virus- or mock-charged wells were washed 1X with PBS + 2% BSA and 1x105 B cells, removed from CD40L stimulation <24 hours after incubation with fresh feeder cells, were added in a final volume of 500 μL of complete media. After 18 hours B cells were returned to a fresh plate of irradiated CD40L feeder cells. 4.4.6 Effector/target co-cultures Cell populations were enumerated using the Countess automated cell counter (Invitrogen) to determine input cell number for effector/target co-cultures. All FRET-shift assays were performed for 4 hours at a 1:1 effector:target ratio in U-bottom 96-well plate wells 1x105 effector cells + 1x105 target cells in a final volume of 100 μL/well of pre-warmed complete RPMI. In all cases, co-cultures were maintained at 37°C, in 5% CO2 atmosphere. 4.4.7 Flow cytometry/FACS All mAb and viability dye staining carried out in this chapter were done so according to manufacturer’s specifications. Virally transduced APC lines were purity-sorted to remove 91 untransduced targets and cells carrying non-productive minigene-reporter transgenes using BD FACSAria Fusion. Single-cell determination was as done by FSC-A vs. FSC-W and SSC-H vs. SSC-W gating. Prior to gating on markers-of-interest, live cells were selected by gating on PI- populations or populations negative by Fixable Viability Dye eFluor780 (FVD780)(Thermo Fisher Scientific) (ex. 640, em. 780). FVD780 viability dye was only used in instances in which subject cell lines contained RFP and, thus could not be assessed by PI. For all sorts, cells were collected in 15 mL conical Falcon tubes containing 3mL of 55% FBS/45% culture medium. Flow cytometric analyses of effector/target co-cultures, viral titering curves, and B-cell transductions were performed on BD LSR II Fortessa. In all cases, to prepare samples for flow cytometry or FACS cells were harvested, washed 1X with PBS, stained with FVD780 and/or mAb where applicable, washed 1X more with PBS, resuspended in 300 μL PBS, filtered through 40 μm nylon mesh, and kept on ice for duration of analysis. 4.4.8 Data analysis Gel and CFU analyses were performed using ImageQuant TL v8.1. Flow cytometry analyses were performed using FlowJo V10.0.8r1. Quantitative PCR analyses were performed using SDS2.4. Sequence data handling was performed using Geneious v8.1.2 and Bioconductor package seqinr. Other data handling and statistical analyses were performed using R v3.1.2 and RStudio. 92 Chapter 5: Conclusions and future directions High-throughput epitope discovery methods are a necessity for further understanding T-cell biology. There is a severe mismatch in the volume of known TCR epitope data uploaded in public repositories such as the Immune Epitope Database (IEDB) (Vita et al., 2014), which contains ~8x104 T-cell activating epitope sequences, and TCR-seq studies, which are routinely used to reveal millions of unique TCR α- and/or β-chains per individual across dozens of individuals per study (Woodsworth et al., 2013). On top of this, the vast majority of this data is unlinked, the number of interacting TCR/pMHC pairs reported at the sequence level is very low, totaling <2000 complexes existing in curated databases such as VDJdb (Shugay et al., 2018) and ATLAS (Borrman et al., 2017). Novel strategies are needed to uncover T-cell epitopes at a pace that matches the pace at which TCR sequences are being discovered. Methods for rationally screening individual TCR clonotypes-of-interest against vast and unbiased libraries of peptides will be indispensable in generating linked TCR/pMHC data that can then be used to generate novel insights into adaptive immunity. Here, I have developed a suite of in vitro tools that, provides a powerful strategy for performing high-throughput TCR antigen screening. This approach involves the use of a novel FRET-based reporter system that, counter to existing paradigms, is encoded in target cells and can detect T-cell reactivity prior to the initiation of apoptosis. This configuration allows for simultaneous signal detection and capture of epitope-encoding minigene sequences for downstream characterization. In Chapter 2 this system was constructed and validated as an efficient and highly selective read-out of granzyme B mediated T-cell cytotoxicity. In Chapter 3, The FRET-reporter was then applied to the detection of relevant minigenes out of large libraries of random minigene by sorting putatively antigenic minigenes on the basis of FRET-shift and identifying them by deep amplicon sequencing. The FRET-shift/amplicon sequencing method was validated as a sensitive approach capable of detecting relevant epitopes at low abundance in libraries. For perspective, the limit of detection in gold-standard ELISPOT methods is considered to be 5-10 spots per 93 1x105 effector cells (Moodie et al., 2010), which approximately matches the 1:10,000 signal detection threshold that was observed in FRET-shift library assays. Furthermore, it was shown in Chapter 3 that the FRET-shift/amplicon sequencing strategy is robust enough to achieve this threshold of detection from polyclonal mixtures of input T cells, rapid enough to detect low abundance antigens without the requirement to deconvolute or iteratively pan libraries for hits, and can readily be scaled to comprehensively query >1x106 minigene sequences in parallel. In Chapter 4, I developed a synthetic framework into which recombinant TCR and pMHC complexes can be assayed by FRET-shift screening. Successful recapitulation of TCR/pMHC dependent cytotoxicity in a non-T cell cytotoxic cell line served as a demonstration that TCR sequences can be queried for reactivity ab initio. That is, starting from only sequence data, the reactivity of a given TCR can be rapidly interrogated for function in the absence of any primary tissue. In this chapter, I will discuss immediately relevant applications of the technology presented in this thesis as well as the opportunities that exist to further develop it. In Section 5.1, I will survey some of outstanding problems faced in understanding adaptive immunity from a basic science and translational standpoint and how a functional, rapid, and scalable method like the FRET-shift/amplicon sequencing approach should provide a key to addressing these problems. Together with this, in Section 5.2, I will explore the next steps in technology development that, when implemented, should expand the capabilities of the novel T-cell epitope discovery approach described here. 5.1 Future directions in applying high-throughput T-cell epitope screening 5.1.1 Whole microbial proteome screening of infectious pathogen-reactive T cells As discussed in Section 1.3.3, a barrier to the rational design of vaccine strategies is the inability to test all possible pathogen targets for their ability to initiate robust T-cell responses. Though obtaining whole genome sequence data from microbial strains or microbial communities is, today, an easily realizable goal, testing individual peptide-coding sequences from this data vis-à-vis T-cell reactivity is severely restricted. Methods that enable high-throughput T-cell epitope 94 screening would be a valuable tool in understanding the interplay between host immunity and microbial pathogenesis. FRET-shift/amplicon sequencing, as demonstrated in this thesis, is one such method that could potentially be mobilized towards that characterization of antimicrobial immunity. The rapidity with which libraries of peptide-coding sequences could be screened by FRET-shift/amplicon sequencing in comparison to other emerging high-throughput methods is a potentially very meaningful feature in the study of particular infectious pathogens. Examples of sudden pandemics are numerous, even in recent history. Influenza viruses, by virtue of their extreme strain variation, are consistently a moving target capable of mediating severe pathogenesis in the right circumstances. Alternatively, pathogens that have laid dormant for decades or centuries can achieve renewed virulence and result in widespread disease, as evidenced by the recent 2014-2016 outbreak of Ebola Virus Disease in West Africa (Kaner and Schaack, 2016). In these types of situations, high-throughput T-cell antigen discovery methods with rapid turnaround could prove to be a key to developing vaccine strategies for containing epidemics and treating infected individuals (Sakabe et al., 2018). 5.1.2 Whole tumor exome screening of tumor-associated T cells T-cell epitope discovery is important to the development of novel, precision cancer immunotherapies. It is known that increasing levels of tumor-infiltrating lymphocytes (TIL) correlate significantly with enhanced survival outcomes and this has prompted the development of methods such as checkpoint blockade, cancer vaccination, and TIL therapy. Success has so far been restricted to relatively few cancer types – primarily highly mutated cancer types with substantial T-cell infiltrates (Brown et al., 2014). To develop more potent T cell-based therapeutics, tumor-specific T-cell epitopes are needed to selectively expand or engineer therapeutic products tailored to individual tumors. Current approaches to T-cell epitope identification for cancer immunotherapy employ up-front tumor exome sequencing and somatic mutation calling. Peptide sequences containing all expressed tumor mutations are subject to in silico peptide binding prediction algorithms to 95 generate a list of possible T-cell epitopes predicted to bind to patient MHC-I molecules. Candidate epitopes (generally fewer than 1,000 peptides) are then tested for their ability to elicit reactivity in TIL by conventional low-throughput methods. Typically, however, and for unknown reasons, only a small percentage of these candidates verify as bona fide cancer epitopes (Linnemann et al., 2015; Robbins et al., 2013; Wick et al., 2014) by this approach. Exhaustive and unbiased screening of tumor-expressed T-cell epitopes against TIL populations by the methods I have describe here would, in principle, allow for a more complete view of the landscape of tumor associated T-cell epitopes. 5.1.3 Exploring T-cell receptor cross-reactivity in libraries of random minigenes Based on the idea that a finite number of T-cell clonotypes in a single individual is able to provide adequate protection against a universe of possible epitopes that is several orders of magnitude larger, it has been theorized for many years that T-cell receptor cross-reactivity is a crucial mechanism by which the TCR repertoire operates (Mason, 1998; Nikolich-Zugich et al., 2004). This has been evidenced in vivo by the finding that, frequently, adults naïve to certain pathogens harbor memory T-cell responses to pathogen-derived antigens despite never having been exposed. These memory T cells have been shown to also react to the presence of antigens from common environmental microbes indicating that the acquisition of the memory phenotypes is prompted by cross-reactive TCR/pMHC binding and activation (Su et al., 2013; Yu et al., 2015a). In vitro approaches and mathematical modeling have provided evidence that a single CTL clone is capable of recognizing over a million unique epitopes from unbiased peptide space (Wooldridge et al., 2012). Other large-scale screening approaches have also noted experimentally that a large degree of cross-reactivity exists and that this TCR promiscuity follows patterns. Cross-reactive epitopes, it has been observed, can come in tightly related clusters of sequence similarity (Birnbaum et al., 2014) but, at the same time, can interact with TCR CDR3 loops in multiple different orientations or “registers” (Riley et al., 2018; Sewell, 2012). This is suggestive of a binding degeneracy that would likely lead to numerous distinct identity clusters per TCR of cross-reactive epitopes in peptide space. 96 Though TCR cross-reactivity is, by this point, a widely acknowledged property of the T-cell repertoire, characterization of it is still in its infancy. The main challenge that has hampered the study of this phenomenon to date is the inability of current methods to measure T-cell responses directly and simultaneously against large unbiased libraries of possible pMHC antigens in a physiologically relevant manner. By scaling-up the random minigene screening approaches used in Chapter 3, the FRET-shift/amplicon sequencing method should facilitate the mapping of TCR sequences to cognate epitopes and help define cross-reactivity between the T-cell receptor and T-cell epitope repertoires. High-throughput functional studies of cross-reactivity offer the possibility for investigators to reveal novel insights into the physical mechanisms of engagement and activation of T cells, and insights into the mechanisms by which the TCR repertoire successfully balances immune protection against auto-reactivity. 5.1.4 Assessing safety of engineered TCR therapeutics An immediate practical consequence of TCR cross-reactivity was recently observed in a recent adoptive cell therapy trial in melanoma. T-cell receptor sequences with known reactivity to the common cancer testis self-antigen, MAGE-A3, were engineered in vitro to have enhanced affinity for the target pMHC complex. Extensive preclinical validation was done to confirm that the engineered TCR did, in fact, have increased binding affinity for HLA-A*01-presented MAGE-A3 epitopes and that off-target effects were not observed when testing for reactivity against common laboratory cell lines (Cameron et al., 2013). However, upon infusion of activated T cells transduced with engineered TCR DNA, rapid and fatal off-target toxicity was observed in both patients receiving the treatment (Linette et al., 2013). It was found, retrospectively, that the cause of the toxicity was a potent cross-reactive T-cell response to cardiomyocytes in response to an HLA-A*01 epitope derived from the protein, Titin. Though the affinity enhanced MAGE-A3 TCR in the above example was tested in in vitro for reactivity to self-antigens using cell lines, this testing was still unable to predict the observed cross-reactivity to Titin. This is in large part due to the tissue-specific expression of Titin as it is only expressed in striated muscle tissue. In view of this, the methods used to pre-screen the off-97 target reactivity of this engineered therapeutic product were completely inadequate. The only way to confirm with certainty that off-target cross-reactivity would not occur in T-cell transfer procedures would be to functionally test every possible peptide in the human proteome for its ability to elicit T-cell reactivity. To mitigate gaps presented by tissue-specific expression, these peptides could be screened by expressing an exome-derived fragment library. The experiments performed in Chapter 3 indicated that the FRET-shift/amplicon sequencing approach should be able to test a fragment library of 1x106 unique peptide-coding fragments (without applying any industrial-scale methods to increase throughput). Given a human exome size of ~30 Mb and an average fragment size of ~90 bp, a library of 1 million unique fragments would result in approximately three-fold coverage, suggesting that screening a fragment library of this size by FRET-shift/amplicon sequencing would be sufficient to query the entire healthy human exome. Therefore, assessing the safety of novel TCR-based immunotherapeutics with respect to their cross-reactivity to self-tissue is a potentially major translational application of the methods presented here. 5.1.5 Characterizing the T-cell receptor reactivity of T-cell lymphomas T cells are able to initiate and sustain rounds of massive proliferation and/or differentiate into very long-lived memory phenotypes as part of their natural biology: both of which are consistent with the hallmarks of cancer (Hanahan and Weinberg, 2000, 2011). Therefore, T cells are vulnerable to carcinogenesis. The T-cell lymphomas that result from this process lose their T-cell functionality but retain their germline encoded rearranged TCR. Indeed, this property has previously been leveraged as a genomic barcode that can be used to track malignant clonal expansion or minimal residual disease in T-cell (Brown et al., 2017) or B-cell (via their rearranged BCR loci) lymphomas (Sala Torra et al., 2017). Beyond characterizing the identity of lymphoma-associated TCR, however, is the opportunity to gain insights into the processes by which lymphomagenesis occurs. It is known that one of the contributing factors to the transformation of healthy T-cell clones into malignancy is chronic antigenic stimulation by ubiquitous pathogens or self-antigen (Malcolm et al., 2016). Identifying 98 the specific antigens that underpin this chronic antigen stimulation could lead to novel methods of mitigating or eliminating it, representing a means of managing T-cell lymphoma by prophylaxis. The challenges in accomplishing this task, however, are nuanced. To isolate epitopes responsible for delivering chronic stimulatory signal necessitates the use of a functional T-cell antigen identification assay. However, in most cases, T-cell functions are highly dysregulated in T-lymphomas, meaning the lymphoma clones themselves would not be suitable for functional assays. As a solution, FRET-shift/amplicon sequencing based antigen discovery performed in the context of the rCTL/aAPC platform could provide a way to resurrect the TCR of T-lymphoma clones and functionally test them against minigene libraries derived from the whole human exome (as suggested in Section 5.1.4) or candidate pathogen genomes (as suggested in Section 5.1.1). 5.2 Future directions in technology development for FRET-shift/amplicon sequencing methodology 5.2.1 Validation of reconstituted CTL platform in library screening contexts The synthetic screening approach developed in Chapter 4 requires further validation to demonstrate its utility in TCR antigen discovery. The initial proof-of-concept experiments served to show that cytotoxic function could be detected by FRET-shift using clonal populations of APC bearing a single minigene. To harness the power of the FRET-shift/amplicon sequencing library screening approach and apply it to recombinant TCR/pMHC, experiments analogous to those performed in Chapter 3 need to be reproduced with rCTL/aAPC. That is, studies into the sensitivity of the FRET-shift assay in isolating known epitope-coding minigenes out of complex minigene pools using a similar decreasing spike-in design as was done previously need to be conducted. Successfully screening minigene libraries using rCTL would be a crucial development towards embarking on the studies proposed in Section 5.1. At many research centers, including our own, finding primary clinical samples containing T cells relevant to a particular disease state is often a 99 difficult, or even prohibitive, task. This is particularly true in the case of researchers who may be interested in studying a diverse spectrum of diseases; in most cases, clinical units from which tissues could be obtained will be highly specialized and only be a source of T cells relevant to one particular context. In contrast, rCTL library screening, if a viable option, would make any T cell with a literature-documented TCR sequence accessible for TCR antigen discovery studies. 5.2.2 Overcoming library bottlenecks A particular hurdle to the screening of very large (>1x106 unique minigenes) is the presence of multiple technical bottlenecks that limit the diversity of expressed minigene libraries in target cell populations. Even though specialized molecular cloning techniques can be applied to yield plasmid pools containing ~2x107 cfu from as little as 24 ng of library insert DNA (Appendix A), achieving this level of richness in lentiviral libraries and in target cell populations post-transduction is a non-trivial prospect. One of the most immediately implementable steps towards mitigating virus-associated bottlenecks would be to explore advanced strategies for contending with the stop codon or frame-shift issues that arise during screening library construction discussed in Section 3.1. In random minigene libraries, 54% of minigenes are expected to be lost due to stop codons. Not a phenomenon only restricted to random minigene libraries, libraries derived from sheared cellular gDNA or cDNA would be expected to have 83% of all inserted minigenes be erroneously frame-shifted and/or non-productive due to the possibility that each individual minigene may be ligated in 1 of 6 possible reading frames, only one of which would be correct The experiments carried out in Chapter 3 were done using library cell populations that were pre-sorted by FACS for purity of resting FRET signature; the principle being that, should an upstream minigene contain a stop codon, the resultant transduced host cell would be fluorescently silent and removed during this purity sorting step. However, this approach requires carrying a significant burden of stop codon-containing minigenes through virus production and infection steps, thereby reducing the available bandwidth for productive minigene sequences to be produced and delivered to cells. To avoid this, it would be highly desirable to perform bacteria-level selection of productive minigenes. 100 Preliminary attempts to accomplish this were done by incorporating a kanamycin/neomycin drug-selection marker between the minigene and FRET2 sites of the 2nd generation library destination plasmid vector and equipping the plasmid with an upstream T7 promoter sequence (Appendix B.1). I expected that, upon delivering library plasmid to a T7 polymerase-inducible expression strain of E.coli (BL21(DE3)), productive minigenes would allow for translation of the kanamycin-resistance gene while stop codon-containing minigenes would result in a truncated peptide that would not result in kanR expression and, thus, fail to form colonies. These initial attempts failed to produce any colonies, in retrospect most likely due to the incompatibility of the 2A ribosomal-skipping signal peptide with prokaryotic systems and the inability of the kanamycin resistance protein to function as a fusion (Reiss et al., 1984). Investigating alternative drug selection markers capable of functioning as part of a fusion protein or identifying constitutive bacterial peptidase recognition motifs that would enable polycistronic expression and subsequent liberation of a functional drug resistance protein would contribute to resolving bottleneck issues in library construction. Another strategy for contending with attrition due to frameshift issues in minigene libraries derived from sheared gDNA or cDNA endogenous sequences is to construct them synthetically as oligonucleotide pools. Numerous strategies have been devised to accurately generate vast numbers of unique oligonucleotide molecules in parallel on microarray slides (Kosuri and Church, 2014). These array-immobilized DNA fragments can then be chemically cleaved and collected as a pool into a single tube. The major advantage with this technology is the ability to strictly control which sequences are included in the final library and, as a corollary, ensure that every minigene in the library design would be inserted into viral transfer plasmid in-frame. Remarkably, the capacity of modern, commercially-available array synthesis platforms allows for routine synthesis of >1x105 unique single-stranded DNA molecules in a single synthesis run. 5.2.3 Class II peptide-MHC antigens The granzyme/perforin reporter system central to the T-cell antigen identification tools described here is contingent on utilizing effector cells capable of delivering GzmB in response to antigen 101 recognition. When inputting primary T-cell populations for use in FRET-shift assays, the utility of the approach is, currently, limited to assessing activated CD8+ T cells. A means to perform an analogous version of FRET-shift/amplicon sequencing based screening to discover class II pMHC antigens would represent an enormous opportunity to apply the method to a whole new spectrum of challenges in the study of adaptive immunity. Interestingly, previous reports have documented the existence of CD4+ cytotoxic T cells (CD4 CTL). Originally observed in vitro and thought to be an artifact of long-term culture, these cells have since been detected in vivo as a new subset of T cells that contributes to antiviral immunity (Marshall and Swain, 2011; Takeuchi and Saito, 2017). These CD4 CTL function similarly to conventional CD8 CTL by delivering granzyme B and perforin to target cells at the site of immune synapse formation to invoke direct cell killing of infected targets. Moreover, they do this in a TCR/pMHC II dependent fashion. Cytotoxic CD4 cells, thus, provide a tantalizing opportunity to adapt the GzmB sensitive FRET-shift assay to interrogate libraries of class II antigens. Though the mechanisms of differentiation leading to CD4 CTL populations have not yet been fully elucidated, this area of research should be monitored carefully. An in vitro method to induce CD4 T-cell clonotypes-of-interest to acquire cytotoxic activity would provide a means by which class-II restricted TCR could be input into FRET-shift based library screening. The observation, however, that engagement of class-II restricted TCR with pMHC-II complexes in the presence of CD4 can initiate intracellular Ca2+ release and GzmB/PFN degranulation implies that a primary CD4 CTL may not necessarily be required to apply the FRET-shift assay to class II antigens. It could be possible to use the rCTL/aAPC approach described in Chapter 4 to test pMHC-II complexes for their ability to activate T cells by switching the CD8αβ helper cassette for a CD4-coding transgene and reconfiguring the HLA cassette to accommodate the α- and β-chains needed for MHC-II expression. 102 5.2.4 Tools to facilitate construction of immune networks T-cell receptor repertoire profiling has become an established technology and, to complement it, a variety of approaches, including those presented in this thesis, have been developed to characterize the repertoire of antigens recognized by T-cell populations. However, still absent in the sphere of T-cell immunoprofiling is a scalable methodology for generating linked TCR/epitope sequence data. Currently, linking TCR/pMHC identity could only be done by holding one side of the immune synapse fixed (i.e. selecting one known TCR or pMHC antigen to test) and screening against panels of potential interaction partners. It follows, then, that constructing a network map of TCR/pMHC interactions involves iteratively performing these types of “one-by-all” screening approaches until the desired scale is reached. However, to link, for example, the entire human T-cell repertoire to the entire set of 8-14mer peptides comprising the human proteome, this type of strategy is unfeasible. One avenue for linking TCR/pMHC data together is to employ in silico predictive modeling of these interactions (Dash et al., 2017; Glanville et al., 2017). At present, the tools that have been developed to do this are limited as they are only able to classify TCR into coarse-grained specificity groups on the basis of their binding to known epitopes in MHC-tetramer format. These algorithms are not at the level of being able to accurately map a given TCR to the constellation of peptide epitopes that it is capable of reacting to or vice versa. It is unclear if that level could ever be achieved but, in order to find out, much more function-based training data is required to refine predictive algorithms further. The uptake of high-throughput epitope discovery methods, including FRET-shift/amplicon sequencing, will yield increasing volumes of T-cell epitope sequence data for use in data mining and meta-analysis, analogous to the growth of many other high-dimensional genomic data resources, and facilitate the development of sophisticated tools capable of accurately modeling and predicting interactions between TCR and pMHC molecules. Another avenue towards mapping the network of TCR/pMHC interactions in a given context is to obtain TCR sequences, antigen sequences, and linkage data all at once using “all-by-all” omnibus-style experiments. Achieving a method to accomplish this will require a totally novel 103 approach that will need to solve the problems associated with co-localizing each query antigen and query TCR such that positively interacting pairs can be captured and characterized together. To this end, one potential scheme that could be applied could be to leverage the property of T cells to process and present antigen on their own cell surfaces while using their TCR to inspect other cells. Antigen can be delivered to a clonal population of T cells and clonal self-reactivity can be monitored as a measure of TCR/pMHC activity (Prommersberger et al., 2015). Expanding on this principle, delivering libraries of minigenes such that each individual transduced T cell contains one TCR and one minigene would allow for query TCR/pMHC pairs to be physically linked together, in the same cell, during the course of a screening experiment. However, further challenges to be addressed with this concept include how to grow monoclonal transduced T-cell populations prior to screening without premature self-destruction and how to keep TCRα, TCRβ, and minigene sequences linked during the identification steps after screening. 5.3 Final Remarks In this thesis I have developed a suite of methods for the identification of T-cell receptor epitopes from large libraries of peptide-coding sequences and shown that these methods are functionally-relevant, specific, sensitive, robust, scalable, and rapid. I have discussed some examples wherein this unique combination of advantages offers new ways to arrive at the answers to a variety of outstanding questions that exist in the field of T-cell biology today. I have also explored a few of the ways in which this technology can be improved, adapted, or extended. Faced with the daunting complexity associated with understanding and engineering the T-cell receptor repertoire’s ability to protect the body from harm, I am confident that the work presented here will be a positive contribution towards this goal. 104 Bibliography Alexandrov, L.B., Nik-Zainal, S., Wedge, D.C., Aparicio, S.A.J.R., Behjati, S., Biankin, A. V, Bignell, G.R., Bolli, N., Borg, A., Børresen-Dale, A.-L., et al. (2013). Signatures of mutational processes in human cancer. Nature 500, 415–421. Allison, J.P., Mcintyre, B.W., and Bloch, D. (1982). Tumor-specific antigen of murine T-lymphoma defined with monoclonal antibody. J. Immunol. 129, 2293–2300. Almeida, C.A.M., Roberts, S.G., Laird, R., McKinnon, E., Ahmed, I., Pfafferott, K., Turley, J., Keane, N.M., Lucas, A., Rushton, B., et al. (2009). Automation of the ELISpot assay for high-throughput detection of antigen-specific T-cell responses. J. Immunol. Methods 344, 1–5. Altman, J.D., Moss, P.A.H., Goulder, P.J.R., Barouch, D.H., Mcheyzer-Williams, M.G., Bell, J.I., Mcmichael, A.J., and Davis, M.M. (1996). Phenotypic Analysis of Antigen-Specific T Lymphocytes. Science 274, 94–96. Altschul, S.F., Gish, W., Miller, W., Myers, E.R., and Lipman, D.J. (1990). Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403–410. Amirache, F., Levy, C., Costa, C., Mangeot, P.-E., Torbett, B.E., Wang, C.X., Negre, D., Cosset, F.-L., and Verhoeyen, E. (2014). Mystery solved: VSV-G-LVs do not allow efficient gene transfer into unstimulated T cells, B cells, and HSCs because they lack the LDL receptor. Blood 123, 1422–1424. Anderson, M.S., and Miller, J. (1992). Invariant chain can function as a chaperone protein for class II major histocompatibility complex molecules. Proc. Natl. Acad. Sci. USA 89, 2282–2286. Anmole, G., Kuang, X.T., Toyoda, M., Martin, E., Shahid, A., Le, A.Q., Markle, T., Baraki, B., Jones, R.B., Ostrowski, M.A., et al. (2015). A robust and scalable TCR-based reporter cell assay to measure HIV-1 Nef-mediated T cell immune evasion. J. Immunol. Methods 426, 104–113. Bajar, B.T., Wang, E.S., Zhang, S., Lin, M.Z., and Chu, J. (2016). A guide to fluorescent protein FRET pairs. Sensors 16, 1–24. Bakker, A.H., Hoppes, R., Linnemann, C., Toebes, M., Rodenko, B., Berkers, C.R., Hadrup, S.R., van Esch, W.J.E., Heemskerk, M.H.M., Ovaa, H., et al. (2008). Conditional MHC class I ligands and peptide exchange technology for the human MHC gene products HLA-A1, -A3, -A11, and -B7. Proc. Natl. Acad. Sci. USA 105, 3825–3830. Bassani-Sternberg, M., Pletscher-Frankild, S., Jensen, L.J., and Mann, M. (2015). Mass spectrometry of HLA-I peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Mol. Cell. Proteomics 14, 658–673. Bassing, C., Swat, W., and Alt, F. (2002). The mechanism and regulation of chromosomal V (D) J recombination. Cell 109, 45–55. 105 Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W. (2017). GenBank. Nucleic Acids Res. 45, D37–D42. Bentzen, A.K., Marquard, A.M., Lyngaa, R., Saini, S.K., Ramskov, S., Donia, M., Such, L., Furness, A.J.S., McGranahan, N., Rosenthal, R., et al. (2016). Large-scale detection of antigen-specific T cells using peptide-MHC-I multimers labeled with DNA barcodes. Nat. Biotechnol. 34, 1037–1045. Bevan, M.J. (1977). In a radiation chimaera, host H–2 antigens determine immune responsiveness of donor cytotoxic cells. Nature 269, 417–418. Birnbaum, M.E., Mendoza, J.L., Sethi, D.K., Dong, S., Glanville, J., Dobbins, J., Özkan, E., Davis, M.M., Wucherpfennig, K.W., and Garcia, K.C. (2014). Deconstructing the peptide-MHC specificity of T cell recognition. Cell 157, 1073–1087. Boegel, S., Löwer, M., Bukur, T., Sorn, P., Castle, J.C., and Sahin, U. (2018). HLA and proteasome expression body map. BMC Med. Genomics 11, 1–12. Bogin, Y., Ainey, C., Beach, D., and Yablonski, D. (2007). SLP-76 mediates and maintains activation of the Tec family kinase ITK via the T cell antigen receptor-induced association between SLP-76 and ITK. Proc. Natl. Acad. Sci. USA 104, 6638–6643. Bolotin, D. a, Shugay, M., Mamedov, I.Z., Putintseva, E. V, Turchaninova, M. a, Zvyagin, I. V, Britanova, O. V, and Chudakov, D.M. (2013). MiTCR: software for T-cell receptor sequencing data analysis. Nat. Methods 10, 813–814. Bolotin, D.A., Poslavsky, S., Mitrophanov, I., Shugay, M., Mamedov, I.Z., Putintseva, E. V., and Chudakov, D.M. (2015). MiXCR: Software for comprehensive adaptive immunity profiling. Nat. Methods 12, 380–381. Borràs, E., Martin, R., Judkowski, V., Shukaliak, J., Zhao, Y., Rubio-Godoy, V., Valmori, D., Wilson, D., Simon, R., Houghten, R., et al. (2002). Findings on T cell specificity revealed by synthetic combinatorial libraries. J. Immunol. Methods 267, 79–97. Borrman, T., Cimons, J., Cosiano, M., Purcaro, M., Pierce, B.G., Baker, B.M., and Weng, Z. (2017). ATLAS: A database linking binding affinities with structures for wild-type and mutant TCR-pMHC complexes. Proteins 85, 908–916. Bouvier, M. (2003). Accessory proteins and the assembly of human class I MHC molecules: a molecular and structural perspective. Mol Immunol 39, 697–706. Boyle, W. (1968). An extension of the 51Cr-release assay for the estimation of mouse cytotoxins. Transplantation 6, 761–764. Bridgeman, J.S., Sewell, A.K., Miles, J.J., Price, D.A., and Cole, D.K. (2012). Structural and biophysical determinants of αβ T-cell antigen recognition. Immunology 135, 9–18. 106 Broek, M.E. Van Den, Kagi, D., Ossendorpfl, F., Toes, R., Spiros, V., Lutz, W.K., Melief, C.J.M., Zinkernagel, R.M., and Hengartner, H. (1996). Decreased tumor surveillance in perforin-deficient mice. J Exp. Med. 184, 1781–1790. Brooks, S.E., Bonney, S.A., Lee, C., Publicover, A., Khan, G., Smits, E.L., Sigurdardottir, D., Arno, M., Li, D., Mills, K.I., et al. (2015). Application of the pMHC array to characterise tumour antigen specific T cell populations in leukaemia patients at disease diagnosis. PLoS One 10, 1–19. Brown, S.D., Warren, R.L., Gibb, E.A., Martin, S.D., Spinelli, J.J., Nelson, B.H., and Holt, R.A. (2014). Neo-antigens predicted by tumor genome meta-analysis correlate with increased patient survival. Genome Res. 24, 743–750. Brown, S.D., Hapgood, G., Steidl, C., Weng, A.P., Savage, K.J., and Holt, R.A. (2017). Defining the clonality of peripheral T cell lymphomas using RNA-seq. Bioinformatics 33, 1111–1115. van der Bruggen, P., Traversari, C., Chomez, P., Lurquin, C., De Plaen, E., Van den Eynde, B.J., Knuth, A., and Boon, T. (1991). A gene encoding an antigen recognized by cytolytic T lymphocytes on a human melanoma. Science 254, 1643–1647. Brunner, K.T., Mauel, J., Cerottini, J.C., and Chapuis, B. (1968). Quantitative assay of the lytic action of immune lymphoid cells on 51-Cr-labelled allogeneic target cells in vitro; inhibition by isoantibody and by drugs. Immunology 14, 181–196. Bruno, L., Cortese, M., Rappuoli, R., and Merola, M. (2015). Lessons from Reverse Vaccinology for viral vaccine design. Curr. Opin. Virol. 11, 89–97. Buferne, M., Luton, F., Letourneur, F., Hoeveler, A., Couez, D., Barad, M., Malissen, B., Schmitt-Verhulst, A.M., and Boyer, C. (1992). Role of CD3 delta in surface expression of the TCR/CD3 complex and in activation for killing analyzed with a CD3 delta-negative cytotoxic T lymphocyte variant. J. Immunol. 148, 657–664. Burnet, M. (1957). Cancer—a biological approach. Br. Med. J. 1, 841–847. Butler, M.O., and Hirano, N. (2014). Human cell-based artificial antigen-presenting cells for cancer immunotherapy. 257, 191–209. Call, M.E., Pyrdol, J., Wiedmann, M., and Wucherpfennig, K.W. (2002). The Organizing Principle in the Formation of the T Cell Receptor-CD3 Complex. Cell 111, 967–979. Cameron, B.J., Gerry, A.B., Dukes, J., Harper, J. V, Kannan, V., Bianchi, F.C., Grand, F., Brewer, J.E., Gupta, M., Plesa, G., et al. (2013). Identification of a Titin-derived HLA-A1-presented peptide as a cross-reactive target for engineered MAGE A3-directed T cells. Sci. Transl. Med. 5, 1–11. Castellarin, M., Milne, K., Zeng, T., Tse, K., Mayo, M., Zhao, Y., Webb, J.R., Watson, P.H., Nelson, B.H., and Holt, R.A. (2013). Clonal evolution of high-grade serous ovarian carcinoma 107 from primary to recurrent disease. J. Pathol. 229, 515–524. Chan, A.C., Iwashima, M., Turck, C.W., and Weiss, A. (1992). ZAP-70: A 70 kd protein-tyrosine kinase that associates with the TCR ζ chain. Cell 71, 649–662. Chen, Y.T., Scanlan, M.J., Sahin, U., Türeci, O., Gure, A.O., Tsang, S., Williamson, B., Stockert, E., Pfreundschuh, M., and Old, L.J. (1997). A testicular antigen aberrantly expressed in human cancers detected by autologous antibody screening. Proc. Natl. Acad. Sci. USA. Chevalier, M.F., Bobisse, S., Costa-nunes, C., Cesson, V., Jichlinski, P., Speiser, D.E., Harari, A., Coukos, G., Romero, P., Nardelli-Haefliger, D., et al. (2015). High-throughput monitoring of human tumor-specific T-cell responses with large peptide pools. Oncoimmunology 4, e1029702. Christodoulopoulos, I., and Cannon, P.M. (2001). Sequences in the Cytoplasmic Tail of the Gibbon Ape Leukemia Virus Envelope Protein That Prevent Its Incorporation into Lentivirus Vectors Sequences in the Cytoplasmic Tail of the Gibbon Ape Leukemia Virus Envelope Protein That Prevent Its Incorporation i. J. Virol. 75, 4129–4138. Christodoulopoulos, I., Droniou-Bonzom, M.E., Oldenburg, J.E., and Cannon, P.M. (2010). Vpu-dependent block to incorporation of GaLV Env into lentiviral vectors. Retrovirology 7, 4. Clarke, S.Rm., Barnden, M., Kurts, C., Carbone, F.R., Miller, J.F., and Heath, W.R. (2000). Characterization of the ovalbumin-specific TCR transgenic line OT-I: MHC elements for positive and negative selection. Immunol. Cell Biol. 78, 110–117. Clevers, H., Alarcon, B., Wileman, T., and Terhorst, C. (1988). The T Cell Receptor/CD3 Complex: A Dynamic Protein Ensemble. Annu. Rev. Immunol. 6, 629–662. Cohen, R.N., van der Aa, M.A.E.M., Macaraeg, N., Lee, A.P., and Szoka, F.C. (2009). Quantification of plasmid DNA copies in the nucleus after lipoplex and polyplex transfection. J. Control. Release 135, 166–174. Cousens, L., Najafian, N., Martin, W.D., and De Groot, A.S. (2014). Tregitope: Immunomodulation Powerhouse. Hum Immunol 75, 1139–1146. Cousens, L.P., Su, Y., McClaine, E., Li, X., Terry, F., Smith, R., Lee, J., Martin, W., Scott, D.W., and De Groot, A.S. (2013). Application of IgG-derived natural Treg epitopes (IgG Tregitopes) to antigen-specific tolerance induction in a murine model of type 1 diabetes. J. Diabetes Res. Article ID 621693, 17 pages. Cox, A.L., Skipper, J., Chen, Y., Henderson, R.A., Darrow, T.L., Shabanowitz, J., Engelhard, V.H., Hunt, D.F., and Slingluff, C.L. (1994). Identification of a peptide recognized by five melanoma-specific human cytotoxic T cell lines. Science 264, 716–719. Crawford, F., Jordan, K.R., Stadinski, B., Wang, Y., Huseby, E., Marrack, P., Slansky, J.E., and Kappler, J.W. (2006). Use of baculovirus MHC/peptide display libraries to characterize T-cell receptor ligands. Immunol. Rev. 210, 156–170. 108 Czerkinsky, C.C., Nilsson, L.-Å., Nygren, H., Ouchterlony, Ö., and Tarkowski, A. (1983). A solid-phase enzyme-linked immunospot (ELISPOT) assay for enumeration of specific antibody-secreting cells. J. Immunol. Methods 65, 109–121. Das, D.K., Feng, Y., Mallis, R.J., Li, X., Keskin, D.B., Hussey, R.E., Brady, S.K., Wang, J.-H., Wagner, G., Reinherz, E.L., et al. (2015). Force-dependent transition in the T-cell receptor β-subunit allosterically regulates peptide discrimination and pMHC bond lifetime. Proc. Natl. Acad. Sci. USA 112, 1517–1522. Dash, P., Fiore-Gartland, A.J., Hertz, T., Wang, G.C., Sharma, S., Souquette, A., Crawford, J.C., Clemens, E.B., Nguyen, T.H.O., Kedzierska, K., et al. (2017). Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93. Davis, M.M., and Bjorkman, P.J. (1988). T-cell antigen receptor genes and T-cell recognition. Nature 334, 395–402. Degano, M., Garcia, K.C., Apostolopoulos, V., Rudolph, M.G., Teyton, L., and Wilson, I.A. (2000). A Functional Hot Spot for Antigen Recognition in a Superagonist TCR/MHC Complex. Immunity 12, 251–261. Degen, E., Cohen-Doyle, M.F., and Williams, D.B. (1992). Efficient dissociation of the p88 chaperone from major histocompatibility complex class I molecules requires both beta 2-microglobulin and peptide. J. Exp. Med. 175, 1653–1661. Denzin, L.K., and Cresswell, P. (1995). HLA-DM induces clip dissociation from MHC class II αβ dimers and facilitates peptide loading. Cell 82, 155–165. Derbinski, J., Schulte, A., Kyewski, B., and Klein, L. (2001). Promiscuous gene expression in medullary thymic epithelial cells mirrors the peripheral self. Nat. Immunol. 2, 1032–1039. Deviren, G., Gupta, K., Paulaitis, M.E., and Schneck, J.P. (2007). Detection of antigen-specific T cells on p/MHC microarrays. J Mol Recognit 20, 32–38. Dietrich, J., Neisig, A., Hou, X., Wegener, A.M., Gajhede, M., and Geisler, C. (1996). Role of CD3 gamma in T cell receptor assembly. J. Cell Biol. 132, 299–310. Dighe, A.S., Richards, E., Old, L.J., and Schreiber, R.D. (1994). Enhanced in vivo growth and resistance to rejection of tumor cells expressing dominant negative IFN gamma receptors. Immunity 1, 447–456. DuBridge, R.B., Tang, P., Hsia, H.C., Leong, P.M., Miller, J.H., and Calos, M.P. (1987). Analysis of Mutation in Human Cells by Using an Epstein-Barr Virus Shuttle System. Mol Cell Biol 7, 379–87. Dunn, G.P., Bruce, A.T., Ikeda, H., Old, L.J., and Schreiber, R.D. (2002). Cancer immunoediting : from immuno- surveillance to tumor escape. Nat. Immunol. 3, 991–998. 109 Ekeruche-Makinde, J., Clement, M., Cole, D.K., Edwards, E.S.J., Ladell, K., Miles, J.J., Matthews, K.K., Fuller, A., Lloyd, K.A., Madura, F., et al. (2012). T-cell receptor-optimized peptide skewing of the T-cell repertoire can enhance antigen targeting. J. Biol. Chem. 287, 37269–37281. Ekeruche-Makinde, J., Miles, J.J., Van Den Berg, H.A., Skowera, A., Cole, D.K., Dolton, G., Schauenburg, A.J.A., Tan, M.P., Pentier, J.M., Llewellyn-Lacey, S., et al. (2013). Peptide length determines the outcome of TCR/peptide-MHCI engagement. Blood 121, 1112–1123. Falk, K., Rötzschke, O., Stevanović, S., Jung, G., and Rammensee, H.G. (1991). Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules. Nature 351, 290–296. Finkelshtein, D., Werman, A., Novick, D., Barak, S., and Rubinstein, M. (2013). LDL receptor and its family members serve as the cellular receptors for vesicular stomatitis virus. Proc. Natl. Acad. Sci. USA 110, 7306–7311. Folch, G., and Lefranc, M.P. (2000a). The human T cell receptor beta diversity (TRBD) and beta joining (TRBJ) genes. Exp Clin Immunogenet 17, 107–114. Folch, G., and Lefranc, M.P. (2000b). The human T cell receptor beta variable (TRBV) genes. Exp Clin Immunogenet 17, 42–54. Forster, V.T. (1948). Zwischenmolekulare Energiewanderung und Fluoreszenz. Ann. Phys. 33, 55–76. Frecha, C., Levy, C., Costa, C., Negre, D., Amirache, F., Buckland, R., Russell, S.J., Cosset, F.-L., and Verhoeyen, E. (2011). Measles Virus Glycoprotein-Pseudotyped Lentiviral Vector-Mediated Gene Transfer into Quiescent Lymphocytes Requires Binding to both SLAM and CD46 Entry Receptors. J. Virol. 85, 5975–5985. Freeman, G.J., Long, A.J., Iwai, Y., Bourque, K., Chernova, T., Nishimura, H., Fitz, L.J., Malenkovich, N., Okazaki, T., Byrne, M.C., et al. (2000). Engagement of the PD-1 Immunoinhibitory Receptor by a Novel B7 Family Member Leads to Negative Regulation of Lymphocyte Activation. J. Exp. Med. 192, 1027–1034. Freeman, J.D., Warren, R.L., Webb, J.R., Nelson, B.H., and Holt, R.A. (2009). Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing. Genome Res. 19, 1817–1824. Galon, J., Costes, A., Sanchez-Cabo, F., Kirilovsky, A., Mlecnik, B., Lagorce-Pagès, C., Tosolini, M., Camus, M., Berger, A., Wind, P., et al. (2006). Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science 313, 1960–1964. Gee, M.H., Han, A., Lofgren, S.M., Beausang, J.F., Mendoza, J.L., Birnbaum, M.E., Bethune, M.T., Fischer, S., Yang, X., Gomez-Eerland, R., et al. (2018). Antigen Identification for Orphan T Cell Receptors Expressed on Tumor-Infiltrating Lymphocytes. Cell 172, 549–563. 110 Germain, R.N., and Hendrix, L.R. (1991). MHC class II structure, occupancy and surface expression determined by post-endoplasmic reticulum antigen binding. Nature 353, 134–139. Giannoni, F., Hardee, C.L., Wherley, J., Gschweng, E., Senadheera, S., Kaufman, M.L., Chan, R., Bahner, I., Gersuk, V., Wang, X., et al. (2013). Allelic exclusion and peripheral reconstitution by TCR transgenic T cells arising from transduced human hematopoietic stem/progenitor cells. Mol Ther 21, 1044–1054. Glanville, J., Huang, H., Nau, A., Hatton, O., Wagar, L.E., Rubelt, F., Ji, X., Han, A., Krams, S.M., Pettus, C., et al. (2017). Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98. Glover, D.J., Leyton, D.L., Moseley, G.W., and Jans, D.A. (2010). The efficiency of nuclear plasmid DNA delivery is a critical determinant of transgene expression at the single cell level. J. Gene Med. 12, 77–85. Gong, J.H., Maki, G., and Klingemann, H.-G. (1994). Characterization of a human cell line (NK-92) with phenotypical and functional characteristics of activated natural killer cells. Leukemia 8, 652–658. Gorer, P.A. (1950). Studies in antibody response of mice to tumour inoculation. Br. J. Cancer 4, 372–381. Gubin, M.M., Zhang, X., Schuster, H., Caron, E., Ward, J.P., Noguchi, T., Ivanova, Y., Hundal, J., Arthur, C.D., Krebber, W.-J., et al. (2014). Checkpoint blockade cancer immunotherapy targets tumour-specific mutant antigens. Nature 515, 577–581. Hadrup, S.R., Bakker, A.H., Shu, C.J., Andersen, R.S., van Veluw, J., Hombrink, P., Castermans, E., Straten, P. thor, Blank, C., Haanen, J.B.A.G., et al. (2009). Parallel detection of antigen-specific T-cell responses by multidimensional encoding of MHC multimers. Nat. Methods 6, 520–526. Hall, C., Berkhout, B., Alarcon, B., Sancho, J., Wileman, T., and Terhorst, C. (1991). Requirements for Cell-Surface Expression of the Human Tcr/Cd3 Complex in Non-T-Cells. Int. Immunol. 3, 359–368. Han, A., Glanville, J., Hansmann, L., and Davis, M.M. (2014). Linking T-cell receptor sequence to functional phenotype at the single cell level. Nat Biotechnol 32, 684–692. Hanahan, D., and Weinberg, R.A. (2000). The Hallmarks of Cancer. Cell 100, 57–70. Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of Cancer: The Next Generation. Cell 144, 646–674. Hannon, G.J. (2010). FastX-Toolkit v0.0.13. Harndahl, M., Rasmussen, M., Roder, G., Dalgaard Pedersen, I., Sørensen, M., Nielsen, M., and 111 Buus, S. (2012). Peptide-MHC class I stability is a better predictor than peptide affinity of CTL immunogenicity. Eur. J. Immunol. 42, 1405–1416. Haskins, K., Kubo, R., White, J., Pigeon, M., Kappler, J., and Marrack, P. (1983). The major histocompatibility complex-restricted antigen receptor on T cells. J Exp. Med. 157, 1149–1169. Hedrick, S.M., Cohen, D.I., Nielsen, E.A., and Davis, M.M. (1984). Isolation of cDNA clones encoding T cell-specific membrane-associated proteins. Nature 308, 149–153. Hochberg, D., Souza, T., Catalina, M., Sullivan, J.L., Luzuriaga, K., and Thorley-Lawson, D.A. (2004). Acute infection with Epstein-Barr virus targets and overwhelms the peripheral memory B-cell compartment with resting, latently infected cells. J. Virol. 78, 5194–5204. Hogquist, K.A., Jameson, S.C., Heath, W.R., Howard, J.L., Bevan, M.J., and Carbone, F.R. (1994). T Cell Receptor Antagonist Peptides Induce Positive Selection. Cell 76, 17–27. Holčapek, M., Jirásko, R., and Lísa, M. (2012). Recent developments in liquid chromatography-mass spectrometry and related techniques. J. Chromatogr. A 1259, 3–15. Hondowicz, B.D., Schwedhelm, K. V., Kas, A., Tasch, M.A., Rawlings, C., Ramchurren, N., McIntosh, M., D’Amico, L.A., Sanda, S., Standifer, N.E., et al. (2012). Discovery of T cell antigens by high-throughput screening of synthetic minigene libraries. PLoS One 7, e29949. Hornstein, B.D., Roman, D., Arévalo-Soliz, L.M., Engevik, M.A., and Zechiedrich, L. (2016). Effects of circular DNA length on transfection efficiency by electroporation into HeLa cells. PLoS One 11, e0167537. Howie, B., Sherwood, a. M., Berkebile, a. D., Berka, J., Emerson, R.O., Williamson, D.W., Kirsch, I., Vignali, M., Rieder, M.J., Carlson, C.S., et al. (2015). High-throughput pairing of T cell receptor and sequences. Sci. Transl. Med. 7, 301ra131. Huang, J., Zarnitsyna, V.I., Liu, B., Edwards, L.J., Jiang, N., Evavold, B.D., and Zhu, C. (2010). The kinetics of two-dimensional TCR and pMHC interactions determine T-cell responsiveness. Nature 464, 932–936. Hui-Yuen, J., McAllister, S., Koganti, S., Hill, E., and Bhaduri-McIntosh, S. (2011). Establishment of Epstein-Barr Virus Growth-transformed Lymphoblastoid Cell Lines. J. Vis. Exp. 57, e3321. Imai, C., Mihara, K., Andreansky, M., Nicholson, I.C., Pui, C.H., Geiger, T.L., and Campana, D. (2004). Chimeric receptors with 4-1BB signaling capacity provoke potent cytotoxicity against acute lymphoblastic leukemia. Leukemia 18, 676–684. Imboden, J.B., and Stobo, J.D. (1985). Transmembrane signaling by the T cell antigen receptor. J Exp. Med. 161, 446–456. Imboden, J.B., Weiss, A., and Stobo, J.D. (1985). The antigen receptor on a human T cell line 112 initiates activation by increasing cytoplasmic free calcium. J Immunol 134, 663–665. Jäger, E., Chen, Y.T., Drijfhout, J.W., Karbach, J., Ringhoffer, M., Jäger, D., Arand, M., Wada, H., Noguchi, Y., Stockert, E., et al. (1998). Simultaneous humoral and cellular immune response against cancer-testis antigen NY-ESO-1: definition of human histocompatibility leukocyte antigen (HLA)-A2-binding peptide epitopes. J. Exp. Med. 187, 265–270. James, J.R., and Vale, R.D. (2012). Biophysical mechanism of T-cell receptor triggering in a reconstituted system. Nature 487, 64–69. Kaluza, K.M., Kottke, T., Diaz, R.M., Rommelfanger, D., Thompson, J., and Vile, R. (2012). Adoptive Transfer of Cytotoxic T Lymphocytes Targeting Two Different Antigens Limits Antigen Loss and Tumor Escape. Hum. Gene Ther. 23, 1054–1064. Kanaseki, T., Blanchard, N., Hammer, G.E., Gonzalez, F., and Shastri, N. (2006). ERAAP Synergizes with MHC Class I Molecules to Make the Final Cut in the Antigenic Peptide Precursors in the Endoplasmic Reticulum. Immunity 25, 795–806. Kaner, J., and Schaack, S. (2016). Understanding Ebola: The 2014 epidemic. Global. Health 12, 53. Kappler, J.W., Roehm, N., and Marrack, P. (1987). T cell tolerance by clonal elimation in the thymus. Cell 49, 273–280. Karttunen, J., and Shastri, N. (1991). Measurement of ligand-induced activation in single viable T cells using the lacZ reporter gene. Proc. Natl. Acad. Sci. USA 88, 3972–3976. Karunakaran, K.P., Rey-Ladino, J., Stoynov, N., Berg, K., Shen, C., Jiang, X., Gabel, B.R., Yu, H., Foster, L.J., and Brunham, R.C. (2008). Immunoproteomic discovery of novel T cell antigens from the obligate intracellular pathogen Chlamydia. J. Immunol. 180, 2459–2465. Kawakami, Y., Eliyahu, S., Delgado, C.H., Robbins, P.F., Sakaguchi, K., Appella, E., Yannelli, J.R., Adema, G.J., Miki, T., and Rosenberg, S.A. (1994). Identification of a human melanoma antigen recognized by tumor-infiltrating lymphocytes associated with in vivo tumor rejection. Proc. Natl. Acad. Sci. USA 91, 6458–6462. Kersh, E.N., Shaw, A.S., and Allen, P.M. (1998a). Fidelity of T Cell Activation Through Multistep T Cell Receptor Zeta Phosphorylation. Science 281, 572–575. Kersh, G.J., Kersh, E.N., Fremont, D.H., and Allen, P.M. (1998b). High- and low-potency ligands with similar affinities for the TCR: The importance of kinetic in TCR signaling. Immunity 9, 817–826. Kim, J.H., Lee, S.R., Li, L.H., Park, H.J., Park, J.H., Lee, K.Y., Kim, M.K., Shin, B.A., and Choi, S.Y. (2011). High cleavage efficiency of a 2A peptide derived from porcine teschovirus-1 in human cell lines, zebrafish and mice. PLoS One 6, e18556. 113 Klarenbeek, P.L., De Hair, M.J.H., Doorenspleet, M.E., Van Schaik, B.D.C., Esveldt, R.E.E., Van De Sande, M.G.H., Cantaert, T., Gerlag, D.M., Baeten, D., Van Kampen, A.H.C., et al. (2012). Inflamed target tissue provides a specific niche for highly expanded T-cell clones in early human autoimmune disease. Ann. Rheum. Dis. 71, 1088–1093. Kleijmeer, M.J., Kelly, A., Geuze, H.J., Slot, J.W., Townsend, A., and Trowsdale, J. (1992). Location of MHC-encoded transporters in the endoplasmic reticulum and cis-Golgi. Nature 357, 342–344. Kosuri, S., and Church, G.M. (2014). Large-scale de novo DNA synthesis: Technologies and applications. Nat. Methods 11, 499–507. Kwakkenbos, M.J., Diehl, S.A., Yasuda, E., Bakker, A.Q., Van Geelen, C.M.M., Lukens, M. V., Van Bleek, G.M., Widjojoatmodjo, M.N., Bogers, W.M.J.M., Mei, H., et al. (2010). Generation of stable monoclonal antibody-producing B cell receptor-positive human memory B cells by genetic programming. Nat. Med. 16, 123–128. Kwong, G.A., Radu, C.G., Hwang, K., Shu, C.J., Chao, M., Koya, R.C., Comin-Anduix, B., Hadrup, S.R., Bailey, R.C., Witte, O.N., et al. (2009). Modular nucleic acid assembled p/MHC microarrays for multiplexed sorting of antigen-specific T cells. J. Am. Chem. Soc. 131, 9695–9703. Lawrence, M.S., Stojanov, P., Polak, P., Kryukov, G. V, Cibulskis, K., Sivachenko, A., Carter, S.L., Stewart, C., Mermel, C.H., Roberts, S. a, et al. (2013). Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218. Leach, D.R., Krummel, M.F., and Allison, J.P. (1996). Enhancement of Antitumor Immunity by CTLA-4 Blockade. Science 271, 1734–1736. Lee, D., Long, S. a., Adams, J.L., Chan, G., Vaidya, K.S., Francis, T. a., Kikly, K., Winkler, J.D., Sung, C.M., Debouck, C., et al. (2000). Potent and selective nonpeptide inhibitors of caspases 3 and 7 inhibit apoptosis and maintain cell functionality. J. Biol. Chem. 275, 16007–16014. Lefranc, M.P., Giudicelli, V., Duroux, P., Jabado-Michaloud, J., Folch, G., Aouinti, S., Carillon, E., Duvergey, H., Houles, A., Paysan-Lafosse, T., et al. (2015). IMGT, the international ImMunoGeneTics information system 25 years on. Nucleic Acids Res. 43, D413–D422. Legoux, F.P., Lim, J.B., Cauley, A.W., Dikiy, S., Ertelt, J., Mariani, T.J., Sparwasser, T., Way, S.S., and Moon, J.J. (2015). CD4+ T Cell Tolerance to Tissue-Restricted Self Antigens Is Mediated by Antigen-Specific Regulatory T Cells Rather Than Deletion. Immunity 43, 896–908. Lennerz, V., Fatho, M., Gentilini, C., Frye, R.A., Lifke, A., Ferel, D., Wolfel, C., Huber, C., and Wolfel, T. (2005). The response of autologous T cells to a human melanoma is dominated by mutated neoantigens. Proc. Natl. Acad. Sci. USA 102, 16013–16018. Li, M., Husic, N., Lin, Y., Christensen, H., Malik, I., McIver, S., Daniels, C.M.L.P., Harris, 114 D.A., Kotzbauer, P.T., Goldberg, M.P., et al. (2010). Optimal promoter usage for lentiviral vector-mediated transduction of cultured central nervous system cells. J. Neurosci. Methods 189, 56–64. Lickliter, J.D., Cox, J., McCarron, J., Martinez, N.R., Schmidt, C.W., Lin, H., Nieda, M., and Nicol, A.J. (2007). Small-molecule Bcl-2 inhibitors sensitise tumour cells to immune-mediated destruction. Br. J. Cancer 96, 600–608. Liebig, T.M., Fiedler, A., Zoghi, S., Shimabukuro-Vornhagen, A., and von Bergwelt-Baildon, M.S. (2009). Generation of Human CD40-activated B cells. J. Vis. Exp. 32, 1373. Linette, G.P., Stadtmauer, E. a, Maus, M. V, Rapoport, A.P., Levine, B.L., Litzky, L., Bagg, A., Carreno, B.M., Cimino, P.J., Binder-scholl, G.K., et al. (2013). Cardiovascular toxicity and titin cross-reactivity of affinity-enhanced T cells in myeloma and melanoma. Blood 122, 863–871. Linnemann, C., van Buuren, M.M., Bies, L., Verdegaal, E.M., Schotte, R., Calis, J.J., Behjati, S., Velds, A., Hilkmann, H., Atmioui, D.E., et al. (2015). High-throughput epitope discovery reveals frequent recognition of neo-antigens by CD4+ T cells in human melanoma. Nat Med 21, 81–85. Loftus, B., Newsom, B., Montgomery, M., Von Gynz-Rekowski, K., Riser, M., Inman, S., Garces, P., Rill, D., Zhang, J., and Williams, J.C. (2009). Autologous attenuated T-cell vaccine (Tovaxin) dose escalation in multiple sclerosis relapsing-remitting and secondary progressive patients nonresponsive to approved immunomodulatory therapies. Clin. Immunol. 131, 202–215. Lopez, J. a., Susanto, O., Jenkins, M.R., Lukoyanova, N., Sutton, V.R., Law, R.H.P., Johnston, A., Bird, C.H., Bird, P.I., Whisstock, J.C., et al. (2013). Perforin forms transient pores on the target cell plasma membrane to facilitate rapid access of granzymes during killer cell attack. Blood 121, 2659–2668. Lozzio, C., and Lozzio, B. (1975). Human Chronic Myelogenous Leukemia Cell-Line With Positive Philadelphia Chromosome. Blood 45, 321–334. Lu, Y.C., Yao, X., Crystal, J.S., Li, Y.F., El-Gamil, M., Gross, C., Davis, L., Dudley, M.E., Yang, J.C., Samuels, Y., et al. (2014). Efficient identification of mutated cancer antigens recognized by T cells associated with durable tumor regressions. Clin. Cancer Res. 20, 3401–3410. Lundegaard, C., Lamberth, K., Harndahl, M., Buus, S., Lund, O., and Nielsen, M. (2008). NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11. Nucleic Acids Res. 36, 509–512. Lyubchenko, T.A., Wurth, G.A., and Zweifach, A. (2001). Role of Calcium Influx in Cytotoxic T Lymphocyte Lytic Granule Exocytosis during Target Cell Killing. Immunity 15, 847–859. Magoč, T., and Salzberg, S.L. (2011). FLASH: Fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963. 115 Malcolm, T.I.M., Hodson, D.J., Macintyre, E.A., and Turner, S.D. (2016). Challenging perspectives on the cellular origins of lymphoma. Open Biol. 6, 160232. Marek-Trzonkowska, N., Mysliwiec, M., Dobyszuk, A., Grabowska, M., Techmanska, I., Juscinska, J., Wujtewicz, M.A., Witkowski, P., Mlynarski, W., Balcerska, A., et al. (2012). Administration of CD4+CD25high CD127- Regulatory T Cells Preserves Beta-Cell Function in Type 1 Diabetes in Children. Diabetes Care 35, 1817–1820. Marshall, N.B., and Swain, S.L. (2011). Cytotoxic CD4 T cells in antiviral immunity. J. Biomed. Biotechnol. 2011. Martin, S.D., Wick, D. a., Nielsen, J.S., Little, N., Holt, R. a., and Nelson, B.H. (2017). A library-based screening method identifies neoantigen-reactive T cells in peripheral blood prior to relapse of ovarian cancer. Oncoimmunology 7, e1371895. Martinez, R.J., Andargachew, R., Martinez, H. a., and Evavold, B.D. (2016). Low-affinity CD4+ T cells are major responders in the primary immune response. Nat. Commun. 7, 13848. Mason, D. (1998). A very high level of crossreactivity is an essential feature of the T-cell receptor. Immunol Today 19, 395–404. Mazzei, G.J., Edgerton, M.D., Losberger, C., Lecoanet-Henchoz, S., Graber, P., Durandy, A., Gauchat, J.-F., Bernard, A., Allet, B., and Bonnefoy, J.-Y. (1995). Recombinant soluble trimeric CD40 ligand is biologically active. J. Biol. Chem. 270, 7025–7028. Meuer, S.C., FitzGerald, K.A., Hussey, R.E., Hodgdon, J.C., Schlossman, S.F., and Reinherz, E.L. (1983). Clonotypic structures involved in antigen-specific human T cell function. J Exp. Med. 157, 705–719. Mock, U., Thiele, R., Uhde, A., Fehse, B., and Horn, S. (2012). Efficient lentiviral transduction and transgene expression in primary human B cells. Hum. Gene Ther. Methods 23, 408–415. Mohammed, F., Cobbold, M., Zarling, A.L., Salim, M., Barrett-Wilt, G.A., Shabanowitz, J., Hunt, D.F., Engelhard, V.H., and Willcox, B.E. (2008). Phosphorylation-dependent interaction between antigenic peptides and MHC class I: a molecular basis for the presentation of transformed self. Nat. Immunol. 9, 1236–1243. Moise, L., McMurry, J. a., Buus, S., Frey, S., Martin, W.D., and De Groot, A.S. (2009). In silico-accelerated identification of conserved and immunogenic variola/vaccinia T-cell epitopes. Vaccine 27, 6471–6479. Montel, A.H., Morse, P. a., and Brahmi, Z. (1995). Upregulation of B7 molecules by the Epstein-Barr virus enhances susceptibility to lysis by a human NK-like cell line. Cell. Immunol. 160, 104–114. Moodie, Z., Price, L., Gouttefangeas, C., Mander, A., Janetzki, S., Löwer, M., Welters, M.J.P., Ottensmeier, C., Van Der Burg, S.H., and Britten, C.M. (2010). Response definition criteria for 116 ELISPOT assays revisited. Cancer Immunol. Immunother. 59, 1489–1501. Moore, M.W., Carbone, F.R., and Bevan, M.J. (1988). Introduction of soluble protein into the class I pathway of antigen processing and presentation. Cell 54, 777–785. Morrison, B.Y.L.A., Lukacher, A.E., Braciale, V.L., Fan, D.P., and Braciale, T.J. (1986). Differences in antigen presentation to MHC class I-and class II-restricted influenza virus-specific cytolytic T lymphocyte clones. J. Exp. Med. 163, 903–921. Newell, E.W., Klein, L.O., Yu, W., and Davis, M.M. (2009). Simultaneous detection of many T-cell specificities using combinatorial tetramer staining. Nat. Methods 6, 497–499. Newell, E.W., Sigal, N., Nair, N., Kidd, B.A., Greenberg, H.B., and Davis, M.M. (2013). Combinatorial tetramer staining and mass cytometry analysis facilitate T-cell epitope mapping and characterization. Nat. Biotechnol. 31, 623–629. Ng, D.T.W., and Sarkar, C.A. (2014). NP-Sticky: A web server for optimizing DNA ligation with non-palindromic sticky ends. J. Mol. Biol. 426, 1861–1869. Nguyen, A.W., and Daugherty, P.S. (2005). Evolutionary optimization of fluorescent proteins for intracellular FRET. Nat. Biotechnol. 23, 355–360. Nikolich-Zugich, J., Slifka, M.K., and Messaoudi, I. (2004). The many important facets of T-cell repertoire diversity. Nat. Rev. Immunol. Ortmann, B., Androlewicz, M.J., and Cresswell, P. (1994). MHC class I/beta 2-microglobulin complexes associate with TAP transporters before peptide binding. Nat. Chem. 368, 864–867. Ortmann B, Copeman J, Lehner PJ, Sadasivan B, Herbert JA, Grandea AG, Riddel SR, Tampe R, Spies T, Trowsdale J, et al. (1997). A critical role for tapasin in the assembly and function of multimeric MHC class I TAP complexes. Science 277, 1306–1309. Ott, P.A., Hu, Z., Keskin, D.B., Shukla, S.A., Sun, J., Bozym, D.J., Zhang, W., Luoma, A., Giobbie-Hurder, A., Peter, L., et al. (2017). An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217–221. Overwijk, W.W., Theoret, M.R., Finkelstein, S.E., Surman, D.R., de Jong, L. a., Vyth-Dreese, F. a., Dellemijn, T. a., Antony, P. a., Spiess, P.J., Palmer, D.C., et al. (2003). Tumor Regression and Autoimmunity after Reversal of a Functionally Tolerant State of Self-reactive CD8 + T Cells. J. Exp. Med. 198, 569–580. Packard, B.Z., Telford, W.G., Komoriya, A., and Henkart, P.A. (2007). Granzyme B activity in target cells detects attack by cytotoxic lymphocytes. J. Immunol. 179, 3812–3820. Pender, M.P., and Burrows, S.R. (2014). Epstein–Barr virus and multiple sclerosis: potential opportunities for immunotherapy. Clin. Transl. Immunol. 3, e27. 117 Pender, M.P., Csurhes, P. a., Smith, C., Beagley, L., Hooper, K.D., Raj, M., Coulthard, A., Burrows, S.R., and Khanna, R. (2014). Epstein-Barr virus-specific adoptive immunotherapy for progressive multiple sclerosis. Mult. Scler. J. 20, 6–10. Pender, M.P., Csurhes, P.A., Burrows, J.M., and Burrows, S.R. (2017). Defective T-cell control of Epstein–Barr virus infection in multiple sclerosis. Clin. Transl. Immunol. 6, e147. Pentier, J.M., Sewell, A.K., and Miles, J.J. (2013). Advances in T-cell epitope engineering. Front. Immunol. 4, 133. Peters, P.J., Neefjes, J.J., Oorschot, V., Ploegh, H.L., and Geuze, H.J. (1991). Segregation of MHC class II molecules from MHC class I molecules in the Golgi complex for transport to lysosomal compartments. Nature 349, 669–676. Petersen, J., Purcell, A.W., and Rossjohn, J. (2009). Post-translationally modified T cell epitopes: Immune recognition and immunotherapy. J. Mol. Med. 87, 1045–1051. Pinkoski, M.J., Waterhouse, N.J., Heibein, J.A., Wolf, B.B., Kuwana, T., Goldstein, J.C., Newmeyer, D.D., Bleackley, R.C., and Green, D.R. (2001). Granzyme B-mediated Apoptosis Proceeds Predominantly through a Bcl-2-inhibitable Mitochondrial Pathway. J. Biol. Chem. 276, 12060–12067. Prakash, M.D., Bird, C.H., and Bird, P.I. (2009). Active and zymogen forms of granzyme B are constitutively released from cytotoxic lymphocytes in the absence of target cell engagement. Immunol. Cell Biol. 87, 249–254. Prommersberger, S., Höfflin, S., Schuler-Thurner, B., Schuler, G., Schaft, N., and Dörrie, J. (2015). A new method to monitor antigen-specific CD8+ T cells, avoiding additional target cells and the restriction to human leukocyte antigen haplotype. Gene Ther. 6, 516–520. Rappuoli, R. (2000). Reverse Vaccinology. Curr. Opin. Microbiol. 3, 445–450. Reinherz, E.L. (2015). αβ TCR-Mediated Recognition: Relevance to Tumor-Antigen Discovery and Cancer Immunotherapy. Cancer Immunol. Res. 3, 305–312. Reiss, B., Sprengel, R., and Schaller, H. (1984). Protein fusions with the kanamycin resistance gene from transposon Tn5. EMBO J. 3, 3317–3322. Reth, M. (1989). Antigen receptor tail clue. Nature 338, 383–384. Riley, T.P., Hellman, L.M., Gee, M.H., Mendoza, J.L., Alonso, J.A., Foley, K.C., Nishimura, M.I., Vander Kooi, C.W., Garcia, K.C., and Baker, B.M. (2018). T cell receptor cross-reactivity expanded by dramatic peptide–MHC adaptability. Nat. Chem. Biol. 14, 934–942. Rizvi, N.A., Hellmann, M.D., Snyder, A., Kvistborg, P., Makarov, V., Havel, J.J., Lee, W., Yuan, J., Wong, P., Ho, T.S., et al. (2016). Mutational landscape determines sensitivity to PD-1 blockade in non. 348, 124–128. 118 Robbins, P.F., Lu, Y.-C., El-Gamil, M., Li, Y.F., Gross, C., Gartner, J., Lin, J.C., Teer, J.K., Cliften, P., Tycksen, E., et al. (2013). Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nat. Med. 19, 747–752. Robins, H.S., Campregher, P. V, Srivastava, S.K., Wacher, A., Turtle, C.J., Kahsai, O., Riddell, S.R., Warren, E.H., and Carlson, C.S. (2009). Comprehensive assessment of T-cell receptor beta-chain diversity in alphabeta T cells. Blood 114, 4099–4107. Robinson, J., Halliwell, J.A., Hayhurst, J.D., Flicek, P., Parham, P., and Marsh, S.G.E. (2015). The IPD and IMGT/HLA database: Allele variant databases. Nucleic Acids Res. 43, D423–D431. Roby, K.F., Taylor, C.C., Sweetwood, J.P., Cheng, Y., Pace, J.L., Tawfik, O., Persons, D.L., Smith, P.G., and Terranova, P.F. (2000). Development of a syngeneic mouse model for events related to ovarian cancer. Carcinogenesis 21, 585–591. Romanski, A., Uherek, C., Bug, G., Seifried, E., Klingemann, H., Wels, W.S., Ottmann, O.G., and Tonn, T. (2016). CD19-CAR engineered NK-92 cells are sufficient to overcome NK cell resistance in B-cell malignancies. J. Cell. Mol. Med. 20, 1287–1294. La Rosa, C., Krishnan, R., Markel, S., Schneck, J.P., Houghten, R., Pinilla, C., and Diamond, D.J. (2001). Enhanced immune activity of cytotoxic T-lymphocyte epitope analogs derived from positional scanning synthetic combinatorial libraries. Blood 97, 1776–1786. Rosenberg, S.A., and Restifo, N.P. (2015). Adoptive cell transfer as personalized immunotherapy for human cancer. Science 348, 62–68. Rosenberg, S.A., Packard, B.S., Aebersold, P.M., Solomon, D., Topalian, S.L., Toy, S.T., Simon, P., Lotze, M.T., Yang, J.C., Seipp, C.A., et al. (1988). Use of Tumor-Infiltrating Lymphocytes and Interleukin-2 in the Immunotherapy of Patients with Metastatic Melanoma. N Engl J Med 319, 1676–1680. Rush, J.S., and Hodgkin, P.D. (2001). B cells activated via CD40 and IL-4 undergo a division burst but require continued stimulation to maintain division, survival and differentiation. Eur. J. Immunol. 31, 1150–1159. Sadelain, M., Brentjens, R., and Rivière, I. (2013). The basic principles of chimeric antigen receptor design. Cancer Discov. 3, 388–398. Sahin, U., and Türeci, Ö. (2018). Personalized vaccines for cancer immunotherapy. Science 359, 1355–1360. Sahin, U., Türeci, O., Schmitt, H., Cochlovius, B., Johannes, T., Schmits, R., Stenner, F., Luo, G., Schobert, I., and Pfreundschuh, M. (1995). Human neoplasms elicit multiple specific immune responses in the autologous host. Proc. Natl. Acad. Sci. USA 92, 11810–11813. Sakabe, S., Sullivan, B.M., Hartnett, J.N., Robles-Sikisaka, R., Gangavarapu, K., Cubitt, B., 119 Ware, B.C., Kotliar, D., Branco, L.M., Goba, A., et al. (2018). Analysis of CD8+ T cell response during the 2013–2016 Ebola epidemic in West Africa. Proc. Natl. Acad. Sci. USA 115, E7578–E7586. Sala Torra, O., Othus, M., Williamson, D.W., Wood, B., Kirsch, I., Robins, H., Beppu, L., O’Donnell, M.R., Forman, S.J., Appelbaum, F.R., et al. (2017). Next-Generation Sequencing in Adult B Cell Acute Lymphoblastic Leukemia Patients. Biol. Blood Marrow Transplant. 23, 691–696. Sato, E., Olson, S.H., Ahn, J., Bundy, B., Nishikawa, H., Qian, F., Jungbluth, A.A., Frosina, D., Gnjatic, S., Ambrosone, C., et al. (2005). Intraepithelial CD8+ tumor-infiltrating lymphocytes and a high CD8+/regulatory T cell ratio are associated with favorable prognosis in ovarian cancer. Proc. Natl. Acad. Sci. USA 102, 18538–18543. Scaviner, D., and Lefranc, M.P. (2000a). The human T cell receptor alpha joining (TRAJ) genes. Exp Clin Immunogenet 17, 97–106. Scaviner, D., and Lefranc, M.P. (2000b). The human T cell receptor alpha variable (TRAV) genes. Exp Clin Immunogenet 17, 83–96. Sedelies, K.A., Ciccone, A., Clarke, C.J.P., Oliaro, J., Sutton, V.R., Scott, F.L., Silke, J., Susanto, O., Green, D.R., Johnstone, R.W., et al. (2008). Blocking granule-mediated death by primary human NK cells requires both protection of mitochondria and inhibition of caspase activity. Cell Death Differ. 15, 708–717. Serafini, M., Naldini, L., and Introna, M. (2004). Molecular evidence of inefficient transduction of proliferating human B lymphocytes by VSV-pseudotyped HIV-1-derived lentivectors. Virology 325, 413–424. Sewell, A.K. (2012). Why must T cells be cross-reactive? Nat Rev Immunol 12, 669–677. Shankaran, V., Ikeda, H., Bruce, A.T., White, J.M., Swanson, P.E., Old, L.J., and Schreiber, R.D. (2001). IFNgamma and lymphocytes preven primary tumour development and shape tumour immunogenicity. Nature 410, 1107. Shepherd, J.C., Schumacher, T.N.M., Ashton-Rickardt, P.G., Imaeda, S., Ploegh, H.L., Janeway, C.A., and Tonegawa, S. (1993). TAP1-dependent peptide translocation in vitro is ATP dependent and peptide selective. Cell 74, 577–584. Shimizu, Y., Geraghty, D.E., Koller, B.H., Orr, H.T., and DeMars, R. (1988). Transfer and expression of three cloned human non-HLA-A,B,C class I major histocompatibility complex genes in mutant lymphoblastoid cells. Proc. Natl. Acad. Sci. USA 85, 227–231. Shugay, M., Bagaev, D. V., Zvyagin, I. V., Vroomans, R.M., Crawford, J.C., Dolton, G., Komech, E. a., Sycheva, A.L., Koneva, A.E., Egorov, E.S., et al. (2018). VDJdb: A curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res. 46, D419–D427. 120 Sibener, L. V., Fernandes, R. a., Kolawole, E.M., Carbone, C.B., Liu, F., McAffee, D., Birnbaum, M.E., Yang, X., Su, L.F., Yu, W., et al. (2018). Isolation of a Structural Mechanism for Uncoupling T Cell Receptor Signaling from Peptide-MHC Binding. Cell 174, 672–687. Siewert, K., Malotka, J., Kawakami, N., Wekerle, H., Hohlfeld, R., and Dornmair, K. (2012). Unbiased identification of target antigens of CD8+ T cells with combinatorial libraries coding for short peptides. Nat. Med. 18, 824–828. Slifka, M.K., and Whitton, J.L. (2000). Antigen-Specific Regulation of T Cell – Mediated Cytokine Production. 12, 451–457. Snyder, A., Makarov, V., Merghoub, T., Yuan, J., Zaretsky, J.M., Desrichard, A., Walsh, L.A., Postow, M.A., Wong, P., Ho, T.S., et al. (2014). Genetic Basis for Clinical Response to CTLA-4 Blockade in Melanoma. N. Engl. J. Med. 371, 2189–2199. Soen, Y., Chen, D.S., Kraft, D.L., Davis, M.M., and Brown, P.O. (2003). Detection and Characterizationof Cellular Immune Responses Using Peptide–MHC Microarrays. PLoS Biol. 1, 429–438. Stone, J.D., and Kranz, D.M. (2013). Role of T cell receptor affinity in the efficacy and specificity of adoptive T cell therapies. Front. Immunol. 4, 244. Stone, J.D., Demkowicz, W.E., and Stern, L.J. (2005). HLA-restricted epitope identification and detection of functional T cell responses by using MHC–peptide and costimulatory microarrays. PNAS 102, 3744–3749. Stone, J.D., Chervin, A.S., and Kranz, D.M. (2009). T-cell receptor binding affinities and kinetics: impact on T-cell activity and specificity. Immunology 126, 165–176. Su, L.F., Kidd, B.A., Han, A., Kotzin, J.J., and Davis, M.M. (2013). Virus-Specific CD4+ Memory-Phenotype T Cells Are Abundant in Unexposed Adults. Immunity 38, 373–383. Sun, J., and Kavathas, P.B. (1997). Comparison of the roles of CD8 alpha alpha and CD8 alpha beta in interaction with MHC class I. J. Immunol. 159, 6077–6082. Sun, Q., Burton, R.L., Dai, L.J., Britt, W.J., and Lucas, K.G. (2000). B lymphoblastoid cell lines as efficient APC to elicit CD8+ T cell responses against a cytomegalovirus antigen. J. Immunol. Taguchi, T., McGhee, J.R., Coffman, R.L., Beagley, K.W., Eldridge, J.H., Takatsu, K., and Kiyono, H. (1990). Detection of individual mouse splenic T cells producing IFN-gamma and IL-5 using the enzyme-linked immunospot (ELISPOT) assay. J. Immunol. Methods 128, 65–73. Takeuchi, A., and Saito, T. (2017). CD4 CTL, a cytotoxic subset of CD4+T cells, their differentiation and function. Front. Immunol. 8, 194. Tam, Y.K., Maki, G., Miyagawa, B., Hennemann, T., and Klingemann, H.-G. (1999). Characterization of genetically altered, interleukin 2-independent natural killer cell lines suitable 121 for adoptive cellular immunotherapy. Hum. Gene Ther. 10, 1359–1373. Tan, C.T., Croft, N.P., Dudek, N.L., Williamson, N.A., and Purcell, A.W. (2011). Direct quantitation of MHC-bound peptide epitopes by selected reaction monitoring. Proteomics 11, 2336–2340. Taylor, M.J., Husain, K., Gartner, Z.J., Mayor, S., and Vale, R.D. (2017). A DNA-Based T Cell Receptor Reveals a Role for Receptor Clustering in Ligand Discrimination. Cell 169, 108–119. The MHC sequencing consortium (1999). Complete sequence and gene map of a human major histocompatibility complex. Nature 401, 921–923. Theaker, S.M., Rius, C., Greenshields-Watson, A., Lloyd, A., Trimby, A., Fuller, A., Miles, J.J., Cole, D.K., Peakman, M., Sewell, A.K., et al. (2016). T-cell libraries allow simple parallel generation of multiple peptide-specific human T-cell clones. J. Immunol. Methods 430, 43–50. Thomas, N., Heather, J., Ndifon, W., Shawe-Taylor, J., and Chain, B. (2013). Decombinator: A tool for fast, efficient gene assignment in T-cell receptor sequences using a finite state machine. Bioinformatics 29, 542–550. Toebes, M., Coccoris, M., Bins, A., Rodenko, B., Gomez, R., Nieuwkoop, N.J., van de Kasteele, W., Rimmelzwaan, G.F., Haanen, J.B.A.G., Ovaa, H., et al. (2006). Design and use of conditional MHC class I ligands. Nat. Med. 12, 246–251. Townsend, A.R.M., Rothbard, J., Gotch, F.M., Bahadur, G., Wraith, D., and McMichael, A.J. (1986). The Epitopes of Influenza Nucleoprotein Recognized by Cytotoxic T Lymphocytes Can Be Defined with Short Synthetic Peptides. Cell 44, 959–968. Tumeh, P.C., Harview, C.L., Yearley, J.H., Shintaku, I.P., Taylor, E.J.M., Robert, L., Chmielowski, B., Spasic, M., Henry, G., Ciobanu, V., et al. (2014). PD-1 blockade induces responses by inhibiting adaptive immune resistance. Nature 515, 568–571. Turchaninova, M.A., Britanova, O. V, Bolotin, D.A., Shugay, M., Putintseva, E. V, Staroverov, D.B., Sharonov, G., Shcherbo, D., Zvyagin, I. V, Mamedov, I.Z., et al. (2013). Pairing of T-cell receptor chains via emulsion PCR. Eur. J. Immunol. 43, 2507–2515. Turner, M.J., Abdul-Alim, C.S., Willis, R.A., Fisher, T.L., Lord, E.M., and Frelinger, J.G. (2001). T-cell antigen discovery (T-CAD) assay: A novel technique for identifying T cell epitopes. J. Immunol. Methods 256, 107–119. Uhlen, M., Fagerberg, L., Hallstrom, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, A., Kampf, C., Sjostedt, E., Asplund, A., et al. (2015). Tissue-based map of the human proteome. Science 347, 1260419–1260419. Valentino, M.D., Abdul-Alim, C.S., Maben, Z.J., Skrombolas, D., Hensley, L.L., Kawula, T.H., Dziejman, M., Lord, E.M., Frelinger, J.A., and Frelinger, J.G. (2011). A broadly applicable approach to T cell epitope identification: Application to improving tumor associated epitopes and 122 identifying epitopes in complex pathogens. J. Immunol. Methods 373, 111–126. Valitutti, S., Müller, S., Cella, M., Padovan, E., and Lanzavecchia, A. (1995). Serial triggering of many T-cell receptors by a few peptide MHC complexes. Nature 375, 148–151. Veillette, A., Bookman, M.A., Horak, E.M., and Bolen, J.B. (1988). The CD4 and CD8 T cell surface antigens are associated with the internal membrane tyrosine-protein kinase p56lck. Cell 55, 301–308. Vita, R., Overton, J. a., Greenbaum, J. a., Ponomarenko, J., Clark, J.D., Cantrell, J.R., Wheeler, D.K., Gabbard, J.L., Hix, D., Sette, a., et al. (2014). The immune epitope database (IEDB) 3.0. Nucleic Acids Res. 43, D405–D412. Wardemann, H. (2003). Predominant Autoantibody Production by Early Human B Cell Precursors. Science 301, 1374–1377. Wick, D.A., Webb, J.R., Nielsen, J.S., Martin, S.D., Kroeger, D.R., Milne, K., Castellarin, M., Twumasi-Boateng, K., Watson, P.H., Holt, R. a., et al. (2014). Surveillance of the tumor mutanome by T cells during progression from primary to recurrent ovarian cancer. Clin. Cancer Res. 20, 1125–1134. Woodsworth, D.J., Castellarin, M., and Holt, R.A. (2013). Sequence analysis of T-cell repertoires in health and disease. Genome Med 5, 98. Woodsworth, D.J., Dunsing, V., and Coombs, D. (2015). Design Parameters for Granzyme-Mediated Cytotoxic Lymphocyte Target-Cell Killing and Specificity. Biophys. J. 109, 477–488. Wooldridge, L., Ekeruche-Makinde, J., Van Den Berg, H.A., Skowera, A., Miles, J.J., Tan, M.P., Dolton, G., Clement, M., Llewellyn-Lacey, S., Price, D.A., et al. (2012). A single autoimmune T cell receptor recognizes more than a million different peptides. J. Biol. Chem. 287, 1168–1177. Yablonski, D., Kadlecek, T., and Weiss, A. (2001). Identification of a Phospholipase C-γ1 (PLC-γ1) SH3 Domain-Binding Site in SLP-76 Required for T-Cell Receptor-Mediated Activation of PLC-γ1 and NFAT. Mol Cell Biol 21, 4208–4218. Yanagi, Y., Yoshikai, Y., Leggett, K., Clark, S.P., Aleksander, I., and Mak, T.W. (1984). A human T cell-specific cDNA clone encodes a protein having extensive homology to immunoglobulin chains. Nature 310, 145–149. Yanelly, J.R., Sullivan, J.A., Mandell, J.L., and Engelhard, V.H. (1986). Reorientation and fusion of cytotoxic T lymphocyte granules after interaction with targer cells as determined by high resolution cinemigrography. J Immunol 136, 377. Yodoi, J., Teshigawara, K., Nikaido, T., Fukui, K., Noma, T., Honjo, T., Takigawa, M., Sasaki, M., Minato, N., and Tsudo, M. (1985). Regulation of IL 2 receptor on a natural killer-like cell line (YT cells). J. Immunol. 134, 1623–1630. 123 Yu, W., Jiang, N., Ebert, P.J., Kidd, B.A., Muller, S., Lund, P.J., Juang, J., Adachi, K., Tse, T., Birnbaum, M.E., et al. (2015a). Clonal Deletion Prunes but Does Not Eliminate Self-Specific αβ CD8+ T Lymphocytes. Immunity 42, 929–941. Yu, Y., Ceredig, R., and Seoighe, C. (2015b). LymAnalyzer: A tool for comprehensive analysis of next generation sequencing data of T cell receptors and immunoglobulins. Nucleic Acids Res. 44, e31. Zhang, W., Sloan-Lancaster, J., Kitchen, J., Trible, R.P., and Samelson, L.E. (1998). LAT: the ZAP-70 tyrosine kinase substrate that links T-cell receptor to cellular activation. Cell 92, 83–92. Zhou, C.-Y., Wen, Q., Chen, X.-J., Wang, R.-N., He, W.-T., Zhang, S.-M., Du, X.-L., and Ma, L. (2016). Human CD8+ T cells transduced with an additional receptor bispecific for both Mycobacterium tuberculosis and HIV‐1 recognize both epitopes. J. Cell. Mol. Med. 10, 1984–1988. Zinkernagel, R.M., and Doherty, P.C. (1975). H-2 compatability requirement for T-cell-mediated lysis of target cells infected with lymphocytic choriomeningitis virus. Different cytotoxic T-cell specificities are associated with structures coded for in H-2K or H-2D. J Exp. Med. 141, 1427–1436. Zorita, E., Cuscó, P., and Filion, G.J. (2015). Starcode: Sequence clustering based on all-pairs search. Bioinformatics 31, 1913–1919. Zufferey, R., Nagy, D., Mandel, R.J., Naldini, L., and Trono, D. (1997). Multiply attenuated lentiviral vector achieves efficient gene delivery in vivo. Nat. Biotechnol. 15, 871–875. 124 Appendices Appendix A - Enhanced protocol for the production of high-diversity, high-quality minigene plasmid libraries Inserting high-diversity populations of short DNA fragments into plasmid vectors carries with it two main technical hurdles. First, low ligation/transformation efficiency of desired recombinant plasmid product can potentially present an upper limit on the diversity of plasmid libraries achievable. Second, dependent on the insertion strategy employed, high proportions of artifact species may be introduced in the library population, potentially drastically decreasing the efficiency of downstream library applications. These include recircularized parental plasmid, concatemers, or inadvertent inclusion of contaminants. Described here is a novel protocol that I have developed for conducting endonuclease-based cloning of diverse DNA fragment libraries into appropriate plasmid vectors with high efficiency and accuracy. Key to this method is the use of three different endonucleases and stepwise ligation of inserts and vector, in contrast to the two-endonuclease, single-ligation that is done conventionally. Importantly the endonucleases designated to prepare library inserts for insertion into plasmid vector are the homing endonucleases I-SceI and PI-SceI. These enzymes are extremely rare cutters that will serve to reduce bias in final library preparations that stems from the undesired digestion of library inserts by conventional 6- or 8-cutter restriction enzymes. In the first round of ligation, linearized vector and digested inserts are digested under high DNA concentration conditions (~25 nM total DNA) to ensure maximal ligation efficiency. Normally, such high concentration is not used in conventional cloning due to the likelihood of vector concatemerization or vector “capping” preventing recircularization due to insert molecules ligating both ends of linearized vector. However, in the method demonstrated here, a third endonuclease producing non-compatible sticky ends is used to ensure that the insert ligates only to the 3’ end of the plasmid at this stage. Once insert/vector chimeric molecules are generated, they are subject to a second round of digestion in which the necessary sticky end for recircularization is exposed. At this point, a low concentration ligation reaction (~3 nM) is performed to recircularize complete library plasmid molecules. In addition to avoiding capping artifacts, this triple-endonuclease/double-ligation approach mitigates the probability of partially 125 digested plasmid carryover molecule from forming re-ligated parental plasmid. In this procedure, for a plasmid molecule to form re-ligated parental vector, 2 out of 3 endonucleases must fail to cut and carry over into ligation reactions; in contrast, in a conventional cloning scheme, only 1 out of 2 endonuclease needs to fail in order to produce background parental plasmid that can contaminate final library preparations. As an additional means of boosting library diversity in this protocol, efficiency of bacterial transformation is maximized by purifying circular DNA prior to electroporation. In addition to buffer exchange by column or DNA precipitation, I have also incorporated a DNase treatment step with an enzyme to which circular DNA molecules are resistant. By performing this step, it possible to remove unligated linear DNA and precisely quantitate the amount of productive vector present in the final library preparation. This allows for the optimal DNA amount for transformation to be used to generate maximum colony numbers. The enhanced plasmid library construction protocol described in this appendix is summarized in Figure A-1. Detailed steps provided in Section A.2 are in reference to constructing a random minigene library derived from synthetic degenerate oligonucleotide DNA. For applications involving sheared DNA from endogenous sources, section A.4.1 contains a description of a custom adapter strategy that can be used to introduce sheared, end-polished, A-tailed, double-stranded DNA fragments into the pMND-libDest-FRET2 host plasmid vector used in this appendix. 126 Appendix Figure A-1. Graphical overview of the enhanced plasmid library construction strategy for accurately and efficiently inserting high-diversity minigene libraries in lentiviral transfer plasmid backbones. The sequential digestion and ligation steps outlined in the detailed procedure below are summarized here. Numbered red boxes indicate the order of the steps to be performed and key notes regarding each step. Schematic diagrams of the desired starting materials and intermediates are included at each step. A.1 Materials  Minigene library DNA generated synthetically or by adapterization of sheared cDNA/gDNA fragments (see supplementary information Section A.5.1)  pMND-libDest-FRET2 plasmid DNA (see Section B.1.2)  Autoclaved, 18.2 milli-Ω H2O  10X Platinum Taq PCR buffer (Thermo Fisher Scientific)  1X composition: 20 mM Tris-HCl, 50 mM KCl; pH 8.4 127  MgCl2 solution (50 mM) (Thermo Fisher Scientific)  dNTP mix (25 mM each) (New England Biolabs)  Platinum Taq polymerase (10,000 U/mL) (Thermo Fisher Scientific)  Anhydrous ethanol (Commercial Alcohols Ltd.)  GlycoBlue coprecipitant (15 mg/mL) (Thermo Fisher Scientific)  Ammonium acetate (Sigma)  Econospin Micro Volume Columns (Epoch Life Science) + relevant column buffers:  Buffer P1 (Qiagen): 50 mM Tris-HCl, 100 μg/mL RNaseA; pH 8.0  Buffer P2 (Qiagen): 200 mM NaOH, 1% SDS  Buffer P3 (Qiagen): 3.0 M potassium acetate; pH 5.5  Buffer N3 (Qiagen): 4.2 M guanidine hydrochloride, 0.9 M potassium acetate; pH 4.8  Buffer PB (Qiagen): 5 M guanidine hydrochloride, 30% isopropanol  Buffer QG (Qiagen): 5.5 M guanidinium thiocyanate, 20 mM Tris-HCl; pH 6.6  Buffer PE (Qiagen): 10 mM Tris-HCl, 80% ethanol; pH 7.5  Buffer EB (Qiagen): 10 mM Tris-HCl; pH 8.5  I-SceI (5,000 U/mL) (New England Biolabs)  PI-SceI (5,000 U/mL) (New England Biolabs)  XcmI (5,000 U/mL) (New England Biolabs)  10X CutSmart Buffer concentrate (New England Biolabs)  1X composition: 50 mM potassium acetate, 20 mM Tris-acetate, 10 mM magnesium acetate, 100 μg/mL BSA; pH 7.9  10X NEBuffer PI-SceI concentrate (New England Biolabs)  1X composition: 100 mM KCl, 10 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT; pH 8.6  Purified BSA (New England Biolabs)  NuSieve GTG Agarose (Lonza)  50X TAE buffer concentrate (Bio-Rad)  1X composition: 40 mM Tris-acetate, 1 mM EDTA; pH 8.3  10X β-agarase I reaction buffer (New England Biolabs)  1X composition: 10 mM Bis-Tris-HCl, 1 mM EDTA; pH 6.5 128  β-agarase I (1,000 U/mL) (New England Biolabs)  SYBR Safe DNA Gel Stain (Thermo Fisher Scientific)  Low Molecular Weight DNA Ladder (New England Biolabs)  1Kb DNA Ladder (Frogga Bio)  6X Gel Loading Dye, Purple (New England Biolabs)  E-gel (1.2%) pre-cast agarose gels – 12 lanes (Thermo Fisher Scientific)  E-gel (1%) pre-cast agarose gels – 4 lanes (Thermo Fisher Scientific)  E-gel PowerBase (Thermo Fisher Scientific)  Qubit dsDNA BR Assay Kit (Thermo Fisher Scientific)  Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific)  Qubit 2.0 Fluorometer (Thermo Fisher Scientific)  Typhoon FLA9500 biomolecular imager (GE Healthcare)  T4 DNA ligase (400,000 U/mL) (New England Biolabs)  ATP (25 mM) (Epicentre)  PlasmidSafe ATP-Dependent DNase (Epicentre)  10X PlasmidSafe Reaction Buffer concentrate (Epicentre)  1X composition: 33 mM Tris-acetate, 66 mM potassium acetate, 10 mM magnesium acetate, 500 nM DTT; pH 7.5  ElectroMAX DH10B T1R E.coli (Thermo Fisher Scientific)  SOC medium (Thermo Fisher Scientific)  LB-Agar plates + Carbenicillin (100 μg/mL)  GenePulser II electroporation system (Bio-Rad) + GenePulser electroporation cuvettes, 0.1 cm gap (Bio-Rad) A.2 Procedure A.2.1. Amplification of insert 1) Random minigene PCR:  Reaction setup:  312 μL H2O 129 + 8 μL RM6_FWD oligonucleotide (10 μM) + 8 μL RM6_REV oligonucleotide (10 μM) + 8 μL RM6_template oligonucleotide (1 μM) + 40 μL Taq buffer + 8 μL dNTP + 12 μL MgCl2 + 4 μL Taq polymerase  Split across 8 PCR tubes prior to cycling  Thermocycling: 1. 94℃, 2 minutes 2. 94℃, 30 seconds 3. 57℃, 30 seconds 4. 72℃, 15 seconds 5. Repeat steps 2-4 29 more times 6. 72℃, 5 minutes A.2.2. Digestion of insert 1) Pool and concentrate random minigene (RM) PCR reactions (1.8 μg product) by spin column cleanup. i. Add 5 volumes of Buffer PB and 10 μL of Buffer P3 to reaction mixture. ii. Vortex well and add mixture to micro volume spin columns (no more than 800 uL per column). iii. Centrifuge columns at 1,800 x g, 2 minutes. iv. Collect flow-through from columns and reload them on column membrane, Centrifuge again at 18,000 x g, 1 minute. v. Discard flow-through, load 750 uL of Buffer PE onto each column membrane. vi. Centrifuge at 18,000 x g, 1 minute. vii. Discard flow-through, load a second wash of 750 uL of Buffer PE onto each column membrane. viii. Centrifuge at 18,000 x g, 1 minute. 130 ix. Discard flow-through, dry columns by centrifuging with no buffer at 18,000 x g, 1 minute. x. Transfer columns to fresh 1.5 mL microcentrifuge tube and add 10 μL of Buffer EB to each column. xi. Elute by incubating for 5 minutes at room temperature, followed by 5 minutes at 37℃, followed by centrifugation at 18,000 x g for 1 minute. Pool eluates if desired. 2) I-SceI digest:  Reaction setup:  280 μL H2O + 40 μL 10X CutSmart buffer + 40 μL RM amplicon (100% of eluted PCR products) + 40 μL I-SceI  Incubation: 1. 37℃, 1 hour 2. 70℃, 10 minutes 3) PI-SceI digest:  Reaction setup:  1,380 μL H2O + 200 μL 10X PI-SceI buffer + 20 μL BSA + 400 μL I-SceI digestion mix + 40 μL PI-SceI  Incubation: 1. 37℃, 3 hours 2. 70℃, 10 minutes 4) To digestion mix, add 400 μL of ammonium acetate (10 M), 8 μL GlycoBlue, and 6 mL of anhydrous ethanol. 5) Vortex well and incubate overnight at 4℃. 6) Allow mixture to reach room temperature and then centrifuge at 18,000 x g, 60 minutes. 131 7) Remove supernatants, combine and wash pellets with 2 mL of freshly prepared 75% ethanol. 8) Centrifuge at 18,000 x g, 60 minutes. 9) Remove supernatant completely and air-dry pellet <30 seconds. 10) Resuspend pellet in 100 μL of Buffer EB. A.2.3. Gel purification of insert Prepare gel: 1) Slowly sprinkle 1.6 g of NuSieve GTG agarose into 40 mL of 1X TAE buffer stirring on magnetic hot plate for a 4% agarose solution. 2) Allow agarose to rehydrate at room temperature for 30 minutes with continued stirring. 3) Engage hotplate and warm solution until a gentle boil is achieved. 4) Allow solution to cool to the touch and add 4 μL of SYBR Safe DNA gel stain. 5) Pipet 30 mL of molten agarose to gel tray (7.2 cm x 10 cm) for ~4 mm thick gel. 6) Allow gel to set at room temperature and chill for 30 minutes at 4℃ for further solidification. Run gel: 7) Place gel in running tank (20 cm inter-electrode distance) containing 250 mL 1X TAE buffer for an approximate 4 mm buffer overlay. 8) Add 20 μL of 6X loading buffer to digested insert eluate. 9) Load DNA into gel wells (30 μL/well x 4 wells) 10) Mix 100 ng NEB low MW ladder and 5 μL loading buffer in a final volume of 30 μL and load into reference lane of gel 11) Run gel at 80 V, 2.5 hours Isolate desired DNA: 12) Image gel using Typhoon FLA 9500. Excise band containing double-cut random minigene library as shown in Fig. A-2 below. 132 Appendix Figure A-2. Demonstration of minigene library excision from 4% agarose gel. Random minigene inserts from I-SceI/PI-SceI digestion were loading across 4 wells of a high-resolution agarose gel to reveal a number of distinct DNA species present in the mixture. Gel bands labeled above as double-cut insert were excised with a scalpel and carried forward for further preparation. 13) Weigh recovered gel slices and add 4 volumes H2O to dilute agarose to 1% (1 mg = 1 μL). 14) Melt gel at 72℃; once completely liquefied, equilibrate gel to 42℃ for a minimum of 15 minutes. 15) Add 0.1 volume (relative to diluted agarose volume) of 10X β-agarase I buffer and 2 U/100 μL of β-agarase I. Mix well. 16) Incubate at 42℃, 30 minutes; Inactivate enzyme at 65℃, 15 minutes 17) Add 0.2 volumes of 10 M ammonium acetate and GlycoBlue to a final concentration of 50 μg/mL. 18) Add 2.5 volumes of anhydrous ethanol, vortex mixture, and store overnight at 4℃ 19) Allow mixture to reach room temperature and then centrifuge at 18,000 x g, 60 minutes. 20) Remove supernatants, combine and wash pellets with 2 mL of freshly prepared 75% ethanol. 21) Centrifuge at 18,000 x g, 60 minutes. 22) Remove supernatant completely and air-dry pellet <30 seconds. 23) Resuspend pellet in 15 μL of Buffer EB. A.2.4. Primary digestion of vector 1) PI-SceI digest: 133  Reaction setup:  301 μL H2O + 40 μL 10X PI-SceI buffer + 4 μL BSA + 15 μL Purified circular pMND-libDest-FRET2 DNA (1.8 μg/uL) + 40 μL PI-SceI  Incubation: 1. 37℃, 3 hours 2. 70, 10 minutes 2) XcmI digest:  Reaction setup:  1,380 μL H2O + 200 μL 10X CutSmart buffer + 400 μL PI-SceI digestion mix + 40 μL XcmI  Incubation: 1. 37℃, 1 hour 2. 70℃, 10 minutes 3) To digestion mix, add 400 μL of ammonium acetate (10 M), 8 μL GlycoBlue (15 μg/μL), and 6 mL of anhydrous ethanol. 4) Vortex well and incubate overnight at 4℃. 5) Allow mixture to reach room temperature and then centrifuge at 18,000 x g, 60 minutes. 6) Remove supernatants, combine and wash pellets with 2 mL of freshly prepared 75% ethanol. 7) Centrifuge at 18,000 x g, 60 minutes. 8) Remove supernatant completely and air-dry pellet <30 seconds. 9) Resuspend pellet in 200 μL of Buffer EB. A.2.5. Gel purification of vector 1) Load digested plasmid DNA into 11 wells of 12-well 1.2% E-gel (18 μL/well). 134 2) Into the 12th well, load 100 ng of 1Kb DNA ladder in a final volume of 18 μL. 3) Run gel in E-gel PowerBase for 45 minutes at 60V. 4) Break open plastic E-gel casing and excise bands corresponding to linear plasmid DNA. 5) Pool bands and extract DNA with modified spin-column gel purification procedure: i. Weigh gel slices and add 0.3 volumes of Buffer QG (1 mg = 1 μL) ii. Melt gel at 72℃. May require >15 minutes. iii. Homogenize agarose/Buffer QG mix by pipetting and vortexing. Add 6 volumes (relative to gel/Buffer QG mix) of Buffer PB. iv. Vortex well and load gel solution to DNA-binding spin columns (max 800 μL per column, use multiple parallel columns if necessary). v. Centrifuge columns at 18,000 x g, 1 minute. vi. Collect flow-through from columns and reload them on column membrane, Centrifuge again at 18,000 x g, 1 minute. vii. Discard flow-through, load 750 uL of Buffer PE onto each column membrane. viii. Centrifuge at 18,000 x g, 1 minute. ix. Discard flow-through, load a second wash of 750 uL of Buffer PE onto each column membrane. x. Centrifuge at 18,000 x g, 1 minute. xi. Discard flow-through, dry columns by centrifuging with no buffer at 18,000 x g, 1 minute. xii. Transfer columns to fresh 1.5 mL microcentrifuge tube and add 10 μL of Buffer EB to each column. xiii. Elute by incubating for 5 minutes at room temperature, followed by 5 minutes at 37℃, followed by centrifugation at 18,000 x g for 1 minute. Pool eluates if desired. A.2.6. High concentration ligation 1) Quantitate recovered insert and vector DNA using Qubit dsDNA BR assay.  Insert: 4 ng/μL, 15 μL total volume  Vector: 119 ng/μL, 25 μL total volume 135 2) T4 DNA ligase reaction:  Reaction setup (1:1 insert:vector molar ratio, 25 nM total DNA concentration):  25 μL prepared digested vector from Section A.3.5 (2.98 μg, 0.5 pmol) + 6 μL prepared digested insert from Section A.3.3 (24 ng, 0.5 pmol) + 4 μL CutSmart buffer + 1.6 μL ATP + 4 uL T4 DNA ligase  Incubation: 1. 16℃, 18 hours 2. 70℃, 10 minutes A.2.7. Secondary digestion of vector 1) I-SceI digest:  Reaction setup:  44 μL H2O + 6 μL 10X CutSmart buffer + 40 μL ligation mix from A.3.5 + 10 μL I-SceI  Incubation: 1. 37℃, 1 hour 2. 70℃, 10 minutes A.2.8. Secondary gel purification 1) Add 15 μL of 6X loading dye (containing SDS) and 15uL H2O to A.3.6 reaction mixture 2) Load DNA into 5 wells of 12-well 1.2% E-gel (18 μL/well). 3) Into the 1 well, load 100 ng of 1Kb DNA ladder in a final volume of 18 μL. 4) Into the remaining 6 wells, load 18 μL H2O. 5) Run gel in E-gel PowerBase for 45 minutes at 60V. 6) Break open plastic E-gel casing and excise bands corresponding to linear plasmid DNA. 136 7) Pool bands and extract DNA with modified spin-column gel purification procedure described in A.3.4, step 5). A.2.9. Low concentration ligation 1) T4 DNA ligase reaction:  Reaction setup (recircularization reaction, <3 nM total DNA concentration):  208 μL H2O + 30 μL CutSmart buffer + 12 μL ATP + 20 μL prepared DNA from A.3.8 (100% of eluted product) + 30 uL T4 DNA ligase  Incubation: 1. 16℃, 18 hours 2. 70℃, 10 minutes A.2.10. PlasmidSafe DNase treatment 1) PlasmidSafe DNase reaction:  Reaction setup:  288 μL ligation reaction from A.3.9 (12 μL sampled for gel shown in fig) + 12 μL ATP + 1 uL PlasmidSafe DNase  Incubation: 1. 37℃, 1 hour 2. 70℃, 10 minutes 2) To DNase reaction mixture, add 0.2 volumes of 10 M ammonium acetate and GlycoBlue to a final concentration of 50 μg/mL. 3) Add 2.5 volumes of anhydrous ethanol, vortex mixture, and store overnight at 4℃ 4) Allow mixture to reach room temperature and then centrifuge at 18,000 x g, 60 minutes. 5) Remove supernatants, combine and wash pellets with 1 mL of freshly prepared 75% ethanol. 137 6) Centrifuge at 18,000 x g, 60 minutes. 7) Remove supernatant completely and air-dry pellet <30 seconds. 8) Resuspend pellet in 18 μL of Buffer EB. A.2.11. Electroporation of library plasmid into bacterial hosts 1) Library plasmid quantification on-gel:  Run 1% E-gel in HR mode for 30 minutes:  Load 100 ng and 10 ng 1Kb DNA ladder in parallel with 2 μL of final prepared plasmid  Gel imaged on Typhoon FLA 9500  Perform gel pixelometry using standardized masses provided with 1Kb ladder and ImageQuant TL v8.1.0.0 2) Also quantify DNA using Qubit HS kit. Average values from both methods to determine final library concentration 3) Add 1 ng of library plasmid to 20 μL of ElectroMAX DH10B bacterial cells on ice 4) Immediately transfer DNA/bacteria mix to pre-chilled 0.1 cm GenePulser cuvette 5) Insert cuvette into GenePulserII with associated Pulse Controller Plus module and pulse with 2000V, 25 μF, 200 Ω 6) Monitor time constant to ensure that delivered pulse was ≈ 4.5 msec 7) Recover bacteria from cuvette using 975 μL of SOC media and shake at 37℃, 1 hour in 14 mL round-bottom culture tube 8) After recovery, library can be amplified by plating transformants on LB-Agar + Carbenicillin in a 576 cm2 culture tray and incubating ~18 hours at 37℃ 9) Resultant colonies are recovered by plate scraping 10) Bacterially amplified library DNA can then be isolated by standard maxiprep procedures. A.3 Results To characterize the yield of circular DNA obtained after performing the steps described in Sections A.2.1 to A.2.9, a sample of low-concentration ligation mix was taken before and after 138 treatment of ligation product with PlasmidSafe DNase. Both samples were quantitated by gel electrophoresis (Fig. A-3). In parallel, a positive control sample of linear DNA was used to confirm that DNase digestion efficiency was near 100%. The results show that 30% of the DNA present in the secondary ligation mixture from Section A.2.9 was in circular form prior to final ethanol precipitation and bacterial transformation. In comparison, conventional ligation methods would be expected to yield <5% in the correct circular form (Ng and Sarkar, 2014). Appendix Figure A-3. Quantitation of DNA after PlasmidSafe DNase treatment. Equivalent sample volumes of recircularized plasmid reaction mixture taken before and after PlasmidSafe DNase treatment were analyzed on a 1% E-gel. Using the gel quantitation tool included in ImageQuant TL and the standardized masses of FroggaBio 1Kb DNA ladder, a standard curve was calibrated and the quantity of recovered circular DNA was determined. Purified circular DNA was transformed and counts of colony-forming units (CFU) were obtained by plating 100 μL of 100X diluted transformants on LB-Agar (Figure A-4a). Colony counting was done using ImageQuant TL to determine that 1,488 colonies resulted from incubation of this 0.1% sample and, therefore, 1.5x106 CFU were present in the entire batch of transformants. 139 Given equivalent parallel transformation conditions for the remaining ligated DNA, this can be extrapolated to 2.1x107 potential CFU present in the final DNA product obtained from inputting 24 ng of digested/gel-purified of library insert DNA. Appendix Figure A-4. Quantitation and quality assessment of library generated using enhanced library construction protocol. a) 0.1% sample of transformants generated from electroporation of 1 ng (6.7% of total yield) of prepared minigene library plasmid results in 1,488 colonies by automated colony counting. b) 20 colonies were selected for overnight culture/plasmid mini-prep and Sanger sequenced using MND313.f primer (Section A.4.2). Reads were all >80% Q20 quality and were aligned to reference sequence using Geneious map-to-reference algorithm (http://desktop-links.geneious.com/assets/documentation/geneious/GeneiousReadMapper.pdf). c) An overnight culture of bulk transformants was grown and plasmids were extracted. Bulk library plasmid DNA was Sanger sequenced and aligned to reference sequence as above. The 75 bp flanking the random minigene site at each end were >85% Q30 quality while the random minigene itself was <5% Q20 quality. To assess the quality of resultant library, 20 colonies and 1 μL of bulk transformants were inoculated into 3 mL liquid cultures. The following day, plasmids were extracted from all cultures by standard plasmid mini-prep procedures and all recovered DNA samples were submitted for Sanger sequencing (Genewiz). In Figure A-4b, all 20 clones returned perfectly inserted random minigenes and detected no concatemers, re-ligated parental plasmid, or common in-house contaminant inserts. The bulk sequencing read (Figure A-4c) indicates that this high quality is maintained across the entire library population as conserved flanking regions are sequenced with high base call quality while the random minigene site exhibits complete degeneracy as expected. 140 A.4 Supplementary information A.4.1. Custom forked adapters for sheared gDNA/cDNA minigene libraries The detailed protocol described above assumes the input of synthetically generated oligonucleotide libraries. For applications of this library construction procedure to situations in which DNA derived from endogenous sources such as gDNA or cDNA is the intended source material, a modified version of the Y-shaped adapter strategy commonly used for Illumina sequencing library preparations can be used. The approach is summarized in Figure A-5. Input DNA should first be end-repaired and A-tailed. This can be achieved by numerous available commercial enzyme blends. Adapters are prepared by annealing libDestAdapter-topFork and libDestAdapter-bottom fork oligos (Section A.4.2), each at 25 μM in a 5 mM TE buffer + 2.5 M NaCl (pH 8.0). The oligo mix is heated to 95℃, held for 1 minute, and then cooled to 14℃ at a rate of -0.1℃/sec. The adapters can then be ligated to prepared DNA fragments by ligation at a 10:1 adapter:fragment molar ratio. Following this, PCR amplification is performed using MND313.f and LTR21.r primers to resolve forks and adapterized DNA fragments can now be inserted into pMND-libDest-FRET2 plasmid backbones as described starting from Section A.2.3. 141 Appendix Figure A-5. Adaptation of Illumina Y-shaped adapter strategy to inserting sheared DNA fragments into plasmid vectors. To insert double stranded DNA fragments from randomly sheared endogenous sources into plasmid vector, it is necessary to add flanking DNA sequences containing necessary cloning sites by ligation. The strategy illustrated above is a modification of an Illumina sequencing library preparation strategy described previously. In this embodiment, forked adapters are constructed consisting of two largely non-complementary strands held together by a short annealing region. The tails of this custom adapter contain the appropriate strands to add I-SceI/PI-SceI recognition sites in the correct position and orientation as well as outer priming sites for library amplification using trusted primers MND313.f and LTR21.r. This forked adapter can ligate to either end of the fragment and, after PCR cycling, will generate adapterized DNA fragments with identical 5’ and 3’ adapter regions with the fragment itself being oriented in both possible directions (orientation indicated by blue gradient shading). A.4.2. Oligonucleotide sequences referenced in this appendix Oligo name Oligo description Oligo sequence (5'-3') RM6_FWD Forward primer for random minigene library amplification. Tailed with partial I-SceI sequence and arbitrary tailing bases to facilitate endonuclease function. TCAACTCCAGTAGGGATAACAGGGTAATA RM6_REV Reverse primer for random minigene library amplification. Tailed with partial PI-SceI sequence and arbitrary tailing bases to facilitate endonuclease function. GTGTGCCACTCCATTTCATTACCTCTTTCTCCGCACCCGACATAG 142 RM6_template 2nd iteration of random minigene encoding library. Degenerate sequence (48bp) flanked by 5’-spacer base + partial I-SceI recognition site and partial PI-SceI recognition site at 3’ end. N-nucleotides were synthesized by specifying hand-mixed ratios for balanced incorporation of bases at each position CAGTAGGGATAACAGGGTAATANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATCTATGTCGGGTGCGGAG libDestAdapter-topFork Top strand of forked adapter pair intended for capture and insertion of sheared DNA fragments into pMND-libDest vector backbone. Asterisk (*) denotes phosphorothioate bond. GGCTAAGATCTACAGCTGCCTCCATTTCATTACCTCTTTCTCCGCACCCGACATAGATGCTCTTCCGATC*T libDestAdapter-bottomFork Bottom strand of forked adapter pair intended for capture and insertion of sheared DNA fragments into pMND-libDest vector backbone. ’P’ denotes 5’-phosphorylation PGATCGGAAGAGCATTACCCTGTTATCCCTAAACAGAAGCGAGAAGCGAAC MND313.f Forward primer for amplifying adapter-ligated sheared DNA fragments GTTCGCTTCTCGCTTCTGTT LTR21.r Reverse primer for amplifying adapter-ligated sheared DNA fragments GGCTAAGATCTACAGCTGCCT 143 Appendix B - Nucleic acid sequences used B.1 Annotated plasmid backbones B.1.1. pMND-silent-FRET LOCUS pMND-silent-FRET 9049 bp DNA circular SYN 09-AUG-2018 DEFINITION 3rd-generation lentiviral transfer vector with non-productive stuffer sequence + FRET reporter. Intended as recipient vector for minigene libraries; productive minigenes restore frame and result in functional fluorescent protein genes. ACCESSION KEYWORDS SOURCE synthetic construct ORGANISM synthetic construct FEATURES Location/Qualifiers misc_feature 1..6 /Recognition_pattern=\"G^AATTC\" /note=\"restriction site\" /label=\"EcoRI\" sig_peptide 7..72 /label=\"P2A sequence\" misc_feature 73..80 /Recognition_pattern=\"GGCCGG^CC\" /note=\"restriction site\" /label=\"FseI\" CDS 82..801 /label=\"Cerulean fluorescent protein\" misc_feature 802..807 /Recognition_pattern=\"A^CTAGT\" /label=\"SpeI\" /note=\"restriction site\" sig_peptide 808..828 /label=\"GzmB Cleavage Substrate\" misc_feature 829..834 /Recognition_pattern=\"AT^CGAT\" /label=\"ClaI\" /note=\"restriction site\" CDS 835..1554 /label=\"Enhanced YFP\" misc_feature 1555..1562 /Recognition_pattern=\"GG^CGCGCC\" /note=\"restriction site\" /label=\"AscI\" misc_feature 1563..1568 /Recognition_pattern=\"C^TCGAG\" /note=\"restriction site\" /label=\"XhoI\" LTR 1630..1847 /label=\"3' SIN LTR\"\"\" CDS 3087..3947 /note=\"selection marker\" /label=\"AmpR\" 144 rep_origin 4042..4274 /label=\"ColE1\" promoter 5324..5627 /note=\"promoter eukaryotic\" /label=\"CMV enhancer\" promoter 5628..5831 /note=\"promoter eukaryotic\" /label=\"CMV immediate-early promoter\" LTR 5846..6027 /label=\"5' LTR\" CDS 6182..6546 /label=\"dGAG\" motif 6692..6933 /label=\"RRE\" motif 7449..7566 /label=\"cPPT\" promoter 7758..8149 /note=\"promoter eukaryotic\" /label=\"MNDU3 Promoter\" misc_feature 8168..8175 /Recognition_pattern=\"TTAAT^TAA\" /note=\"restriction site\" /label=\"PacI\" motif 8179..8187 /note=\"translational start site eukaryotic\" /label=\"Kozak sequence\" misc_feature 8188..8193 /Recognition_pattern=\"G^GATCC\" /note=\"restriction site\" /label=\"BamHI\" sig_peptide 8194..8256 /label=\"T2A sequence\" misc_feature 8257..9049 /note=\"Arbitrary stuffer sequence\" ORIGIN 1 gaattcggaa gcggagctac taacttcagc ctgctgaagc aggctggaga cgtggaggag 61 aaccctggac ctggccggcc tatggtgagc aagggcgagg agctgttcac cggggtggtg 121 cccatcctgg tcgagctgga cggcgacgta aacggccaca agttcagcgt gtccggcgag 181 ggcgagggcg atgccaccta cggcaagctg accctgaagt tcatctgcac caccggcaag 241 ctgcccgtgc cctggcccac cctcgtgacc accctgacct ggggcgtgca gtgcttcgcc 301 cgctaccccg accacatgaa gcagcacgac ttcttcaagt ccgccatgcc cgaaggctac 361 gtccaggagc gcaccatctt cttcaaggac gacggcaact acaagacccg cgccgaggtg 421 aagttcgagg gcgacaccct ggtgaaccgc atcgagctga agggcatcga cttcaaggag 481 gacggcaaca tcctggggca caagctggag tacaacgcca tcagcgacaa cgtctatatc 541 accgccgaca agcagaagaa cggcatcaag gccaacttca agatccgcca caacatcgag 601 gacggcagcg tgcagctcgc cgaccactac cagcagaaca cccccatcgg cgacggcccc 661 gtgctgctgc ccgacaacca ctacctgagc acccagtcca agctgagcaa agaccccaac 721 gagaagcgcg atcacatggt cctgctggag ttcgtgaccg ccgccgggat cactctcggc 781 atggacgagc tgtacaagtt aactagtgtg ggccccgact tcggcaggat cgatatggtg 841 agcaagggcg aggagctgtt caccggggtg gtgcccatcc tggtcgagct ggacggcgac 901 gtaaacggcc acaagttcag cgtgtccggc gagggcgagg gcgatgccac ctacggcaag 961 ctgaccctga agttcatctg caccaccggc aagctgcccg tgccctggcc caccctcgtg 1021 accaccttcg gctacggcct gcagtgcttc gcccgctacc ccgaccacat gaagcagcac 1081 gacttcttca agtccgccat gcccgaaggc tacgtccagg agcgcaccat cttcttcaag 1141 gacgacggca actacaagac ccgcgccgag gtgaagttcg agggcgacac cctggtgaac 145 1201 cgcatcgagc tgaagggcat cgacttcaag gaggacggca acatcctggg gcacaagctg 1261 gagtacaact acaacagcca caacgtctat atcatggccg acaagcagaa gaacggcatc 1321 aaggtgaact tcaagatccg ccacaacatc gaggacggca gcgtgcagct cgccgaccac 1381 taccagcaga acacccccat cggcgacggc cccgtgctgc tgcccgacaa ccactacctg 1441 agctaccagt ccgccctgag caaagacccc aacgagaagc gcgatcacat ggtcctgctg 1501 gagttcgtga ccgccgccgg gatcactctc ggcatggacg agctgtacaa gtaaggcgcg 1561 ccctcgagag atcccccggg gtcgactgat caaattcgag ctcggtacct ttaagaccaa 1621 tgacttacaa ggcagctgta gatcttagcc actttttaaa agaaaagggg ggactggaag 1681 ggctaattca ctcccaacga agacaagatc tgctttttgc ttgtactggg tctctctggt 1741 tagaccagat ctgagcctgg gagctctctg gctaactagg gaacccactg cttaagcctc 1801 aataaagctt gccttgagtg cttcaagtag tgtgtgcccg tctgttgtgt gactctggta 1861 actagagatc cctcagaccc ttttagtcag tgtggaaaat ctctagcagt agtagttcat 1921 gtcatcttat tattcagtat ttataacttg caaagaaatg aatatcagag agtgagagga 1981 acttgtttat tgcagcttat aatggttaca aataaagcaa tagcatcaca aatttcacaa 2041 ataaagcatt tttttcactg cattctagtt gtggtttgtc caaactcatc aatgtatctt 2101 atcatgtctg gctctagcta tcccgcccct aactccgccc atcccgcccc taactccgcc 2161 cagttccgcc cattctccgc cccatggctg actaattttt tttatttatg cagaggccga 2221 ggccgcctcg gcctctgagc tattccagaa gtagtgagga ggcttttttg gaggcctagg 2281 cttttgcgtc gagacgtacc caattcgccc tatagtgagt cgtattacgc gcgctcactg 2341 gccgtcgttt tacaacgtcg tgactgggaa aaccctggcg ttacccaact taatcgcctt 2401 gcagcacatc cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct 2461 tcccaacagt tgcgcagcct gaatggcgaa tggcgcgacg cgccctgtag cggcgcatta 2521 agcgcggcgg gtgtggtggt tacgcgcagc gtgaccgcta cacttgccag cgccctagcg 2581 cccgctcctt tcgctttctt cccttccttt ctcgccacgt tcgccggctt tccccgtcaa 2641 gctctaaatc gggggctccc tttagggttc cgatttagtg ctttacggca cctcgacccc 2701 aaaaaacttg attagggtga tggttcacgt agtgggccat cgccctgata gacggttttt 2761 cgccctttga cgttggagtc cacgttcttt aatagtggac tcttgttcca aactggaaca 2821 acactcaacc ctatctcggt ctattctttt gatttataag ggattttgcc gatttcggcc 2881 tattggttaa aaaatgagct gatttaacaa aaatttaacg cgaattttaa caaaatatta 2941 acgtttacaa tttcccaggt ggcacttttc ggggaaatgt gcgcggaacc cctatttgtt 3001 tatttttcta aatacattca aatatgtatc cgctcatgag acaataaccc tgataaatgc 3061 ttcaataata ttgaaaaagg aagagtatga gtattcaaca tttccgtgtc gcccttattc 3121 ccttttttgc ggcattttgc cttcctgttt ttgctcaccc agaaacgctg gtgaaagtaa 3181 aagatgctga agatcagttg ggtgcacgag tgggttacat cgaactggat ctcaacagcg 3241 gtaagatcct tgagagtttt cgccccgaag aacgttttcc aatgatgagc acttttaaag 3301 ttctgctatg tggcgcggta ttatcccgta ttgacgccgg gcaagagcaa ctcggtcgcc 3361 gcatacacta ttctcagaat gacttggttg agtactcacc agtcacagaa aagcatctta 3421 cggatggcat gacagtaaga gaattatgca gtgctgccat aaccatgagt gataacactg 3481 cggccaactt acttctgaca acgatcggag gaccgaagga gctaaccgct tttttgcaca 3541 acatggggga tcatgtaact cgccttgatc gttgggaacc ggagctgaat gaagccatac 3601 caaacgacga gcgtgacacc acgatgcctg tagcaatggc aacaacgttg cgcaaactat 3661 taactggcga actacttact ctagcttccc ggcaacaatt aatagactgg atggaggcgg 3721 ataaagttgc aggaccactt ctgcgctcgg cccttccggc tggctggttt attgctgata 3781 aatctggagc cggtgagcgt gggtctcgcg gtatcattgc agcactgggg ccagatggta 3841 agccctcccg tatcgtagtt atctacacga cggggagtca ggcaactatg gatgaacgaa 3901 atagacagat cgctgagata ggtgcctcac tgattaagca ttggtaactg tcagaccaag 3961 tttactcata tatactttag attgatttaa aacttcattt ttaatttaaa aggatctagg 4021 tgaagatcct ttttgataat ctcatgacca aaatccctta acgtgagttt tcgttccact 4081 gagcgtcaga ccccgtagaa aagatcaaag gatcttcttg agatcctttt tttctgcgcg 4141 taatctgctg cttgcaaaca aaaaaaccac cgctaccagc ggtggtttgt ttgccggatc 4201 aagagctacc aactcttttt ccgaaggtaa ctggcttcag cagagcgcag ataccaaata 4261 ctgtccttct agtgtagccg tagttaggcc accacttcaa gaactctgta gcaccgccta 4321 catacctcgc tctgctaatc ctgttaccag tggctgctgc cagtggcgat aagtcgtgtc 4381 ttaccgggtt ggactcaaga cgatagttac cggataaggc gcagcggtcg ggctgaacgg 4441 ggggttcgtg cacacagccc agcttggagc gaacgaccta caccgaactg agatacctac 146 4501 agcgtgagct atgagaaagc gccacgcttc ccgaagggag aaaggcggac aggtatccgg 4561 taagcggcag ggtcggaaca ggagagcgca cgagggagct tccaggggga aacgcctggt 4621 atctttatag tcctgtcggg tttcgccacc tctgacttga gcgtcgattt ttgtgatgct 4681 cgtcaggggg gcggagccta tggaaaaacg ccagcaacgc ggccttttta cggttcctgg 4741 ccttttgctg gccttttgct cacatgttct ttcctgcgtt atcccctgat tctgtggata 4801 accgtattac cgcctttgag tgagctgata ccgctcgccg cagccgaacg accgagcgca 4861 gcgagtcagt gagcgaggaa gcggaagagc gcccaatacg caaaccgcct ctccccgcgc 4921 gttggccgat tcattaatgc agctggcacg acaggtttcc cgactggaaa gcgggcagtg 4981 agcgcaacgc aattaatgtg agttagctca ctcattaggc accccaggct ttacacttta 5041 tgcttccggc tcgtatgttg tgtggaattg tgagcggata acaatttcac acaggaaaca 5101 gctatgacca tgattacgcc aagcgcgcaa ttaaccctca ctaaagggaa caaaagctgg 5161 agctgcaagc ttggccattg catacgttgt atccatatca taatatgtac atttatattg 5221 gctcatgtcc aacattaccg ccatgttgac attgattatt gactagttat taatagtaat 5281 caattacggg gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg 5341 taaatggccc gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt 5401 atgttcccat agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac 5461 ggtaaactgc ccacttggca gtacatcaag tgtatcatat gccaagtacg ccccctattg 5521 acgtcaatga cggtaaatgg cccgcctggc attatgccca gtacatgacc ttatgggact 5581 ttcctacttg gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt 5641 ggcagtacat caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc 5701 ccattgacgt caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc 5761 gtaacaactc cgccccattg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata 5821 taagcagagc tcgtttagtg aaccggggtc tctctggtta gaccagatct gagcctggga 5881 gctctctggc taactaggga acccactgct taagcctcaa taaagcttgc cttgagtgct 5941 tcaagtagtg tgtgcccgtc tgttgtgtga ctctggtaac tagagatccc tcagaccctt 6001 ttagtcagtg tggaaaatct ctagcagtgg cgcccgaaca gggacctgaa agcgaaaggg 6061 aaaccagagg agctctctcg acgcaggact cggcttgctg aagcgcgcac ggcaagaggc 6121 gaggggcggc gactggtgag tacgccaaaa attttgacta gcggaggcta gaaggagaga 6181 gatgggtgcg agagcgtcag tattaagcgg gggagaatta gatcgcgatg ggaaaaaatt 6241 cggttaaggc cagggggaaa gaaaaaatat aaattaaaac atatagtatg ggcaagcagg 6301 gagctagaac gattcgcagt taatcctggc ctgttagaaa catcagaagg ctgtagacaa 6361 atactgggac agctacaacc atcccttcag acaggatcag aagaacttag atcattatat 6421 aatacagtag caaccctcta ttgtgtgcat caaaggatag agataaaaga caccaaggaa 6481 gctttagaca agatagagga agagcaaaac aaaagtaaga ccaccgcaca gcaagcggcc 6541 gctgatcttc agacctggag gaggagatat gagggacaat tggagaagtg aattatataa 6601 atataaagta gtaaaaattg aaccattagg agtagcaccc accaaggcaa agagaagagt 6661 ggtgcagaga gaaaaaagag cagtgggaat aggagctttg ttccttgggt tcttgggagc 6721 agcaggaagc actatgggcg cagcctcaat gacgctgacg gtacaggcca gacaattatt 6781 gtctggtata gtgcagcagc agaacaattt gctgagggct attgaggcgc aacagcatct 6841 gttgcaactc acagtctggg gcatcaagca gctccaggca agaatcctgg ctgtggaaag 6901 atacctaaag gatcaacagc tcctggggat ttggggttgc tctggaaaac tcatttgcac 6961 cactgctgtg ccttggaatg ctagttggag taataaatct ctggaacaga ttggaatcac 7021 acgacctgga tggagtggga cagagaaatt aacaattaca caagcttaat acactcctta 7081 attgaagaat cgcaaaacca gcaagaaaag aatgaacaag aattattgga attagataaa 7141 tgggcaagtt tgtggaattg gtttaacata acaaattggc tgtggtatat aaaattattc 7201 ataatgatag taggaggctt ggtaggttta agaatagttt ttgctgtact ttctatagtg 7261 aatagagtta ggcagggata ttcaccatta tcgtttcaga cccacctccc aaccccgagg 7321 ggacccgaca ggcccgaagg aatagaagaa gaaggtggag agagagacag agacagatcc 7381 attcgattag tgaacggatc tcgacggtat cgataagcta attcacaaat ggcagtattc 7441 atccacaatt ttaaaagaaa aggggggatt ggggggtaca gtgcagggga aagaatagta 7501 gacataatag caacagacat acaaactaaa gaattacaaa aacaaattac aaaaattcaa 7561 aattttcggg tttattacag ggacagcaga gatccagttt gggaattagc ttgatcgatt 7621 agtccaattt gttaaagaca ggatatcagt ggtccaggct ctagttttga ctcaacaata 7681 tcaccagctg aagcctatag agtacgagcc atagatagaa taaaagattt tatttagtct 7741 ccagaaaaag gggggaatga aagaccccac ctgtaggttt ggcaagctag gatcaaggtt 147 7801 aggaacagag agacagcaga atatgggcca aacaggatat ctgtggtaag cagttcctgc 7861 cccggctcag ggccaagaac agttggaaca gcagaatatg ggccaaacag gatatctgtg 7921 gtaagcagtt cctgccccgg ctcagggcca agaacagatg gtccccagat gcggtcccgc 7981 cctcagcagt ttctagagaa ccatcagatg tttccagggt gccccaagga cctgaaatga 8041 ccctgtgcct tatttgaact aaccaatcag ttcgcttctc gcttctgttc gcgcgcttct 8101 gctccccgag ctcaataaaa gagcccacaa cccctcactc ggcgcgatct agatctcgaa 8161 tcgaatttta attaaattgc cgccatggga tccggaagcg gagagggcag aggaagtctg 8221 ctaacatgcg gtgacgtcga ggagaatcct ggacctatga ttgaacaaga tggattgcac 8281 gcaggttctc cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca 8341 atcggctgct ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt 8401 gtcaagaccg acctgtccgg tgccctgaat gaactgcaag acgaggcagc gcggctatcg 8461 tggctggcca cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga 8521 agggactggc tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct 8581 cctgccgaga aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg 8641 gctacctgcc cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg 8701 gaagccggtc ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc 8761 gaactgttcg ccaggctcaa ggcgagcatg cccgacggcg aggatctcgt cgtgacccat 8821 ggcgatgcct gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac 8881 tgtggccggc tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt 8941 gctgaagagc ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct 9001 cccgattcgc agcgcatcgc cttctatcgc cttcttgacg agttcttct // B.1.2. pMND-libDest-FRET2 LOCUS pMND-libDest-FRET2 9996 bp DNA circular SYN 09-AUG-2018 DEFINITION 3rd-generation lentiviral transfer vector with non-productive stuffer sequence + neoR selection cassette + FRET2 reporter. Intended as recipient vector for minigene libraries; productive minigenes restore frame and result in functional drug resistance and fluorescent protein genes. ACCESSION KEYWORDS SOURCE synthetic construct ORGANISM synthetic construct FEATURES Location/Qualifiers misc_feature 1..37 /Recognition_pattern=\"ATCTATGTCGGGTGCGGAGAAAGAGGTAATGAAA TGG(-22/-26)\" /note=\"homing endonuclease site\" /label=\"PI-SceI\" misc_feature 40..45 /Recognition_pattern=\"G^CTAGC\" /note=\"restriction site\" /label=\"NheI\" sig_peptide 46..108 /label=\"T2A sequence\" misc_feature 109..114 /Recognition_pattern=\"A^CGCGT\" /note=\"restriction site\" /label=\"MluI\" CDS 115..906 /note=\"selection marker\" /label=\"neoR\" misc_feature 907..912 148 /Recognition_pattern=\"G^AATTC\" /note=\"restriction site\" /label=\"EcoRI\" sig_peptide 913..978 /label=\"P2A sequence\" misc_feature 979..986 /Recognition_pattern=\"GGCCGG^CC\" /note=\"restriction site\" /label=\"FseI\" CDS 988..1701 /label=\"CyPet\" misc_feature 1702..1707 /Recognition_pattern=\"A^CCGGT\" /note=\"restriction site\" /label=\"AgeI\" sig_peptide 1708..1728 /label=\"GzmB Cleavage Substrate\" misc_feature 1729..1734 /Recognition_pattern=\"GGGCC^C\" /note=\"restriction site\" /label=\"ApaI\" gene 1735..2451 /label=\"YPet\" misc_feature 2452..2459 /Recognition_pattern=\"GG^CGCGCC\" /note=\"restriction site\" /label=\"AscI\" misc_feature 2460..2465 /Recognition_pattern=\"C^TCGAG\" /note=\"restriction site\" /label=\"XhoI\" LTR 2527..2805 /label=\"3' SIN LTR\" CDS 3984..4844 /note=\"selection marker\" /label=\"AmpR\" rep_origin 4939..5621 /label=\"ColE1\" promoter 6221..6524 /note=\"promoter eukaryotic\" /label=\"CMV enhancer\" promoter 6525..6728 /note=\"promoter eukaryotic\" /label=\"CMV promoter\" LTR 6743..6924 /label=\"5' LTR\" CDS 7079..7443 /label=\"dGAG\" motif 7589..7830 /label=\"RRE\" motif 8346..8463 /label=\"cPPT\" promoter 8655..9046 /note=\"promoter eukaryotic\" /label=\"MNDU3 Promoter\" 149 misc_feature 9065..9072 /Recognition_pattern=\"TTAAT^TAA\" /note=\"restriction site\" /label=\"PacI\" misc_feature 9085..9090 /Recognition_pattern=\"G^GATCC\" /note=\"restriction site\" /label=\"BamHI\" promoter 9091..9109 /label=\"T7 promoter\" motif 9116..9121 /note=\"ribosomal binding site prokaryotic\" /label=\"Shine-Dalgarno sequence\" motif 9122..9131 /note=\"translational start site eukaryotic\" /label=\"Kozak sequence\" misc_feature 9133..9150 /Recognition_pattern=\"TAGGGATAACAGGGTAAT(-9/-13)\" /note=\"homing endonuclease site\" /label=\"I-SceI\" misc_feature 9151 /note=\"required for maintaining open reading frame\" /label=\"Spacer\" misc_feature 9152..9996 /note=\"arbitrary stuffer sequence\" /label=\"mActB.Ex6\" misc_feature 9646..9660 /Recognition_pattern=\"CCANNNNN^NNNNTGG\" /note=\"restriction site\" /label=\"XcmI\" ORIGIN 1 atctatgtcg ggtgcggaga aagaggtaat gaaatggttg ctagcggaag cggagagggc 61 agaggaagtc tgctaacatg cggtgacgtc gaggagaatc ctggacctac gcgtatgatt 121 gaacaagatg gattgcacgc aggttctccg gccgcttggg tggagaggct attcggctat 181 gactgggcac aacagacaat cggctgctct gatgccgccg tgttccggct gtcagcgcag 241 gggcgcccgg ttctttttgt caagaccgac ctgtccggtg ccctgaatga actgcaagac 301 gaggcagcgc ggctatcgtg gctggccacg acgggcgttc cttgcgcagc tgtgctcgac 361 gttgtcactg aagcgggaag ggactggctg ctattgggcg aagtgccggg gcaggatctc 421 ctgtcatctc accttgctcc tgccgagaaa gtatccatca tggctgatgc aatgcggcgg 481 ctgcatacgc ttgatccggc tacctgccca ttcgaccacc aagcgaaaca tcgcatcgag 541 cgagcacgta ctcggatgga agccggtctt gtcgatcagg atgatctgga cgaagagcat 601 caggggctcg cgccagccga actgttcgcc aggctcaagg cgagcatgcc cgacggcgag 661 gatctcgtcg tgacccatgg cgatgcctgc ttgccgaata tcatggtgga aaatggccgc 721 ttttctggat tcatcgactg tggccggctg ggtgtggcgg accgctatca ggacatagcg 781 ttggctaccc gtgatattgc tgaagagctt ggcggcgaat gggctgaccg cttcctcgtg 841 ctttacggta tcgccgctcc cgattcgcag cgcatcgcct tctatcgcct tcttgacgag 901 ttcttcgaat tcggaagcgg agctactaac ttcagcctgc tgaagcaggc tggagacgtg 961 gaggagaacc ctggacctgg ccggcctatg tctaaaggtg aagaattatt cggcggtatc 1021 gtcccaattt tagttgaatt agagggtgat gttaatggtc acaaattttc tgtctccggt 1081 gaaggtgaag gtgatgctac gtacggtaaa ttgaccttaa aatttatttg tactactggt 1141 aaattgccag ttccatggcc aaccttagtc actactctga cttggggtgt tcaatgtttt 1201 tctagatacc cagatcatat gaaacaacat gactttttca agtctgtcat gccagaaggt 1261 tatgttcaag aaagaactat ttttttcaaa gatgacggta actacaagac cagagctgaa 1321 gtcaagtttg aaggtgatac cttagttaat agaatcgaat taaaaggtat tgattttaaa 1381 gaagatggta acattttagg tcacaaattg gaatacaact atatctctca caatgtttac 150 1441 atcaccgctg acaaacaaaa gaatggtatc aaagctaact tcaaagccag acacaacatt 1501 accgatggtt ctgttcaatt agctgaccat tatcaacaaa atactccaat tggtgatggt 1561 ccagtcatct tgccagacaa ccattactta tccactcaat ctgccttatc taaagatcca 1621 aacgaaaaga gagaccacat ggtcttgctc gaatttgtta ctgctgctgg tattacccat 1681 ggtatggatg aattgtacaa aaccggtgtc ggccccgact tcggcagggg gcccatgtct 1741 aaaggtgaag aattattcac tggtgttgtc ccaattttgg ttgaattaga tggtgatgtt 1801 aatggtcaca aattttctgt ctccggtgaa ggtgaaggtg atgctacgta cggtaaattg 1861 accttaaaat tactctgtac tactggtaaa ttgccagttc catggccaac cttagtcact 1921 actttaggtt atggtgttca atgttttgct agatacccag atcatatgaa acaacatgac 1981 tttttcaagt ctgccatgcc agaaggttat gttcaagaaa gaactatttt tttcaaagat 2041 gacggtaact acaagaccag agctgaagtc aagtttgaag gtgatacctt agttaataga 2101 atcgaattaa aaggtattga ttttaaagaa gatggtaaca ttttaggtca caaattggaa 2161 tacaactata actctcacaa tgtttacatc actgctgaca aacaaaagaa tggtatcaaa 2221 gctaacttca aaattagaca caacattgaa gatggtggtg ttcaattagc tgaccattat 2281 caacaaaata ctccaattgg tgatggtcca gtcttgttac cagacaacca ttacttatcc 2341 tatcaatctg ccttattcaa agatccaaac gaaaagagag accacatggt cttgttagaa 2401 tttttgactg ctgctggtat taccgagggt atgaatgaat tgtacaaata aggcgcgccc 2461 tcgagagatc ccccggggtc gactgatcaa attcgagctc ggtaccttta agaccaatga 2521 cttacaaggc agctgtagat cttagccact ttttaaaaga aaagggggga ctggaagggc 2581 taattcactc ccaacgaaga caagatctgc tttttgcttg tactgggtct ctctggttag 2641 accagatctg agcctgggag ctctctggct aactagggaa cccactgctt aagcctcaat 2701 aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac tctggtaact 2761 agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagtagt agttcatgtc 2821 atcttattat tcagtattta taacttgcaa agaaatgaat atcagagagt gagaggaact 2881 tgtttattgc agcttataat ggttacaaat aaagcaatag catcacaaat ttcacaaata 2941 aagcattttt ttcactgcat tctagttgtg gtttgtccaa actcatcaat gtatcttatc 3001 atgtctggct ctagctatcc cgcccctaac tccgcccatc ccgcccctaa ctccgcccag 3061 ttccgcccat tctccgcccc atggctgact aatttttttt atttatgcag aggccgaggc 3121 cgcctcggcc tctgagctat tccagaagta gtgaggaggc ttttttggag gcctaggctt 3181 ttgcgtcgag acgtacccaa ttcgccctat agtgagtcgt attacgcgcg ctcactggcc 3241 gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa tcgccttgca 3301 gcacatcccc ctttcgccag ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc 3361 caacagttgc gcagcctgaa tggcgaatgg cgcgacgcgc cctgtagcgg cgcattaagc 3421 gcggcgggtg tggtggttac gcgcagcgtg accgctacac ttgccagcgc cctagcgccc 3481 gctcctttcg ctttcttccc ttcctttctc gccacgttcg ccggctttcc ccgtcaagct 3541 ctaaatcggg ggctcccttt agggttccga tttagtgctt tacggcacct cgaccccaaa 3601 aaacttgatt agggtgatgg ttcacgtagt gggccatcgc cctgatagac ggtttttcgc 3661 cctttgacgt tggagtccac gttctttaat agtggactct tgttccaaac tggaacaaca 3721 ctcaacccta tctcggtcta ttcttttgat ttataaggga ttttgccgat ttcggcctat 3781 tggttaaaaa atgagctgat ttaacaaaaa tttaacgcga attttaacaa aatattaacg 3841 tttacaattt cccaggtggc acttttcggg gaaatgtgcg cggaacccct atttgtttat 3901 ttttctaaat acattcaaat atgtatccgc tcatgagaca ataaccctga taaatgcttc 3961 aataatattg aaaaaggaag agtatgagta ttcaacattt ccgtgtcgcc cttattccct 4021 tttttgcggc attttgcctt cctgtttttg ctcacccaga aacgctggtg aaagtaaaag 4081 atgctgaaga tcagttgggt gcacgagtgg gttacatcga actggatctc aacagcggta 4141 agatccttga gagttttcgc cccgaagaac gttttccaat gatgagcact tttaaagttc 4201 tgctatgtgg cgcggtatta tcccgtattg acgccgggca agagcaactc ggtcgccgca 4261 tacactattc tcagaatgac ttggttgagt actcaccagt cacagaaaag catcttacgg 4321 atggcatgac agtaagagaa ttatgcagtg ctgccataac catgagtgat aacactgcgg 4381 ccaacttact tctgacaacg atcggaggac cgaaggagct aaccgctttt ttgcacaaca 4441 tgggggatca tgtaactcgc cttgatcgtt gggaaccgga gctgaatgaa gccataccaa 4501 acgacgagcg tgacaccacg atgcctgtag caatggcaac aacgttgcgc aaactattaa 4561 ctggcgaact acttactcta gcttcccggc aacaattaat agactggatg gaggcggata 4621 aagttgcagg accacttctg cgctcggccc ttccggctgg ctggtttatt gctgataaat 4681 ctggagccgg tgagcgtggg tctcgcggta tcattgcagc actggggcca gatggtaagc 151 4741 cctcccgtat cgtagttatc tacacgacgg ggagtcaggc aactatggat gaacgaaata 4801 gacagatcgc tgagataggt gcctcactga ttaagcattg gtaactgtca gaccaagttt 4861 actcatatat actttagatt gatttaaaac ttcattttta atttaaaagg atctaggtga 4921 agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg ttccactgag 4981 cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa 5041 tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag 5101 agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg 5161 tccttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat 5221 acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta 5281 ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg 5341 gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc 5401 gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa 5461 gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc 5521 tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt 5581 caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct 5641 tttgctggcc ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc 5701 gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg 5761 agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt 5821 ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc 5881 gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta cactttatgc 5941 ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca ggaaacagct 6001 atgaccatga ttacgccaag cgcgcaatta accctcacta aagggaacaa aagctggagc 6061 tgcaagcttg gccattgcat acgttgtatc catatcataa tatgtacatt tatattggct 6121 catgtccaac attaccgcca tgttgacatt gattattgac tagttattaa tagtaatcaa 6181 ttacggggtc attagttcat agcccatata tggagttccg cgttacataa cttacggtaa 6241 atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata atgacgtatg 6301 ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggag tatttacggt 6361 aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc cctattgacg 6421 tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta tgggactttc 6481 ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg cggttttggc 6541 agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt ctccacccca 6601 ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca aaatgtcgta 6661 acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag gtctatataa 6721 gcagagctcg tttagtgaac cggggtctct ctggttagac cagatctgag cctgggagct 6781 ctctggctaa ctagggaacc cactgcttaa gcctcaataa agcttgcctt gagtgcttca 6841 agtagtgtgt gcccgtctgt tgtgtgactc tggtaactag agatccctca gaccctttta 6901 gtcagtgtgg aaaatctcta gcagtggcgc ccgaacaggg acctgaaagc gaaagggaaa 6961 ccagaggagc tctctcgacg caggactcgg cttgctgaag cgcgcacggc aagaggcgag 7021 gggcggcgac tggtgagtac gccaaaaatt ttgactagcg gaggctagaa ggagagagat 7081 gggtgcgaga gcgtcagtat taagcggggg agaattagat cgcgatggga aaaaattcgg 7141 ttaaggccag ggggaaagaa aaaatataaa ttaaaacata tagtatgggc aagcagggag 7201 ctagaacgat tcgcagttaa tcctggcctg ttagaaacat cagaaggctg tagacaaata 7261 ctgggacagc tacaaccatc ccttcagaca ggatcagaag aacttagatc attatataat 7321 acagtagcaa ccctctattg tgtgcatcaa aggatagaga taaaagacac caaggaagct 7381 ttagacaaga tagaggaaga gcaaaacaaa agtaagacca ccgcacagca agcggccgct 7441 gatcttcaga cctggaggag gagatatgag ggacaattgg agaagtgaat tatataaata 7501 taaagtagta aaaattgaac cattaggagt agcacccacc aaggcaaaga gaagagtggt 7561 gcagagagaa aaaagagcag tgggaatagg agctttgttc cttgggttct tgggagcagc 7621 aggaagcact atgggcgcag cctcaatgac gctgacggta caggccagac aattattgtc 7681 tggtatagtg cagcagcaga acaatttgct gagggctatt gaggcgcaac agcatctgtt 7741 gcaactcaca gtctggggca tcaagcagct ccaggcaaga atcctggctg tggaaagata 7801 cctaaaggat caacagctcc tggggatttg gggttgctct ggaaaactca tttgcaccac 7861 tgctgtgcct tggaatgcta gttggagtaa taaatctctg gaacagattg gaatcacacg 7921 acctggatgg agtgggacag agaaattaac aattacacaa gcttaataca ctccttaatt 7981 gaagaatcgc aaaaccagca agaaaagaat gaacaagaat tattggaatt agataaatgg 152 8041 gcaagtttgt ggaattggtt taacataaca aattggctgt ggtatataaa attattcata 8101 atgatagtag gaggcttggt aggtttaaga atagtttttg ctgtactttc tatagtgaat 8161 agagttaggc agggatattc accattatcg tttcagaccc acctcccaac cccgagggga 8221 cccgacaggc ccgaaggaat agaagaagaa ggtggagaga gagacagaga cagatccatt 8281 cgattagtga acggatctcg acggtatcga taagctaatt cacaaatggc agtattcatc 8341 cacaatttta aaagaaaagg ggggattggg gggtacagtg caggggaaag aatagtagac 8401 ataatagcaa cagacataca aactaaagaa ttacaaaaac aaattacaaa aattcaaaat 8461 tttcgggttt attacaggga cagcagagat ccagtttggg aattagcttg atcgattagt 8521 ccaatttgtt aaagacagga tatcagtggt ccaggctcta gttttgactc aacaatatca 8581 ccagctgaag cctatagagt acgagccata gatagaataa aagattttat ttagtctcca 8641 gaaaaagggg ggaatgaaag accccacctg taggtttggc aagctaggat caaggttagg 8701 aacagagaga cagcagaata tgggccaaac aggatatctg tggtaagcag ttcctgcccc 8761 ggctcagggc caagaacagt tggaacagca gaatatgggc caaacaggat atctgtggta 8821 agcagttcct gccccggctc agggccaaga acagatggtc cccagatgcg gtcccgccct 8881 cagcagtttc tagagaacca tcagatgttt ccagggtgcc ccaaggacct gaaatgaccc 8941 tgtgccttat ttgaactaac caatcagttc gcttctcgct tctgttcgcg cgcttctgct 9001 ccccgagctc aataaaagag cccacaaccc ctcactcggc gcgatctaga tctcgaatcg 9061 aattttaatt aaattgccgc catgggatcc taatacgact cactataggg actgtaggag 9121 ggccaccatg gctagggata acagggtaat tcttgtcttg ctttcttcag atcattgctc 9181 ctcctgagcg caagtactct gtgtggatcg gtggctccat cctggcctca ctgtccacct 9241 tccagcagat gtggatcagc aagcaggagt acgatgagtc cggcccctcc atcgtgcacc 9301 gcaagtgctt ctaggcggac tgttactgag ctgcgtttta caccctttct ttgacaaaac 9361 ctaacttgcg cagaaaaaaa aaaaataaga gacaacattg gcatggcttt gtttttttaa 9421 atttttttta aagttttttt tttttttttt tttttttttt ttaagttttt ttgttttgtt 9481 ttggcgcttt tgactcagga tttaaaaact ggaacggtga aggcgacagc agttggttgg 9541 agcaaacatc ccccaaagtt ctacaaatgt ggctgaggac tttgtacatt gttttgtttt 9601 tttttttttt tggttttgtc tttttttaat agtcattcca agtatccatg aaataagtgg 9661 ttacaggaag tccctcaccc tcccaaaagc cacccccact cctaagagga ggatggtcgc 9721 gtccatgccc tgagtccacc ccggggaagg tgacagcatt gcttctgtgt aaattatgta 9781 ctgcaaaaat ttttttaaat cttccgcctt aatacttcat ttttgttttt aatttctgaa 9841 tggcccaggt ctgaggcctc cctttttttt gtccccccaa cttgatgtat gaaggctttg 9901 gtctccctgg gagggggttg aggtgttgag gcagccaggg ctggcctgta cactgacttg 9961 agaccaataa aagtgcacac cttaccttac acaaac // B.1.3. pMND-Multi LOCUS pMND-Multi 7506 bp DNA circular SYN 09-AUG-2018 DEFINITION Multi-purpose 3rd-generation lentiviral transfer vector. ACCESSION KEYWORDS SOURCE synthetic construct ORGANISM synthetic construct FEATURES Location/Qualifiers misc_feature 1..8 /Recognition_pattern=\"GG^CGCGCC\" /note=\"restriction site\" /label=\"AscI\" misc_feature 9..14 /Recognition_pattern=\"C^TCGAG\" /note=\"restriction site\" /label=\"XhoI\" LTR 76..354 /label=\"3' SIN LTR\" CDS 1533..2393 153 /label=\"AmpR\" rep_origin 2488..3170 /label=\"ColE1\" promoter 3770..4073 /note=\"promoter eukaryotic\" /label=\"CMV enhancer\" promoter 4074..4277 /note=\"promoter eukaryotic\" /label=\"CMV promoter\" LTR 4292..4473 /label=\"5' LTR\" CDS 4268..4992 /label=\"dGAG\" motif 5138..5379 /label=\"RRE\" motif 5895..6012 /label=\"cPPT\" promoter 6204..6595 /note=\"promoter eukaryotic\" /label=\"MNDU3 Promoter\" misc_feature 6614..6621 /Recognition_pattern=\"TTAAT^TAA\" /note=\"restriction site\" /label=\"PacI\" misc_feature 6634..6639 /Recognition_pattern=\"G^GATCC\" /note=\"restriction site\" /label=\"BamHI\" misc_feature 6640..6645 /Recognition_pattern=\"G^CTAGC\" /note=\"restriction site\" /label=\"NheI\" sig_peptide 6646..6708 /label=\"T2A sequence\" misc_feature 6709..6714 /Recognition_pattern=\"A^CGCGT\" /note=\"restriction site\" /label=\"MluI\" misc_feature 6715..6720 /Recognition_pattern=\"G^AATTC\" /note=\"restriction site\" /label=\"EcoRI\" sig_peptide 6721..6786 /label=\"P2A sequence\" misc_feature 6787..6794 /Recognition_pattern=\"GGCCGG^CC\" /note=\"restriction site\" /label=\"FseI\" CDS 6796..7506 /label=\"mStrawberry fluorescent protein\" ORIGIN 1 ggccggccta tggtgagcaa gggcgaggag aataacatgg ccatcatcaa ggagttcatg 61 cgcttcaagg tgcgcatgga gggctccgtg aacggccacg agttcgagat cgagggcgag 121 ggcgagggcc gcccctacga gggcacccag accgccaagc tgaaggtgac caagggtggc 181 cccctgccct tcgcctggga catcctaacc cccaacttca cctacggctc caaggcctac 154 241 gtgaagcacc ccgccgacat ccccgactac ttgaagctgt ccttccccga gggcttcaag 301 tgggagcgcg tgatgaactt cgaggacggc ggcgtggtga ccgtgaccca ggactcctcc 361 ctgcaggacg gcgagttcat ctacaaggtg aagctgcgcg gcaccaactt cccctccgac 421 ggccccgtaa tgcagaagaa gaccatgggc tgggaggcct cctccgagcg gatgtacccc 481 gaggacggcg ccctgaaggg cgagatcaag atgaggctga agctgaagga cggcggccac 541 tacgacgctg aggtcaagac cacctacaag gccaagaagc ccgtgcagct gcccggcgcc 601 tacatcgtcg gcatcaagtt ggacatcacc tcccacaacg aggactacac catcgtggaa 661 ctgtacgaac gcgccgaggg ccgccactcc accggcggca tggacgagct gtacaagtaa 721 ggcgcgccct cgagagatcc cccggggtcg actgatcaaa ttcgagctcg gtacctttaa 781 gaccaatgac ttacaaggca gctgtagatc ttagccactt tttaaaagaa aaggggggac 841 tggaagggct aattcactcc caacgaagac aagatctgct ttttgcttgt actgggtctc 901 tctggttaga ccagatctga gcctgggagc tctctggcta actagggaac ccactgctta 961 agcctcaata aagcttgcct tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact 1021 ctggtaacta gagatccctc agaccctttt agtcagtgtg gaaaatctct agcagtagta 1081 gttcatgtca tcttattatt cagtatttat aacttgcaaa gaaatgaata tcagagagtg 1141 agaggaactt gtttattgca gcttataatg gttacaaata aagcaatagc atcacaaatt 1201 tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa ctcatcaatg 1261 tatcttatca tgtctggctc tagctatccc gcccctaact ccgcccatcc cgcccctaac 1321 tccgcccagt tccgcccatt ctccgcccca tggctgacta atttttttta tttatgcaga 1381 ggccgaggcc gcctcggcct ctgagctatt ccagaagtag tgaggaggct tttttggagg 1441 cctaggcttt tgcgtcgaga cgtacccaat tcgccctata gtgagtcgta ttacgcgcgc 1501 tcactggccg tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat 1561 cgccttgcag cacatccccc tttcgccagc tggcgtaata gcgaagaggc ccgcaccgat 1621 cgcccttccc aacagttgcg cagcctgaat ggcgaatggc gcgacgcgcc ctgtagcggc 1681 gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc 1741 ctagcgcccg ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc 1801 cgtcaagctc taaatcgggg gctcccttta gggttccgat ttagtgcttt acggcacctc 1861 gaccccaaaa aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg 1921 gtttttcgcc ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact 1981 ggaacaacac tcaaccctat ctcggtctat tcttttgatt tataagggat tttgccgatt 2041 tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa 2101 atattaacgt ttacaatttc ccaggtggca cttttcgggg aaatgtgcgc ggaaccccta 2161 tttgtttatt tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat 2221 aaatgcttca ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc 2281 ttattccctt ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga 2341 aagtaaaaga tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca 2401 acagcggtaa gatccttgag agttttcgcc ccgaagaacg ttttccaatg atgagcactt 2461 ttaaagttct gctatgtggc gcggtattat cccgtattga cgccgggcaa gagcaactcg 2521 gtcgccgcat acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc 2581 atcttacgga tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata 2641 acactgcggc caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt 2701 tgcacaacat gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag 2761 ccataccaaa cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca 2821 aactattaac tggcgaacta cttactctag cttcccggca acaattaata gactggatgg 2881 aggcggataa agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg 2941 ctgataaatc tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag 3001 atggtaagcc ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg 3061 aacgaaatag acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag 3121 accaagttta ctcatatata ctttagattg atttaaaact tcatttttaa tttaaaagga 3181 tctaggtgaa gatccttttt gataatctca tgaccaaaat cccttaacgt gagttttcgt 3241 tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat cctttttttc 3301 tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg gtttgtttgc 3361 cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga gcgcagatac 3421 caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac tctgtagcac 3481 cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt ggcgataagt 155 3541 cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag cggtcgggct 3601 gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc gaactgagat 3661 acctacagcg tgagctatga gaaagcgcca cgcttcccga agggagaaag gcggacaggt 3721 atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca gggggaaacg 3781 cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt cgatttttgt 3841 gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc tttttacggt 3901 tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc cctgattctg 3961 tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg 4021 agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa ccgcctctcc 4081 ccgcgcgttg gccgattcat taatgcagct ggcacgacag gtttcccgac tggaaagcgg 4141 gcagtgagcg caacgcaatt aatgtgagtt agctcactca ttaggcaccc caggctttac 4201 actttatgct tccggctcgt atgttgtgtg gaattgtgag cggataacaa tttcacacag 4261 gaaacagcta tgaccatgat tacgccaagc gcgcaattaa ccctcactaa agggaacaaa 4321 agctggagct gcaagcttgg ccattgcata cgttgtatcc atatcataat atgtacattt 4381 atattggctc atgtccaaca ttaccgccat gttgacattg attattgact agttattaat 4441 agtaatcaat tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac 4501 ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa 4561 tgacgtatgt tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt 4621 atttacggta aactgcccac ttggcagtac atcaagtgta tcatatgcca agtacgcccc 4681 ctattgacgt caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttat 4741 gggactttcc tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc 4801 ggttttggca gtacatcaat gggcgtggat agcggtttga ctcacgggga tttccaagtc 4861 tccaccccat tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa 4921 aatgtcgtaa caactccgcc ccattgacgc aaatgggcgg taggcgtgta cggtgggagg 4981 tctatataag cagagctcgt ttagtgaacc ggggtctctc tggttagacc agatctgagc 5041 ctgggagctc tctggctaac tagggaaccc actgcttaag cctcaataaa gcttgccttg 5101 agtgcttcaa gtagtgtgtg cccgtctgtt gtgtgactct ggtaactaga gatccctcag 5161 acccttttag tcagtgtgga aaatctctag cagtggcgcc cgaacaggga cctgaaagcg 5221 aaagggaaac cagaggagct ctctcgacgc aggactcggc ttgctgaagc gcgcacggca 5281 agaggcgagg ggcggcgact ggtgagtacg ccaaaaattt tgactagcgg aggctagaag 5341 gagagagatg ggtgcgagag cgtcagtatt aagcggggga gaattagatc gcgatgggaa 5401 aaaattcggt taaggccagg gggaaagaaa aaatataaat taaaacatat agtatgggca 5461 agcagggagc tagaacgatt cgcagttaat cctggcctgt tagaaacatc agaaggctgt 5521 agacaaatac tgggacagct acaaccatcc cttcagacag gatcagaaga acttagatca 5581 ttatataata cagtagcaac cctctattgt gtgcatcaaa ggatagagat aaaagacacc 5641 aaggaagctt tagacaagat agaggaagag caaaacaaaa gtaagaccac cgcacagcaa 5701 gcggccgctg atcttcagac ctggaggagg agatatgagg gacaattgga gaagtgaatt 5761 atataaatat aaagtagtaa aaattgaacc attaggagta gcacccacca aggcaaagag 5821 aagagtggtg cagagagaaa aaagagcagt gggaatagga gctttgttcc ttgggttctt 5881 gggagcagca ggaagcacta tgggcgcagc ctcaatgacg ctgacggtac aggccagaca 5941 attattgtct ggtatagtgc agcagcagaa caatttgctg agggctattg aggcgcaaca 6001 gcatctgttg caactcacag tctggggcat caagcagctc caggcaagaa tcctggctgt 6061 ggaaagatac ctaaaggatc aacagctcct ggggatttgg ggttgctctg gaaaactcat 6121 ttgcaccact gctgtgcctt ggaatgctag ttggagtaat aaatctctgg aacagattgg 6181 aatcacacga cctggatgga gtgggacaga gaaattaaca attacacaag cttaatacac 6241 tccttaattg aagaatcgca aaaccagcaa gaaaagaatg aacaagaatt attggaatta 6301 gataaatggg caagtttgtg gaattggttt aacataacaa attggctgtg gtatataaaa 6361 ttattcataa tgatagtagg aggcttggta ggtttaagaa tagtttttgc tgtactttct 6421 atagtgaata gagttaggca gggatattca ccattatcgt ttcagaccca cctcccaacc 6481 ccgaggggac ccgacaggcc cgaaggaata gaagaagaag gtggagagag agacagagac 6541 agatccattc gattagtgaa cggatctcga cggtatcgat aagctaattc acaaatggca 6601 gtattcatcc acaattttaa aagaaaaggg gggattgggg ggtacagtgc aggggaaaga 6661 atagtagaca taatagcaac agacatacaa actaaagaat tacaaaaaca aattacaaaa 6721 attcaaaatt ttcgggttta ttacagggac agcagagatc cagtttggga attagcttga 6781 tcgattagtc caatttgtta aagacaggat atcagtggtc caggctctag ttttgactca 156 6841 acaatatcac cagctgaagc ctatagagta cgagccatag atagaataaa agattttatt 6901 tagtctccag aaaaaggggg gaatgaaaga ccccacctgt aggtttggca agctaggatc 6961 aaggttagga acagagagac agcagaatat gggccaaaca ggatatctgt ggtaagcagt 7021 tcctgccccg gctcagggcc aagaacagtt ggaacagcag aatatgggcc aaacaggata 7081 tctgtggtaa gcagttcctg ccccggctca gggccaagaa cagatggtcc ccagatgcgg 7141 tcccgccctc agcagtttct agagaaccat cagatgtttc cagggtgccc caaggacctg 7201 aaatgaccct gtgccttatt tgaactaacc aatcagttcg cttctcgctt ctgttcgcgc 7261 gcttctgctc cccgagctca ataaaagagc ccacaacccc tcactcggcg cgatctagat 7321 ctcgaatcga attttaatta aattgccgcc atgggatccg ctagcggaag cggagagggc 7381 agaggaagtc tgctaacatg cggtgacgtc gaggagaatc ctggacctac gcgtgaattc 7441 ggaagcggag ctactaactt cagcctgctg aagcaggctg gagacgtgga ggagaaccct 7501 ggacct // B.1.4. pCMV-ΔR8.91 LOCUS pCMV-ΔR8.91 12150 bp DNA circular SYN 22-AUG-2017 DEFINITION 2nd-generation lentiviral packaging vector ACCESSION KEYWORDS SOURCE synthetic construct ORGANISM synthetic construct FEATURES Location/Qualifiers promoter 1..580 /label=\"CMV enhancer/immediate-early promoter\" ORF 855..2354 /note=\"polyprotein\" /label=\"gag\" CDS 855..1250 /note=\"MA, matrix protein\" /label=\"p17\" CDS 1251..1943 /note=\"CA, capsid protein\" /label=\"p24\" CDS 1944..1985 /label=\"spacer peptide 1\" CDS 1986..2150 /note=\"NC, nucleocapsid protein\" /label=\"p7\" ORF 2150..5159 /note=\"polyprotein\" /label=\"pol\" CDS 2151..2173 /label=\"spacer peptide 2\" CDS 2174..2317 /label=\"P6 protein\" CDS 2318..2614 /note=\"PR, protease\" /label=\"p10\" CDS 2619..4294 /note=\"RT, reverse transcriptase/RNase H\" /label=\"p51\" CDS 4295..5159 /note=\"IN, integrase\" /label=\"p32\" 157 misc_difference 5106..5176 /label=\"Δvif\" misc_difference 5169..5217 /label=\"Δvpr\" CDS 5255..5469 /label=\"tat\" CDS 5394..5469 /label=\"rev\" misc_difference 5489..5580 /label=\"Δvpu\" misc_difference 5589..6759 /label=\"Δenv\" motif 5746..5950 /label=\"RRE\" CDS 6345..6617 /label=\"rev\" CDS 6345..6388 /label=\"tat\" polyA_signal 6782..6989 /label=\"Rat INS2 polyA\" CDS 8052..8908 /note=\"selection marker\" /label=\"AmpR\" rep_origin 9006..9685 /note=\"ColE1\" rep_origin 9929..10072 /label=\"SV40\" CDS 10246..11679 /note=\"selection marker\" /label=\"gpt\" ORIGIN 1 ttgattattg actagttatt aatagtaatc aattacgggg tcattagttc atagcccata 61 tatggagttc cgcgttacat aacttacggt aaatggcccg cctggctgac cgcccaacga 121 cccccgccca ttgacgtcaa taatgacgta tgttcccata gtaacgccaa tagggacttt 181 ccattgacgt caatgggtgg agtatttacg gtaaactgcc cacttggcag tacatcaagt 241 gtatcatatg ccaagtacgc cccctattga cgtcaatgac ggtaaatggc ccgcctggca 301 ttatgcccag tacatgacct tatgggactt tcctacttgg cagtacatct acgtattagt 361 catcgctatt accatggtga tgcggttttg gcagtacatc aatgggcgtg gatagcggtt 421 tgactcacgg ggatttccaa gtctccaccc cattgacgtc aatgggagtt tgttttggca 481 ccaaaatcaa cgggactttc caaaatgtcg taacaactcc gccccattga cgcaaatggg 541 cggtaggcgt gtacggtggg aggtctatat aagcagagct cgtttagtga accgtcagat 601 cgcctggaga cgccatccac gctgttttga cctccataga agacaccggg accgatccag 661 cctccgcggc cgggaacggt gcattggaac gcggattccc cgtgccaaga gtgacgtaag 721 taccgcctat agagtctata ggcccacccc cttggcttct tatgcgacgg atcgatcccg 781 taataagctt cgaggtccgc ggccggccgc gttgacgcgc acggcaagag gcgaggggcg 841 gcgactggtg agagatgggt gcgagagcgt cagtattaag cgggggagaa ttagatcgat 901 gggaaaaaat tcggttaagg ccagggggaa agaaaaaata taaattaaaa catatagtat 961 gggcaagcag ggagctagaa cgattcgcag ttaatcctgg cctgttagaa acatcagaag 1021 gctgtagaca aatactggga cagctacaac catcccttca gacaggatca gaagaactta 1081 gatcattata taatacagta gcaaccctct attgtgtgca tcaaaggata gagataaaag 1141 acaccaagga agctttagac aagatagagg aagagcaaaa caaaagtaag aaaaaagcac 1201 agcaagcagc agctgacaca ggacacagca atcaggtcag ccaaaattac cctatagtgc 1261 agaacatcca ggggcaaatg gtacatcagg ccatatcacc tagaacttta aatgcatggg 1321 taaaagtagt agaagagaag gctttcagcc cagaagtgat acccatgttt tcagcattat 1381 cagaaggagc caccccacaa gatttaaaca ccatgctaaa cacagtgggg ggacatcaag 158 1441 cagccatgca aatgttaaaa gagaccatca atgaggaagc tgcagaatgg gatagagtgc 1501 atccagtgca tgcagggcct attgcaccag gccagatgag agaaccaagg ggaagtgaca 1561 tagcaggaac tactagtacc cttcaggaac aaataggatg gatgacacat aatccaccta 1621 tcccagtagg agaaatctat aaaagatgga taatcctggg attaaataaa atagtaagaa 1681 tgtatagccc taccagcatt ctggacataa gacaaggacc aaaggaaccc tttagagact 1741 atgtagaccg attctataaa actctaagag ccgagcaagc ttcacaagag gtaaaaaatt 1801 ggatgacaga aaccttgttg gtccaaaatg cgaacccaga ttgtaagact attttaaaag 1861 cattgggacc aggagcgaca ctagaagaaa tgatgacagc atgtcaggga gtggggggac 1921 ccggccataa agcaagagtt ttggctgaag caatgagcca agtaacaaat ccagctacca 1981 taatgataca gaaaggcaat tttaggaacc aaagaaagac tgttaagtgt ttcaattgtg 2041 gcaaagaagg gcacatagcc aaaaattgca gggcccctag gaaaaagggc tgttggaaat 2101 gtggaaagga aggacaccaa atgaaagatt gtactgagag acaggctaat tttttaggga 2161 agatctggcc ttcccacaag ggaaggccag ggaattttct tcagagcaga ccagagccaa 2221 cagccccacc agaagagagc ttcaggtttg gggaagagac aacaactccc tctcagaagc 2281 aggagccgat agacaaggaa ctgtatcctt tagcttccct cagatcactc tttggcagcg 2341 acccctcgtc acaataaaga taggggggca attaaaggaa gctctattag atacaggagc 2401 agatgataca gtattagaag aaatgaattt gccaggaaga tggaaaccaa aaatgatagg 2461 gggaattgga ggttttatca aagtaagaca gtatgatcag atactcatag aaatctgcgg 2521 acataaagct ataggtacag tattagtagg acctacacct gtcaacataa ttggaagaaa 2581 tctgttgact cagattggct gcactttaaa ttttcccatt agtcctattg agactgtacc 2641 agtaaaatta aagccaggaa tggatggccc aaaagttaaa caatggccat tgacagaaga 2701 aaaaataaaa gcattagtag aaatttgtac agaaatggaa aaggaaggaa aaatttcaaa 2761 aattgggcct gaaaatccat acaatactcc agtatttgcc ataaagaaaa aagacagtac 2821 taaatggaga aaattagtag atttcagaga acttaataag agaactcaag atttctggga 2881 agttcaatta ggaataccac atcctgcagg gttaaaacag aaaaaatcag taacagtact 2941 ggatgtgggc gatgcatatt tttcagttcc cttagataaa gacttcagga agtatactgc 3001 atttaccata cctagtataa acaatgagac accagggatt agatatcagt acaatgtgct 3061 tccacaggga tggaaaggat caccagcaat attccagtgt agcatgacaa aaatcttaga 3121 gccttttaga aaacaaaatc cagacatagt catctatcaa tacatggatg atttgtatgt 3181 aggatctgac ttagaaatag ggcagcatag aacaaaaata gaggaactga gacaacatct 3241 gttgaggtgg ggatttacca caccagacaa aaaacatcag aaagaacctc cattcctttg 3301 gatgggttat gaactccatc ctgataaatg gacagtacag cctatagtgc tgccagaaaa 3361 ggacagctgg actgtcaatg acatacagaa attagtggga aaattgaatt gggcaagtca 3421 gatttatgca gggattaaag taaggcaatt atgtaaactt cttaggggaa ccaaagcact 3481 aacagaagta gtaccactaa cagaagaagc agagctagaa ctggcagaaa acagggagat 3541 tctaaaagaa ccggtacatg gagtgtatta tgacccatca aaagacttaa tagcagaaat 3601 acagaagcag gggcaaggcc aatggacata tcaaatttat caagagccat ttaaaaatct 3661 gaaaacagga aagtatgcaa gaatgaaggg tgcccacact aatgatgtga aacaattaac 3721 agaggcagta caaaaaatag ccacagaaag catagtaata tggggaaaga ctcctaaatt 3781 taaattaccc atacaaaagg aaacatggga agcatggtgg acagagtatt ggcaagccac 3841 ctggattcct gagtgggagt ttgtcaatac ccctccctta gtgaagttat ggtaccagtt 3901 agagaaagaa cccataatag gagcagaaac tttctatgta gatggggcag ccaataggga 3961 aactaaatta ggaaaagcag gatatgtaac tgacagagga agacaaaaag ttgtccccct 4021 aacggacaca acaaatcaga agactgagtt acaagcaatt catctagctt tgcaggattc 4081 gggattagaa gtaaacatag tgacagactc acaatatgca ttgggaatca ttcaagcaca 4141 accagataag agtgaatcag agttagtcag tcaaataata gagcagttaa taaaaaagga 4201 aaaagtctac ctggcatggg taccagcaca caaaggaatt ggaggaaatg aacaagtaga 4261 taaattggtc agtgctggaa tcaggaaagt actattttta gatggaatag ataaggccca 4321 agaagaacat gagaaatatc acagtaattg gagagcaatg gctagtgatt ttaacctacc 4381 acctgtagta gcaaaagaaa tagtagccag ctgtgataaa tgtcagctaa aaggggaagc 4441 catgcatgga caagtagact gtagcccagg aatatggcag ctagattgta cacatttaga 4501 aggaaaagtt atcttggtag cagttcatgt agccagtgga tatatagaag cagaagtaat 4561 tccagcagag acagggcaag aaacagcata cttcctctta aaattagcag gaagatggcc 4621 agtaaaaaca gtacatacag acaatggcag caatttcacc agtactacag ttaaggccgc 4681 ctgttggtgg gcggggatca agcaggaatt tggcattccc tacaatcccc aaagtcaagg 159 4741 agtaatagaa tctatgaata aagaattaaa gaaaattata ggacaggtaa gagatcaggc 4801 tgaacatctt aagacagcag tacaaatggc agtattcatc cacaatttta aaagaaaagg 4861 ggggattggg gggtacagtg caggggaaag aatagtagac ataatagcaa cagacataca 4921 aactaaagaa ttacaaaaac aaattacaaa aattcaaaat tttcgggttt attacaggga 4981 cagcagagat ccagtttgga aaggaccagc aaagctcctc tggaaaggtg aaggggcagt 5041 agtaatacaa gataatagtg acataaaagt agtgccaaga agaaaagcaa agatcatcag 5101 ggattatgga aaacagatgg caggtgatga ttgtgtggca agtagacagg atgaggatta 5161 acacatggaa ttctgcaaca actgctgttt atccatttca gaattgggtg tcgacatagc 5221 agaataggcg ttactcgaca gaggagagca agaaatggag ccagtagatc ctagactaga 5281 gccctggaag catccaggaa gtcagcctaa aactgcttgt accaattgct attgtaaaaa 5341 gtgttgcttt cattgccaag tttgtttcat gacaaaagcc ttaggcatct cctatggcag 5401 gaagaagcgg agacagcgac gaagagctca tcagaacagt cagactcatc aagcttctct 5461 atcaaagcag taagtagtac atgtaatgca acctataata gtagcaatag tagcattagt 5521 agtagcaata ataatagcaa tagttgtgtg gtccatagta atcatagaat ataggaaaat 5581 ggccgctgat cttcagacct ggaggaggag atatgaggga caattggaga agtgaattat 5641 ataaatataa agtagtaaaa attgaaccat taggagtagc acccaccaag gcaaagagaa 5701 gagtggtgca gagagaaaaa agagcagtgg gaataggagc tttgttcctt gggttcttgg 5761 gagcagcagg aagcactatg ggcgcagcgt caatgacgct gacggtacag gccagacaat 5821 tattgtctgg tatagtgcag cagcagaaca atttgctgag ggctattgag gcgcaacagc 5881 atctgttgca actcacagtc tggggcatca agcagctcca ggcaagaatc ctggctgtgg 5941 aaagatacct aaaggatcaa cagctcctgg ggatttgggg ttgctctgga aaactcattt 6001 gcaccactgc tgtgccttgg aatgctagtt ggagtaataa atctctggaa cagatttgga 6061 atcacacgac ctggatggag tgggacagag aaattaacaa ttacacaagc ttaatacact 6121 ccttaattga agaatcgcaa aaccagcaag aaaagaatga acaagaatta ttggaattag 6181 ataaatgggc aagtttgtgg aattggttta acataacaaa ttggctgtgg tatataaaat 6241 tattcataat gatagtagga ggcttggtag gtttaagaat agtttttgct gtactttcta 6301 tagtgaatag agttaggcag ggatattcac cattatcgtt tcagacccac ctcccaaccc 6361 cgaggggacc cgacaggccc gaaggaatag aagaagaagg tggagagaga gacagagaca 6421 gatccattcg attagtgaac ggatccttgg cacttatctg ggacgatctg cggagcctgt 6481 gcctcttcag ctaccaccgc ttgagagact tactcttgat tgtaacgagg attgtggaac 6541 ttctgggacg cagggggtgg gaagccctca aatattggtg gaatctccta caatattgga 6601 gtcaggagct aaagaatagt gctgttagct tgctcaatgc cacagccata gcagtagctg 6661 aggggacaga tagggttata gaagtagtac aaggagcttg tagagctatt cgccacatac 6721 ctagaagaat aagacagggc ttggaaagga ttttgctata agctcgaggc cgccccggtg 6781 accttcagac cttggcactg gaggtggccc ggcagaagcg cggcatcgtg gatcagtgct 6841 gcaccagcat ctgctctctc taccaactgg agaactactg caactaggcc caccactacc 6901 ctgtccaccc ctctgcaatg aataaaacct ttgaaagagc actacaagtt gtgtgtacat 6961 gcgtgcatgt gcatatgtgg tgcgggggga acatgagtgg ggctggctgg agtggcgatg 7021 ataagctgtc aaacatgaga attaattctt gaagacgaaa gggcctcgtg atacgcctat 7081 ttttataggt taatgtcatg ataataatgg tttcttagtc tagaattaat tccgtgtatt 7141 ctatagtgtc acctaaatcg tatgtgtatg atacataagg ttatgtatta attgtagccg 7201 cgttctaacg acaatatgta caagcctaat tgtgtagcat ctggcttact gaagcagacc 7261 ctatcatctc tctcgtaaac tgccgtcaga gtcggtttgg ttggacgaac cttctgagtt 7321 tctggtaacg ccgtcccgca cccggaaatg gtcagcgaac caatcagcag ggtcatcgct 7381 agccagatcc tctacgccgg acgcatcgtg gccggcatca ccggcgccac aggtgcggtt 7441 gctggcgcct atatcgccga catcaccgat ggggaagatc gggctcgcca cttcgggctc 7501 atgagcgctt gtttcggcgt gggtatggtg gcaggccccg tggccggggg actgttgggc 7561 gccatctcct tgcatgcacc attccttgcg gcggcggtgc tcaacggcct caacctacta 7621 ctgggctgct tcctaatgca ggagtcgcat aagggagagc gtcgaatggt gcactctcag 7681 tacaatctgc tctgatgccg catagttaag ccagccccga cacccgccaa cacccgctga 7741 cgcgccctga cgggcttgtc tgctcccggc atccgcttac agacaagctg tgaccgtctc 7801 cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg aaacgcgcga gacgaaaggg 7861 cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt cttagacgtc 7921 aggtggcact tttcggggaa atgtgcgcgg aacccctatt tgtttatttt tctaaataca 7981 ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa 160 8041 aaggaagagt atgagtattc aacatttccg tgtcgccctt attccctttt ttgcggcatt 8101 ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg ctgaagatca 8161 gttgggtgca cgagtgggtt acatcgaact ggatctcaac agcggtaaga tccttgagag 8221 ttttcgcccc gaagaacgtt ttccaatgat gagcactttt aaagttctgc tatgtggcgc 8281 ggtattatcc cgtattgacg ccgggcaaga gcaactcggt cgccgcatac actattctca 8341 gaatgacttg gttgagtact caccagtcac agaaaagcat cttacggatg gcatgacagt 8401 aagagaatta tgcagtgctg ccataaccat gagtgataac actgcggcca acttacttct 8461 gacaacgatc ggaggaccga aggagctaac cgcttttttg cacaacatgg gggatcatgt 8521 aactcgcctt gatcgttggg aaccggagct gaatgaagcc ataccaaacg acgagcgtga 8581 caccacgatg cctgtagcaa tggcaacaac gttgcgcaaa ctattaactg gcgaactact 8641 tactctagct tcccggcaac aattaataga ctggatggag gcggataaag ttgcaggacc 8701 acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg gagccggtga 8761 gcgtgggtct cgcggtatca ttgcagcact ggggccagat ggtaagccct cccgtatcgt 8821 agttatctac acgacgggga gtcaggcaac tatggatgaa cgaaatagac agatcgctga 8881 gataggtgcc tcactgatta agcattggta actgtcagac caagtttact catatatact 8941 ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga tcctttttga 9001 taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt cagaccccgt 9061 agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct gctgcttgca 9121 aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc taccaactct 9181 ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgttc ttctagtgta 9241 gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 9301 aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg ggttggactc 9361 aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt cgtgcacaca 9421 gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg agctatgaga 9481 aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg 9541 aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 9601 cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag gggggcggag 9661 cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt gctggccttt 9721 tgctcacatg ttctttcctg cgttatcccc tgattctgtg gataaccgta ttaccgcctt 9781 tgagtgagct gataccgctc gccgcagccg aacgaccgag cgcagcgagt cagtgagcga 9841 ggaagcggaa gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta 9901 atgcagctgt ggaatgtgtg tcagttaggg tgtggaaagt ccccaggctc cccagcaggc 9961 agaagtatgc aaagcatgca tctcaattag tcagcaacca ggtgtggaaa gtccccaggc 10021 tccccagcag gcagaagtat gcaaagcatg catctcaatt agtcagcaac catagtcccg 10081 cccctaactc cgcccatccc gcccctaact ccgcccagtt ccgcccattc tccgccccat 10141 ggctgactaa ttttttttat ttatgcagag gccgaggccg cctcggcctc tgagctattc 10201 cagaagtagt gaggaggctt ttttggaggc ctaggctttt gcaaaaagct tggacacaag 10261 acaggcttgc gagatatgtt tgagaatacc actttatccc gcgtcaggga gaggcagtgc 10321 gtaaaaagac gcggactcat gtgaaatact ggtttttagt gcgccagatc tctataatct 10381 cgcgcaacct attttcccct cgaacacttt ttaagccgta gataaacagg ctgggacact 10441 tcacatgagc gaaaaataca tcgtcacctg ggacatgttg cagatccatg cacgtaaact 10501 cgcaagccga ctgatgcctt ctgaacaatg gaaaggcatt attgccgtaa gccgtggcgg 10561 tctgtaccgg gtgcgttact ggcgcgtgaa ctgggtattc gtcatgtcga taccgtttgt 10621 atttccagct acgatcacga caaccagcgc gagcttaaag tgctgaaacg cgcagaaggc 10681 gatggcgaag gcttcatcgt tattgatgac ctggtggata ccggtggtac tgcggttgcg 10741 attcgtgaaa tgtatccaaa agcgcacttt gtcaccatct tcgcaaaacc ggctggtcgt 10801 ccgctggttg atgactatgt tgttgatatc ccgcaagata cctggattga acagccgtgg 10861 gatatgggcg tcgtattcgt cccgccaatc tccggtcgct aatcttttca acgcctggca 10921 ctgccgggcg ttgttctttt taacttcagg cgggttacaa tagtttccag taagtattct 10981 ggaggctgca tccatgacac aggcaaacct gagcgaaacc ctgttcaaac cccgctttaa 11041 acatcctgaa acctcgacgc tagtccgccg ctttaatcac ggcgcacaac cgcctgtgca 11101 gtcggccctt gatggtaaaa ccatccctca ctggtatcgc atgattaacc gtctgatgtg 11161 gatctggcgc ggcattgacc cacgcgaaat cctcgacgtc caggcacgta ttgtgatgag 11221 cgatgccgaa cgtaccgacg atgatttata cgatacggtg attggctacc gtggcggcaa 11281 ctggatttat gagtgggccc cggatctttg tgaaggaacc ttacttctgt ggtgtgacat 161 11341 aattggacaa actacctaca gagatttaaa gctctaaggt aaatataaaa tttttaaccc 11401 ggatctttgt gaaggaacct tacttctgtg gtgtgacata attggacaaa ctacctacag 11461 agatttaaag ctctaaggta aatataaaat ttttaagtgt ataatgtgtt aaactactga 11521 ttctaattgt ttgtgtattt tagattccaa cctatggaac tgatgaatgg gagcagtggt 11581 ggaatgcctt taatgaggaa aacctgtttt gctcagaaga aatgccatct agtgatgatg 11641 aggctactgc tgactctcaa cattctactc ctccaaaaaa gaagagaaag gtagaagacc 11701 ccaaggactt tccttcagaa ttgctaagtt ttttgagtca tgctgtgttt agtaatagaa 11761 ctcttgcttg ctttgctatt tacaccacaa aggaaaaagc tgcactgcta tacaagaaaa 11821 ttatggaaaa atattctgta acctttataa gtaggcataa cagttataat cataacatac 11881 tgttttttct tactccacac aggcatagag tgtctgctat taataactat gctcaaaaat 11941 tgtgtacctt tagcttttta atttgtaaag gggttaataa ggaatatttg atgtatagtg 12001 ccttgactag agatcataat cagccatacc acatttgtag aggttttact tgctttaaaa 12061 aacctcccac acctccccct gaacctgaaa cataaaatga atgcaattgt tgttgttggg 12121 ctgcaggaat taattcgagc tcgcccgaca // B.1.5. pCMV-VSV-G LOCUS pCMV-VSV-G 5822 bp DNA circular SYN 09-AUG-2018 DEFINITION plasmid vector encoding vesicular stomatitis virus (VSV) G protein for pseudotyping engineered lentivirus particles ACCESSION KEYWORDS SOURCE synthetic construct ORGANISM synthetic construct FEATURES Location/Qualifiers promoter 5..581 /note=\"promoter eukaryotic\" /label=\"CMV enhancer/immediate-early promoter\" intron 858..1245 /label=\"Human B-globin intron\" CDS 1271..2803 /label=\"VSV-G CDS\" 3'UTR 2804..2913 /label=\"VSV-G UTR\" 3'UTR 2926..3694 /label=\"Human B-globin UTR\" rep_origin 3896..4575 /label=\"ColE1\" CDS complement(4670..5530) /note=\"selection marker\" /label=\"AmpR\" ORIGIN 1 gttgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 61 gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 121 ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 181 ggactttcca ttgacgtcaa tgggtggagt atttacggta aactgcccac ttggcagtac 241 atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 301 cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 361 tattagtcat cgctattacc atggtgatgc ggttttggca gtacatcaat gggcgtggat 421 agcggtttga ctcacgggga tttccaagtc tccaccccat tgacgtcaat gggagtttgt 481 tttggcacca aaatcaacgg gactttccaa aatgtcgtaa caactccgcc ccattgacgc 541 aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctcgt ttagtgaacc 601 gtcagatcgc ctggagacgc catccacgct gttttgacct ccatagaaga caccgggacc 661 gatccagcct cccctcgaag cttacatgtg gtaccgagct cggatcctga gaacttcagg 162 721 gtgagtctat gggacccttg atgttttctt tccccttctt ttctatggtt aagttcatgt 781 cataggaagg ggagaagtaa cagggtacac atattgacca aatcagggta attttgcatt 841 tgtaatttta aaaaatgctt tcttctttta atatactttt ttgtttatct tatttctaat 901 actttcccta atctctttct ttcagggcaa taatgataca atgtatcatg cctctttgca 961 ccattctaaa gaataacagt gataatttct gggttaaggc aatagcaata tttctgcata 1021 taaatatttc tgcatataaa ttgtaactga tgtaagaggt ttcatattgc taatagcagc 1081 tacaatccag ctaccattct gcttttattt tatggttggg ataaggctgg attattctga 1141 gtccaagcta ggcccttttg ctaatcatgt tcatacctct tatcttcctc ccacagctcc 1201 tgggcaacgt gctggtctgt gtgctggccc atcactttgg caaagcacgt gagatctgaa 1261 ttctgacact atgaagtgcc ttttgtactt agccttttta ttcattgggg tgaattgcaa 1321 gttcaccata gtttttccac acaaccaaaa aggaaactgg aaaaatgttc cttctaatta 1381 ccattattgc ccgtcaagct cagatttaaa ttggcataat gacttaatag gcacagcctt 1441 acaagtcaaa atgcccaaga gtcacaaggc tattcaagca gacggttgga tgtgtcatgc 1501 ttccaaatgg gtcactactt gtgatttccg ctggtatgga ccgaagtata taacacattc 1561 catccgatcc ttcactccat ctgtagaaca atgcaaggaa agcattgaac aaacgaaaca 1621 aggaacttgg ctgaatccag gcttccctcc tcaaagttgt ggatatgcaa ctgtgacgga 1681 tgccgaagca gtgattgtcc aggtgactcc tcaccatgtg ctggttgatg aatacacagg 1741 agaatgggtt gattcacagt tcatcaacgg aaaatgcagc aattacatat gccccactgt 1801 ccataactct acaacctggc attctgacta taaggtcaaa gggctatgtg attctaacct 1861 catttccatg gacatcacct tcttctcaga ggacggagag ctatcatccc tgggaaagga 1921 gggcacaggg ttcagaagta actactttgc ttatgaaact ggaggcaagg cctgcaaaat 1981 gcaatactgc aagcattggg gagtcagact cccatcaggt gtctggttcg agatggctga 2041 taaggatctc tttgctgcag ccagattccc tgaatgccca gaagggtcaa gtatctctgc 2101 tccatctcag acctcagtgg atgtaagtct aattcaggac gttgagagga tcttggatta 2161 ttccctctgc caagaaacct ggagcaaaat cagagcgggt cttccaatct ctccagtgga 2221 tctcagctat cttgctccta aaaacccagg aaccggtcct gctttcacca taatcaatgg 2281 taccctaaaa tactttgaga ccagatacat cagagtcgat attgctgctc caatcctctc 2341 aagaatggtc ggaatgatca gtggaactac cacagaaagg gaactgtggg atgactgggc 2401 accatatgaa gacgtggaaa ttggacccaa tggagttctg aggaccagtt caggatataa 2461 gtttccttta tacatgattg gacatggtat gttggactcc gatcttcatc ttagctcaaa 2521 ggctcaggtg ttcgaacatc ctcacattca agacgctgct tcgcaacttc ctgatgatga 2581 gagtttattt tttggtgata ctgggctatc caaaaatcca atcgagcttg tagaaggttg 2641 gttcagtagt tggaaaagct ctattgcctc ttttttcttt atcatagggt taatcattgg 2701 actattcttg gttctccgag ttggtatcca tctttgcatt aaattaaagc acaccaagaa 2761 aagacagatt tatacagaca tagagatgaa ccgacttgga aagtaactca aatcctgcac 2821 aacagattct tcatgtttgg accaaatcaa cttgtgatac catgctcaaa gaggcctcaa 2881 ttatatttga gtttttaatt tttatgaaaa aaaaaaaaaa aaacggaatt caccccacca 2941 gtgcaggctg cctatcagaa agtggtggct ggtgtggcta atgccctggc ccacaagtat 3001 cactaagctc gctttcttgc tgtccaattt ctattaaagg ttcctttgtt ccctaagtcc 3061 aactactaaa ctgggggata ttatgaaggg ccttgagcat ctggattctg cctaataaaa 3121 aacatttatt ttcattgcaa tgatgtattt aaattatttc tgaatatttt actaaaaagg 3181 gaatgtggga ggtcagtgca tttaaaacat aaagaaatga agagctagtt caaaccttgg 3241 gaaaatacac tatatcttaa actccatgaa agaaggtgag gctgcaaaca gctaatgcac 3301 attggcaaca gcccctgatg cctatgcctt attcatccct cagaaaagga ttcaagtaga 3361 ggcttgattt ggaggttaaa gttttgctat gctgtatttt acattactta ttgttttagc 3421 tgtcctcatg aatgtctttt cactacccat ttgcttatcc tgcatctctc agccttgact 3481 ccactcagtt ctcttgctta gagataccac ctttcccctg aagtgttcct tccatgtttt 3541 acggcgagat ggtttctcct cgcctggcca ctcagcctta gttgtctctg ttgtcttata 3601 gaggtctact tgaagaagga aaaacagggg gcatggtttg actgtcctgt gagcccttct 3661 tccctgcctc ccccactcac agtgacccgg aatccctcga catggcagtc tagcactagt 3721 gcggccgcag atctgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc 3781 gagcggtatc agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg 3841 caggaaagaa catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt 3901 tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa 3961 gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct 163 4021 ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc 4081 cttcgggaag cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg 4141 tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct 4201 tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag 4261 cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga 4321 agtggtggcc taactacggc tacactagaa gaacagtatt tggtatctgc gctctgctga 4381 agccagttac cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg 4441 gtagcggtgg tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag 4501 aagatccttt gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag 4561 ggattttggt catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat 4621 gaagttttaa atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct 4681 taatcagtga ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac 4741 tccccgtcgt gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa 4801 tgataccgcg agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg 4861 gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt 4921 gttgccggga agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca 4981 ttgctacagg catcgtggtg tcacgctcgt cgtttggtat ggcttcattc agctccggtt 5041 cccaacgatc aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct 5101 tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg 5161 cagcactgca taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg 5221 agtactcaac caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg 5281 cgtcaatacg ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa 5341 aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt 5401 aacccactcg tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt 5461 gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt 5521 gaatactcat actcttcctt tttcaatatt attgaagcat ttatcagggt tattgtctca 5581 tgagcggata catatttgaa tgtatttaga aaaataaaca aataggggtt ccgcgcacat 5641 ttccccgaaa agtgccacct gacgtggatc ccctgagggg gcccccatgg gctagaggat 5701 ccggcctcgg cctctgcata aataaaaaaa attagtcagc catgagcttg gcccattgca 5761 tacgttgtat ccatatcata atatgtacat ttatattggc tcatgtccaa cattaccgcc 5821 at // 164 B.2 Nucleic acid sequences inserted into plasmid backbones Insert name Plasmid backbone Flanking sites Final plasmid name Position in final Insert source material # cloning steps Ova(241-280) pMND-silent-FRET BamHI/EcoRI pMND-minOva-FRET 8,194 – 8,313 cDNA from EG7 cells 1 Ova(241-280|SCR) pMND-silent-FRET BamHI/EcoRI pMND-minOScr-FRET 8,194 – 8,313 Synthesized 1 hgp100(9-59) pMND-silent-FRET BamHI/EcoRI pMND-min.hgp100-FRET 8,194 – 8,316 Synthesized 1 RM library pMND-silent-FRET BamHI/EcoRI pMND-RM-FRET 8,194 – 8,241 Synthesized 1 HSDL1(4-43)_WT pMND-libDest-FRET2 I-SceI/PI-SceI pMND-minHSDL1(WT)-FRET2 9,152 – 9,271 Synthesized 1 HSDL1(4-43)_L25Vmut pMND-libDest-FRET2 I-SceI/PI-SceI pMND-minHSDL1(L25V)-FRET2 9,152 – 9,271 Synthesized 1 CD3δγεζ pMND-Multi PacI/AscI pMND-hCD3-RFP 6,622 – 9,043* Synthesized 2 CD8αβ pMND-Multi BamHI/AscI pMND-hCD8-RFP 6,640 –8,076† Synthesized/primary cDNA 2 Wick-TCRαβ pMND-Multi BamHI/EcoRI pMND-Wick.TCR-RFP 6,640 – 8,460‡ Primary cDNA 2 HLAC1403 pMND-Multi BamHI/MluI pMND-HLAC1403-RFP 6,640 – 7,737§ cDNA clone from Riken DNA Bank 1 *Annotated BamHI and mStrawberry fluorescent protein sequence from pMND-Multi are absent in final plasmid (BamHI occurs naturally in CD3δ CDS). †Annotated mStrawberry fluorescent protein sequence from pMND-Multi is absent in final plasmid. ‡Position of EcoRI-P2A-FseI-mStrawberry coding sequences changes from 6,715 – 7,506 in pMND-Multi to 8,461 – 9,252 in pMND-Wick.TCR-RFP. §Position of MluI-EcoRI-P2A-FseI-mStrawberry coding sequences changes from 6,709 – 7,506 in pMND-Multi to 7,738 – 9,252 in pMND-Wick.TCR-RFP. Legend for Section B.2 BamHI CD8α HLA-C*14:03 EcoRI CD8β Ova(264-281) minigene PacI CD3δ SIINFEKL minimal epitope AscI CD3γ Scrambled SIINFEKL epitope MluI CD3ε hgp100(9-59) minigene NheI CD3ζ KVPRNQDWL minimal epitope FseI Wick TCRα HSDL1(4-43) minigene I-SceI Wick TCRβ CYMEALAL minimal epitope PI-SceI P2A CYMEAVAL minimal epitope AgeI E2A AsiSI T2A 165 >Ova(241-280) GGATCCATGTTGGTGCTGTTGCCTGATGAAGTCTCAGGCCTTGAGCAGCTTGAGAGTATAATCAACTTTGAAAAACTGACTGAATGGACCAGTTCTAATGTTATGGAAGAGAGGAAGATCAAAGTGGAATTC >Ova(241-280|SCR) GGATCCATGTTGGTGCTGTTGCCTGATGAAGTCTCAGGCCTTGAGCAGCTTGAGCTGAAAAACTTTATCAGTGAAATAACTGAATGGACCAGTTCTAATGTTATGGAAGAGAGGAAGATCAAAGTGGAATTC >hgp100(9-59) GGATCCCTTCTTCATTTGGCTGTGATAGGTGCTTTGCTGGCTGTGGGGGCTACAAAAGTACCCAGAAACCAGGACTGGCTTGGTGTCTCAAGGCAACTCAGAACCAAAGCCTGGAACAGGCAGCTGTATGAATTC >RM library GGATCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGAATTC >HSDL1(4-43)_WT TAGGGATAACAGGGTAATAGTTGACAGTTTCTACCTCTTGTACAGGGAAATCGCCAGGTCTTGCAATTGCTATATGGAAGCTCTAGCTTTGGTTGGAGCCTGGTATACGGCCAGAAAAAGCATCACTGTCATCTGTGACATCTATGTCGGGTGCGGAGAAAGAGGTAATGAAATGG >HSDL1(4-43)_L25Vmut TAGGGATAACAGGGTAATAGTTGACAGTTTCTACCTCTTGTACAGGGAAATCGCCAGGTCTTGCAATTGCTATATGGAAGCTGTAGCTTTGGTTGGAGCCTGGTATACGGCCAGAAAAAGCATCACTGTCATCTGTGACATCTATGTCGGGTGCGGAGAAAGAGGTAATGAAATGG >CD3δγεζ TTAATTAACGCCGCCACCATGGAACATAGCACGTTTCTCTCTGGCCTGGTACTGGCTACCCTTCTCTCGCAAGTGAGCCCCTTCAAGATACCTATAGAGGAACTTGAGGACAGAGTGTTTGTGAATTGCAATACCAGCATCACATGGGTAGAGGGAACGGTGGGAACACTGCTCTCAGACATTACAAGACTGGACCTGGGAAAACGCATCCTGGACCCACGAGGAATATATAGGTGTAATGGGACAGATATATACAAGGACAAAGAATCTACCGTGCAAGTTCATTATCGAATGTGCCAGAGCTGTGTGGAGCTGGATCCAGCCACCGTGGCTGGCATCATTGTCACTGATGTCATTGCCACTCTGCTCCTTGCTTTGGGAGTCTTCTGCTTTGCTGGACATGAGACTGGAAGGCTGTCTGGGGCTGCCGACACACAAGCTCTGTTGAGGAATGACCAGGTCTATCAGCCCCTCCGAGATCGAGATGATGCTCAGTACAGCCACCTTGGAGGAAACTGGGCTCGGAACAAGACCGGTGGAAGCGGACAGTGTACTAATTATGCTCTCTTGAAATTGGCTGGAGATGTTGAGAGCAACCCTGGACCTCGCGATCGCATGGAACAGGGGAAGGGCCTGGCTGTCCTCATCCTGGCTATCATTCTTCTTCAAGGTACTTTGGCCCAGTCAATCAAAGGAAACCACTTGGTTAAGGTGTATGACTATCAAGAAGATGGTTCGGTACTTCTGACTTGTGATGCAGAAGCCAAAAATATCACATGGTTTAAAGATGGGAAGATGATCGGCTTCCTAACTGAAGATAAAAAAAAATGGAATCTGGGAAGTAATGCCAAGGACCCTCGAGGGATGTATCAGTGTAAAGGATCACAGAACAAGTCAAAACCACTCCAAGTGTATTACAGAATGTGTCAGAACTGCATTGAACTAAATGCAGCCACCATATCTGGCTTTCTCTTTGCTGAAATCGTCAGCATTTTCGTCCTTGCTGTTGGGGTCTACTTCATTGCTGGACAGGATGGAGTTCGCCAGTCGAGAGCTTCAGACAAGCAGACTCTGTTGCCCAATGACCA166 GCTCTACCAGCCCCTCAAGGATCGAGAAGATGACCAGTACAGCCACCTTCAAGGAAACCAGTTGAGGAGGAATGCTAGCGGAAGCGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGACCTACGCGTATGCAGTCGGGCACTCACTGGAGAGTTCTGGGCCTCTGCCTCTTATCAGTTGGTGTTTGGGGGCAAGATGGTAATGAAGAAATGGGTGGTATTACACAGACACCATATAAAGTCTCCATCTCTGGAACCACAGTAATATTGACATGCCCTCAGTATCCTGGATCTGAAATACTATGGCAACACAATGATAAAAACATAGGCGGTGATGAGGATGATAAAAACATAGGCAGTGATGAGGATCACCTGTCACTGAAGGAATTTTCAGAATTGGAGCAAAGTGGTTATTATGTCTGCTACCCCAGAGGAAGCAAACCAGAAGATGCGAACTTTTATCTCTACCTGAGGGCAAGAGTGTGTGAGAACTGCATGGAGATGGATGTGATGTCGGTGGCCACAATTGTCATAGTGGACATCTGCATCACTGGGGGCTTGCTGCTGCTGGTTTACTACTGGAGCAAGAATAGAAAGGCCAAGGCCAAGCCTGTGACACGAGGAGCGGGTGCTGGCGGCAGGCAAAGGGGACAAAACAAGGAGAGGCCACCACCTGTTCCCAACCCAGACTATGAGCCCATCCGGAAAGGCCAGCGGGACCTGTATTCTGGCCTGAATCAGAGACGCATCGAATTCGGAAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTAGGCCGGCCATGAAGTGGAAGGCGCTTTTCACCGCGGCCATCCTGCAGGCACAGTTGCCGATTACAGAGGCACAGAGCTTTGGCCTGCTGGATCCCAAACTCTGCTACCTGCTGGATGGAATCCTCTTCATCTATGGTGTCATTCTCACTGCCTTGTTCCTGAGAGTGAAGTTCAGCAGGAGCGCAGACGCCCCCGCGTACCAGCAGGGCCAGAACCAGCTCTATAACGAGCTCAATCTAGGACGAAGAGAGGAGTACGATGTTTTGGACAAGAGACGTGGCCGGGACCCTGAGATGGGGGGAAAGCCGCAGAGAAGGAAGAACCCTCAGGAAGGCCTGTACAATGAACTGCAGAAAGATAAGATGGCGGAGGCCTACAGTGAGATTGGGATGAAAGGCGAGCGCCGGAGGGGCAAGGGGCACGATGGCCTTTACCAGGGTCTCAGTACAGCCACCAAGGACACCTACGACGCCCTTCACATGCAGGCCCTGCCCCCTCGCGGCGCGCC >CD8αβ GGATCCATGGCCTTACCAGTGACCGCCTTGCTCCTGCCGCTGGCCTTGCTGCTCCACGCCGCCAGGCCGAGCCAGTTCCGGGTGTCGCCGCTGGATCGGACCTGGAACCTGGGCGAGACAGTGGAGCTGAAGTGCCAGGTGCTGCTGTCCAACCCGACGTCGGGCTGCTCGTGGCTCTTCCAGCCGCGCGGCGCCGCCGCCAGTCCCACCTTCCTCCTATACCTCTCCCAAAACAAGCCCAAGGCGGCCGAGGGGCTGGACACCCAGCGGTTCTCGGGCAAGAGGTTGGGGGACACCTTCGTCCTCACCCTGAGCGACTTCCGCCGAGAGAACGAGGGCTACTATTTCTGCTCGGCCCTGAGCAACTCCATCATGTACTTCAGCCACTTCGTGCCGGTCTTCCTGCCAGCGAAGCCCACCACGACGCCAGCGCCGCGACCACCAACACCGGCGCCCACCATCGCGTCGCAGCCCCTGTCCCTGCGCCCAGAGGCGTGCCGGCCAGCGGCGGGGGGCGCAGTGCACACGAGGGGGCTGGACTTCGCCTGTGATATCTACATCTGGGCGCCCTTGGCCGGGACTTGTGGGGTCCTTCTCCTGTCACTGGTTATCACCCTTTACTGCAACCACAGGAACCGAAGACGTGTTTGCAAATGTCCCCGGCCTGTGGTCAAATCGGGAGACAAGCCCAGCCTTTCGGCGAGATACGTCGCTAGCGGAAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGAGACGTGGAGGAGAACCCTGGACCTACGCGTATGCGGCCGCGGCTGTGGCTCCTCTTGGCCGCGCAGCTGACAGTTCTCCATGGCAACTCAGTCCTCCAGCAGACCCCTGCATACATAAAGGTGCAAACCAACAAGATGGTGATGCTGTCCTGCGAGGCTAAAATCTCCCTCAGTAACATGCGCATCTACTGGCTGAGACAGCGCCAGGCACCGAGCAGTGACAGTCACCACGAGTTCCTGGCCCTCTGGGATTCCGCAAAAGGGACTATCCACGGTGAAGAGGTGGAACAGGAGAAGATAGCTGTGTTTCGGGATGCAAGCCGGTTCATTCTCAATCTCACAAGCGTGAAGCCGGAAGACAGTGGCATCTACTTCTGCATGATCGTCGGGAGCCCCGAGCTGACCTTCGGGAAGGGAACTCAGCTGAGTGTGGTTGATTTCCTTCCCACCACTGCCCAGCCCACCAAGAAGTCCACCCTCAAGAAGAGAGTGTGCCGGTTACCCAGGCCAGAGACCCAGAAGGGCCCACTTTGTAGCCCCATCACCCTTGGCCTGCTGGTGGCTGGCGTCCTGGTTCTGCTGGTT167 TCCCTGGGAGTGGCCATCCACCTGTGCTGCCGGCGGAGGAGAGCCCGGCTTCGTTTCATGAAACAATTTTACAAAGAATTCAATATGGCCGGCCTTTAAGGCGCGCC >Wick-TCRαβ GGATCCATGAAGTTGGTGACAAGCATTACTGTACTCCTATCTTTGGGTATTATGGGTGATGCTAAGACCACACAGCCAAATTCAATGGAGAGTAACGAAGAAGAGCCTGTTCACTTGCCTTGTAACCACTCCACAATCAGTGGAACTGATTACATACATTGGTATCGACAGCTTCCCTCCCAGGGTCCAGAGTACGTGATTCATGGTCTTACAAGCAATGTGAACAACAGAATGGCCTCTCTGGCAATCGCTGAAGACAGAAAGTCCAGTACCTTGATCCTGCACCGTGCTACCTTGAGAGATGCTGCTGTGTACTACTGCATCCTGTACTACTATGGAGGAAGCCAAGGAAATCTCATCTTTGGAAAAGGCACTAAACTCTCTGTTAAACCAAATATCCAGAACCCTGACCCTGCCGTGTACCAGCTGAGAGACTCTAAATCCAGTGACAAGTCTGTCTGCCTATTCACCGATTTTGATTCTCAAACAAATGTGTCACAAAGTAAGGATTCTGATGTGTATATCACAGACAAAACTGTGCTAGACATGAGGTCTATGGACTTCAAGAGCAACAGTGCTGTGGCCTGGAGCAACAAATCTGACTTTGCATGTGCAAACGCCTTCAACAACAGCATTATTCCAGAAGACACCTTCTTCCCCAGCCCAGAAAGTTCCTGTGATGTCAAGCTGGTCGAGAAAAGCTTTGAAACAGATACGAACCTAAACTTTCAAAACCTGTCAGTGATTGGGTTCCGAATCCTCCTCCTGAAAGTGGCCGGGTTTAATCTGCTCATGACGCTGCGGCTGTGGTCCAGCGCTAGCGGAAGCGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGACCTACGCGTATGAGCATCAGCCTCCTGTGCTGTGCAGCCTTTCCTCTCCTGTGGGCAGGTCCAGTGAATGCTGGTGTCACTCAGACCCCAAAATTCCGCATCCTGAAGATAGGACAGAGCATGACACTGCAGTGTACCCAGGATATGAACCATAACTACATGTACTGGTATCGACAAGACCCAGGCATGGGGCTGAAGCTGATTTATTATTCAGTTGGTGCTGGTATCACTGATAAAGGAGAAGTCCCGAATGGCTACAACGTCTCCAGATCAACCACAGAGGATTTCCCGCTCAGGCTGGAGTTGGCTGCTCCCTCCCAGACATCTGTGTACTTCTGTGCCAGCAGTTGGGGCGGGAGGGCCGTAGAGACCCAGTACTTCGGGCCAGGCACGCGGCTCCTGGTGCTCGAGGACCTGAAAAACGTGTTCCCACCCGAGGTCGCTGTGTTTGAGCCATCAGAAGCAGAGATCTCCCACACCCAAAAGGCCACACTGGTGTGCCTGGCCACAGGCTTCTACCCCGACCACGTGGAGCTGAGCTGGTGGGTGAATGGGAAGGAGGTGCACAGTGGGGTCAGCACAGACCCGCAGCCCCTCAAGGAGCAGCCCGCCCTCAATGACTCCAGATACTGCCTGAGCAGCCGCCTGAGGGTCTCGGCCACCTTCTGGCAGAACCCCCGCAACCACTTCCGCTGTCAAGTCCAGTTCTACGGGCTCTCGGAGAATGACGAGTGGACCCAGGATAGGGCCAAACCTGTCACCCAGATCGTCAGCGCCGAGGCCTGGGGTAGAGCAGACTGTGGCTTCACCTCCGAGTCTTACCAGCAAGGGGTCCTGTCTGCCACCATCCTCTATGAGATCTTGCTAGGGAAGGCCACCTTGTATGCCGTGCTGGTCAGTGCCCTCGTGCTGATGGCCATGGTCAAGAGAAAGGATTCCAGAGGCGAATTC >HLAC1403 GGATCCATGCGGGTCATGGCGCCCCGAACCCTCATCCTGCTGCTCTCGGGAGCCCTGGCCCTGACCGAGACCTGGGCCTGCTCCCACTCCATGAGGTATTTCTCCACATCCGTGTCCCGGCCCGGCCGCGGGGAGCCCCACTTCATCGCAGTGGGCTACGTGGACGACACGCAGTTCGTGCGGTTCGACAGCGACGCCGCGAGTCCAAGAGGGGAGCCGCGGGCGCCGTGGGTGGAGCAGGAGGGGCCGGAGTATTGGGACCGGGAGACACAGAAGTACAAGCGCCAGGCACAGACTGACCGAGTGAGCCTGCGGAACCTGCGCGGCTACTACAACCAGAGCGAGGCCGGGTCTCACACCCTCCAGTGGATGTTTGGCTGCGACCTGGGGCCCGACGGGCGCCTCCTCCGCGGGTATGACCAGTCCGCCTACGACGGCAAGGATTACATCGCCCTGAACGAGGATCTGCGCTCCTGGACCGCCGCGGACACGGCGGCTCAGATCACCCAGCGCAAGTGGGAGGCGGCCCGTGAGGCGGAGCAGCGGAGAGCCTACCTGGAGGGCACGTGCGTGGAGTGGCTCCGCAGATACCTGGAGAACGGGAAGGAGACGCTGCAGCGCGCGGAACACCCAAAGACACACGTGACCCACCATCCCGTCTCTGACCATGAGGCCACCCTGAGGTGCTGGGCCCTGGGCTTCTA168 CCCTGCGGAGATCACACTGACCTGGCAGTGGGATGGGGAGGACCAAACTCAGGACACCGAGCTTGTGGAGACCAGGCCAGCAGGAGATGGAACCTTCCAGAAGTGGGCAGCTGTGGTGGTGCCTTCTGGAGAAGAGCAGAGATACACGTGCCATGTGCAGCACGAGGGGCTGCCGGAGCCCCTCACCCTGAGATGGGAGCCGTCTTCCCAGCCCACCATCCCCATCGTGGGCATCGTTGCTGGCCTGGCTGTCCTGGCTGTCCTAGCTGTCCTAGGAGCTGTGGTGGCTGTTGTGATGTGTAGGAGGAAGAGCTCAGGTGGAAAAGGAGGGAGCTGCTCTCAGGCTGCGTCCAGCAACAGTGCCCAGGGCTCTGATGAGTCTCTCATCGCTTGTAAAGCCACGCGT 169 B.3 Relevant oligonucleotide sequences used Oligo name Oligo description Oligo sequence (5'-3') Ova(241-280)_FWD Forward primer for the amplification of Ova minigene from OVAL (Gallus gallus) coding sequence. Tailed with BamHI site. ATGCGGATCCATGTTGGTGCTGTTGCCT Ova(241-280)_REV Reverse primer for the amplification of Ova minigene from OVAL (Gallus gallus) coding sequence. Tailed with EcoRI site. ATGCGAATTCCACTTTGATCTTCCTCTCTTCC Ova(241-280|SCR)_FWD Forward overlap extension oligo for the construction of Ova minigene containing scrambled epitope (LKNFISEI). Tailed with a BamHI restriction site. ATGCGGATCCATGTTGGTGCTGTTGCCTGATGAAGTCTCAGGCCTTGAGCAGCTTGAGCTGAAAAACTTTATCAGTGAAA Ova(241-280|SCR)_REV Reverse overlap extension oligo for the construction of Ova minigene containing scrambled epitope (LKNFISEI). Tailed with an EcoRI restriction site. ATGCGAATTCCACTTTGATCTTCCTCTCTTCCATAACATTAGAACTGGTCCATTCAGTTATTTCACTGATAAAGTTTTTC hgp100(9-59)_FWD Forward overlap extension oligo for construction of human gp100 minigene encoding amino acid position 9 to 49. Tailed with a BamHI restriction site. ATGCGGATCCCTTCTTCATTTGGCTGTGATAGGTGCTTTGCTGGCTGTGGGGGCTACAAAAGTACCCAGAAACCAGGACT hgp100(9-59)_REV Reverse overlap extension oligo for construction of human gp100 minigene encoding amino acid position 9 to 49. Tailed with an EcoRI restriction site. CATGGAATTCATACAGCTGCCTGTTCCAGGCTTTGGTTCTGAGTTGCCTTGAGACACCAAGCCAGTCCTGGTTTCTGGGTAC RM_FWD Forward primer for random minigene library amplification. Tailed with arbitrary sequence to facilitate resolution of uncut, single-cut, and double-cut fragments on agarose gels. CGTAGTTATCCTGTATCGGATGAGAATTCTGCATCGGGCCAGCCACGTTTGGTGGAATTC RM_REV Reverse primer for random minigene library amplification. Tailed with arbitrary sequence to facilitate resolution of uncut, single-cut, and double-cut fragments on agarose gels. CTGTACTAATAGCACACACGGGGGATTCCCAGCACAAGCTAGTCATGCAGCTCCGGATCC RM_template Degenerate oligo for construction of minigene library. Encodes BamHI/EcoRI restriction sites immediately flanking degenerate regions. CCACGTTTGGTGGAATTCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCGGAGCTGCAT MND25_FWD Forward primer for qPCR detection of integrated viral cassettes. Primes at position 25 of MND promoter. GCAAGCTAGGATCAAGGTTAGG MND148_REV Reverse primer for qPCR detection of integrated viral cassettes. Primes at position 148 of MND promoter. TGGCCCATATTCTGCTGTTC Minigene_Illumina_FWD Illumina-adapterized forward PCR primer for single round amplicon library preparation and direct-seq of minigenes encoded in pMND-minigene-FRET reporter constructs. Primes at position 313 of MND promoter AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGGTTCGCTTCTCGCTTCTGTT Minigene_Illumina+index_REV Illumina-adapterized reverse PCR primer for for single round amplicon CAAGCAGAAGACGGCATACGAGATNNNNNNGTG170 library preparation and direct-seq of minigenes encoded in pMND-minigene-FRET reporter constructs plasmids. Primes at position 18 of P2A sequence. Variable index sequence is underlined ACTGGAGTTCAGACGTGTGCTCTTCCGATCTGACGTTAGTAGCTCCGCTTCC Minigene_FWD _IA+stagger Forward primer for the amplification of minigenes from recovered gDNA in the 1st round of 2-round PCR library preparation scheme. Tailed with annealing region for 2nd round primers containing complete Illumina adapter sequence. Stock primer consists of an equimolar mix of oligos containing 1-9 arbitrary staggering bases located between minigene annealing region and Illumina adapter region. CGCTCTTCCGATCTCTG(N)1-9GTTCGCTTCTCGCTTCTGTT Minigene_REV_IA+stagger Reverse primer for the amplification of minigenes from recovered gDNA in the 1st round of 2-round PCR library preparation scheme. Tailed with annealing region for 2nd round primers containing complete Illumina adapter sequence. Stock primer consists of an equimolar mix of oligos containing 1-9 arbitrary staggering bases located between minigene annealing region and Illumina adapter region. TGCTCTTCCGATCTGAC(N)1-9GTTAGTAGCTCCGCTTCC Illumina_FWD Forward primer for 2nd round PCR in Illumina library preparation. AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCTG Illumina_REV Reverse primer for 2nd round PCR in Illumina library preparation. Contains variable index sequence (underlined). CAAGCAGAAGACGGCATACGAGATNNNNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGAC HSDL1(4-43)_FWD Forward strand for the construction of HSDL1 minigenes by overlap extension. Tailed with BamHI. ATGCGGATCCGTTGACAGTTTCTACCTCTTGTACAGGGAAATCGCCAGGTCTTGCAATTGCTATATGGAAGCT HSDL1(4-43)_REV Reverse strand for the construction of HSDL1 wildtype minigene by overlap extension. Tailed with EcoRI. ATGCGAATTCGTCACAGATGACAGTGATGCTTTTTCTGGCCGTATACCAGGCTCCAACCAAAGCTAGAGCTTCCATATAGCAATTGC HSDL1-L25V(4-43)_REV Reverse strand for the construction of HSDL1 L25V mutant minigene by overlap extension. Tailed with EcoRI. ATGCGAATTCGTCACAGATGACAGTGATGCTTTTTCTGGCCGTATACCAGGCTCCAACCAAAGCTACAGCTTCCATATAGCAATTGC minHSDL1.f-ISceI Forward primer for amplifying HSDL1 wt/L25V minigenes. Tailed with I-SceI sites compatible for cloning into pMND-libDest-FRET2 backbone. TAGCATTAGGGATAACAGGGTAATAGTTGACAGTTTCTACCTCTTG minHSDL1.r-PISceI Reverse primer for amplifying HSDL1 wt/L25V minigenes. Tailed with PI-SceI sites compatible for cloning into pMND-libDest-FRET2 backbone. ATGTTATCTGCGTGTCCAACCTTAGGATTACATCCCATTTCATTACCTCTTTCTCCGCACCCGACATAGATGTCACAGATGACAGTGATG C6 1st strand synthesis TRBC gene-specific primer for TCRβ transcript cDNA synthesis and subsequent sequencing. CACGTGGTCGGGGWAGAAGC TCRA-5 1st strand synthesis TRAC gene-specific primer for TCRα transcript cDNA synthesis and subsequent sequencing. CATTTGTTTGAGAATCAAAATCGGTGA Template switching oligo Template switching oligo for 1st strand synthesis TCRα and TCRβ chain BiotinC6-171 cDNA AAGCAGTGGTATCAACGCAGAGTACrGrG-BNA(+G)-C3-3 TCRA-3 Reverse primer for 1st round PCR amplification of TCRα chains. AGGCAGACAGACTTGTCACTGGATT C9B Reverse primer for 1st round PCR amplification of TCRβ chains. TCTCTGCTTCTGATGGCTCAAAC UPM-Short Forward primer 1st round PCR amplification of template-switched TCRα and TCRβ chains. To be used in combination with UPM-LTS_T primer. CTAATACGACTCACTATAGGGC UPM-LTS_T Forward primer 1st round PCR amplification of template-switched TCRα and TCRβ chains. To be used in combination with UPM-Short primer CTAATACGACTCACTATAGGGCAAGCAGTGGTATCAACGCAGAGT NLTS_III_T_trunc Forward primer for 2nd round nested PCR amplification of template-switched TCRα and TCRβ chains CGCTCTTCCGATCTCTGGCAGTGGTATCAACGCAGAGTA TCRA-2_tail Reverse primer for 2nd round nested PCR amplification of TCRα chains TGCTCTTCCGATCTGACCACTGGATTTAGAGTCTCTCAGCTGGT C14_III_Tail Reverse primer for 2nd round PCR amplification of TCRβ chains TGCTCTTCCGATCTGACAGCGACCTCGGGTGGGAACA TRAC.f Forward primer annealing to 5’-end of TCRα constant region ATATCCAGAACCCTGACCCTG TRAC.r_NheI Reverse primer annealing to 3’-end of TCRα constant region and tailed with NheI restriction site ACTGGCTAGCGCTGGACCACAGCCGC TRAV26-201.f_BamHI Forward primer annealing to 5’-end of TRAV26-2*01 allele. Tailed with BamHI restriction site. GTATGGATCCGATGCTAAGACCACACAGCCAA TRAC(33-53).r Reverse primer annealing from position 33-53 of TCRα constant region. Used to generate VJ-truncated-C amplicons from sequencing TOPO clones for use in stitching PCR to attach full C-region. ACTGGATTTAGAGTCTCTCAG TRBC2.f Forward primer annealing to 5’-end of TCRβ constant region 2. GAGGACCTGAAAAACGTGT TRBC2.r_EcoRI Reverse primer annealing to 3’-end of TCRβ constant region 2 and tailed with EcoRI restriction site TACGGAATTCGCCTCTGGAATCCTTTCTC TRBV6-601.f_MluI Forward primer annealing to 5’-end of TRBV6-6*01 allele. Tailed with MluI restriction site. TGATACGCGTAATGCTGGTGTCACTCAGAC TRBC2(23-38).r Reverse primer annealing from position 23-38 of TCRβ constant region 2. Used to generate VDJ-truncated-C amplicons from sequencing TOPO clones for use in stitching PCR to attach full C-region. ACAGCGACCTCGGGTG 172 Appendix C - Code C.1 R Code for functional lentivirus titering library(ggplot2) library(gridExtra) Titer<-function(vols=c(),cellNum,transdEffs=c(),sampleName,plotsDir){ x<-unlist(vols) y<-unlist(transdEffs) y<-y/100 c<-as.numeric(cellNum) fit<-nls(y~1-exp(1)^(-x*Titer/c), data=data.frame(x,y),start=list(Titer=1)) titer<-summary(fit)$coefficients[1] meanFunc<-function(x){1-exp(1)^(-x*titer/c)} ggplot(data=data.frame(x,y), environment=environment()) + geom_point(aes(x,y), size=3) + scale_y_continuous(limits = c(0,1)) + stat_function(fun=meanFunc) + annotation_custom( grob=tableGrob(as.data.frame(round(summary(fit)$ coefficients,digits=2))), xmin=x[grep(max(x),x)[1]-1], xmax=max(x), ymin=-0.1, ymax=.1) + xlab(\"Amount of virus added (uL)\") + ylab(\"Proportion of fluorescent cells\") + ggtitle(sampleName) ggsave(filename=paste(plotsDir,sampleName,\"-Titering.png\",sep=\"\")) summary(fit) } # Arguments required: ## volumes of virus added in microlitres as comma-separated list ## number of cells/per well used in experiment ## % of fluorescent cells detected as comma-separated list (% should corresponding to volumes) ## Sample name and directory to save titering curve # e.g. Titer(vols=c(0.5,1,2,4,8,16), cellNum=100000, transdEffs=c(1.83,4.16,7.76,13.6,23.7,38.8), sampleName=\"minhgp100-16aa(M)\", plotsDir=\"/Data/Flow/2018/Aug17_2018/\") 173 C.1.1. Lentivirus titering example output Appendix Figure C-1. Example output of lentivirus functional titering script. After inputting the volumes of virus added to cells, the number of cells per well used in titering samples, and % fluorescent cells measured, a non-linear least squares regression is applied to fit a value of T to eq. 2-4. The output is a plot of viral volumes versus fluorescence measurements overlaid with a line of the fit obtained. Also included is a table of summary statistics including the fitted value of T, the standard error, and associated t-statistic of the fit (all rounded to 2 decimal places for clarity). 174 C.2 R Code for FRET-shift assay sequencing data processing & analysis pipeline library(\"seqinr\") library(\"grid\") library(\"gridExtra\") library(\"gtools\") library(\"reshape\") library(\"ggplot2\") #fastqs were moved to working directory using: # mkdir /projects/gsharma_prj/Tope-Seq-4 # cp -r /projects/analysis30/PX0768/ /projects/gsharma_prj/Tope-Seq-4 #and extracted using: # gunzip -r /projects/gsharma_prj/Tope-Seq-4/ #To run this script, you need: ## 1) Directory where all the raw reads are: Dir<-\"/projects/gsharma_prj/Tope-Seq-4/\" GSC_ID<-\"PX0768\" ## 2) A .csv key linking index sequences used to sample names (Example ##provided): key<-read.csv(\"~/Data/Sequencing/Tope-Seq_4/Indexes-key.csv\", header=TRUE,stringsAsFactors=FALSE) ## 3) Directory where FLASH/ subfolder and FASTX-Toolkit bin/ subfolder are ##installed: ToolsDir<-\"/projects/gsharma_prj/Tarballs/\" setwd(Dir) #Get filenames fastQs<-list.files(paste(Dir,GSC_ID,sep=\"\"),recursive=TRUE) fastQs<-fastQs[grep(\".fastq$\",fastQs)] #Extract indexes from filenames fastQs_split<-strsplit(fastQs,\"_\") indexes<-unique(sapply(1:length(fastQs), function(element){ fastQs_split[[element]][5] })) lanes<-unique(sapply(1:length(fastQs), function(element){ fastQs_split[[element]][4] })) rm(fastQs_split) #Group fastqs from different lanes to make \"poolingKey\" for pooler function #below ##(the following code meant for paired end data) matchedFq<-matrix(ncol=length(lanes),nrow=length(indexes)*2) for(i in 1:length(indexes)){ test<-fastQs[grep(indexes[i],fastQs)] for(j in 1:length(lanes)){ I<-2*i-1 matchedFq[I,j]<-test[grep(paste(\"_\",lanes[j],\"_\",sep=\"\"),test)][1] I<-I+1 matchedFq[I,j]<-test[grep(paste(\"_\",lanes[j],\"_\",sep=\"\"),test)][2] I<-I+1 } } matchedFq<-data.frame(matchedFq,stringsAsFactors=FALSE) rm(I,test,lanes) 175 #Pool matched fastqs from different lanes ##Pooling key should have each file to be pooled as individual columns of ##each row pooler<-function(outputDir,poolingKey){ print(\"Initializing\") Indexes<-unique(sapply(1:nrow(poolingKey),function(element){ strsplit(poolingKey[element,1],split=\"_\")[[1]][5] })) #Above lines may need to be tweaked from experiment-to-experiment #depending how files are named by production group pb<-txtProgressBar(min=0,max=nrow(poolingKey),initial=0,style=3) prog<-0 for(i in 1:length(Indexes)){ setTxtProgressBar(pb,prog) filename_read1<-paste(outputDir,Indexes[i], \"_pooledRepsR1.fastq\",sep=\"\") filename_read2<-paste(outputDir,Indexes[i], \"_pooledRepsR2.fastq\",sep=\"\") R1catCall<-paste(\"cat\",paste(matchedFq[2*i-1,], collapse=\" \"),\">\",filename_read1,sep=\" \") R2catCall<-paste(\"cat\",paste(matchedFq[2*i,], collapse=\" \"),\">\",filename_read2,sep=\" \") system(R1catCall) prog<-prog+1 system(R2catCall) prog<-prog+1 } print(\"Finished\") } pooler(Dir,matchedFq) #Call FLASH for each library pooledReps<-list.files(Dir,pattern=\"pooledReps\") pairedReads<-data.frame( pooledReps[grep(x=pooledReps,pattern=\"R1\")], pooledReps[grep(x=pooledReps,pattern=\"R2\")], sapply(1:length(indexes),function(element){ as.character(subset(key,key$Sequence.RC==indexes[element],select=Tope.Seq.ID)) }), stringsAsFactors=FALSE) colnames(pairedReads)<-c(\"Read1\",\"Read2\",\"Extended\") for(i in 1:nrow(pairedReads)){ system(paste(ToolsDir,\"FLASH/flash \",pairedReads$Read1[i],\" \",pairedReads$Read2[i],\" --max-overlap=304 --allow-outies --output-prefix='\",pairedReads$Extended[i],\"'\",sep=\"\")) print(paste(i,\"of\",nrow(pairedReads),\"libraries assembled\")) } #Filter extended reads using FASTX-toolkit setwd(paste(ToolsDir,\"bin/\",sep=\"\")) for(i in 1:nrow(pairedReads)){ system(paste(\"./fastq_quality_filter -Q33 -q 20 -p 90 -i \",Dir,pairedReads$Extended[i],\".extendedFrags.fastq -o \",Dir,pairedReads$Extended[i],\".fastq -v\",sep=\"\")) 176 print(paste(pairedReads$Extended[i],\"complete.\",nrow(pairedReads)-i,\"libraries remaining\")) } #Make .fasta files for(i in 1:nrow(pairedReads)){ system(paste(\"./fastq_to_fasta -Q33 -n -i \",Dir,pairedReads$Extended[i],\".fastq -o \",Dir,pairedReads$Extended[i],\".fasta -v\",sep=\"\")) print(paste(pairedReads$Extended[i],\"complete.\",nrow(pairedReads)-i,\"libraries remaining\")) } #Remove adapter sequences using FASTX-Toolkit ##Fastx-clipper removes sequence after specified adapter sequence. Need to ##reverse complement reads prior to removing 5' flanking region setwd(paste(ToolsDir,\"bin/\",sep=\"\")) fwdAdapt<-\"GGATCC\" #Make sure this is the RC of the 5' flanking sequence revAdapt<-\"GAATTC\" #Do not use reverse complement here for(i in 1:nrow(key)){ print(paste(\"Now processing\",key$Tope.Seq.ID[i])) print(\"Trimming 5' flanking region...\") system(paste(\"./fastx_reverse_complement -i \",Dir,key$Tope.Seq.ID[i],\".fasta -o \",Dir,key$Tope.Seq.ID[i],\"_revComp.fasta\",sep=\"\")) system(paste(\"./fastx_clipper -ncva \",fwdAdapt,\" -i \",Dir,key$Tope.Seq.ID[i],\"_revComp.fasta -o \",Dir,key$Tope.Seq.ID[i],\"_revComp_trim1.fasta\",sep=\"\")) print(\"Trimming 3' flanking region...\") system(paste(\"./fastx_reverse_complement -i \",Dir,key$Tope.Seq.ID[i],\"_revComp_trim1.fasta -o \",Dir,key$Tope.Seq.ID[i],\"_preTrim2.fasta\",sep=\"\")) system(paste(\"./fastx_clipper -ncva \",revAdapt,\" -i \",Dir,key$Tope.Seq.ID[i],\"_preTrim2.fasta -o \",Dir,key$Tope.Seq.ID[i],\"_trim3.fasta\",sep=\"\")) print(\"Cleaning up...\") system(paste(\"rm \",Dir,key$Tope.Seq.ID[i],\"_revComp.fasta \",Dir,key$Tope.Seq.ID[i],\"_revComp_trim1.fasta \",Dir,key$Tope.Seq.ID[i],\"_preTrim2.fasta\",sep=\"\")) print(paste(key$Tope.Seq.ID[i],\"complete.\",i,\"samples complete\")) } rm(i,fwdAdapt,revAdapt) #Use starcode to collapse highly similar reads into clusters with #representative sequences and associated read counts setwd(Dir) for(i in 1:nrow(key)){ print(paste(\"Now processing\",key$Tope.Seq.ID[i])) system(paste(ToolsDir,\"starcode/starcode -i \",key$Tope.Seq.ID[i],\"_trim3.fasta -o \",key$Tope.Seq.ID[i],\"_clustered\",sep=\"\")) print(paste(key$Tope.Seq.ID[i],\"complete.\",i,\"samples complete\")) } #Translate all sequences for(i in 1:nrow(key)){ nucleicTab<-read.table(paste(key$Tope.Seq.ID[i],\"_clustered\",sep=\"\"), stringsAsFactors=FALSE) translated<-sapply(1:nrow(nucleicTab),function(element){ paste(translate(s2c(nucleicTab[element,1])),collapse=\"\") }) 177 aminoTab<-data.frame(translated,nucleicTab$V2,stringsAsFactors=FALSE) colnames(aminoTab)<-c(\"Minigene\",\"readCount\") #Re-collapse translated files (so that identical peptides derived from #different DNA sequences are grouped together) aminoTab<-aggregate(readCount ~ Minigene,data=aminoTab,FUN=sum) write.table(aminoTab,paste(key$Tope.Seq.ID[i],\"_clustered_translated\",sep=\"\") ,row.names=FALSE) print(paste(key$Tope.Seq.ID[i],\"completed\",i,\"of\",nrow(key), \"libraries translated\")) } rm(nucleicTab,aminoTab,translated) #Filter stop codons for(i in 1:nrow(key)){ aminoTab<-read.table(paste(key$Tope.Seq.ID[i],\"_clustered_translated\", sep=\"\"),stringsAsFactors=FALSE,header=TRUE) test<-sapply(1:nrow(aminoTab),function(element){ if(TRUE %in% grepl(\"*\",aminoTab[element,1],fixed=TRUE)==FALSE){ return(element) } }) test<-unlist(test[!sapply(test,is.null)]) aminoTab<-aminoTab[test,] write.table(aminoTab,paste(key$Tope.Seq.ID[i],\"_stopCodonFiltered\", sep=\"\"),row.names=FALSE) print(paste(key$Tope.Seq.ID[i],\"completed\",i,\"of\",nrow(key), \"libraries filtered\")) } rm(aminoTab,test) #Make a summary table for each gate Summaries<-list() for(i in 1:nrow(pairedReads)){ Summaries[[i]]<-data.frame(matrix(ncol=1,nrow=4,dimnames=list(c(\"Raw\", \"Filtered\",\"Unique\",\"Productive\"),\"Count\"))) Summaries[[i]][1,1]<-as.numeric(system(paste(\"wc -l < \",pairedReads$Read1[i],sep=\"\"),intern=TRUE))/4 #Each entry of a fastq has 4 lines Summaries[[i]][2,1]<-as.numeric(system(paste(\"wc -l < \",pairedReads$Extended[i],\"_trim3.fasta\",sep=\"\"),intern=TRUE))/2 #Each entry of a fasta has 2 lines Summaries[[i]][3,1]<-as.numeric(system(paste(\"wc -l < \",pairedReads$Extended[i],\"_clustered\",sep=\"\"),intern=TRUE)) Summaries[[i]][4,1]<-as.numeric(system(paste(\"wc -l < \",pairedReads$Extended[i],\"_stopCodonFiltered\",sep=\"\"), intern=TRUE)) } names(Summaries)<-pairedReads$Extended rm(pairedReads) #Plot clonal frequency distributions for each gate ClonFreq<-function(gateName,plotDir,highlightSeqs=c(NULL),whichFile=NULL){ diversityTab<-read.table(paste(gateName,whichFile,sep=\"_\"),header=TRUE, stringsAsFactors=FALSE) colnames(diversityTab)<-c(\"Minigene\",\"readCount\") depth<-sum(diversityTab$readCount) #Normalize read counts to frequency per million reads diversityTab$readCount<-sapply(1:nrow(diversityTab),function(element){ 178 round(diversityTab$readCount[element]/depth,digits=8)*1000000 }) diversityTab<-diversityTab[order(diversityTab$readCount, decreasing=TRUE),] row.names(diversityTab)<-1:nrow(diversityTab) highlight<-vector(\"character\",length=nrow(diversityTab)) for(i in 1:length(highlightSeqs)){ highlight[grep(highlightSeqs[i],diversityTab$Minigene)] <-highlightSeqs[i] } diversityTab<-data.frame(diversityTab,highlight,stringsAsFactors=FALSE) unhighlighted<-subset(diversityTab,diversityTab$highlight==\"\") highlighted<-subset(diversityTab,diversityTab$highlight!=\"\") p<-ggplot(diversityTab,environment=environment()) + geom_point(data=unhighlighted,aes(x=as.numeric( row.names(unhighlighted)),y=readCount),pch=16) + geom_point(data=highlighted,aes(x=as.numeric(row.names( highlighted)),y=readCount,color=highlight),pch=16) + xlab(\"Unique Sequences (in order of abundance)\") + scale_y_log10() + ylab(\"Frequency per million reads\") + ggtitle(gateName) + theme_classic() + theme(axis.text.x=element_blank(),axis.title.x=element_blank()) + annotation_custom( grob=tableGrob(Summaries[[gateName]]), xmin=0.6*(nrow(diversityTab)), xmax=0.95*(nrow(diversityTab)), ymin=0.87*log10(max(diversityTab$readCount)), ymax=0.95*log10(max(diversityTab$readCount))) invisible(p) } plotDir<-\"~/Data/Sequencing/Tope-Seq_4/FreqDistributions2/\" for(i in 1:length(Summaries)){ p<-ClonFreq(names(Summaries)[i],highlightSeqs=c(\"KVPRNQDWL\", \"SIINFEKL\"),whichFile=\"stopCodonFiltered\") png(paste(plotDir,names(Summaries)[i],\".png\",sep=\"\"),type=\"cairo\") print(p) dev.off() } rm(p) #Assemble read counts from matched gates into tables coordinator<-function(Unshifted,Shifted,whichFile=\"clustered_translated\"){ shiftedTab<-read.table(paste(Shifted,whichFile,sep=\"_\"), stringsAsFactors=FALSE,header=TRUE) colnames(shiftedTab)<-c(\"Seq\",\"countInShifted\") depth<-sum(shiftedTab$countInShifted) #Normalize read counts to frequency per million reads shiftedTab$countInShifted<-sapply(1:nrow(shiftedTab),function(element){ round(shiftedTab$countInShifted[element]/depth,digits=8)*1000000 }) shiftedTab<-shiftedTab[order(shiftedTab$countInShifted, decreasing=TRUE),] row.names(shiftedTab)<-1:nrow(shiftedTab) unshiftedTab<-read.table(paste(Unshifted,whichFile,sep=\"_\"), 179 stringsAsFactors=FALSE,header=TRUE) colnames(unshiftedTab)<-c(\"Seq\",\"countInUnshifted\") depth<-sum(unshiftedTab$countInUnshifted) unshiftedTab$countInUnshifted<-sapply(1:nrow(unshiftedTab), function(element){ round(unshiftedTab$countInUnshifted[element]/depth,digits=8)*1000000\\ }) #Normalize read counts to frequency per million reads unshiftedTab<-unshiftedTab[order(unshiftedTab$countInUnshifted, decreasing=TRUE),] row.names(unshiftedTab)<-1:nrow(unshiftedTab) #Build table of coordinates for all sequences across gates test<-merge(shiftedTab,unshiftedTab,all=TRUE) test[is.na(test)]<-0 invisible(test) } #Make a key for pairing up relevant gates ##In my convention “S+” indicates a FRET-shifted gate and “S-“ indicates an ##unshifted gate x<-grep(\"S+\",key$Tope.Seq.ID,fixed=TRUE) y<-grep(\"S-\",key$Tope.Seq.ID,fixed=TRUE) z<-expand.grid(x,y) test<-data.frame(apply(z,c(1,2),function(element){element<-key$Tope.Seq.ID[element]}),stringsAsFactors=FALSE) colnames(test)<-c(\"shiftedGate\",\"unshiftedGate\") rm(x,y,z) #Optional at this step: remove irrelevant combinations from pairwise key test2<-t(data.frame(sapply(1:nrow(test),function(element){ unlist(strsplit(as.character(test[element,]),split=\"S\")) }))) test3<-sapply(1:nrow(test2),function(element){ if(length(grep(\"TRUE\",duplicated(test2[element,])))>0){ return(element) } }) test3<-as.numeric(unlist(test3)) pairedGates<-test[test3,] rm(test,test2,test3) #Do capture-recapture estimations based on diversity in unshifted gates RCapSamps<-combinations(length(pairedGates$unshiftedGate),2, pairedGates$unshiftedGate) diversityEsts<-vector() for(i in 1:nrow(RCapSamps)){ unique1<-as.numeric(system(paste(\"wc -l < \",RCapSamps[i,1], \"_clustered\",sep=\"\"),intern=TRUE)) unique2<-as.numeric(system(paste(\"wc -l < \",RCapSamps[i,2], \"_clustered\",sep=\"\"),intern=TRUE)) A<-read.table(paste(RCapSamps[i,1],\"_clustered\",sep=\"\")) B<-read.table(paste(RCapSamps[i,2],\"_clustered\",sep=\"\")) overlap<-length(intersect(A$V1,B$V1)) diversityEsts[i]<-(unique1*unique2)/overlap } rm(unique1,unique2,A,B,overlap,RCapSamps) summary(diversityEsts) #Generate coordinate tables from matched gates Pairwise<-list() 180 for(i in 1:nrow(pairedGates)){ Pairwise[[i]]<-coordinator(pairedGates$unshiftedGate[i], pairedGates$shiftedGate[i],whichFile=\"stopCodonFiltered\") } names(Pairwise)<-sapply(1:nrow(pairedGates),function(element){ paste(pairedGates$shiftedGate[element],\"versus\", pairedGates$unshiftedGate[element],sep=\"-\") }) #Add deltas column to count tables for(i in 1:length(Pairwise)){ Delta<-Pairwise[[i]]$countInShifted-Pairwise[[i]]$countInUnshifted Pairwise[[i]]<-data.frame(Pairwise[[i]]$Seq, Pairwise[[i]]$countInShifted, Pairwise[[i]]$countInUnshifted,Delta, stringsAsFactors=FALSE) colnames(Pairwise[[i]])<-c(\"Seq\",\"countInShifted\", \"countInUnshifted\",\"Delta\") } rm(Delta) plotter<-function(dataframe,highlightSeqs=c(NULL),y_axis=\"Delta\"){ #Specify which sequences should be highlighted in plot highlight<-vector(\"character\",length=nrow(dataframe)) for(i in 1:length(highlightSeqs)){ highlight[grep(highlightSeqs[i],dataframe$Seq)]<-highlightSeqs[i] } dataframe<-data.frame(dataframe,highlight,stringsAsFactors=FALSE) y<-grep(y_axis,colnames(dataframe)) normal<-subset(dataframe,dataframe$highlight==\"\") highlighted<-subset(dataframe,dataframe$highlight!=\"\") p<-ggplot(dataframe,environment=environment()) + geom_jitter(data=normal,aes(x=Seq,y=normal[,y]), pch=16,position=position_jitter(height=10),size=2, alpha=0.4,colour=\"grey45\") + geom_point(data=highlighted,aes(x=Seq,y=highlighted[,y], color=highlight),pch=16,size=2) + ylab(\"Delta (Frequency per million reads)\") + theme(axis.title.x=element_blank(),axis.text.x=element_blank(), axis.ticks.x=element_blank()) } plotDir<-\"~/Data/Sequencing/Tope-Seq_4/Enrichment2/\" for(i in 1:length(Pairwise)){ p<-plotter(Pairwise[[i]],highlightSeqs=c(\"KVPRNQDWL\",\"SIINFEKL\")) png(paste(plotDir,names(Pairwise)[i],\".png\",sep=\"\")) print(p) dev.off() } rm(p) 181 C.1.2. Example key to be input to sequencing data analysis pipeline Index.Well Sequence Sequence.RC Tope.Seq.ID C03 AGTCTT AAGACT Ctrl1 C04 TATCGT ACGATA Ctrl2 C05 AATTAT ATAATT Ctrl3 C06 CCGGTG CACCGG Ctrl4 C07 CATGGG CCCATG O1S- C08 TCTGAG CTCAGA O2S- C09 AAGTGC GCACTT O3S- C10 ATTATA TATAAT O4S- E03 GTCCTT AAGGAC P1S- E04 ATCAGT ACTGAT P2S- E05 TAGGAT ATCCTA P3S- E06 TGAGTG CACTCA P4S- E07 TTGCGG CCGCAA P5S- E08 GGTTTC GAAACC O1S+ E09 TAAGGC GCCTTA O2S+ E10 TCGGGA TCCCGA O3S+ E11 TTCGAA TTCGAA O4S+ G03 GTTTGT ACAAAC P1S+ G04 CTATCT AGATAG P2S+ G05 GCTCAT ATGAGC P3S+ G06 GCCATG CATGGC P4S+ G07 TTCTCG CGAGAA P5S+ 182 Appendix D - FACS gating schema D.1 Representative gating strategy for FRET purity sorting in EL4 cells Appendix Figure D-1. Representative gating strategy for FRET purity sorting in EL4 cells. The sequence of gates used to prepare pure populations of minigene-expressing EL4 with intact FRET reporter signal for use in FRET-shift assays. FRET+ gate is set by selecting stoichiometrically diagonal populations out of YFP+ populations. These populations can be visually determined as being parallel to autofluorescent cells in the negative control populations on FRET vs. CFP plot but substantially shifted up in FRET detection channel. 183 D.2 Representative gating strategy for FRET-shift assay sorting in EL4 cells Appendix Figure D-2. Representative gating strategy for FRET-shift assay sorting in EL4 cells. The sequence of gates used to perform 2-way sorting of minigene-expressing EL4 APC cell lines at the conclusion of co-cultures with effector cells to obtain unshifted and shifted target populations for downstream Illumina sequencing. Gates are set using no-effector APC control samples. 184 D.3 Representative gating strategy for FRET2 purity sorting in K562 cells Appendix Figure D-3. Representative gating strategy for FRET2 purity sorting in K562 cells. The sequence of gates used to prepare pure populations of minigene-expressing K562 with intact FRET2 reporter signal for use in FRET-shift assays. FRET+ gate is set by selecting stoichiometrically diagonal populations out of YFP+ populations. These populations can be visually determined as being parallel to autofluorescent cells in the negative control populations on FRET vs. CFP plot but substantially shifted up in FRET detection channel. 185 Appendix E - Cell and read count summary from FRET-shift FACS/amplicon sequencing experiments For FRET-shift/amplicon sequencing experimental conditions presented in Chapter 3, two independent experiments, internally designated as Tope-Seq-1 and Tope-Seq-4, were conducted. Co-cultures prepared as part of the same experiment were processed on the same day in the same FACS session and sequenced in the same MiSeq runs. Tope-Seq-1 sequencing libraries were prepared using the 1-round PCR approach described in Section 3.4.9 with Minigene_Illumina_FWD/Minigene_Illumina+index_REV primer pairs (Appendix B.3). Tope-Seq-4 sequencing libraries were prepared using the 2-round PCR strategy described in Section 3.4.9 with Minigene_FWD _IA+stagger/Minigene_REV _IA+stagger primer pair (Appendix B.3) for 1st round minigene amplification and Illumina_FWD/Illumina_REV primer pair for 2nd round PCR. Indexed amplicon pooling for Tope-Seq-1 libraries was conducted by inputting balanced DNA quantities from all prepared libraries prior to sequencing run. Indexed amplicon pooling for Tope-Seq-4 libraries was conducted by inputting unbalanced DNA quantities from prepared libraries – 90% of final pool was composed of Unshifted-gate amplicons present in equimolar amounts and 10% Shifted-gate amplicons present in equimolar amounts. Shown below are summary tables displaying the number of cells recovered and number of sequencing reads obtained from each gate of each co-culture condition of each FRET-shift/amplicon sequencing experiment conducted in Chapter 3. Filtering of raw reads was done as described in Section 3.4.9 and Appendix C.2. Unique minigenes refers to the number of distinct sequences detected in each gate and productive minigenes refers to the number of unique minigene sequences that did not contain a stop codon. 186 Experiment: Tope-Seq 1 OT-I CTL + 1:10 Ova:random minigenes OT-I CTL + 1:100 Ova:random minigenes OT-I CTL + 1:1,000 Ova:random minigenes OT-I CTL + 1:10,000 Ova:random minigenes OT-I CTL + 1:100,000 Ova:random minigenes Unshifted gate Unshifted gate Unshifted gate Unshifted gate Unshifted gate # Cells Collected 1.98x105 1.45x105 1.48x105 1.65x105 1.54x105 # Raw reads obtained 238,837 672,513 58,663 564,387 241,645 # After filtering 126,915 356,366 31,689 301,229 123,897 # Unique minigenes 15,663 50,381 3,968 36,396 22,136 # Productive minigenes 6,431 20,782 1,644 15,022 9,154 Shifted gate Shifted gate Shifted gate Shifted gate Shifted gate # Cells Collected 1.7x104 8.0x103 7.0x103 8x103 8.0x103 # Raw reads obtained 372,033 535,848 375,160 728,075 914,105 # After filtering 225,665 282,885 204,181 385,092 486,668 # Unique minigenes 13,492 25,935 22,282 50,135 69,222 # Productive minigenes 5,613 10,717 9,045 20,375 27,913 Experiment: Tope-Seq 4 1:30 OT-I CTL:WT C57BL/6 CTL + 1:100 Ova:random minigenes 1:30 OT-I CTL:WT C57BL/6 CTL + 1:10,000 Ova:random minigenes 1:3,000 OT-I CTL:WT C57BL/6 CTL + 1:10,000 Ova:random minigenes Unshifted gate Unshifted gate Unshifted gate # Cells Collected 1.21x106 9.06x105 6.07x105 # Raw reads obtained 986,434 1,550,627 1,586,028 # After filtering 526,413 775,400 912,583 # Unique minigenes 191,658 173,203 121,249 # Productive minigenes 108,552 97,386 69,271 Shifted gate Shifted gate Shifted gate # Cells Collected 7.7x103 5.7x103 4.3x103 # Raw reads obtained 52,557 40,427 57,377 # After filtering 30,797 16,290 27,444 # Unique minigenes 3,373 2,990 2,351 # Productive minigenes 1,375 1,521 1,133 187 Experiment: Tope-Seq 4 pmel-1 TCR CTL + 1:10,000 hgp100:random minigenes 1:30 pmel-1 TCR:WT C57BL/6 CTL + 1:100 hgp100:random minigenes 1:30 pmel-1 TCR:WT C57BL/6 CTL + 1:10,000 hgp100:random minigenes 1:3,000 pmel-1 TCR:WT C57BL/6 CTL + 1:10,000 hgp100:random minigenes Unshifted gate Unshifted gate Unshifted gate Unshifted gate # Cells Collected 3.67x105 5.18x105 4.29x105 4.50x105 # Raw reads obtained 1,324,555 1,489,613 1,303,835 1,463,938 # After filtering 655,449 846,459 699,221 779,659 # Unique minigenes 103,906 121,571 124,278 122,121 # Productive minigenes 60,318 69,739 71,602 70,366 Shifted gate Shifted gate Shifted gate Shifted gate # Cells Collected 2.4x103 4.1x103 3.7x103 3.6x103 # Raw reads obtained 18,574 37,560 37,591 24,843 # After filtering 8,578 16,898 17,616 12,502 # Unique minigenes 2,456 2,019 1,580 1,808 # Productive minigenes 1,135 896 718 798 "@en ; edm:hasType "Thesis/Dissertation"@en ; vivo:dateIssued "2019-02"@en ; edm:isShownAt "10.14288/1.0375763"@en ; dcterms:language "eng"@en ; ns0:degreeDiscipline "Genome Science and Technology"@en ; edm:provider "Vancouver : University of British Columbia Library"@en ; dcterms:publisher "University of British Columbia"@en ; dcterms:rights "Attribution-NonCommercial-NoDerivatives 4.0 International"@* ; ns0:rightsURI "http://creativecommons.org/licenses/by-nc-nd/4.0/"@* ; ns0:scholarLevel "Graduate"@en ; dcterms:title "Novel in vitro methods for the discovery of functional T-cell receptor epitopes from large peptide-coding libraries"@en ; dcterms:type "Text"@en ; ns0:identifierURI "http://hdl.handle.net/2429/68080"@en .