"Medicine, Faculty of"@en . "Pathology and Laboratory Medicine, Department of"@en . "DSpace"@en . "UBCV"@en . "Xia, Zhouchunyang"@en . "2019-10-31T07:00:00+0000"@en . "2018"@en . "Master of Science - MSc"@en . "University of British Columbia"@en . "High grade serous ovarian cancer (HGSC), endometrioid ovarian cancer (ENOC) and clear cell ovarian cancer (CCOC) are the three most common subtypes of epithelial ovarian cancers (EOC). While HGSC arise from serous tubal intraepithelial carcinomas (STIC) lesions in the fallopian tube, ENOC and CCOC share a common precursor lesion, endometriosis (ectopic growth of uterine lining). Effective biomarkers of early cancer development and recurrence are lacking. We performed whole genome sequencing (WGS) and observed highly recurrent retrotransposition events originating from an active LINE-1 retrotransposon (L1) in the TTC28 gene in one-third of our ENCO and CCOC cohort. L1s are mobile genetic elements that encodes their own protein machineries to \u00E2\u0080\u009Ccopy-and-paste\u00E2\u0080\u009D their sequences into random genomic loci. A process called 3\u00E2\u0080\u0099 transduction occur when L1s insert the unique downstream DNA sequences along with their own sequences. All these processes may fuel genomic instability, as such L1s are epigenetically silenced in normal tissues, but are found to be re-activated in cancers and cancer precursor lesions. We hypothesize that L1s activate early in EOC tumorigenesis and that TTC28-L1 3\u00E2\u0080\u0099 transductions could be used as markers of tumor development and progression.\r\nUsing conventional and multiplex PCR on formalin-fixed paraffin-embedded (FFPE) tumor tissues, we found that TTC28-L1 3\u00E2\u0080\u0099 transductions occurred early and preceded many somatic mutations. We developed a probe-based target capture sequencing method that could identify novel TTC28-L1 3\u00E2\u0080\u0099 transductions in frozen tumor and FFPE tissues, and potentially in circulating tumor DNA. Using immunohistochemistry (IHC), we observed high L1 protein expressions in HGSC and its precursor lesions. \r\nOur results suggest that TTC28-L1 events occur early in EOC development and L1 protein expressions may reflect pre-malignant transformations. The use of L1 protein IHC and our target capture assay could be explored as a potential method to track such development."@en . "https://circle.library.ubc.ca/rest/handle/2429/67443?expand=metadata"@en . "LINE-1 RETROTRANSPOSITIONS IN EPITHELIAL OVARIAN CANCER: CAN WE USE DNA \u00E2\u0080\u009CPARASITES\u00E2\u0080\u009D FOR GOOD PURPOSE? by ZHOUCHUNYANG XIA Honours B.Sc., The University of Toronto, 2014 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Pathology and Laboratory Medicine) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) October 2018 \u00C2\u00A9 Zhouchunyang Xia, 2018 ii The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, a thesis/dissertation entitled: LINE-1 retrotranspositions in epithelial ovarian cancer: can we use DNA \u00E2\u0080\u009Cparasites\u00E2\u0080\u009D for good purpose? submitted by Zhouchunyang Xia in partial fulfillment of the requirements for the degree of Master of Science in Pathology and Laboratory Medicine Examining Committee: David G. Huntsman Supervisor Christian Steidl Supervisory and Examining Committee Chair Dixie Mager Supervisory Committee Member Stephen Yip Department of Pathology Faculty Member Alex Wyatt Additional Examiner Additional Supervisory Committee Members: Jessica McAlpine Supervisory Committee Member William Lockwood Supervisory Committee Member iii Abstract High grade serous ovarian cancer (HGSC), endometrioid ovarian cancer (ENOC) and clear cell ovarian cancer (CCOC) are the three most common subtypes of epithelial ovarian cancers (EOC). While HGSC arise from serous tubal intraepithelial carcinomas (STIC) lesions in the fallopian tube, ENOC and CCOC share a common precursor lesion, endometriosis (ectopic growth of uterine lining). Effective biomarkers of early cancer development and recurrence are lacking. We performed whole genome sequencing (WGS) and observed highly recurrent retrotransposition events originating from an active LINE-1 retrotransposon (L1) in the TTC28 gene in one-third of our ENCO and CCOC cohort. L1s are mobile genetic elements that encodes their own protein machineries to \u00E2\u0080\u009Ccopy-and-paste\u00E2\u0080\u009D their sequences into random genomic loci. A process called 3\u00E2\u0080\u0099 transduction occur when L1s insert the unique downstream DNA sequences along with their own sequences. All these processes may fuel genomic instability, as such L1s are epigenetically silenced in normal tissues, but are found to be re-activated in cancers and cancer precursor lesions. We hypothesize that L1s activate early in EOC tumorigenesis and that TTC28-L1 3\u00E2\u0080\u0099 transductions could be used as markers of tumor development and progression. Using conventional and multiplex PCR on formalin-fixed paraffin-embedded (FFPE) tumor tissues, we found that TTC28-L1 3\u00E2\u0080\u0099 transductions occurred early and preceded many somatic mutations. We developed a probe-based target capture sequencing method that could identify novel TTC28-L1 3\u00E2\u0080\u0099 transductions in frozen tumor and FFPE tissues, and potentially in circulating tumor DNA. Using immunohistochemistry (IHC), we observed high L1 protein expressions in HGSC and its precursor lesions. Our results suggest that TTC28-L1 events occur early in EOC development and L1 protein expressions may reflect pre-malignant transformations. The use of L1 protein IHC and our target capture assay could be explored as a potential method to track such development. iv Lay Summary Ovarian cancer is the 5th leading cause of cancer related death in women in North America. During the effort to find markers within the ovarian cancer genome to help us identify cancer at its early stages, we observed a genetic element called the L1 retrotransposon (L1) that was activated in one-third of our patient cohort. L1s are known to be genetic \u00E2\u0080\u009Cparasites\u00E2\u0080\u009D that may alter pieces of our genome, so they are locked up by various mechanisms in the normal cell. However, cancers lose the ability to lock L1 up, and activated L1s may help us identify the presence of cancer cells. Here we confirm that L1s are active early in ovarian cancer development, and we develop a tool to detect certain subpopulation of L1s in fresh tumor tissues and tumor tissues stored in wax. We propose the potential of using these genetic \u00E2\u0080\u009Cparasites\u00E2\u0080\u009D to recognize ovarian cancer presence. v Preface The research conducted in this thesis were approved by the BC Cancer Agency and the University of British Columbia Research Ethics Board (REB #H08-01411, #H09-02153, and #H18-01457). This research was supported by grants from the Canadian Cancer Society (Impact Grant), BC Cancer Foundation and the VGH & UBC Hospitals Foundations. A version of chapter 2 has been published. [Zhouchunyang Xia], Cochrane, D., Anglesio, M., Wang, Y.K., Nazeran, T., Tessier-Cloutier, B., McConechy, M., Senz, J., Lum, A., Bashashati, A., Shah, S., and Huntsman, D., LINE-1 retrotransposon-mediated DNA transductions in endometriosis associated ovarian cancers. Gynecologic Oncology. 2017; 147:642-7. DOI: 10.1016/j.ygyno.2017.09.032. I designed and conducted all experiments and data analysis in the manuscript. Manuscript was written by me and reviewed and edited by Dr. D. Huntsman. I designed and carried out all experiments in chapter 3 and modified the IDT capture protocol designed by Dr. M. Alcaide from the Morin lab at SFU to suit transduction event capturing. Patient tissues, including frozen tumor, buffy coat, FFPE and plasma tissues were identified by me and obtained through the BCCA Gynecology Tissue Bank. The bioinformatics scripts for the analysis pipeline was written by C. Rushington from the Morin lab, and the execution of the analysis and interpretation of the data were performed by me. I was responsible for initial identification of the L1OR1p positivity in our ovarian cancer cohort, designing the study, and the interpretation of the TMA and IHC data in chapter 4. All tissue microarrays (TMAs) used in chapter 4 were pre-designed TMAs obtained from the Genetic Pathology Evaluation Centre (GPEC). IHC staining of the TMAs was performed by A. Cheng from GPEC. Evaluation of the IHC was completed by pathologist Dr. B. Tessier-Cloutier. Deconvolution of the TMA scoresheets and the subsequent survival analysis was performed by S. Leung from GPEC. vi Table of Contents Abstract ......................................................................................................................................... iii Lay Summary ............................................................................................................................... iv Preface ............................................................................................................................................ v Table of Contents ......................................................................................................................... vi List of Tables ................................................................................................................................ ix List of Figures ................................................................................................................................ x List of Abbreviations ................................................................................................................... xi Acknowledgements .................................................................................................................... xiv Dedication ................................................................................................................................... xvi Chapter 1: Introduction .............................................................................................................. 1 1.1 Epithelial Ovarian Cancers (EOC).................................................................................. 1 1.1.1 High Grade Serous Ovarian Cancers .......................................................................... 2 1.1.2 Clear Cell Ovarian Cancer (CCOC) ........................................................................... 6 1.1.3 Endometrioid Ovarian Cancer (ENOC) ...................................................................... 7 1.1.4 Endometriosis: A Common Origin of ENOC and CCOC ........................................ 11 1.1.5 The Role of Circulating Cell-free DNA (cfDNA) as Cancer Specific Biomarkers in Ovarian Cancer ..................................................................................................................... 12 1.2 Mobile Genetic Elements and Cancer........................................................................... 14 1.2.1 Consequences of Somatic LINE-1 Retrotranspositions in Cancer ........................... 19 1.2.2 Timing of L1 Retrotranspositions in the Evolution of Tumors ................................ 22 1.2.3 L1 as a Biomarker ..................................................................................................... 22 1.3 Rational and Aims......................................................................................................... 24 vii Chapter 2: L1-mediated DNA transductions in EAOCs ........................................................ 26 2.1 Background ................................................................................................................... 26 2.2 Methods......................................................................................................................... 26 2.2.1 Whole genome sequencing ....................................................................................... 26 2.2.2 Case Selection ........................................................................................................... 26 2.2.3 DNA extraction and FFPE blocks............................................................................. 27 2.2.4 PCR validation of TTC28-L1-mediated transduction events ................................... 27 2.2.5 Microfluidic PCR validation of selected SNV/Frameshift mutations ...................... 28 2.3 Results ........................................................................................................................... 28 2.3.1 TTC28-L1 mediated transductions are frequent events in clear cell and endometrioid ovarian cancer ....................................................................................................................... 28 2.3.2 TTC28-L1 mediated transductions are early events in ENOC and CCOC oncogenesis ........................................................................................................................... 31 2.4 Discussion ..................................................................................................................... 35 Chapter 3: Detecting L1-mediated transductions ................................................................... 38 3.1 Background ................................................................................................................... 38 3.2 Methods......................................................................................................................... 39 3.2.1 Overview ................................................................................................................... 39 3.2.2 Sample selection ....................................................................................................... 42 3.2.3 DNA Extraction and Library Construction ............................................................... 42 3.2.4 Probe Hybridization Capture and Sequencing .......................................................... 44 3.2.5 Bioinformatics analysis ............................................................................................. 47 3.3 Results ........................................................................................................................... 49 viii 3.3.1 Detection of TTC28-L1 transductions in samples with WGS .................................. 49 3.3.2 Detection of L1 transductions in FFPE samples without WGS ................................ 53 3.3.3 Detection of L1 transductions in plasma samples..................................................... 58 3.4 Discussion ..................................................................................................................... 62 Chapter 4: L1 protein expression in ovarian and endometrial cancers ............................... 67 4.1 Background ................................................................................................................... 67 4.2 Methods......................................................................................................................... 68 4.2.1 Immunohistochemistry ............................................................................................. 68 4.2.2 Statistical Analysis .................................................................................................... 69 4.3 Results ........................................................................................................................... 69 4.3.1 Cohort Description and the association of L1ORF1p expression with stage, grade, and subtypes of epithelial ovarian cancers ............................................................................ 69 4.3.2 Evaluating the association of L1ORF1p expression with survival in EOC .............. 75 4.3.3 Evaluating the association between L1ORF1p expression and p53 expression ....... 77 4.3.4 Evaluating L1ORF1p expression in endometrial cancers and its association with MMR and p53 mutation status. ............................................................................................. 83 4.4 Discussion ..................................................................................................................... 89 Chapter 5: Concluding Chapter: Summary and Future Direction ...................................... 92 5.1 Summary ....................................................................................................................... 92 5.2 Future Directions .......................................................................................................... 94 Bibliography ................................................................................................................................ 96 Appendix A ................................................................................................................................ 107 Supporting Materials ............................................................................................................... 107 ix List of Tables Table 1. Comparison of putative driver mutation frequencies in genes common to CCOC and ENOC. ........................................................................................................................................... 10 Table 2. Hg 19 coordinates for the probe captured regions of the 6 L1s. ..................................... 40 Table 3. Library amplification cycling conditions. ....................................................................... 44 Table 4. Post-capture PCR reagents.............................................................................................. 45 Table 5. Post-capture PCR cycling conditions. ............................................................................ 46 Table 6. Comparison of the coordinates for TTC28-L1 mediated transduction detected in WGS and target capture. ......................................................................................................................... 51 Table 7. Summary table for cases without WGS. ......................................................................... 54 Table 8. Summary table for cases with plasma samples. CA-125 levels are listed when available........................................................................................................................................................ 60 Table 9. Distribution of clinical characteristics by L1ORF1p IHC status. ................................... 72 Table 10. Association of L1ORF1p expression (negative vs. any positive) with histotypes. HGSC correlated with more L1 positivity. Chi-square test, alpha = 0.05. ................................... 74 Table 11. Association between L1ORF1p and p53 IHC status. Complete absence, overexpression and cytoplasmic phenotype all indicate mutated p53. .................................................................. 78 Table 12. No association was found between L1ORF1p and p53 IHC status in HGSC. ............. 78 Table A 1. Validated TTC28-L1 transduction events in Chapter 2. ........................................... 107 Table A 2. SNV/Frameshift mutation coordinate and allelic frequency for each tumor block surveyed in Chapter 2. ................................................................................................................ 108 x List of Figures Figure 1. Copy Number Variation (CNV) landscape of the three most common EOC. ................ 5 Figure 2. Schematics of L1 structure and its retrotransposition mechanism. ............................... 17 Figure 3. Retrotranspositions originating from TTC28 on chromosome 22q12. .......................... 30 Figure 4. Comparison of TTC28-L1 presence and allelic frequencies.. ....................................... 34 Figure 5. Schematic overview of the transduction capture protocol. ........................................... 41 Figure 6. Schematic overview of the in-silico analysis pipeline.. ................................................ 48 Figure 7. An example of transduction events validated via PCR and Sanger sequencing in the FFPE sample 821A.. ..................................................................................................................... 57 Figure 8. The number of events detected in 821A across the different follow-up visits displayed as months after the date of surgery.. ............................................................................................. 61 Figure 9. Flowchart for the case selection process of the ovarian TMAs..................................... 71 Figure 10. Distribution of L1ORF1p expression across the different EOC Histotypes. .............. 73 Figure 11. Kaplan-Meier analysis of L1ORF1p (L1RE1) expression for overall survival (OS), disease specific survival (DSS), and progression free survival (PFS). ......................................... 76 Figure 12. Comparison of p53 and L1ORF1p (L1RE1) expression in STIC lesions. .................. 80 Figure 13. IHC, p53, and L1ORF1p (L1RE1) stains for four normal fallopian tubes.. ................ 82 Figure 14. Distribution of L1OR1p expression across the two endometrial cohorts.................... 85 Figure 15. Distribution of L1ORF1p expression within three molecular subgroups of the ProMisE classifier. ........................................................................................................................ 87 Figure 16. L1ORF1p expression in p53 wildtype cases. .............................................................. 88 Figure A 1. Insert size distribution for sequencing libraries \u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6\u00E2\u0080\u00A6...111 xi List of Abbreviations AKT/PKB protein kinase B APC adenomatous polyposis coli ARID1A AT-rich interactive domain 1A BAM binary alignment mapping BCCA British Columbia Cancer Agency BCCRC British Columbia Cancer Research Centre BRCA1/2 breast cancer genes 1/2 CA-125 cancer antigen 125 CCEC clear cell endometrial cancer CCNE1 Cyclin E1 CCOC clear cell ovarian cancer cfDNA cell free DNA CNV copy number variant CRC colorectal cancer ctDNA circulating tumor DNA CTNNB1 Catenin Beta 1 DLBCL diffuse large B cell lymphoma DNA deoxyribonucleic acid DSS disease specific survival EAOC endometriosis associated ovarian cancer EC endometrial cancer EDM exonuclease domain mutation (relating to POLE) ENEC endometrioid endometrial cancer ENOC endometrioid ovarian cancer EOC epithelial ovarian cancer FFPE formalin-fixed paraffin-embedded fs frame shift FT fallopian tube GCT granulosa cell tumor GI gastrointestinal xii H&E Haematoxylin and Eosin stain HE4 human epidermis protein HGSC high grade serous ovarian cancer HR hazard ratio IHC Immunohistochemistry KRAS Kirsten Rat Sarcoma Viral Oncogene Homolog L1HS human L1 element L1ORF1p L1 open reading frame 1 protein L1ORF2p L1 open reading frame 2 protein L1RE1 An alternative name for L1ORF1p LGSC low grade serous ovarian cancer LINE-1/L1 Long Interspersed Nuclear Element 1 LOH loss of heterozygosity MAPK mitogen-activated protein kinase MECOM Myelodysplasia Syndrome-Associated Protein 1 MET Met Proto-Oncogene MLH1 mutL homolog 1 MLH2 mutL homolog 2 MMR mismatch repair MSH6 mutS homolog 6 MSI microsatellite instability MSS microsatellite stable mTOR mammalian target of rapamycin MYC Myc Proto-Oncogene Protein NF1 Neurofibromatosis type 1 OC ovarian cancer OS overall survival PCR polymerase chain reaction PFS progression free survival PI3K phosphoinositide 3-kinase PIK3CA phosphatidylinositol-4,5-biphosphate 3-kinase catalytic subunit alpha xiii PMS2 PMS1 homolog 2 POLE polymerase epsilon PP2R1A protein phosphatase 2 , regulatory subunit A, alpha isoform PTEN phosphatase and tensin homolog QC quality control RB1 Retinoblastoma 1 RIP retrotransposon insertion polymorphisms RNA ribonucleic acid RNP Ribonucleoprotein SEC serous endometrial cancer SINE short interspersed nuclear elements SNV Single Nucleotide Variation ST18 suppression of tumorigenicity 18 STIC serous tubal intraepithelial carcinomas SV structural variant TCGA The Cancer Genome Atlas TIP-seq transposon insertion profiling TMA tissue microarray TP53 tumor protein 53 TPRT target-primed reverse transcriptase TTC28 Tetratricopeptide Repeat Domain 28 UTR untranslated region VCF variant call format VGH Vancouver General Hospital WFDC2 Major Epididymis-Specific Protein E4 WGS whole genome sequencing xiv Acknowledgements I would like to thank my supervisor Dr. David Huntsman, for giving me with the opportunity to join his amazing research team. He has been extremely patient and encouraging to me throughout this project, providing advices and guidance along each step that pointed me in the right direction. I am grateful for not only his support of my research project, but also his support of my personal goal in pursuing a computer science/bioinformatics career route for my future. I would also like to thank my supervising committee, Dr. Christian Steidl, Dr. Dixie Mager, Dr. Jessica McAlpine, and Dr. William Lockwood, for all the constructive questions, criticisms, support and suggestions they offered during each of my committee meeting, as well as in meetings outside of those times. I would like to thank Dr. Dawn Cochrane, whom we have agreed to call our \u00E2\u0080\u009Clab mom\u00E2\u0080\u009D, for training me and guiding me into lab when I started out fresh and nervous. She has been extremely supportive, often accompanying me to meetings, and has patiently answered all my questions, from how to order reagents to how well an experiment should work. This study was supported by grants from the Canadian Cancer Society (Impact Grant # 701603), BC Cancer Foundation, and the VGH & UBC Hospitals Foundations. This research would not have been made possible without the gracious donation of tissues from the women with ovarian and endometrial cancers. This study would not have been possible without the amazing technical support from the Huntsman lab members, Winnie Yang, Amy Lum, and Janine Senz, and the guidance from Dr. Michael Anglesio and Dr. Yemin Wang. Nor could I do without the bioinformatics support from our collaborators, Dr. Yikan Wang from Dr. Sohrab Shah\u00E2\u0080\u0099s lab, Dr. Miguel Alcaide and Christopher Rushton from Dr. Ryan Morin\u00E2\u0080\u0099s lab. I owe a great deal of gratitude to our xv pathologist fellow, Dr. Basile Tessier-Cloutier, who patiently evaluated the countless H&E and IHC slides that I bugged him with, and who answered my cancer pathology questions and joked with a serious face. I am also grateful for all my friends in the lab and at the BCCRC that accompanied me through this journey, who listened to my joys and my complaints with positive attitudes. Special thanks to my parents for all the mental, moral and meal supports. xvi Dedication I would like to dedicate this thesis to the women who have fought against ovarian and endometrial cancers. It is their courage and generosity that enabled this research, and I sincerely hope the research I have done and the research I will continue to do could contribute, however minimal, to understanding and conquering cancer. 1 Chapter 1: Introduction 1.1 Epithelial Ovarian Cancers (EOC) Ovarian cancer is the 5th leading cause of cancer related deaths in women in the United States and in Canada [1]. It is classified into four stages by the International Federation of Gynecology and Obstetrics (FIGO) system. Stage I ovarian cancers describe tumors confined within the ovaries or the fallopian tubes. The 5-year survival rate for stage I ovarian cancer (OC) is up to 90%, however, less than 20% of patients are diagnosed at this stage [2]. Stage II ovarian cancers involve metastasis that is confined within the pelvic regions. Stage III ovarian cancers involve metastasis into the abdominal region and the surrounding lymph nodes, and stage IV ovarian cancers involve metastasis into distant regions. Approximately 60% - 80% of patients present with advanced stage disease (stage III/IV)[3], for which the 5-year survival rate drops to less than 40% [2]. As such, the lack of effective early detection method makes ovarian cancers the most lethal malignancy of the female reproductive tract. Ovarian cancer (OC) is classified into several categories. Rare OC includes sex cord-stromal and germ cell tumors that accounts for 5% and 3% of all cases [2] and are not the subjects of this thesis. Here we focus on epithelial ovarian carcinomas (EOC), which account for over 90% of all ovarian cancer cases [2]. As advancing laboratory technologies reveal more molecular and genetic characteristics of EOC, it has become increasingly clear that it is a diverse cancer and can be stratified into different histotypes based on distinct morphological, pathological, molecular and genetic signatures. The five major histotypes of EOC are high-grade serous ovarian cancer (HGSC; 68-70%), clear cell ovarian cancer (CCOC; 9-12%), endometrioid ovarian cancer (ENOC; 8-11%), low-grade serous ovarian cancer (LGSC; 3-4%), and mucinous ovarian cancers (MC; 3%) [2, 3]. 2 The standard treatment for ovarian cancers includes surgical staging and platinum-based chemotherapy. In patients with advanced staged EOC, chemotherapy is often given first (neoadjuvant) to reduce tumor burden before continuing with surgery (cytoreduction/debulking of primary tumor and metastases) and further chemotherapy [4]. 1.1.1 High Grade Serous Ovarian Cancers High grade serous ovarian cancer is most common subtype of EOC in North America, accounting for 68-70% of all cases [3]. Up to 75% of HGSC cases present at stage III/IV, and up to 87% of all EOC cases diagnosed at the advanced stage are of the HGSC histotype [3]. Morphologically, HGSC display solid masses of cells with a highly stratified epithelium that appears fenestrated with slit-like spaces, and cribriform patterns accompanied by necrosis [5, 6]. Genetically, HGSC is characterized by near ubiquitous TP53 (tumor protein 53) mutations and high genomic instability that results in widespread DNA copy number changes and structural aberrations [7]. In addition, germline BRCA1/2 (breast cancer gene 1 and 2) mutations are one of the factors that increase susceptibility to HGSC, and have been found in up to 40% of HGSC [8]. Up to 97% of HGSC harbor TP53 mutations [7, 9, 10]. TP53 is a well-known tumor suppressor encoding the protein p53, which induces cell cycle arrest, initiates DNA repair upon DNA damage and induces apoptosis in cells with irreparable DNA damage [11]. Missense mutations are the most common class of TP53 mutations (62%), resulting in the translation of mutant p53 that accumulate in the tumor cells, and can be detected with immunohistochemical (IHC) staining [10]. Nonsense mutations and deletions in TP53 result in an absence of protein production, and thus an absence of expression on IHC stains. Wildtype TP53 tends to have intermediate level of staining [10, 12]. The clear and intense staining of the mutant TP53 makes 3 it an ideal IHC marker for TP53 status in HGSC. However, HGSC with wildtype TP53 may have p53 expression that is indistinguishable from mutated p53 on IHC [10], and the exact mutational status of the tumor needs to be resolved by sequencing. While the status of TP53 mutation does not have prognostic significance, different types of TP53 mutations may affect treatment response [9, 10]. In addition to near ubiquitous TP53 mutations, HGSC is characterized by high numbers of structural variants (SV) and DNA copy number variations (CNV) stemming from aberrant DNA repair mechanisms, making it the most genomically unstable OC (Figure 1) [7]. The most common focal amplification resulting from CNV include CCNE1, MYC and MECOM, and the most common focal deletions include PTEN, RB1, and NF1 [7, 13]. A recent study by Wang and colleagues used whole genome sequencing (WGS) data to stratify HGSC into two distinct subgroups based on SV signatures [13]. One HGSC subgroup exhibited a prevalence of duplications and deletions due to unbalanced rearrangements, while the other exhibited high proportions of foldback inversions with high-level amplifications resulting from breakage-fusion-bridge cycles. The latter subgroup was found to have poorer survival, suggesting the potential of using foldback inversions signatures to aid clinical management[13]. For a long time, HGSC was thought to arise from the ovaries, however, clear evidence of precursor lesions was never found on the ovarian surface epithelium. Instead, early studies examining the fallopian tube and ovaries from prophylactic surgeries in BRCA mutation carriers found pre-cancerous lesions on the fimbriae of the fallopian tube [14, 15]. These lesions, termed serous tubal intraepithelial carcinomas (STIC), resemble HGSC histologically and contain p53 mutations characteristic of HGSC [6]. Recent whole-exome and copy number profiling of concurrent STIC, HGSC, and metastasis showed a clonal relationship, in which phylogenetic 4 analysis of the mutations/variations showed that STIC lesions are ancestral clones which give rise to the ovarian cancers [16]. Mutations found in these STIC lesions include cancer specific alterations including TP53, BRCA1/2, and PTEN mutations. Further evidence were seen in mouse models, where direct transformation of fallopian tube (FT) secretory epithelial cells into HGSC were achieved via TP53, BRCA, and PTEN deletions [17]. As such, it is established that HGSC originates from the fimbriae of the fallopian tube. 5 Figure 1. Copy Number Variation (CNV) landscape (in log ratio) of the three most common EOC with a germ cell OC (GCT), for comparison purposes. HGSC display the highest abundance of CNV. 6 1.1.2 Clear Cell Ovarian Cancer (CCOC) Clear cell ovarian cancer is the second most common subtype of EOC in North America, accounting for 9-12% of cases [3], although it is much more prevalent in Japan (up to 25%) for reasons not yet clear [18]. Compared to the overall OC population, CCOC presents at an earlier age and is diagnosed at earlier stage [19, 20]. When diagnosed at advanced stages, however, CCOC prognosis is inferior to other EOC histotypes due to significantly lower response to platinum-based first-line chemotherapy [21]. Morphologically, CCOC is poorly differentiated and is always diagnosed as grade 3. It is characterized by multiple complex papillae with prominent hyaline basement membrane and hyaline bodies, and a clear, glycogen-rich cytoplasm with positive periodic-acid Schiff staining [5, 12, 22]. CCOC are genomically stable with a low mitotic figure [5]. Frequent CCOC mutations occur in ARID1A (46-57%) and PIK3CA (33-46%) [23-25], as well as PP2R1A (4-9%), KRAS (7%), PTEN (5-10%) and CTNNB1 (1%), albeit at a lower frequency [6, 24]. Weigand and colleagues were the first to identify recurrent ARID1A mutations in CCOC in 2010 [26]. Since then, ARID1A (AT-rich interactive domain 1A) mutations have been found in 46-57% of CCOC cases and are typically mutually exclusive with TP53 mutations [23-27]. Over 97% of mutations found in ARID1A are inactivating mutations (i.e. nonsense, truncating), and not surprisingly ARID1A is located at 1p36.11, a region subject to recurrent deletion [26]. ARID1A encodes the protein ARID1A, also called BAF250, which is a part of the SWI-SNF chromatin remodeling complex involved in regulating gene expression and in tumor suppression. Loss of ARID1A protein expression on IHC correlates with the presence of inactivating mutations in this tumor suppressor gene [26]. Both in vivo and in vitro experiments have shown 7 that re-expression of wildtype ARID1A is sufficient to inhibit tumor proliferation; however, a single ARID1A mutation alone is not sufficient to initiate tumorigenesis in vivo [28]. The PI3K/AKT/mTOR pathway involved in cell cycle regulation is implied to be another mechanism that drives tumorigenesis in CCOC, where its activation is associated with increased cytotoxic chemotherapy resistance and tumor cell invasion [24, 29]. Somatic mutations in PIK3CA (phosphatidylinositol-4,5-biphosphate 3-kinase catalytic subunit alpha), found in 33-46% of CCOC cases, can directly activate AKT, resulting in cell proliferation [24, 25]. Inactivating mutations in the pathway inhibitor, PTEN (phosphatase and tensin homolog), are observed in 5-10% of CCOC cases [6, 24], and also result in pathway activation. Inhibitors targeting this pathway, including PI3K, AKT and mTOR inhibitors, are currently under various clinical studies as possible new therapeutic options in combination with other chemotherapeutic or single-target agents [30]. A recent study suggests that co-occurring ARID1A and PIK3CA mutations initiate CCOC, evidenced by the combined loss of ARID1A and activation of PIK3CA in mouse models that led to the rapid development of ovarian tumors resembling CCOC [28]. CCOC was also found to have the highest frequency (33%) of coexisting ARID1A and PIK3CA amongst all cancers with the two mutation in the Cancer Genome Atlas (TCGA), further suggesting a role of co-occurring mutations in CCOC tumorigenesis [28]. 1.1.3 Endometrioid Ovarian Cancer (ENOC) Endometrioid ovarian cancer is the third most common subtype of EOC, account for 8-11% of the cases in North America [3]. Histologically, ENOC appears highly similar to endometrioid cancer in the uterus, with round or tubular glands with stratified non-mucin-containing epithelium and the appearance of squamous differentiation in half of the cases [5]. 8 ENOC often present at low grade and approximately 90% of ENOC are at stage I/II, which accounts for the improved prognosis compared to HGSC [31]. ENOC share many genetic abnormalities with CCOC, and as will be discussed in section 1.1.4, these two diseases share a common precursor lesion: endometriosis. Common genetic abnormalities in ENOC include mutations in CTNNB1 (15-50%), ARID1A (30%), PIK3CA (20-40%), PTEN (15-20%), KRAS (9-30%) and microsatellite instability (MSI; 20%) [6, 25, 32-34]. CTNNB1 (Catenin Beta 1) is the most common genetic mutation in ENOC and occurs in up to 50% of ENOC cases. It encodes beta-catenin, which is a key effector in the Wnt signaling pathway that is involved in cell proliferation, motility and survival [35]. Activating mutations in CTNNB1 leads to altered beta-catenin regulated signaling, and has been found to correlate with squamous differentiation, low tumor grade and favorable outcome [35]. The frequency of PIK3CA mutations in ENOC varies between studies but is generally similar to CCOC. However, ENOC has more frequent loss of function mutations in PTEN, the PI3K/AKT/mTOR pathway regulator, compared to CCOC. Loss of heterozygosity (LOH) at the PTEN locus 10q23.3 has been found in about 40% ENOC [36]. LOH alone has been found to reduce PTEN function, and in accordance with the two-hit hypothesis, 10q23.3 LOH often co-exist with PTEN mutations. Indeed, PTEN mutations appear to be the key co-occurring mutations that lead to endometrioid tumor differentiation in existing mouse models of ENOC [37, 38]. KRAS is a proto-oncogene that is part of the ras/raf/MEK/MAPK pathway, where activating mutations (in codon 12 specifically) result in sustained MAPK activation, leading to increased cell proliferation and inhibition of apoptosis. KRAS mutations have also been found in endometriosis tissues without tumor involvement [39], and occurs more frequently in ENOC 9 with concurrent endometriosis [33], suggesting a role of this pathway in the tumorigenesis of ENOC. MSI, or microsatellite instability, occurs in approximately 20% of ENOC cases [6]. It is a hypermutation signature whereby aberrant mismatch repair (MMR) leaves a string of repeating nucleotides, often seen as GT/CA pairs. Tumor MSI status is determined via a specific assay that assesses the mutation profile at five specific markers. Instability at two or more markers is considered MSI-high, instability at one is considered MSI-low, and if instability is not detected at any marker the tumor is microsatellite stable (MSS) [40]. MSI is associated with arrant mutations in genes involved in the MMR pathway, including MLH1 (mutL homolog 1), MLH2 (mutL homolog 2), MSH6 (mutS homolog 6), and PMS2 (PMS1 homolog 2) [6]. Deficiency in MMR is seen as loss of MMR protein expressions using IHC [41]. Studies of MSI in ovarian cancer have shown that risk for ovarian cancer does not differ between MSI and MSS cases [42], nor does survival [43]. 10 Table 1. Comparison of putative driver mutation frequencies in genes common to CCOC and ENOC. CCOC ENOC ARID1A 46-57% 30% PIK3CA 33-46% 20-40% PTEN 5-10% 15-20% KRAS 7% 9-30% CTNNB1 1% 15-50% 11 1.1.4 Endometriosis: A Common Origin of ENOC and CCOC Despite drastically different disease presentations, epidemiological and molecular genetic studies have made it increasing clear that there exists an association and a clonal relationship between ENOC, CCOC and endometriosis. It is now established that both these subtypes arise from endometriosis, and more specifically, from blood- and tissue-filled cysts (endometriomas) on the surface of ovaries [36]. As such, ENOC and CCOC are also called endometriosis associated ovarian cancers (EAOCs). Endometriosis is a common condition that is defined by ectopic, or displaced, endometrial epithelium and stroma. It occurs in 6 \u00E2\u0080\u0093 10% of the general female population in their reproductive age [44]. The ectopic tissues can occur anywhere in the peritoneal cavity, which grows and sheds with the monthly cycle, resulting in bleeding and inflammation, and often causing pain and infertility [44]. Women with endometriosis have a 3-fold increased risk of developing CCOC and a 2-fold increased risk of developing ENOC [45]. While there is an increased risk of EAOC with endometriosis, tumor development itself is rare, only occurring in about 1% of all cases [46]. Nonetheless, the ability to identify these high-risk cases would be valuable. Various efforts investigating the molecular link between endometriosis and ovarian cancers have revealed several shared mutations between these two diseases, implicating that similar pathways could be involved in this malignant transformation. One of the earlier links found between endometriosis and EAOC are the shared LOH at 10q23.3 and PTEN mutations [47]. In this study, it was found that the 10q23.3 LOH occurred in 42% ENOC, 27% CCOC, and 57% solitary endometriosis, while PTEN occur in 20% ENOC, 8% CCOC, and 20% of solitary endometriosis. In addition, the shared LOH and PTEN mutations were found in both EAOCs and concurrent endometriosis, implying that PTEN plays a role in the 12 early development of EAOCs from endometriosis. Activating mutations in KRAS have been found in both EAOCs and concurrent endometriosis tissues, as well as in distant endometriosis without tumor [33, 39]. Mouse models of endometriosis have been developed from KRAS mutations alone, and a PTEN deletion in conjunction with KRAS activation mutations in the same mouse model led to the transformation of endometriosis into ENOC, indicating a role for both pathways in tumorigenesis [37]. Similarly, identical ARID1A mutations were found in CCOC tumors and adjacent endometriosis tissue, and cancer-associated ARID1A mutations were found in distant endometriosis tissues without cancer [26, 39]. Shared PIK3CA mutations have also been found in both CCOC and adjacent endometriosis tissues in 9 of 10 endometriosis-associated CCOC cases in one study [48]. In a study comparing WGS profile of CCOC and targeted deep sequencing of concurrent endometriotic lesions, endometriotic lesions sharing a proportion of somatic mutations with the primary tumor were found [49]. 1.1.5 The Role of Circulating Cell-free DNA (cfDNA) as Cancer Specific Biomarkers in Ovarian Cancer The current FDA approved serum biomarkers used in the clinical management of ovarian cancer are CA125 and HE4 [50]. CA125, cancer antigen 125, also known as mucin 16 or MUC 16, is a transmembrane glycoprotein that is expressed by the coelomic epithelium. It has low specificity to malignant ovarian cancers, as other malignant cancers and benign gynecological conditions arising from the epithelium also has elevated CA125 expression detectable in the blood stream [51]. HE4, human epidermis protein, is a 13 kDa protein that is encoded by the WFDC2 gene. The overexpression of the WFDC2 gene and the HE4 protein was found in ovarian cancers compared to healthy controls [52, 53]. Despite their importance in evaluating 13 treatment efficacy and detecting relapse, CA125 and HE4 have high false-positive and false-negative rates and low specificity, as such additional biomarkers are required in supplement. The use of circulating cell-free DNA (cfDNA) as an additional cancer biomarker has become a promising area of research. In a healthy subject, cfDNA is thought to originate from leukocytes and is usually 30 ng/ml of blood on average, up to 100 ng [54]. In cancer patients, it is thought that tumor cells disseminate into the blood stream and contributes to the level of cfDNA as circulating tumor DNA (ctDNA), yielding an average of 180 ng cfDNA per ml of blood, and up to 1000 ng [54]. More frequent apoptosis and necrosis in tumor cells, combined with deficient clearing of apoptotic bodies by the immune system, also contributes to the increased levels of ctDNA in the blood stream [55]. CtDNA are also highly fragmented with an average size of 160-170 bp [56]. Studies in various cancers have shown that patient cfDNA and ctDNA levels correlate with tumor stage, grade, and size [56]. A study in ovarian cancer showed significantly elevated cfDNA levels in malignant ovarian cancer patients compared to healthy controls and to those harboring benign tumors [54]. In the same cohort, patients with advanced staged ovarian cancer (stage III/IV) also displayed elevated cfDNA levels compared to lower stages (stage I/II). When compared with CA-125 and HE4 level, cfDNA alone had a greater sensitivity and specificity for OC detection. When cfDNA level was combined with CA-125 and HE4, both sensitivity and specificity improved. Not surprisingly, patient cfDNA levels correlate with tumor burden and has been proposed as a possible prognostic marker [56]. Studies in colorectal cancers have shown that ctDNA level was a highly sensitive and specific predictor of post-surgery recurrence [57]. In addition to looking at the total cfDNA levels alone, genetic mutations found in solid tumors were also reflected in the ctDNA of the same patient. Various methods have been used to 14 screen point mutations in cfDNA, such as quantitative PCR, digital droplet PCR, and molecular barcoded target enrichment sequencing [56]. These techniques have been used to screen for many common ovarian cancer mutations, such as TP53, BRCA1/2, KRAS, PIK3CA, etc. [54]. In addition to screening for point mutations, detecting chromosomal rearrangements in ctDNA has been proposed as a highly specific tumor marker for ovarian cancer [58]. In this study, whole-genome sequencing was performed in ten ovarian cancer patients and individualized chromosomal rearrangements were identified. Primers for each rearrangement event were designed and used to identify the corresponding event in plasma samples taken pre- and post-surgery. Patient relapse was correlated with the presence of detectable events in plasma samples post-surgery. The proposed advantage in using rearrangement events is the ability to eliminate sequencing errors for highly rare events (~0.01%), as it is unlikely for rearrangement junctions to result from sequencing errors, which is not the case for simple somatic mutations. Obtaining multiple plasma samples at recurrence has helped reveal the clonal dynamics of driver mutations in tumor evolution. The use of cfDNA assays makes it highly advantageous for clinical management, as it is minimally invasive and cheap to perform, making monitoring residual disease and identifying genetic mechanism of treatment failure relatively easy to perform. 1.2 Mobile Genetic Elements and Cancer Mobile genetic elements, also called transposable elements, have shaped the human genome throughout evolution by contributing to a significant fraction of observed structural variations [59]. The activation of mobile genetic elements can pose detrimental effects to the genome, and human cells have developed numerous epigenetic and other mechanisms to suppress transposable element activities in the germline and somatic cells alike [60]. Cancer, in 15 its genomic chaos and abnormal epigenetic states, provides a unique molecular environment permissible for the activation of these elements, such that we begin to see the somatic consequences littering across the cancer genomes. Transposable elements account for approximately 45% of the human genome [61]. There are two major classes of transposable elements: DNA transposons and retrotransposons, which differ in their mechanism of propagation. DNA transposons propagate across the genome via protein machineries that \u00E2\u0080\u009Ccut-and-paste\u00E2\u0080\u009D its original DNA sequence into a new genomic location. Retrotransposons propagate via protein machineries that \u00E2\u0080\u009Ccopy-and-paste\u00E2\u0080\u009D its DNA sequences into a new genomic location through an RNA intermediate, leaving the original sequence intact. While both classes of mobile genetic elements have contributed to the evolution of the human genome, DNA transposons are no longer active in humans, and only makes up 3% of the genome [61]. Retrotransposons comprise approximately 42% of the human genome [61]. While there are several different classes of retrotransposons, the only autonomous retrotransposons, i.e. capable of propagating on their own, and which are currently still active, are LINE-1 (or L1) retrotransposons. L1 elements account for 17% of the human genome [62-65]. However, the majority of L1 elements have lost the capability to retrotranspose due to mutations, truncations and/or sequence inversion, and remain only as molecular fossils. To date, about 70-100 L1 retrotransposons remain at full-length, with intact sequences encoding the proteins needed for autonomous retrotransposition [62]. Many insertional L1 polymorphisms in human populations have been documented, as well as somatic insertions in normal human tissues, particularly in the brain [66, 67]. L1 elements are generally suppressed via promoter methylation in normal cells, and additional mechanisms such as the activation-induced deaminase (AID)/apolipoprotein B 16 mRNA-editing catalytic polypeptide-like (APOBEC) deaminases, small interfering RNA (siRNA), micro RNA (miRNA), and piwi-interacting (piRNA) act to restrict L1 activities, and often in cancer development, these mechanisms fail and L1s are activated and retrotranspositions succeed [60]. 17 Figure 2. Schematics of L1 structure and its retrotransposition mechanism. A full length L1 is transcribed when the internal promoter in the 5\u00E2\u0080\u0099 UTR is unmethylated. The L1 RNA is carried into the cytoplasm for protein translation and formation of the RNP complex. The RNA is taken back into the nucleus and re-integrated into a new genomic location via the process of TPRP using the PolyA tail. The RNA of other retroelements can also use the L1 protein machinery to integrate its own sequence into the genome. TSD (target site duplication, UTR (untranslated region), ORF1/2 (open reading frames 1/2), L1ORF1p/2p (L1 open reading frame proteins 1/2), RNP (ribonucleoprotein), TPRT (target-primed reverse transcription). 18 A full-length L1 element spans 6 kb (Figure 2). It is comprised of a 5\u00E2\u0080\u0099 untranslated region (UTR) with an internal bidirectional RNA polymerase II (Pol II) promoter that is hypomethylated in normal cells [66]. Two open reading frames follow the promoter region, ORF1 and ORF2. ORF1 encodes a 40 kDa RNA-binding protein (L1ORF1p) and ORF2 encodes a 149 kDa protein (L1ORF2p) with an endonuclease and reverse transcriptase capabilities. At the 3\u00E2\u0080\u0099 end lies the 3\u00E2\u0080\u0099 UTR and a poly(A) tail signal that ends transcription. In cases where a stronger poly(A) signal exits further downstream of the L1, transcription and subsequent retrotransposition of the downstream sequences is possible, leading to identification of the source L1 element at the new insertion sites [66]. This is a process called 3\u00E2\u0080\u0099 transduction, and has important utility in studying L1 insertions in cancer [62, 68]. Once L1 RNA is transcribed and transported out of the nucleus, the L1ORF1 and L1ORF2 are translated and form a complex that preferentially binds to the L1 mRNA, forming a ribonucleoprotein (RNP) complex that can re-enter the nucleus. Upon re-entry into the nucleus, the L1ORF2p protein nicks genomic DNA at a TTAAA sequence and reverse transcribes the RNA through target-primed reverse transcription (TPRT) starting at the 3\u00E2\u0080\u0099 poly (A) tail [69]. The resulting insertion is either a full-length copy of L1 or, much more commonly, a sequence truncated at the 5\u00E2\u0080\u0099 end, and are normally flanked by duplications called target site duplications (TSDs). Inversion of the L1 sequences can also occur as a consequence of TPRT at the integration site [70]. The mRNAs of other retroelements, such as SINE (short interspersed nuclear elements) elements, can hijack the L1ORF1 and L1ORF2 protein complex, leading to the genomic insertion of these sequences [66]. 19 1.2.1 Consequences of Somatic LINE-1 Retrotranspositions in Cancer The consequences of somatic L1 retrotransposition is simple to deduce, as its integration mechanism introduce double stranded breaks and result in sequence disruptions at the insertion sites, but determining whether L1 retrotransposition plays an active role in tumorigenesis is difficult to elucidate in practice. Somatic L1 retrotranspositions occur in approximately 50% of human cancers, and epithelial cancers exhibit high numbers of L1 insertions, with lung squamous carcinoma, esophageal adenocarcinoma, and colorectal adenocarcinoma having the highest reported frequencies [62, 71-73]. Interestingly, L1 insertions in blood and brain cancers are almost completely absent, suggesting that the epithelium undergoing transformation offer a unique environment for somatic L1 activation [73-75]. For L1 retrotransposons to be considered a driver in tumorigenesis, they need to affect the relevant pathways early. Despite numerous studies examining the role of L1 retrotransposition in cancer, there are only two reports of L1 insertions directly driving tumorigenesis. In both cases, L1 insertion into exon 16 of the tumor suppressor APC (adenomatous polyposis coli) gene initiated colorectal cancer (CRC) in accordance with the two-hit hypothesis [76, 77]. In the recent report by Scott and colleagues, the expression of the source L1 that gave rise to the somatic insertion in APC was also found in the normal colon tissue, further supporting its tumorigenic role in this sporadic CRC patient [77]. While additional L1 insertions have been found across different epithelial cancers, only a subset of these genes have previously been associated with cancer and their roles in tumorigenesis are not clear [75, 78]. One study in hepatocellular carcinoma showed an intronic L1 insertion within the ST18 (suppression of tumorigenicity 18) gene that resulted in its overexpression in the tumor tissue only, suggesting the potential of ST18 as an oncogene and the L1 insertion as a mutagenic trigger 20 [78]. An exonic L1 insertion into PTEN was found in an endometrial carcinoma, resulting in the expression of a chimeric mRNA, which may have disrupted the function of this tumor suppressor gene [75]. Most recently, Nguyen et al described a intronic L1 insertion into the STC1 gene in a case of high grade serous ovarian cancer, which provided a novel enhancer to elevate the expression of STC1 and led to chemoresistance in vivo [79]. However, such events are rare, and L1 somatic insertions are most often found within the intronic and intergenic spaces of the genome, without clear link to tumorigenesis [62, 71]. In some studies, expression of genes harboring intronic L1 insertions was found to be down-regulated [73, 75], while in other studies no significant differences were observed [62]. Hypomethylation of L1 sequences have been linked with global hypomethylation in cancer genomes and is a widely accepted surrogate marker for the latter [80]. As global hypomethylation has been linked to genomic instability, hypomethylation of L1 elements are also correlated with increased genomic instability in non-small cell lung cancers [81]. However, only localized demethylations of the specific L1 5\u00E2\u0080\u0099 internal promoter sequences leads to L1 activation, while global hypomethylation is the likely predisposing condition [62, 82]. In addition, as the promoter regions of the L1 sequence is bidirectional, hypomethylation can result in antisense transcription; such is the case with the generation of an alternate transcript of the proto-oncogene MET when an intronic L1 element within the gene became activated in bladder and colorectal cancers [82-85]. Given the recent advances in next generation sequencing technologies and the availability of whole genome sequencing at lower costs, bioinformatic efforts to map L1 retrotransposition events have revealed many of its effects on the cancer genome. In addition to correlating with genomic instability, a recent pan-cancer analysis of L1 retrotransposition revealed that, across 21 2,774 cancer genomes, L1 retrotransposition events are highly frequent and were three times the number of unbalanced translocations observed [71]. Future analysis has shown that L1 retrotransposon integration can be an alternative driver of genomic rearrangements, inducing events such as large gene deletions and breakage-fusion-bridge cycles, which has the potential to contribute to oncogenesis [71]. While these events were minimal (96/19,705 total L1 events) and only occurred in cancers with high level of somatic L1 activity, it reveals that L1 may play a larger role in tumorigenesis than previously assumed [71]. L1-mediated transduction of 3\u00E2\u0080\u0099 downstream sequences is another feature frequently observed in epithelial cancer, and may be misidentified as translocation events [62, 63, 71]. The prevalence of 3\u00E2\u0080\u0099 transduction events among L1 retrotranspositions ranges from 18-24% across the epithelial cancers surveyed [62, 71]. The presence of unique downstream sequences makes it easier to identify the specific locations of the active source L1, and such analysis revealed that somatic insertions in cancer tend to be dominated by a few extremely \u00E2\u0080\u009Chot\u00E2\u0080\u009D L1 loci [62, 63, 71]. These \u00E2\u0080\u009Chot\u00E2\u0080\u009D L1 loci include those located at 22q12, 6p24.1, 3q21, and 14q23 [62, 71], as well as 6p22.1, 6p14.1, and Xp22.2-1 [71]. Among these active loci, 22q12, located within the intron of TTC28, is the most frequently active source L1 found in colon, head and neck, and lung cancer samples, and it was the only active L1 element detected in 93% of breast cancer samples in one study [62, 63, 68]. In summary, while certain L1 insertions may be cancer \u00E2\u0080\u009Cdrivers\u00E2\u0080\u009D, these are extremely rare. The majority of L1 retrotranspositions are considered passenger events in cancer, correlating with genomic instability while having the potential to contribute to genomic instability through their methods of genomic integration. 22 1.2.2 Timing of L1 Retrotranspositions in the Evolution of Tumors The detection of L1 promoter hypomethylation and insertions in precursor lesions of several cancers imply that it is an early event. In EOC, a recent study compared the methylation level across different L1 loci in normal endometrium, contiguous endometriosis (endometriosis adjacent to tumor), and ENOCs/CCOCs [86], showing that L1 activation may be an early event in ENOCs. In gastrointestinal (GI) cancers, shared somatic L1 retrotranspositions (solo insertions) have been found between precancerous lesions and colon tumors [87], as well as between Barrett\u00E2\u0080\u0099s esophagus (precursor lesion of esophageal cancer) and concomitant esophageal tumors [88], both suggesting early activation of L1 during GI tumor development. As Tubio and colleagues demonstrated via the pattern of 3\u00E2\u0080\u0099 transduction, L1 activities appears to be phasic during tumor evolution, with different source L1 giving rise to different number of insertions within each clonal population over the course of tumor progression [62]. 1.2.3 L1 as a Biomarker Given the high level of L1 activations in certain cancer types, it has been proposed as a candidate biomarker of neoplasm [89]. As mentioned previously, hypomethylation at various L1 loci, and especially the 5\u00E2\u0080\u0099 promoter regions, is the key phenomenon in cancer that indicates L1 transcription and subsequent retrotransposition. The methylation level of L1 in EOC was found to be significantly lower compared to normal tissues [90], and higher levels of hypomethylation was associated with higher grade and stages, as well as poor overall survival [91]. Additional studies in lung, colorectal, breast, prostate, liver, and esophageal cancers also found an association between decreased L1 methylation level with poorer prognosis and increased insertion levels in metastatic tissues [82, 91]. A similar trend has been observed in studies directly examining L1 insertions in cancer development and progression in head and neck, 23 prostate, colorectal, gastric, esophageal, pancreatic, and breast cancer [62, 63, 65, 86-88, 92-95]. Given that L1 activation can contribute to further genomic instability via double stranded breaks created during its integration into the genome, its presence in the more aggressive subgroup of cancers cases is not surprising. Several studies have found an inverse relationship between L1 hypomethylation and MSI in colorectal cancers [96, 97]. Interestingly, given that L1 hypomethylation alone is associated with worse prognosis and is inversely associated with MSI level, its association with adverse outcome is stronger in MSI-high patients compared to MSS patients [98]. The success of L1 retrotransposition depends upon its protein machinery, and the expression of L1ORF1p and L1ORF2p has been assessed across cancers as a marker of L1 activities [82] . The IHC staining pattern for L1ORF1p is generally cytoplasmic across the cancers surveyed [72]. Nuclear staining has been observed in breast cancer and was found to be associated with increased incidence of recurrence and metastasis, and poorer overall survival [99]. Rodic and colleagues evaluated the association of L1 retrotransposon ORF1p expression to TP53-deficiency in secondary glioblastoma, lung, pancreatic and ovarian carcinomas [72]. In this study, L1ORF1p was immunoreactive in 93.5% of HGSC cases, and 86% of these cases were also TP53 deficient (mutation status established via sequencing). While there were no ovarian cancer cases with wild-type TP53 in this study, L1ORF1p expression was seen in wild-type TP53 lung carcinoma cases, albeit at a significantly lower percentage of cases with lower overall staining intensity [72]. Compared to L1ORF1p, L1ORF2p is suppressed at the translation level, which contributes to its difficulty of detection even when its overexpressed [100]. As such, fewer studies have examined the expression of L1ORF2p [82, 101]. Nonetheless, high level of L1ORF2p were found in epithelial cancers as expected, where in colon, prostate, lung and breast 24 cancers, as well as preneoplastic lesions in colon and prostate tissues [101]. While additional studies are needed to further validate the findings, both L1ORF1p and L1ORF2p could potentially be clinically useful biomarkers in epithelial cancers. 1.3 Rational and Aims Through the whole genome sequencing of 29 ENOC and 35 CCOC cases, we observed frequent rearrangement events originating from an active LINE-1 (L1) retrotransposon located in the TTC28 gene on chromosome 22q12. Such event occurred in 34% (10/29) of ENOC, and 31% (11/35) of CCOC cases, appearing as the most frequent event in these two cohorts. This L1 retrotransposon was the most active retroelement reported across different cancers types [62, 71], and it alone was found to give high frequency of 3\u00E2\u0080\u0099 transductions in colorectal cancers [63, 68]. While L1ORF1p expression and insertions have been studied in HGSC, this frequent transduction event from TTC28-L1 has not been explored in ENOC and CCOC. Given the previous finding that L1 hypomethylation occurs in the precursor lesions of EAOC, and that L1 insertions were found in precursor lesions of other cancer types, we hypothesize that L1 activation occur early in ovarian cancer tumorigenesis and that TTC28-L1 transductions could be used as a marker of tumor development and evolution. To explore this, we propose the following aims: Aim 1: To establish the timing of TTC28-L1 insertions in our ENOC and CCOC cases. To do so, we will validate the presence of these TTC28-L1 events in selected ENOC and CCOC cases using conventional PCR and Sanger sequencing on multi-spatial sampling of FFPE tissues of each of the cases. Then on the same FFPE tissue sampling, we will survey the frequencies of somatic SNV and frameshift mutations and compare the presence of TTC28-L1 insertions to these mutations. 25 Aim 2: To develop a sensitive screening tool to detect retrotranspositions from specific source L1s in FFPE materials and cell free DNA using L1-mediated 3\u00E2\u0080\u0099 transductions, without performing whole genome sequencing (i.e. without prior knowledge of the target insertion sites). Despite the fact that 3\u00E2\u0080\u0099 transductions occur in one-quarter of all L1 insertion events, the presence of unique DNA sequences makes it easier to identify these events bioinformatically, as L1 sequences are highly repetitive and share high levels of sequence identity, thus result in alignment artifacts. As retrotransposition events are highly specific to each individual patient, a potential clinical application would be to monitor treatment efficacy and to identify minimal residual disease. Aim 3: Given the potential value of L1ORF1p expression as an IHC marker, the third aim is to characterize the expression of the L1ORF1 protein in the three major subtypes of ovarian cancers: HGSC, CCOC, and ENOC. In addition, as MSI and L1 activities have been found to have an inverse relationship in colorectal cancers, it would be interesting to compare the expression of L1ORF1p with MSI/MMR status in endometrioid ovarian and endometrial cancers. As such we also aim to assess L1ORF1p expression level and its association with MMR status, p53 expression and survival. 26 Chapter 2: L1-mediated DNA transductions in EAOCs 2.1 Background Here we used PCR and capillary sequencing to validate the presence of TTC28-L1-mediated transductions in different tumor samplings of ENOC and CCOC, and we compared the results to targeted re-sequencing of selected single nucleotide variations (SNVs)/frame shift (fs) mutations in the same tissues to infer biological timing of L1 transductions. Our data showed that these transduction events appeared with canonical driver mutations for majority of the cases and suggests that they likely occurred early in tumorigenesis. We hope to use TTC28-L1 mediated DNA transductions to gain insights into tumor development in ENOC and CCOC 2.2 Methods 2.2.1 Whole genome sequencing The cohort and methods have been described previously by Wang et al.[102]. Briefly, tumor (frozen tissue) and matched normal (buffy coat) DNA libraries were constructed for 29 ENOC and 35 CCOC cases and sequenced using Illumina HiSeq 2500 V4 chemistry. Bioinformatic analysis was performed using supplemental methods described in Wang et al. [102]. 2.2.2 Case Selection Four ENOC and three CCOC cases with the highest abundance of TTC28-L1-mediated transductions detected by WGS were selected from our previous study [102]. All materials were provided by the OVCARE gynecological tissue bank. H&E slides for all available formalin-fixed paraffin-embedded (FFPE) blocks were reviewed by expert pathologists (T.N. and B.T-C.). Five 27 FFPE tumor tissue blocks with the highest cellularity (>90%) and one representative FFPE normal tissue block were selected for each of the seven cases. 2.2.3 DNA extraction and FFPE blocks FFPE tissue blocks were cut 10um thick and applied on charged glass slides, where relevant areas, identified by a pathologist (T.N.), were macrodissected. DNA was extracted from FFPE tissues using the QIAamp DNA FFPE Tissue Kit (Qiagen) as per manufacturers\u00E2\u0080\u0099 protocols. All DNA was quantified using the broad range DNA assay Qubit fluorometer (Life Technologies). 2.2.4 PCR validation of TTC28-L1-mediated transduction events To validate all L1 transduction events, primers spanning each transduction junction were designed using Primer3 to generate 170-250bp amplicons. Primers were validated on whole genome amplified (WGA) DNA extracted from frozen tumor tissues (positive control) and buffy coats (negative control) for each of the cases. Primers were tested at an annealing temperature of 60C, using PCR SuperMix High Fidelity mastermix (ThermoFisher). Presence of a single band at 200-300bp using gel electrophoresis was an indicator of a successful PCR. For failed PCRs (no band, a smear, or multiple bands), a gradient PCR from 55C to 64C was performed. If the PCRs were still unsuccessful, Platinum\u00C2\u00AE Taq DNA Polymerase High Fidelity (ThermoFisher) were used in substitution. Successful PCRs were validated via sequencing using ABI 3130XL automated capillary sequencing as per manufacture protocol (for a list of validated transductions, see Appendix Table A1). Validated primers were then used on the FFPE-extracted DNA samples with 50ng input. 28 2.2.5 Microfluidic PCR validation of selected SNV/Frameshift mutations For each case, six likely pathogenic missense and/or frameshift mutations, as identified from WGS data, were selected for each case for validations on the FFPE tissue blocks. Primer sets were designed using Primer 3 to amplify specific gene regions. Forward primers were tagged with CS1 (5\u00E2\u0080\u0099-ACACTGACGACATGGTTCTACA-3\u00E2\u0080\u0099) and reverse primers with CS2 (5\u00E2\u0080\u0099-TACGGTAGCAGAGACTTGGTCT-3\u00E2\u0080\u0099) sequencing tags. PCR products (150-200bp) were amplified using the Fluidigm 48X48 Access Arrays, as per manufacturers\u00E2\u0080\u0099 protocol, with 50ng of input FFPE-derived DNA. DNA barcodes (10bp) with Illumina cluster-generating adapters were added to the libraries post-Fluidigm harvest, and cleaned-up using the Agencourt AMPure XP beads (Beckman Coulter). Barcoded PCR products were then quantified using the high sensitivity DNA assay Qubit fluorometer (Life Technologies) and pooled to one total library by normalizing to equal amounts of PCR product. In total, 48 samples were pooled and denatured according to Illumina standard protocols, and sequenced using a MiSeq 300 cycle V2 kit on the Illumina MiSeq for ultra-deep validations. Uni-directional barcode sequencing was performed. Downstream analysis was performed using the Binary Alignment Mapping (BAM) and Variant Call Format (VCF) files generated through Illumina MiSeq reporter. 2.3 Results 2.3.1 TTC28-L1 mediated transductions are frequent events in clear cell and endometrioid ovarian cancer In a previous study [102], whole genome shotgun sequencing was performed on 29 ENOC and 35 CCOC cases to a median coverage of 51x and 27x for the tumor and matched normal DNA, respectively. Here, we examined this same data set generated via a structural 29 variant caller (deStruct [103]), and found that the only recurrent rearrangement events originated from the TTC28 gene at the chromosome 22q12 locus (Figure 4). Previously thought to be a fragile site for translocation events, we now know that this is a L1-retrotransposon mediated 3\u00E2\u0080\u0099 transduction event, as indicated by the clustering of apparent breakpoints within 1 kb downstream of an active L1. This retrotransposon-mediated transduction event was observed in 34% (10/29) of ENOC, and 31% (11/35) of CCOC cases. None of the transductions targeted annotated exons. There was no association between TTC28-L1 mediated transduction and common mutations or landscape features (data not shown). Because L1 elements are highly repetitive throughout the genome, there is difficulty aligning short reads to the human reference genome unambiguously. Consequently, there could be additional L1 retrotransposition events that were undetected. 30 Figure 3. Circos plot showing the WGS results of retrotranspositions originating from TTC28 on chromosome 22q12 (hg19 coordinates 29,059,272 \u00E2\u0080\u0093 29,065,303). Blue lines indicate ENOC cases and red lines indicate CCOC cases. 31 2.3.2 TTC28-L1 mediated transductions are early events in ENOC and CCOC oncogenesis We validated these TTC28-L1 transductions on high quality genomic DNA extracted from patient frozen tumors and buffy coats via PCR, with primers designed to flank the sequences around the insertion breakpoint, such that only samples with the transductions are amplified. These transductions were validated through Sanger sequencing of the amplified PCR products. Despite excluding TTC28-L1 transductions that failed at the PCR or Sanger validation stages, each case had at least one validated transduction event (for a list of validated TTC28-L1 transductions, see Appendix Table A2). As expected, validated TTC28-L1 transduction events were found only in the tumor and not the buffy coat samples. To infer the cellular timing of these TTC28-L1-mediated transductions, we obtained archival FFPE tumor samplings from five different sites for each of the seven patients. We hypothesized that, if the transduction occurs early, the same event should be observed at multiple tumor sites. For each case, we selected six somatic SNVs/fs mutations with the highest cellular frequency and function impact, detected via WGS, and performed targeted re-sequencing to further characterize the timing of the TTC28-L1 mediated transductions relative to WGS detected mutations (for a detailed list of SNVs/fs, see Appendix Table A2). In the majority of cases, (4/4 ENOC, 1/3 CCOC, 5/7 total) (Figure 4), we detected TTC28-L1 insertions in in every FFPE samples obtained from that patient. In these cases, SNVs and fs mutations were detected at varying allelic frequencies for each tumor block, reflecting both inter- and intra-tumour heterogeneity (Figure 4 and Appendix Table A2). For each case, SNVs/fs mutations that were found in every tumor block sampled tend to be canonical cancer driver mutations, such as mutations in ARID1A, PIK3CA, KRAS, PTEN, and CTNN1B (Figure 4 32 and Appendix Table A2). These results suggest that TTC28-L1 was likely activated early during tumorigenesis and was mediating DNA transductions along with SNVs/fs mutations in the earlier tumor clones. In one case (ENOC 4), the TTC28-L1 transduction was also detected in the histologically normal cervical tissue block used as a normal FFPE control, while the insertion event was not found in the same patient\u00E2\u0080\u0099s buffy coat DNA. A second sampling from this tissue block yielded the same result, which indicates that L1 activation could even occur early, in cells that appear normal histologically but are clonally related to the tumor. In a single CCOC case (CCOC 2), we observed what appears to be two distinct clonal populations within the same ovarian tumor, where three of the five tumor blocks surveyed had highly concordant SNVs/fs allelic frequencies in addition to presence of the TTC28-L1 transduction event, while the other two tumor blocks contained neither the SNVs/fs nor the TTC28-L1 transduction events. Four cases in our cohort (ENOC 1, ENOC 2, ENOC4, and CCOC3) had synchronous tumors in the ovary and the uterus. In the three endometrioid ovarian carcinoma cases (ENOC 1, ENOC2, and ENOC 4), we observed the TTC28-L1 transduction event as well as several shared SNV mutations in both the ovarian and uterine tumors. The observation of the same TTC28-L1 transductions in tumors at anatomically different locations further suggests that L1 activation occurs early in this disease. In the clear cell ovarian carcinoma case (CCOC 3), SNVs/fs mutations and the TTC28-L1 event were found in all the samplings of the ovarian tumor but were not seen in uterine tumor. 33 34 Figure 4. Comparison of TTC28-L1 presence and allelic frequencies. a). TTC28-L1 retrotransposition status, and the variant allelic frequencies for five selected SNV/fs mutations at six anatomical sites for each of the seven ENOC and CCOC cases. Three shades of grey portray differences in allelic frequencies: dark grey indicates high frequency (>50%) at the point mutation; lighter grey indicates a frequency between 20% and 50%; lightest grey indicates low frequency (5%-20%). A red box indicates the presence of TTC28-L1 retrotransposition, and a white box indicates no variance or retrotransposition detected. Numbers indicate the allelic frequencies in percentage. A.F., allelic frequencies; Ov, ovary; Cer, cervix; Ut, uterus; R, right; L, left; N, normal; ND, not detected. b). Circos plot displays retrotransposition events originating from TTC28 (Chr.22q12) for the seven selected cases. All retrotransposition targets were in non-coding spaces. 35 2.4 Discussion Herein we described the presence of DNA transductions mediated by a specific and active retrotransposon, TTC28-L1, in the endometriosis associated ovarian carcinoma subtypes: ENOC and CCOC. Our data showed that TTC28-L1 mediated transductions were detectable across all tumor sites in the majority of the cases. The presence of a TTC28-L1 transduction was sometimes but not always accompanied by SNV/frameshift mutations at varying allelic frequencies. It is highly unlikely that the same retrotransposition event arose independently at each tumor site, which indicates that TTC28-L1 transductions were not subclonal events, but more likely clonal events that occurr early in tumorigenesis. In addition, the presence of TTC28-L1 transductions were correlated with high allelic frequencies of driver mutations implicated in the development of these cancers, including mutations in ARID1A, CTNNB1, PIK3CA, and PTEN [104], supporting our hypothesis that TTC28-L1 retrotransposons are early events in the development of ENOC and CCOC. Targeted resequencing was previously used by our group to describe a clonal relationship between the endometrial and ovarian cancers in three of the four synchronous endometrioid and ovarian tumors, ENOC 2, ENOC 4 and CCOC 3 [105]. We also found somatic mutations and TTC28-L1 events shared between the uterine and ovarian tumors in ENOC 2 and ENOC 4, however, we did not detect any of our selected mutations nor the TTC28-L1 event in the uterine tumor from CCOC 3. CCOC 3 was unusual in several ways: the ovary (clear cell) and uterus (endometrioid) tumors displayed distinct histopathological appearances, and only a single somatic event was shared between the two (a different event from the ones we investigated) as reported by Anglesio et al [105]. It is possible this somatic event had occurred before either the seeding of endometriosis or the activation of the TTC28-L1 retrotransposon. 36 It is interesting to note that 3\u00E2\u0080\u0099 transductions account for approximately one quarter of all somatic L1 retrotranspositions, and the majority of L1 retrotranspositions result in solo L1 insertions, either as a full L1 or truncated at the 5\u00E2\u0080\u0099 end [62]. This indicates that there are likely more L1 insertions which remain undetected in our cohort. Previous studies on somatic L1 retrotranspositions in various cancer types reported a higher abundance of solo L1 insertions than 3\u00E2\u0080\u0099 transduction events; however, the lack of L1 3\u00E2\u0080\u0099 transductions detected could be due to bias in the L1 sequencing and analysis methods used in these studies [87, 88, 93, 94]. Nonetheless, we observed TTC28-L1 transduction events at a high frequency within our cohort, which could be a unique feature of EAOCs. While somatic L1 insertions are common in certain tumor types, there is little evidence that they are oncogenic. Only two studies showed that somatic L1 retrotranspositions (solo L1 insertions) directly initiated oncogenesis in colorectal cancer, via somatic L1 insertion into an exon of the tumor suppressor gene APC [76, 77]. The majority of somatic L1 insertions target heterochromatic regions with no obvious functional impact [62, 64]. We also observe that our L1 transductions targeted non-coding regions within out cohort. Most research in the field of retrotransposition and cancer focuses on using L1 activity as a marker of tumor development, by assessing the global hypomethylation status of L1 loci and the expression of L1 mRNA and proteins throughout stages of cancer development and metastasis. The general conclusion is that somatic L1 retrotranspositions are passenger events that occur after a permissive environment has been established during cancer development, such as dysregulated epigenetic control [62, 64, 86]. Evidence of L1 activation as an early event include a recent study that found a stepwise decrease of methylation across different L1 loci between normal endometrium, contiguous endometriosis (endometriosis adjacent to tumor), and 37 ENOCs/CCOCs [86]. In gastrointestinal (GI) cancers, shared somatic L1 retrotranspositions (solo insertions) have been found between precancerous lesions and colon tumors [87], as well as between Barrett\u00E2\u0080\u0099s esophagus (precursor lesion of esophageal cancer) and concomitant esophageal tumors [88], suggesting early activation of L1 during GI tumor development. This is likely the scenario in our cohorts, where the TTC28-L1 was activated early during ENOC/CCOC tumorigenesis, but after a retrotransposition-permissive environment was established. Characterizing the exact relationship between TTC28-L1 activation and malignant transformation in these two ovarian cancer subtypes is challenging. Further studies will need to assay endometriosis tissues which are adjacent to tumors, distant from tumors, as well as endometriosis without cancer involvement. Such studies could determine whether TTC28-L1 transduction is a useful marker for cancer risk of endometriosis. Ovarian cancer suffers from a lack of effective early detection tools. TTC28-L1 and other L1s can potentially be used as biomarkers to identify early malignant transformations, especially in the subset of CCOC/ENOC cases where common coding mutations are not present. In addition, a better understanding of L1 activation and transduction patterns over the course of ENOC and CCOC progression may help elucidate the underlying mechanisms of ovarian carcinogenesis and progression. 38 Chapter 3: Detecting L1-mediated transductions 3.1 Background As described in Chapter 2, we detected TTC28-L1 mediated transduction events in one-third of our ENOC and CCOC cohorts through WGS. Given the high frequency of these events, we propose the possibility of using this event as a potential biomarker for the clinical management of EOC. Such a screen will require a low-cost assay can detect 3\u00E2\u0080\u0099 transduction events without prior knowledge of their target site, i.e. without performing WGS, which is a costly method [106]. To this end, we explored several methods to detect these transduction events in a cost-effective manner, as well as to a higher sequencing depth. Several methods of identifying L1 retrotransposition insertions by selectively amplifying and sequencing L1 insertions have been developed. These include LINE-1 Sequencing (L1-seq), retrotransposon capture sequencing (RC-seq), and transposon insertion profiling (TIP-seq) [93, 107-110]. TIP-seq[93] and L1-seq [108] share a similar technique, where primers specific to the 3\u00E2\u0080\u0099 end of L1HS (the youngest family of L1 elements), are used to specifically amplify fragments with L1 insertions either by using ligated adapters (TIP-seq) or hemi-specific PCR (L1-seq). Next generation sequencers are used to sequence the amplified libraries. RC-seq is a probe-based capture assay, where probes that binds to the 5\u00E2\u0080\u0099 and 3\u00E2\u0080\u0099 ends of the L1 consensus sequence are used to isolate and enrich for fragments containing L1 retrotranspositions [110]. The captured libraries are also sequenced on next generation sequencers. These techniques have been applied to look for retrotransposon insertion polymorphisms (RIPs) that may contribute to an individual\u00E2\u0080\u0099s predisposition to certain disease, including cancer [108]. While we hope to discover insertions that may be tumorigenic, L1 insertions have been shown to be passive events in cancer development, and thus our main goal is not to survey across a tumor for all L1 insertions, but to 39 use hot L1 transductions as tumor biomarkers. Here we employ a probe and capture based enrichment assay to take advantage of the unique 3\u00E2\u0080\u0099 transduced DNA and use them as molecular flags to find their retrotransposition targets. 3.2 Methods 3.2.1 Overview The general schematic of our methodologies is shown in the figure below (Figure 6). Given that L1-mediated 3\u00E2\u0080\u0099 transduction contains the unique sequences of TTC28, we used customized biotinylated probes that tile the sequences 1kb immediately downstream of the source L1 element. Six of the most active L1 loci were selected based on a previous report by Tubio et al.[62] (Table 2). Genomic DNA was first extracted and fragmented via sonication. Custom barcoded adapters were ligated and amplified with indices during library construction. Libraries were pooled together, and fragments containing the 3\u00E2\u0080\u0099 transduced sequence were pulled down using the custom probes and Streptavidin beads. Post-hybridization libraries were amplified and paired-end sequencing was performed using the Illumina MiSeq next generation sequencing platform. Following read mapping against the human reference genome, reads which originate from transduction events would appear as a \u00E2\u0080\u009Csplit-read\u00E2\u0080\u009D, where one portion of one read maps to the unique sequence downstream of the source L1s, while the other end maps to the transduction target site. Conventional PCR and Sanger sequencing were used to validate the calls. The detailed protocols are described in the next section. 40 Table 2. Hg19 coordinates for the probe captured regions of the 6 L1s. L1 Origin Capture space (1kb downstream of L1 3\u00E2\u0080\u0099 end) TTC28 Chr22:29065304-29066305 MYLK Chr3:123590727-123591726 PHACTR Chr6:13191598-13192597 14q23.1 Chr14:59220408-59221407 Xp22.2 ChrX:11952192-11953191 18q21.32 Chr18:57070172-57071171 41 Figure 5. Schematic overview of the transduction capture protocol. a). Simplified 3\u00E2\u0080\u0099 transduction scheme and capture probe location. b). DNA library construction. c). IDT probe capture. 42 3.2.2 Sample selection All materials were provided by the OVCARE gynecological tissue bank. To validate that our assay can capture 3\u00E2\u0080\u0099 transductions, we surveyed 11 cases with gDNA and WGS data available. Five of the 11 cases were the same samples from the previous chapter. We also processed 13 cases without WGS data, i.e. no prior knowledge of L1 insertion status, based on FFPE tissue availability and tumor cellularity. H&E slides for all available tumor FFPE blocks were reviewed by an expert pathologist (B.T-C.), and cases with tumor blocks of cellularity >90% were selected. For the eight cases that met this criterion, DNA extraction was performed as described in the methods section of the previous chapter. 3.2.3 DNA Extraction and Library Construction FFPE tissue blocks were cut into slices10um thick, scrolled or applied on charged glass slides, where relevant areas, identified by a pathologist (T.N.), were macrodissected. DNA was extracted from the scrolled and macrodissected FFPE tissues using the QIAamp DNA FFPE Tissue Kit (Qiagen) as per manufacturer\u00E2\u0080\u0099s protocols. Cell-free DNA was isolated from 1.5-2.5ml of plasma using the QIAamp Circulating Nucleic Acid Kit (Qiagen) as per manufacturer\u00E2\u0080\u0099s protocol. All DNA was quantified using the broad range DNA assay Qubit fluorometer (Life Technologies). Genomic DNAs (gDNA) extracted from frozen tumor, FFPE, and buffy coat were sheared using the Covaris S220 Sonicator according to the manufacturer protocol to generate fragments approximately 200-300bp in size. 150ng to 200ng DNA as the starting material was diluted to 50ul in 10mM Tris-HCl (pH=8.5) for shearing. (It is recommended to start with at least 100ng DNA, however, as low as 25ng can be used if DNA source is limiting). Sheared DNA were either used immediately, stored at 4C overnight for next day usage, or at -20C for 43 long term storage. Cell free DNA extracted from plasma samples are highly fragmented and sonication was not needed. DNA library preparation was done using reagents from the NEBNext\u00C2\u00AE Ultra\u00E2\u0084\u00A2 II DNA Library Prep Kit for Illumina\u00C2\u00AE (New England BioLabs, #E7645S). DNA end-repair and A-tailing was performed by mixing NEBNext Ultra II End Prep Enzyme Mix (3ul) and NEBNext Ultra II End Prep Reaction Buffer (7ul) to the sheared gDNA (50ul), and thermocycling in a thermocycler with headed lid >75\u00C2\u00B0C using the program: 30 minutes at 20\u00C2\u00B0C, 30 minutes at 65\u00C2\u00B0C, hold at 4\u00C2\u00B0C. Semi-degenerate barcoded adapters developed by Alcaide et al. were used in place of the Illumina TrueSeq adapters [111]. Adapters (2.5ul; concentrations are 10-fold molar excess for gDNA, and 100-molar excess for ctDNA) were added into the end-repair mix (60ul), along with the NEBNext Ultra II Ligation Master Mix (30ul) and the NEBNext Ligation Enhancer (1ul) for adapter ligation. For gDNA extracted from frozen tumor, buffy coat and FFPE samples, the ligation mix was incubated at 20\u00C2\u00B0C for 15 minutes, while for ctDNA extracted from plasma, the ligation mix was incubated over night at 16\u00C2\u00B0C. Adapters clean up was performed with 0.8X Agencourt AMPure XP (Beckman Coulter) beads according to manufacturer protocol. For PCR amplification, NEBNext Ultra II Q5 Master Mix (25ul) and indexed PCR primers (5uM, 5ul) were added to the adapter-ligated DNA following clean up (20ul). Amplification was performed in a thermocycler using the following conditions (Table 3). 44 Table 3. Library amplification cycling conditions. 1 Initial denaturation 98\u00C2\u00B0C 30sec 2 Denaturation 98\u00C2\u00B0C 10sec 3 Annealing/Extension 65\u00C2\u00B0C 75sec 4 Go to step 2 7-8 cycles 5 Final extension 65\u00C2\u00B0C 5 min 6 Hold 4\u00C2\u00B0C Hold Post-PCR clean-up was performed using 1X AMPure XP beads according to manufacturer protocol. Library concentration was measured using the Qubit high sensitivity dsDNA assay (Life Technologies), and quality control (QC) was performed on the Agilent Bioanalyzer using High Sensitivity DNA chips (Agilent, #5067-4626). For libraries generated from plasma extracted ctDNA, size selection for fragments less than 400bp in length was performed following library amplification, according to the AMPureX manufacture protocol. This was to large fragments originating from normal cells. 3.2.4 Probe Hybridization Capture and Sequencing IDT xGen Lockdown Probes, xGen Blocking Oligos, along with xGen Hybridization and Wash kits were used for hybridization capture, following IDT manufacturer protocol with minor modifications. To generate a balanced pool of the NEBNext Ultra II libraries (up to 12 libraries per capture can be performed, if they have different indices), 100ng cleaned-up PCR product per library were pooled into a low binding Eppendorf tube. Blocking Oligo 1 (2ul at 2nmol concentration), Blocking Oligo 2 (2ul at 2nmol concentration) and xGen Universal Blocking oligo mix (0.5ul, Ts-p7 6nt + i5-6nt custom blocking oligo) were added, to bind to the Illumina adapters to reduce off-target capture. Human Cot-1 DNA, which served to block nonspecific hybridization to repetitive elements such as Alu and LINE-1 were omitted. The contents of the 45 tube were dried down completely using SpeedVac, and the dried pellets were resuspended in xGen 2X Hybridization Buffer (8.5ul), Hybridization Buffer Enhancer (2.7ul), and nuclease-free H2O (1.8ul), which was incubated at room temperature for 5 to 10 minutes. This mixture was then transferred to a 0.2ml low-bind PCR tube and incubated at 95\u00C2\u00B0C for 10minutes in a thermocycler. Afterwards, 4ul of IDT xGen Lockdown probes (total concentration 3-4pmol) were added into the tube, which was briefly vortexed to mix. Hybridization occurred while the mixture was incubated in a thermocycler overnight at 65\u00C2\u00B0C with heated lid at 75\u00C2\u00B0C (4 hours minimum). The xGen Wash Buffers are diluted and prepared according to manufacture protocol. Briefly, 10X Wash Buffers (I, II, III, and Stringent buffer) and 2X Bead Wash Buffer were diluted to 1X working solutions which are stable at room temperature for up to 4 weeks. For each round of capture, 400ul of 1X Stringent Wash Buffer and 100ul of the 1X Wash Buffer I need to be equilibrated at 65\u00C2\u00B0C in water bath or thermoblock (with water in the wells) for minimal 2 hours before washing the captured DNA. Streptavidin M-270 Dynabeads (Thermo Fisher Scientific, #65305) are prepared prior to the addition of the hybridization mixture according to manufacture protocol with slight deviation. Instead of 100ul per capture, 75ul of beads per capture was used. Binding of the probes to Streptavidin beads and subsequent washes were performed according to manufacture protocol, except for using 30ul elution volume. Post-capture PCR is performed in the presence of the Streptavidin beads with 2X KAPA HiFi\u00E2\u0084\u00A2 HotStart ReadyMix (#KK2602) at the following conditions (Table 4 & 5). Table 4. Post-capture PCR reagents 46 2X KAPA HiFi\u00E2\u0084\u00A2 HotStart ReadyMix 35 \u00CE\u00BCL 10 \u00CE\u00BCM Illumina P5 Primer 2.5 \u00CE\u00BCL 10 \u00CE\u00BCM Illumina P7 Primer 2.5 \u00CE\u00BCL Beads plus captured DNA 30 \u00CE\u00BCL Total Volume 70 \u00CE\u00BCL Table 5. Post-capture PCR cycling conditions. 1 Polymerase activation 98\u00C2\u00B0C 45 sec 2 Denaturation 98\u00C2\u00B0C 15 sec 3 Annealing 65\u00C2\u00B0C 30 sec 4 Extension 72\u00C2\u00B0C 30 sec 5 Go to step 2 16 cycles* 6 Final extension 72\u00C2\u00B0C 1 min 7 Hold 4\u00C2\u00B0C Hold *Two rounds of capture were performed due to the small capture space in the genome to achieve higher specificity. For the 2nd round of capture, 12 cycles would suffice. Agencourt AMPure XP beads are used for post-PCR purification according to manufacturer protocol. A 1.5X bead volume was used for frozen tumor/buffy coat samples, while 1X bead volume is used for plasma or FFPE samples. Elution volume of 22ul was used. Qubit was used to measure capture concentration. The expected yield for the 1st round of capture is 2-30 ng/ul, and for the 2nd round of capture is 80-200ng/ul. Agilent traces are used after the 2nd round of capture for QC. Ultra-deep sequencing was performed on the Illumina MiSeq machine. Sample preparation and loading was done according to the Illumina MiSeq loading protocol. MiSeq Reagent Kit V3 (600-cycle) was used and sequencing was set to read 301 bases from each end of the DNA fragment (i.e. all reagents are used, and with insert size average of 150-250, this option will sequence the entire fragment from both end). Workflow was set to Generate FASTQ and Application as FASTQ only (i.e. no manifest is needed). 47 3.2.5 Bioinformatics analysis FASTQ files were downloaded from the MiSeq machine after MD5sums check and uploaded onto a secure server. Barcodes were trimmed from both end of the FASTQ files using previous published pipeline Dellingr [111], which were then aligned using BWA-mem. PCR and sequencing duplications were removed using Picard Mark Duplicates. Given that reads which overlap transduction breakpoints would likely result in a portion of the read being unmapped (\u00E2\u0080\u009Csoft-clipped\u00E2\u0080\u009D), we used a published soft-clip analysis tool, Socrates [112], to identify the structural variants. Post-filtering was performed to exclude calls with neither split-read end mapped within the captured region of the genome, sequences with AT content greater than 80%, and for cases with matched normal specimens, variant with more support in the normal than the tumor are considered germline and were filtered out. One case with multiple high confidence transduction events was selected for validations, which acts as the training set for filtering created using the random forest model. Validation was performed using conventional PCR with primers designed to flank the transduction junctions identified by Socrates. Sanger sequencing was used to confirm junction locations. PCR conditions are the same as described in the methods section of the previous chapter. 48 Figure 6. Schematic overview of the in-silico analysis pipeline. a). As the average insert sizes are shorter than read length, the reads are split read (soft-clipped). Also due to this fact, adapters needed to be trimmed from both end of a read. b). Schematic overview. 49 3.3 Results 3.3.1 Detection of TTC28-L1 transductions in samples with WGS To validate our assay for the detection of TTC28-L1 transductions, we processed 11 cases with WGS data generated frozen tumor and buffy coat DNA, based on availability. At least one event was detected in seven of these 11 cases. Five of these seven cases were from the previous chapter (ENOC 1, ENOC 2, ENOC 4, CCOC 2, and CCOC3). For CCOC 2 and CCOC 3, we also included FFPE tumor blocks in which PCR amplification of WGS detected events had failed. As shown in Table 6, our transduction capture assay was able to successfully capture the transduced DNA and reproduce the calls in WGS, with minimal difference in the translocation coordinates (<50bp). Differences smaller than 5 nucleotides are likely the result of microhomologies at the junctions. Across all TTC28-L1 transduction events called in the capture assay, the frozen tumor samples displayed an average read support of 225.1 (\u00C2\u00B1200), while FFPE samples displayed an average read support of 98.67 (\u00C2\u00B1108.7). In CCOC 2, three transduction events detected by WGS were successfully identified via capture sequencing, chr22-chr18, chr22-chr2, and chr22-chr5. As two of these events exhibited high AT content at the transduction junction, which hinders conventional PCR, thus we were only able to detect the presence of the chr22-chr18 transduction in Chapter 2. Using this transduction capture approach, we were able to confirm the presence of both chr22-chr2 and chr22-chr5 transduction events. In addition, while we were unable to detect any transduction in one tumor block (A11) in Chapter 2, we detected the chr22-chr5 transduction in this block via capture sequencing. This further confirms our previous conclusion that TTC28-L1 transduction events precede certain SNVs, as no other mutations were detected in this block expect for the 50 chr22-chr5 transduction event. Interestingly, we found low-level of read support (<12 reads) for two of these events (chr22-chr18 and chr22-chr2) in the matched normal buffy coat DNA, but no support in the normal FFPE block B2. There are two possible explanations for this observation. First, these events are active at low levels in the germline but could not be detected in the FFPE normal because FFPE DNA is significantly fragmented, and thus any events will have a low number of supporting reads. Two, these supporting reads originated from circulating tumor DNA that was not separated from the normal buffy coat during DNA extraction. The later of the two is more likely, as probe-based capture is a highly sensitive assay that can reliably detect variants in cfDNA [111]. The use of our assay to detect L1 transduction events in ctDNA is explored in the last section of this chapter. Minimal read support for TTC28-L1 event (chr22-chr18) was also observed in the buffy coat of CCOC 3, which likely originated from circulating tumour cells or contaminating ctDNA. Similar to results from Chapter 2, we were not able to detect the event in the uterine tumor tissue. 51 Table 6. Comparison of the coordinates for TTC28-L1 mediated transduction detected in WGS and target capture. # ID WGS Transduction Capture Coordinate difference Origin (TTC28) Target Origin (TTC28) Target MH R.S. Tissue Type Origin Target 1 ENOC 1 chr22:29065556 chr6:97785114 chr22:29065556 chr6:97785114 2 221 Frozen Tumor (T) 0 0 0 Buffy Coat (N) chr22:29066036 chr5:145279611 chr22:29066036 chr5:145279611 - 79 Frozen Tumor (T) 0 0 0 Buffy Coat (N) 2 ENOC 2 chr22:29065579 chr8:140233819* chr22:29065579 chr8:140233819* - 96 Frozen Tumor (T) 0 0 0 Buffy Coat (N) chr22:29065619 chr11:32392341 chr22:29065614 chr11:32392346 5 19 Frozen Tumor (T) 5 5 0 Buffy Coat (N) 3 ENOC 4 chr22:29065948 chr5:85267680* chr22:29065947 chr5:85267679* 1 512 Frozen Tumor (T) 1 1 0 Buffy Coat (N) 108 FFPE A4 (T - Ov) 0 FFPE B6 (N - Cer) chr22:29065743 chr18:27804070 chr22:29065742 chr18:27803952 - 607 Frozen Tumor (T) 1 118 0 Buffy Coat (N) 143 FFPE A4 (T - Ov) 0 FFPE B6 (N - Cer) 4 CCOC 2 chr22:29065604 chr18:37426280 chr22:29065604 chr18:37426280 2 619 Frozen Tumor (T) 0 0 12 Buffy Coat (N) 84 FFPE A1 (T - Ov) 0 FFPE A11 (T - Ov) 0 FFPE B2 (N - Cer) chr22:29065483 chr2:124570519* chr22:29065485 chr2:124570517* 2 436 Frozen Tumor (T) 2 2 10 Buffy Coat (N) 28 FFPE A1 (T - Ov) 0 FFPE A11 (T - Ov) 0 FFPE B2 (N - Cer) chr22:29065637 chr5:144837667* chr22:29065627 chr5:144837717* 4 130 Frozen Tumor (T) 10 50 52 0 Buffy Coat (N) 17 FFPE A1 (T - Ov) 66 FFPE A11 (T - Ov) 0 FFPE B2 (N - Cer) 5 CCOC 3 chr22:29065433 chr18:38206983 chr22:29065433 chr18:38206983 2 434 Frozen Tumor (T) 0 0 5 Buffy Coat (N) 313 FFPE B6 (T - Ov) 0 FFPE C14 (T - Ut) ** 6 ENOC 5 chr22:29065835 chr8:86565957 chr22:29065834 chr8:86565958 2 104 Frozen Tumor (T) 1 1 0 Buffy Coat (N) 7 ENOC 6 chr22:29065807 chr7:145090255 chr22:29065810 chr7:145090252 3 113 Frozen Tumor (T) 3 3 0 Buffy Coat (N) 8 ENOC 7 no TTC28-L1 detected no TTC28-L1 detected Frozen Tumor, Buffy Coat - - 9 ENOC 8 no TTC28-L1 detected no TTC28-L1 detected Frozen Tumor, Buffy Coat - - 10 CCOC 5 no TTC28-L1 detected no TTC28-L1 detected Frozen Tumor, Buffy Coat - - 11 CCOC 6 no TTC28-L1 detected no TTC28-L1 detected Frozen Tumor, Buffy Coat - - T = tumor, N=normal, Ov = ovarian tissue, Cer = cervical tissue, Ut = uterine tissue, MH = microhomologies, R.S.=read support. The last two columns show the difference in the coordinates called in WGS and capture assay. *Conventional PCR amplification across the breakpoint failed to validate these events (Chapter 2) due to high AT content in the junction, which yielded difficulty during primer amplification as well as Sanger sequencing **While this event was validated via PCR, (Chapter 2), such event was also not detected in the uterine tumor block. 53 3.3.2 Detection of L1 transductions in FFPE samples without WGS Once we established that the transduction capture assay was able to detect TTC28-L1 transductions in FFPE tissues, we selected a total of 11 CCOC and ENOC cases with FFPE tumor blocks available and without WGS data. For each case, DNA was extracted from the FFPE tumor block with the highest cellularity and the capture assay was performed. The results are summarized in Table 7, and an example alignment is shown in Figure 8. A total of 67 transduction events passed filters in 11/13 (85%) cases. Not surprisingly, transductions originating from TTC28 were the most frequent (40/67), followed by 18q21.32 (12/67), 14q23.1 (8/67), MYLK (4/67), Xp22.2 (2/67), and PHACTR (1/67). A single CCOC case had an exceptionally high number of TTC28 retrotranspositions, accounting for 35/40 events originating from TTC28, 3/12 from 14q21.32, and over half of all events overall. Excluding this extreme case, the average number of transduction events observed per case is three. Two of the 13 cases, both are CCOC, had no transduction events that passed filtering. We did not detect any retrotransposon-mediated transductions targeting exonic regions. Over half of the transduction targets were in intergenic regions (58%, 39/67), while 39% (26/67) fell into intronic regions of protein-coding genes, and 3% (2/67) fell within long noncoding RNA regions. For events with clear features at the transduction junctions, microhomology ranging from 1bp to 10bp was found in 52% (35/67) of events. Poly A\u00E2\u0080\u0099s and AT repeats were found to be inserted at the transduction junctions in 15% (10/67) of the events. Blunt-end joining was seen in 12% (8/67) of the events. Within the 11 cases with events called, four (36%) cases have at least one event validated by PCR (796A, 821A, 867A, 901C), with two of the four cases (796A and 901C) having one event validated by both PCR and Sanger, and one case (821A) having multiple event validated by both PCR and Sanger. 54 Table 7. Summary table for cases without WGS. Sample Cohort # Passed Filter Target Junction Gene (Introns) Origin Junction Junction features Read Support PCR Validation Sanger Validation 555A ENOC 3 5:121749610 SNCAIP 18:57071173 Microhomology 2 failed n/a 1:239086955 Intergenic 18:57071133 Microhomology 11 failed n/a 11:67336270 Intergenic X:11952984 - 8 failed n/a 605A ENOC 3 8:139214467 FAM135B 18:5707111 Microhomology 9 failed n/a 15:88340668 Intergenic 6:13191761 Microhomology 6 failed n/a 7:18705391 HDAC9 14:59221070 Sequence insertion 2 failed n/a 785B ENOC 5 5:29087758 Intergenic 18:57071103 - 10 failed n/a 12:129039139 TMEM132C 22:29065753 Microhomology 8 failed n/a 5:29087758 Intergenic 18:57071105 - 3 failed n/a 14:47352605 MDGA2 18:57071182 - 3 failed n/a 1:102060560 Intergenic 14:59220914 Microhomology 3 failed n/a 796A ENOC 8 6:115261650 Intergenic 22:29065468 Sequence insertion 96 passed failed 1:102060560 Intergenic 14:59220914 Microhomology 94 passed passed 5:63167862 Intergenic 14:59220699 - 37 passed failed 1:84682382 PRKACB 3:123590918 Sequence insertion 15 failed n/a 3:27932305 Intergenic 22:29066033 Blunt-end joining 4 failed n/a 1:31924154 Intergenic 3:123590713 - 3 failed n/a 8:5189545 Intergenic 22:29065799 Microhomology 1 failed n/a 3:102747961 Intergenic 18:57071164 Microhomology 1 failed n/a 1478B ENOC 2 4:14619262 Intergenic X:11953199 - 3 failed n/a 198A CCOC 0 - - - - - - - 621B CCOC 0 - - - - - - - 734B CCOC 3 12:65149589 GNS 14:59220569 Microhomology 5 failed n/a 1:73169893 Intergenic 18:57070535 Microhomology 4 failed n/a 1:77495841 ST6GALNAC5 3:123590713 Blunt-end joining 2 failed n/a 821A CCOC 35 16:20478545 ACSM2A 22:29065913 - 88 passed passed 55 5:3463205 LOC285577 22:29065884 Microhomology 83 passed passed 12:119929801 CCDC60 22:29065745 Sequence insertion 67 passed passed 1:102706192 Intergenic 22:29065929 Blunt-end joining 54 passed passed 7:70635634 GALNT17 22:29066121 Sequence insertion 48 passed failed 7:70635643* GALNT17 22:29065916 Microhomology 45 passed passed 10:124440824 Intergenic 22:29065657 Microhomology 45 passed passed 13:92952068 GPC5 22:29065664 Microhomology 40 passed passed 16:83312026 CDH13 22:29065883 Microhomology 39 passed passed 17:54227662 Intergenic 22:29065713 Blunt-end joining 33 passed passed 20:22990095 Intergenic 22:29065997 Sequence insertion 29 passed passed 12:66451462 Intergenic 22:29065909 - 25 failed n/a 12:4930804 KCNA6 22:29066350 Microhomology 20 passed passed 8:134972224 Intergenic 18:57070971 Sequence insertion 17 passed failed 20:22826077 Intergenic 22:29065570 Microhomology 14 passed passed 5:49440110 Intergenic 22:29065621 Microhomology 12 passed passed 5:51765508 Intergenic 22:29065641 Sequence insertion 10 passed passed 20:58060022 Intergenic 22:29065385 Microhomology 10 passed passed 13:75367645 Intergenic 22:29065803 Microhomology 9 passed passed 12:30157743 Intergenic 22:29065665 Microhomology 9 passed passed 7:64065824 LOC100128885 22:29065892 Microhomology 8 passed passed 4:171397541 Intergenic 22:29065839 Microhomology 7 passed passed 3:62786164 CADPS 22:29065970 Microhomology 6 passed passed 8:100382851 VPS13B 22:29065698 - 4 failed n/a 7:13833624 Intergenic 22:29065872 Microhomology 4 failed n/a 17:44081940 MAPT 18:57070508 Blunt-end joining 4 failed n/a 12:66451462 Intergenic 22:29066121 - 4 failed n/a 7:57795242 Intergenic 22:29065453 Blunt-end joining 3 passed passed 7:36437043 ANLN 22:29065852 Sequence insertion 3 failed n/a 7:120623445 ANLN 18:57071099 Microhomology 1 failed n/a 4:52800349 Intergenic 22:29066121 Microhomology 1 failed n/a 56 4:45203280 Intergenic 22:29065568 Sequence insertion 1 failed n/a 3:187676560 Intergenic 22:29066121 - 1 failed n/a 1:57200590 C1orf168 22:29066060 Microhomology 1 failed n/a 1:102706020 Intergenic 22:29066121 Microhomology 1 failed n/a 867A CCOC 1 4:47512838 ATP10D 22:29065263 Microhomology 19 passed failed 901C CCOC 4 3:27932305 Intergenic 22:29066033 Blunt-end joining 142 passed failed 4:46968006 GABRA4 22:29065456 Blunt-end joining 117 passed failed 1:84682382 PRKACB 3:123590918 - 101 failed n/a 3:180070147 Intergenic 22:29065913 Microhomology 92 passed passed 1006A CCOC 1 1:78083480 ZZZ3 14:59220385 Sequence insertion 1 failed n/a 1217C CCOC 2 10:69406020 CTNNA3 14:59220456 Microhomology 6 failed n/a 4:13016995 Intergenic 18:57071172 Microhomology 5 failed n/a *This is the example transduction event shown in Figure 8. 57 Figure 7. An example transduction event validated via PCR and Sanger sequencing in the FFPE sample 821A. The top two panels are visualization of the NGS sequence alignment (using Integrative Genome Viewer [113]) showing the sequences which align to chromosome 7. The colored nucleotides are mismatched base pairs of the split read, which maps to downstream of TTC28-L1 in chromosome 22. Bottom panel shows the Sanger sequencing results from PCR validation. While within the red arrow is a known polyA signal within the region (ATTAAA), the lack of a polyA tail indicate that it is more likely the sequence used by the reverse transcriptase (TT/AAA) to integrate into the genome. A 7bp MH was also found (orange arrow). 58 3.3.3 Detection of L1 transductions in plasma samples 3\u00E2\u0080\u0099 transduction capture was performed on eight of the 13 FFPE cases with available plasma samples. A bimodal enrichment of insert sizes were observed for all plasma samples (Figure A1c), where an enrichment of fragment size was observed at 170bp and another at 310bp. Of these eight cases, 867A was the only case where a plasma sample was obtained close to date of surgery, the remaining seven cases only had plasma samples from follow-up visits. Three of these eight cases (785B, 821A, 901C) had samples from multiple time points. Events detected in the plasma samples were compared to the events detected in FFPE samples, and as read supports is generally low for events called in the plasma, only matching events detected in both plasma and FFPE tissues are considered true and these are listed in Table 8. Four cases (605A, 796A, 1478B, 198A) did not have any event that were called in both plasma and FFPE samples. Only one of these four cases, 796A, had validated events called in the FFPE sample, while no event passed filters in the FFPE samples of the remaining three cases (Table 7). Clinical follow-up data was available for 796A and 198A, and both patients were progress-free and were alive at the last follow-up visit. Three cases (785B, 867A, 901C) had a single event called in both the plasma and FFPE, and one case (821A) had multiple matching events. Two of these cases (785B, 901C) had multiple plasma samples at different time points, and the same event was detected across different time points. However, for 785B, the event (chr18-chr8) failed filtering and validation in the FFPE sample. Another event (chr22-chr4) in 901C was well supported with >100 reads. Case 821A had plasma samples at five time points, and all five had different numbers of matching events detected. The two chr22-chr7 events were the most persistent event, detected across four time points, while other events appeared variable (Table 8 and Figure 9). Interestingly, no matching events were detected in Plasma 4667, even 59 though events were detected in visits prior to and after this one. For these three cases where events detected, both 785B and 821A displayed disease progression, but were alive at the last follow-up visit. While 901C contained events, this patient was progression-free and was alive and well at the last follow-up; and given the extremely high number of read supports for the detected event (chr22-chr4), it is possible that this was a germline event detected due to the abundance of DNA released from non-malignant cells. Future validation studies would be needed to confirm such finding. We had intended to compare the level of CA-125, the gold standard cell- free tumor maker, to the number of reads matching transduction events detected, however, the data for CA-125 levels across the different time times was not complete (Table 8). Given the limited data, we did not observe any relationship between CA-125 level and transduction events detected. However, these results provide proof of concept for retrotransposition analysis in cell free DNA. 60 Table 8. Summary table for cases with plasma samples. CA-125 levels are listed when available. Sample Cohort CA-125 (Surgery) Plasma ID Visit Date CA-125 # Matched (# in FFPE) Target Junction Gene (Introns) Origin Junction Read Support Validation in FFPE 605A ENOC - 2860 2013-06-12 4.3 0 (3) - - - - - 785B ENOC 590 2551 2013-01-21 45 1 (5) 8:134972224 intergenic 18:57070971 10 failed filter 2840 2013-06-10 26 1 (5) 8:134972224 intergenic 18:57070971 8 failed filter 796A ENOC 14000 2304 2012-09-26 - 0 (8) - - - - - 1478B ENOC - 2325 2012-10-31 - 0 (2) - - - - - 198A CCOC 5 3774 2014-01-29 5 0 (0) - - - - - 821A CCOC 320 2585 2013-02-12 101 4 (35) 7:70635643* GALNT17 22:29065916 6 passed 8:134972224 intergenic 18:57070971 5 failed 7:70635634 GALNT17 22:29066121 4 passed 5:3463205 LOC285577 22:29065884 2 passed 2721 2013-04-22 - 2 (35) 7:70635634 GALNT17 22:29066121 4 passed - 7:70635643* GALNT17 22:29065916 3 passed 4075 2014-03-31 - 4 (35) 7:64065824 LOC100128885 22:29065892 5 passed - 16:20478545 ACSM2A 22:29065913 3 passed - 7:70635643* GALNT17 22:29065916 2 passed - 5:3463205 LOC285577 22:29065884 1 passed 4667 2014-08-25 - 0 (35) - - - - - 5473 2015-02-19 - 5 (35) 7:64065824 LOC100128885 22:29065892 12 passed - 8:134972224 intergenic 18:57070971 7 failed - 5:3463205 LOC285577 22:29065884 5 passed - 12:119929801 CCDC60 22:29065745 4 passed - 7:70635643* GALNT17 22:29065916 3 passed 867A CCOC 1100 867 2008-04-17 1100 1 (1) 4:47512838 ATP10D 22:29065263 41 passed 901C CCOC 309 2609 2013-02-25 - 1 (4) 4:46968006 GABRA4 22:29065456 143 passed 3111 2013-08-19 - 1 (4) 4:46968006 GABRA4 22:29065456 129 passed 5210 2014-12-01 7 1 (4) 4:46968006 GABRA4 22:29065456 224 passed 61 Figure 8. The number of events detected in 821A across the different follow-up visits displayed as months after the date of surgery. The patient was diagnosed with stage IIIC CCOC, relapsed at 6 months after surgery, and was alive but with disease at the last follow up at 85 months after surgery. Blue bar indicates the number of events passed the bioinformatics filter in FFPE tissues, and red bar indicate the number of events detected in plasma that matched those in the FFPE tissues. Circos plots [114] show the specific transduction events detected at each time points. L1 transduction events observed were dynamic across time. 354240505101520253035400 60 63 74 79 84# of events# of months from the date of surgery (time 0)Number of L1 transduction events at different time points in 821A (FFPE and plasma) 62 3.4 Discussion In this study we were able to successfully detect 3\u00E2\u0080\u0099 transduction events previously identified in WGS data using our L1 transduction capture sequencing method, and we were able to detect transduction events in cases without WGS using as low as 150ng of tumor DNA sourced from frozen tumor, buffy coat or FFPE, as well as minimum of 5ng of ctDNA from plasma. In cases without WGS, transduction events were validated in 4/11 (36%) cases, which is similar to the frequency of TTC28-L1 transductions detected in WGS (as discussed in Chapter 2). This could indicate that 3\u00E2\u0080\u0099 transduction events occur in one-third of ENOC and CCOC, however, additional cases are needed to establish more robust frequencies. We did not find any exonic insertions in our cohort, which is not surprising given our small cohort size and that the vast majority of L1 retrotranspositions occur in noncoding regions [62, 82]. And while some of the transductions landed in the introns of genes with links to cancer, such as SNCAIP [115], HDAC9 [116], MDGA2 [117], GPC5 [118], CDH13 [119], ANLN [120] and CTNNA3 [121], the target loci do not appear to have a known functional impact. This further suggest that most L1 retrotranspositions are passenger events that occur during tumor development. Two of the CCOC cases (198A and 621B) had no detected transduction events that passed filters. This does not mean no L1 is active in these two cases, simply that we could not confidently detect transduction events from the six L1 loci tested. Transductions may occur from other L1 loci, as well as solo L1 insertions within such cases. The use of IHC staining for L1 proteins could be a good generic indicator of L1 activation. 63 One CCOC case, 821A, had a highly active TTC28-L1 that gave rise to at least 20 validated transduction events. These events are likely subclonal, however, we do not have the capability in our transduction capture assay to resolve the clonality of these events. In other words, we do not know if multiple TTC28-L1 transductions occurred within one tumor genome (of one clone or multiple clones), or multiple clones were present each with one unique TTC28-L1 transduction. Single-cell sequencing may provide a much clear picture in this case. Regardless, given the number of TTC28-L1 events, it would be an effective marker for tracking clonal evolution of this tumor. Given that we are capturing the 3\u00E2\u0080\u0099 transductions of these L1 retrotranspositions, it is interesting to note that in half of all events which passed filter (34/68), the junction features were microhomologies (MH). This is interesting because MH are usually found at the 5\u00E2\u0080\u0099 end of the L1 sequencing, where microhomology-mediated end joining (MMEJ) uses the MH between the 5\u00E2\u0080\u0099 end of the L1 and the target sequence to complete its integration [122]. While this may be an alignment error, our validation using PCR and Sanger do confirm the existence of such junction feature in the tumor tissue. This might indicate the presence of an alternative integration method in these cases and resolving the 5\u00E2\u0080\u0099 end of the L1 integration in future studies may help better elucidate the features. Sequence insertion in the breakpoints was found in 10/68 events, where most of the inserted sequences were poly A\u00E2\u0080\u0099s, consistent with the use of the poly A track during genomic integration. One note, however, is that there may be fragments with poly A\u00E2\u0080\u0099s comprising one entire end of the soft-clipped read; such fragments would have been filtered out as they cannot be uniquely mapped, which may result in additional undetected events. Blunt-end joining without target site duplications (TSD) was the third common feature we observed (8/68). While TSD is a feature of the TPRT integration mechanism of L1, we would not able to find 64 such feature using our method, as we are not able to resolve the 5\u00E2\u0080\u0099 end of the insertion. However, the absence of TSD has been observed in other studies [75] and may also indicate alternative mechanisms of integration. Overall, the inability to identify solo insertions and to resolve the 5\u00E2\u0080\u0099 end sequence is a limitation of transduction capture sequencing; however, it does not undermine the validity and potential use for tracking L1 retrotranspositions using this technique. For FFPE cases without WGS, we did not analyze matched normal tissue, and thus some of the transductions detected may be germline events. We did, however, compare the events detected in these tumors with the events detected in other patient samples, as a \u00E2\u0080\u009Csubstitute normal\u00E2\u0080\u009D. While this does not eliminate patient-specific germline transductions, it does eliminate population-specific events. We were able to robustly detect 3\u00E2\u0080\u0099 transduction events, but it was much more difficult to detect these events in plasma samples. Here we took the available plasma samples at different time points for eight cases and performed capture sequencing. Given that the number of reads supporting an event is generally extremely low for ctDNA, we only considered the events identified in both FFPE and plasma samples to be true. The caveat is that there might be additional events detected in the plasma which arose during tumor progression and was not present in the initial FFPE tumor tissue. However, it would be difficult to parse out the true new events from false positive events given the low support for these events in the plasma, and thus we have decided to consider only events identified in the FPPE samples at this point. This also meant that L1 transduction events have to be identified in frozen or FFPE tumor tissues before capturing plasma derived ctDNA. The CCOC case, 821A, was again interesting, as the transduction events detected in plasma samples obtained at various time points were different in numbers and in the transduction 65 events (Figure 8). As all plasma samples were obtained after the initial progression, this may reflect the fluctuating activity of the TTC28-L1 locus in the relapsed tumor across time. Certain events persisted across several visits, which could reflect the presence of different tumor clones emerging during tumor development. The time point where no matching events were identified could indicate that the tumor burden was minimal prior to relapse. On the other hand, L1 source activity has been shown to fluctuate across different metastasis [62], thus the detection of these events in the plasma may also indicate a suppression of source L1 activity, in addition to or instead of the clearance of the specific tumor clones. Unfortunately, we did not have CA-125 data for all time points to confidently identify correlations between these transpositions and tumor burden. However, despite a weak correlation, we did observe that two of three cases with matched L1 events had residual disease at last follow-up and cases without matched L1 events had no progression, indicating that capturing L1 events in plasma may be reflective of tumor burden. Previous studies of L1 activation using ctDNA has used the copy number and/or hypomethylation levels of L1 fragments, as L1 and other retroelements are highly abundant in the genome [89]. The copy number of L1 DNA fragments in the sera was found to positively correlate with tumor burden and may be used to detect early-stage breast cancers, however L1 copy number was still detectable in healthy controls, which may increase false positive rates and limit the application of such methodology [123]. The levels of L1 hypomethylation, which may be more tumor-specific, correlated with more advanced disease in colorectal cancer and was higher in tumor patients compared to health normal [124], and global hypomethylation (as detected via L1 hypomethylation level as surrogate) correlated with poor survival in DLBCL [125]. Our study presents a potential alternative method for detecting L1 activation in plasma, 66 which is not a general survey of L1 activation, but a reflection of the potential tumor clonal populations that remain in the patient at different time points. Despite our small sample size, we have showed that transduction capture is capable of detecting L1 transduction events in plasma, given an a priori knowledge of the events captured from FFPE or frozen tumor tissue In general, using 3\u00E2\u0080\u0099 transduced DNA as the molecular tag to detect retrotranspositions has several advantages, as well as several caveats. By probing for these \u00E2\u0080\u009Chot\u00E2\u0080\u009D L1 loci, which are highly active across different cancers and whose activities would vary the most through tumor development [62, 71], we can avoid an aimless survey of retrotransposition insertions across the genome. Hot L1s also have a much higher chance of being impactful. In addition to partnered transductions, we are also able to track orphaned transductions which originate from these 6 source loci. Orphaned transductions are downstream sequences transduced without the accompanying L1, and they account for half of the total number of 3\u00E2\u0080\u0099 transductions. These events would be missed by techniques that rely on primers binding to the internal L1 sequences [62]. The disadvantage is that only a quarter of tumors display 3\u00E2\u0080\u0099 transductions [62]. We would also not be able to resolve transduction events that are initiated by other L1s. However, we are not limited to the six L1 loci listed in this study. Probes that cover other L1 loci can be readily designed and included into the assay for detection of additional transduction events. Nonetheless, our results show that L1 are active in at least one-third of ENOC and CCOC cases. Detection of 3\u00E2\u0080\u0099 L1 transduction events using our capture assay is a sensitive, widely-applicable and cost-effective alternative to WGS in frozen and FFPE tumor tissues, and that transductions can be detected in ctDNA with the a priori knowledge of the events from corresponding tumor tissues. 67 Chapter 4: L1 protein expression in ovarian and endometrial cancers 4.1 Background L1ORF1p (L1RE1) expression has been found across different cancers, including ovarian cancers [72, 126]. Given its association with p53 expression in different cancers demonstrated by Rodic et al [72] we aimed to see if a similar relationship occurs in our cohort. In addition, we surveyed the expression of L1ORF1p in HGSC precursor lesions, also with p53 mutations, and we hypothesis that L1ORF1p will also be expressed in these precursor lesions. Given that clonally related cancers can co-occur in the ovary and the uterus (known as endometrial cancers or EC) [105], and that endometrioid endometrial cancer (ENEC) shares morphological, histological and genetic similarities with ENOC, we expanded our analysis to a cohort of EC. Here we compared the expression of L1ORF1p within the different categories of the ProMiSe classifier for ENEC. The ProMiSe classifier is a tool developed by Talhouk et al. for stratifying ENEC cases at diagnosis into different prognostic subgroups based on their molecular profile, such that each subgroup has its own associated risk and potentially different treatment options [127]. The four subgroups of the ProMiSe classifier are mismatch repair deficient (MMR-D), exonuclease domain mutations in the POLE gene (POLE EDM), p53 wild type (p53 wt) and p53 abnormal (p53 abn). The MMR-D subgroup comprise of cases with loss of expression of MMR proteins (MLH1, MSH2, MSH6, PMS2) on IHC. As previously mentioned, MMR-D is associated with MSI-high status, and given that an higher MSI correlated with lower L1 hypomethylation in CRC [96], we expect L1ORF1p expression to be lower in the MMR-D group. POLE EDM refers to the subgroup of EC with mutations in the exonuclease domain of the gene encoding polymerase epsilon, and about 10% of EC has POLE EDM, which conferred a more favorable 68 prognosis [128].TP53 mutations are also seen in EC, for which we also expect to observe higher L1ORF1p expression in the p53 abn group compared to the p53 wt group. 4.2 Methods 4.2.1 Immunohistochemistry We assessed the expression of L1ORF1p in retrospective cohorts of ovarian and endometrial cancers using tissue microarrays (TMAs) previously constructed with duplicate 0.6-mm cores from formalin-fixed paraffin-embedded (FFPE) tissues. Three ovarian TMA (n = 1307) and two endometrial TMA (n = 848) were used. 4um slides of each TMA were cut onto Superfrost plus slides (Fisher Scientific) using microtome. Processing of the slides were done using the automated Ventana Benchmark and Discovery systems (Ventana Medical Systems, Tucson, AZ, USA). The recommended IHC staining protocol for the Ventana Discovery XT was used, with minor deviations including primary antibody incubation of 2 hours, counterstain incubation of 12 minutes and post counterstain incubation of 4 minutes. Slides were stained with antibody against L1ORF1p (L1RE1, mouse monoclonal 4H1, MACB1152; Millipore Sigma) diluted to 1:100 using Discovery antibody diluent from Ventana (#760-108). Normal fallopian tube was used as the negative control, and tumors known to have TTC28-L1 was used as a positive control. To assess the expression of L1ORF1p in HGSC precursor lesions, we stained fallopian tube (FT) tissues with STIC lesions (identified and selected by pathologist B.G.), using the same IHC protocol. In addition, IHC for p53 was performed on normal fallopian tube fimbria from 53 patients and was assessed for p53 signature lesions by a pathologist (A.K.). We selected two 69 normal FT with p53 signature lesions, and two normal FT without p53 signature lesions to stain with L1ORF1p. All TMAs were scored by an anatomical pathologist (B.T-C.), with ranges of 0-3. A score of 0 indicate negative staining, 1 indicates weak staining, 2 indicates moderate staining, and a 3 indicates strong staining. Staining patterns were noted as diffused if the staining intensity is constant across the core, and variable if spots of cells with strong stains appear against a background of weaker stains. 4.2.2 Statistical Analysis For univariable associations, Chi-square test was used for categorical biomarker and Kruskall-Wallis/Wilcoxon rank sum test was used for continuous biomarker data. Kaplan-Meier plots are used to assess survival. Statistical significance was set at alpha = 0.05. All statistical analysis was performed in R. 4.3 Results 4.3.1 Cohort Description and the association of L1ORF1p expression with stage, grade, and subtypes of epithelial ovarian cancers Cases with missing L1ORF1p due to incomplete cores are omitted, and a total of 923 patients with a confirmed diagnosis of HGSC, LGSC, ENOC, CCOC, and MC were included in the analysis (Figure 9). Distribution of stage and grade by L1ORF1p expression are listed in Table 9. Briefly, 61.4% of the whole cohort were HGSC, 2.7% were LGSC, 15.5% were ENOC, 13.9% were CCOC, and 3.4% were MC. Majority of the patients were diagnosed at stage III (48.1%), while stage II (24.7%) and stage I (21.6%) patients were similarly distributed, and the least number of patients were diagnosed at stage IV (5.5%). The distribution of L1ORF1p 70 expression across the 5 EOC histotypes are shown in Table 9 and Figure 10, and the associations of stage and grade with L1ORF1p expression levels are shown in Table 9. Within the four subtypes analyzed, HGSC has the highest proportion of positive L1ORF1p expression cases (90.9%), followed by CCOC (59.1%), ENOC (56.1%), MC (46.9%) and LGSC (34.6%), while MC and LGSC were the two histotypes with 0 cases of strong staining intensity. Consequently, HGSC represents the highest proportion of all cases with positive L1ORF1p expression (74.2%) and LGSC represents the smallest proportion (1.3%). Statistical analysis confirmed that L1ORF1p expression was significantly higher in patients diagnosed with stages III/IV (p < 0.0001), grade 3 (p < 0.0001), and of the HGSC subtype (p < 0.0001). 71 Figure 9. Flowchart for the case selection process of the ovarian TMAs. A total of 923 cases were included, with 5 of the common EOC subtypes, HGSC, LGSC, ENOC, CCOC, and MC. 72 Table 9. Distribution of clinical characteristics by L1ORF1p IHC status. Total Negative Weak Moderate Strong Age at surgery mean (SD) 59.5 (12.8) 55.5 (12.7) 57.3 (13.3) 60.4 (12.5) 63.7 (11.7) median 59.0 54.1 56.3 59.8 63.6 IQR 50.2 to 69.1 47.1 to 64.4 48.8 to 67.2 51.2 to 70.6 54.9 to 72.5 range 19.2 to 90.9 23.8 to 88.0 19.2 to 89.0 19.2 to 90.1 34.5 to 90.9 missing 110 41 24 36 9 FIGO Stage I 233 96 (41.2%) 41 (17.6%) 79 (33.9%) 17 (7.3%) II 235 70 (29.8%) 42 (17.9%) 83 (35.3%) 40 (17.0%) III 448 53 (11.8%) 73 (16.3%) 204 (45.5%) 118 (26.3%) IV 58 4 (6.9%) 6 (10.3%) 27 (46.6%) 21 (36.2%) missing 81 34 (42.0%) 19 (23.5%) 21 (25.9%) 7 (8.6%) Histology high-grade serous 585 53 (9.1%) 89 (15.2%) 269 (46.0%) 174 (29.7%) low-grade serous 26 17 (65.4%) 8 (30.8%) 1 (3.8%) 0 (0.0%) endometrioid 148 65 (43.9%) 41 (27.7%) 37 (25.0%) 5 (3.4%) clear cell 132 54 (40.9%) 16 (12.1%) 53 (40.2%) 9 (6.8%) mucinous 26 17 (53.1%) 4 (12.5%) 11 (34.4%) 0 (0.0%) others 95 45(47.4%) 14 (14.7%) 25 (26.3%) 11 (11.6%) 73 Figure 10. Distribution of L1ORF1p expression across the different EOC Histotypes. The numbers of cases within each score are shown on the graph, as well as in the table. The corresponding staining intensity for each score category is shown below the graph. L1ORF1p expression is significantly higher in HGSC compared to the other histotypes (p<0.0001). HGSC LGSC ENOC CCOC MCstrong 174 0 5 9 0intermediate 269 1 37 53 11weak 89 8 41 16 4negative 53 17 65 54 17531765 54178984116426913753111740 5 900%10%20%30%40%50%60%70%80%90%100%% IntensityDistribution of L1ORF1p Staining Intensity Across EOC Histotypes strong intermediate weak negative normal FT 74 Table 10. Association of L1ORF1p expression (negative vs. any positive) with histotypes. HGSC correlated with more L1 positivity. Chi-square test, alpha = 0.05. Variable Levels Total Negative Any Positive P value Histotype high-grade serous 585 53 (9.1%) 532 (90.9%) <0.0001 other 338 153 (45.3%) 185 (54.7%) 75 4.3.2 Evaluating the association of L1ORF1p expression with survival in EOC To see if LINE-1 expression can be used as a prognostic marker in EOC, we assessed the overall survival (OS), disease specific survival (DSS), and progression-free survival (PFS) of our cohort in correlation with L1ORF1p expression. The optimal cut-off points for significant correlation between L1ORF1p staining pattern and survival was determined to be \u00E2\u0080\u009Cnegative\u00E2\u0080\u009D versus \u00E2\u0080\u009Cany positive staining\u00E2\u0080\u009D for the whole cohort and then used for subtype specific analysis. In the whole cohort survival analysis, L1ORF1p expression significantly correlated with worse OS (p < 0.001, HR 1.75), DSS (p < 0.001, HR 2.06), and PFS (p < 0.001, HR1.94) (Figure 11a). However, L1ORF1p expression did not correlate with survival when each histotype was analyzed separately, except for in ENOC, where DSS was significantly worse at the 5-year cut-off (p = 0.0478, HR 2.976) (Figure 11b). 76 Figure 11. Kaplan-Meier analysis of L1ORF1p (L1RE1) expression. a). within the whole cohort (n=923), positive L1ORF1p expression correlated with poor survival in overall survival (OS), disease specific survival (DSS), and progression free survival (PFS) analysis. b). in subtype-specific analysis, the only significant correlation observed was poor DSS survival associated with L1ORF1p (L1RE1) positivity in ENOC at the 5-year cut-off (n=148, p=0.0478, HR 2.976). a. b. 77 4.3.3 Evaluating the association between L1ORF1p expression and p53 expression A previous pan-cancer survey of L1ORF1p expression showed that L1ORF1p is associated with p53 expression [72]. To assess whether the same trend occurs in our cohort, we compared L1ORF1p expression with p53 in cases with p53 IHC available (scored in 2016 by pathologist F.K., n=268). Expression of L1ORF1p was significantly associated with mutated p53 (p < 0.0001) across the whole cohort (Table 11), however, it was not significantly associated with mutated p53 within individual histotypes (Table 12). Given that L1ORF1p expression correlates with p53 mutant expression and L1 insertions have been seen in precursor lesions in other cancers, we hypothesized that L1ORF1p can be detected in the HGSC precursor lesions STIC, which also express mutant p53. We performed IHC staining of four STIC lesions in the fallopian tube (Figure 12). L1ORF1p expression was seen in neoplastic areas and corresponded to areas of p53 overexpression, as well as STIC lesions with p53 null expression. P53 signature lesions, which are the accumulation of mutant p53 proteins in non-neoplastic cells, can be found in the normal fallopian tube. This is an event that precedes STIC lesions, and we assessed whether L1ORF1p was also expressed in these pre-neoplastic tissues. No L1ORF1p expression was found in tissues with p53 signature lesions (Figure 13). 78 Table 11. Association between L1ORF1p and p53 IHC status across the whole cohort. Complete absence, overexpression and cytoplasmic phenotype all indicate mutated p53. Variable Levels Total Negative Any Positive P value p53 Complete absence 63 1 (1.59%) 62 (98.4%) <0.0001 Overexpression 136 8 (5.88%) 128 (94.1%) Cytoplasmic 6 0 (0.00%) 6 (100.0%) Wildtype 76 22 (28.9%) 54 (71.1%) Table 12. No association was found between L1ORF1p and p53 IHC status in HGSC. Variable Levels Total Negative Any Positive P value p53 Mutated 183 7 (3.83%) 176 (96.2%) 1.000 Wildtype 18 1 (5.56%) 17 (94.4%) 79 80 Figure 12. Comparison of p53 and L1ORF1p (L1RE1) expression in four cases with STIC lesions. Distinct STIC morphology is observed in the H&E slides. L1ORF1p was expressed in cells with p53 overexpression as well as null expression. This suggest the possibility of L1ORF1p as a surrogate marker that can identify p53 null expressions. 81 82 Figure 13. IHC, p53, and L1ORF1p (L1RE1) stains for four normal fallopian tubes. Two FT contains p53 signature lesions. Very faint L1ORF1p stains are observed in the epithelial cells throughout the section, with no preference for areas of p53 lesions, which likely indicate negative L1ORF1p activity. 83 4.3.4 Evaluating L1ORF1p expression in endometrial cancers and its association with MMR and p53 mutation status. We evaluated the expression of L1ORF1p across two cohorts of endometrial cancer, a local Vancouver (Van) cohort (n=153) and a German (Ger) cohort (n=454). Four histotypes of endometrial cancers were included in the cohort, Endometrioid Endometrial (ENEC), Serous Endometrial (SEC), Clear Cell Endometrial (CCEC), and mixed type endometrial cancers. The distribution of histotypes in the Van cohort is as follows: ENEC was 80.4% (123/153), SEC 10.5% (16/153), mixed type 3.9% (6/153) and others 5.2% (8/153), and in the Ger cohort: ENEC was 87.9% (399/454), SEC 7.5% (34/545), mixed type 3.3% (15/545), and CCEC 1.3% (6/545). A high proportion of negative and weak L1ORF1p positivity was observed in the endometrioid histotypes within both cohorts, and a high proportion of intermediate and strong L1ORF1p positivity was observed in the serous histotype (Figure 14). The optimal cut-off points for separating staining intensity was determined to be negative/weak staining versus intermediate/strong staining, based on L1ORF1p correlation with disease specific survival and was used for all association analysis. In the whole cohort analysis for the Vancouver cohort, negative/weak staining was associated with improved PFS (p = 0.00114, HR = 2.614), but no significant association was found for OS and DSS, nor was association found in histotype-specific analysis. In the German cohort, negative/weak staining was associated with improved OS (p = 0.00845, HR = 1.741) DSS (p = 0.0185, HR = 1.940), and PFS (p = 0.0125, HR = 1.987) for the whole cohort, but also no association was found within histotypes. We next evaluated the expression of L1OR1p across the four molecular subgroups of the ProMisE classifier [127]. The MMR-D and p53 wt groups had significant proportion of 84 negative/weak staining compared to the p53 abn group (p < 0.0001) (Figure 15). Within the p53 wt group, negative/weak L1OR1p expression appear to be significantly correlate with improved PFS (p < 0.001, HR = 7.404) in the Vancouver cohort, however the correlation was weak in the German cohort that has a larger sample size (p = 0.0672, HR = 2.572) (Figure 16). We did not find any significant correlation with survival in the MMR-D and p53 abn groups. 85 Figure 14. Distribution of L1OR1p expression across the two endometrial cohorts. The distribution of ENEC and SEC was similar in both cohorts, where ENEC tends to have a high proportion of negative/weak expression, and SEC have a high proportion of intermediate/strong expression. Mixed type and others have relatively low numbers and are hard to interpret in this case. ENEC SEC mixed type othersstrong 3 5 0 0intermediate 17 5 4 1weak 53 4 0 3negative 47 0 2 3470235340317541350 00%10%20%30%40%50%60%70%80%90%100%% IntensityDistribution of L1ORF1p staining intensity in EC (Van) ENEC SEC CCEC mixed typestrong 11 13 2 3intermediate 74 11 2 6weak 171 5 2 5negative 128 5 0 112850 11715257411261113 230%10%20%30%40%50%60%70%80%90%100%% IntensityDistribution of L1ORF1p staining intensity in EC (Ger) 86 48%32%20%0%MMR-D (Van)negative weak moderate strong39%42%16%3%MMR-D (Ger)negative weak moderate strong33%52%14%1%p53 wt (Van)negative weak moderate strong33%47%18%2%p53 wt (Ger)negative weak moderate strong20%24%32%24%p53 abn (Van)negative weak moderate strong7%16%42%35%p53 abn (Ger)negative weak moderate strong 87 Figure 15. Distribution of L1ORF1p expression within three molecular subgroups of the ProMisE classifier. Negative/weak L1ORF1p expression correlated with MMR-D and p53 wt within both cohorts. 88 Figure 16. L1ORF1p expression in p53 wildtype cases. A strong correlation was found in the Vancouver cohort (11-010) between negative/weak L1ORF1p (L1RE1) staining and improved PFS (p < 0.001, HR = 7.404). However, such observation was only weakly correlated in the German cohort (16-005) (p =0.0672, HR = 2.572). 89 4.4 Discussion In this study we demonstrated that L1ORF1p expression is associated with higher grade and more advanced stages in ovarian cancer, and it is expressed most frequently in HGSC. Previous studies have shown similar trends where L1ORF1p expression was higher in more aggressive cancers (summarized in [89]). This is expected as L1 activation correlates with general increase in genomic instability, which occurs more frequently in high grade/stages. We had observed that L1ORF1p expression correlated with worse OS, PFS, and DSS survival in the whole cohort, however these correlations were no longer significant in individual EOC histotypes. Given the high proportion of L1ORF1p positivity in HGSC (90.9%) and the high proportion (60.4%) of HGSC in the cohort, the observed association with worse prognosis in whole cohort is likely a reflection of the worse survival in HGSC compared to the other histotypes rather than the association with L1ORF1p expression. While there was a weak correlation between negative L1ORF1p expression and DSS at the 5-year cut-off, it was no longer significant past 5 years. Thus, L1ORF1p expression is not a prognostic marker in EOC. Similar to a previous survey of L1ORF1p expression across different epithelial cancers [72], we have found that L1ORF1p expression is associated with mutant p53 across our EOC cohort. Such association was not found when subtype-specific analysis was performed, which is likely due to the low number of cases without p53 mutation in HGSC and the low number of cases with p53 mutations in the other cohorts. Studies have shown that p53 functions to restrain retrotransposition, and mutant p53 loses such ability, which likely explains how L1ORF1p expression correlated with loss of p53 [129]. We performed IHC on STIC lesions to confirm the activation of L1 elements in the precursor lesions of HGSC. As expected, we observed L1ORF1p expression in all four STIC lesions, which all contained mutant p53, including one case with p53 90 null expression. This indicate that L1ORF1p may be active in neoplastic precursor lesions of HGSC, and that it is an early event in HGSC tumorigenesis, similar to ENOC and CCOC as demonstrated in Chapter 2. We further looked at the expression of L1ORF1p in two normal fallopian tubes with p53 signature lesions, and we were not able to find L1ORF1p expression corresponding with the p53 lesions. This could indicate that L1 elements are not activated in the pre-neoplastic lesions of HGSC. Given p53\u00E2\u0080\u0099s function in restricting L1 activation [129], L1 elements are likely activated after p53 mutations occur. Also given the fact that p53 signature lesions in non-neoplastic tissues did not express L1, it may indicate that additional genomic alterations during neoplastic transformation are needed for L1 activation. However, we do acknowledge the limited number of precursor lesions assessed in our study, and the fact that we studied pre-neoplastic and neoplastic lesions in separate cases. Having additional samples as well as tissues with concurrent non-neoplastic p53 signature lesions, STIC lesions and HGSC would clarify and strengthen our conclusion. Nonetheless, we observed that L1ORF1p is expressed as early as the neoplastic transformation in HGSC but is not present in pre-neoplastic lesions. Such expression could be a potential surrogate marker for p53 null cases. We further surveyed the expression of L1ORF1p in endometrial cancers, and assessed its expressions within the subgroups of the ProMiSe classifier [127]. We saw that endometrioid endometrial carcinomas had similar L1ORF1p expression patterns to endometrioid ovarian carcinomas as expected, since both cancers are similar in histology and genomic features. The serous histotypes of endometrial carcinoma displayed stronger L1ORF1p expression, similar to high-grade serous ovarian carcinomas. As expected, L1ORF1p had weaker expression in p53 wt subgroup, and stronger expression in the p53 abn group. Interestingly, moderate and strong L1ORF1p expressions within the p53 wt subgroup in the Vancouver cohort conferred 91 significantly worse survival, while only a trend towards worse survival was seen in the German cohort. This could mean that L1ORF1p expression is identifying a small group within the p53 wt category, or it is identifying cases with underlying p53 mutations despite having a p53 wt IHC stain. Either way this could mean that L1ORF1p has the potential to be an additional marker for classifications in constructing future tools. Studies in colorectal cancers have demonstrated that increased L1 hypomethylation was correlated with more MSI-stable cases [96], and we observed a similar trend in our endometrial cancer cohort, where MMR-D (which indicates microsatellite instability) cases had weaker L1ORF1p expression. In conclusion, we have showed that L1ORF1p expression tends to be much higher in gynecological cancers with unstable genomes, rather than the quiescent genomes, and has potential utilities as an IHC marker in both research and clinical settings. 92 Chapter 5: Concluding Chapter: Summary and Future Direction 5.1 Summary The aim of this thesis was to investigate the presence of transductions of unique DNA pieces by active LINE-1 retrotransposable elements within cohorts of epithelial ovarian cancers (EOC). Given the high frequency of events originating from one L1 loci within the TTC28 intron 1 on chromosome 22q12.1, which were identified via whole genome sequencing in our endometrioid and clear cell ovarian cohorts, we focused on these events in ENOC and CCOC in Chapter 2. As evidenced by comparing PCR validated TTC28-L1 to SNVs in multiple tumor samplings within one case, we concluded that this L1 at TTC28 chr22q12.1 was activated early in ovarian tumorigenesis. Evidence for the early involvement of L1 in tumorigenesis have been found across various epithelial cancers, and our study confirms ovarian cancer as one of them. To be able to detect this frequent rearrangement event in future cases without performing WGS, we took advantage of a probe-based enrichment assay in Chapter 3. While we cannot predict the transduction target sites a priori, we know the unique sequence that is transduced by the active L1. Using this stretch of sequence (1kb downstream of L1) as molecular tags, we can isolate and enrich all fragments containing this sequence, including fragments of the unknown target. In this chapter we successfully reproduced WGS results using the capture assay, and we also detected transduction events in FFPE tissues without WGS, making it a feasible tool for evaluating 3\u00E2\u0080\u0099 transductions in cancers. At the same time, we do acknowledge the limitations in our assay in detecting L1 insertions without 3\u00E2\u0080\u0099 transductions. We were also not able to resolve the entire L1 insertion, given that we are not probing the L1 sequence itself. This may lead to missing solo L1 insertions that could be potentially tumorigenic. 93 In Chapter 4, we used IHC to survey the expression of the L1 protein, L1orf1p or L1ORF1p, in a large extended cohort of epithelial ovarian cancers including HGSC, LGSC, ENOC, CCOC, and MC. While L1ORF1p was not prognostic in our cohort, we confirmed the correlation between L1ORF1p expression and p53 in EOC, as demonstrated in previous studies [73]. Interestingly, L1ORF1p expression was almost uniformly expressed in p53 null (nonsense mutation) HGSC cases. Extending into HGSC precursor lesions, STIC, we also found L1ORF1p to be highly expressed in p53 mutant cases, including p53 null. Similar L1ORF1p expression patterns were observed in endometrioid endometrial carcinomas compared to endometrioid ovarian carcinomas, as expected. In a recent study, Kobel et al. showed that while p53 IHC status is clinically useful and highly specific in predicting missense mutations, a low percentage of HGSC (4%) had IHC expression patterns non-distinguishable from wildtype p53, despite having underlying loss-of-function mutations (nonsense, indels, and splicing) [130]. While these HGSC cases could be misdiagnosed LGSC cases, a distinction could not be made from p53 IHC results alone. In this thesis we have shown that L1ORF1p is highly expressed in HGSC and p53 mutated cases but not in LGSC, thus we propose that L1ORF1p expression could be a potential surrogate, or complementary, IHC maker to p53 expressions to aid the differential diagnosis of HGSC in these types of difficult cases. And because of the rarely of these cases, a study to confirm the utility of using L1ORF1p and p53 IHC in differential diagnosis of serous ovarian carcinomas would involve multi-center collaborations. Interestingly, despite how L1ORF1p was more frequently expressed in HGSC compared to ENOC or CCOC, our WGS analysis did not find TTC28-L1 transductions to be more frequent in HGSC. This could be because this specific locus was differentially regulated in HGSC. Future 94 transduction capture experiments could be performed on HGSC samples to confirm the presence or the absence of TTC28-L1 transductions. Functional studies involving cell culture and retrotransposition-reporter constructs (e.g. [131]) could be designed to observe the difference in retrotransposon activity between the cell lines derived from these histotypes. 5.2 Future Directions L1 mediated 3\u00E2\u0080\u0099 transduction is a feature that is readily detectable in at least one-third of ENOC and CCOC cases using WGS and our novel transduction capture assay. Future studies that could help resolve the dynamics of 3\u00E2\u0080\u0099 transduction during clonal evolution of tumors would involve using the transduction capture assay at the single-cell level. To evaluate the clinical utility of tracking 3\u00E2\u0080\u0099 transductions in ctDNA, a much more rigorous experiment would be needed to evaluate how well the captured 3\u00E2\u0080\u0099 transductions reflect tumor burden. While L1ORF1p were readily detectable in our cohort, studies have shown that L1ORF1p expression does not correlate with insertions [82]. This may be because L1ORF2p is suppressed in HGSC, such that L1ORF1p is not reflecting the true level of L1 activities [100]. Perhaps detection of L1ORF2p may reflect retrotranspositions much more readily. However, as the purpose of our assay is not a survey of all L1 insertions, nor was this our intention, we would not be able to fully resolve the insertional landscape of L1 within our cohort. Nonetheless, if L1ORF2p indeed reflected retrotranspositions more closely, this antibody by itself or in a combination with L1ORF1p may be a better prognostic marker for our cohort and could be explored as a potential future direction. In addition, while we concluded that L1 is active early in ENOC and CCOC and we have investigated L1ORF1p expression in the HGSC precursor lesions, we did not evaluate the presence of L1 transductions in the precursor lesion 95 (endometriosis) due to resource availability. As such, a possible future study could be staining for L1ORF1p in endometriosis tissues should they become available. A recent study in gastrointestinal cancer associated lower L1 activation with higher immune infiltration in colorectal cancers, proposing that immune infiltration acts to suppress L1 activations [132]. An interesting avenue of investigation is to see if by combining L1ORF1p IHC and transduction capture assay, we could predict the level of immune infiltration. As MMR deficiency has been found to be positively correlated with immune infiltration in ovarian cancer [41], negatively correlated with L1 promoter hypomethylation in colorectal cancers [97, 98] and negatively correlated with L1ORF1pexpression in our cohort (to be confirmed), we do expect to see an inverse relationship. Perhaps highly effective future studies are to combine the methods employed in this thesis. This would involve using L1ORF1p IHC as an initial screening method to identify cases that have potentially active L1s, followed by transduction capture assay to identify transduction events in the available tumor tissues, and then for cases with high levels of transduction events, we can use the capture assay to assess the evolutionary dynamics of these events using available tissues (frozen, FFPE, or ctDNA) at different time points, retrospectively and/or prospectively. The use of a probe-based assay meant flexibility in the probe capture tool added into the libraries, i.e., we can expand the number of L1 loci, and potentially add probes for hotspot SNVs to simultaneously resolve L1 status and mutation profile in one capture. Ultimately, we hope to continue our research in understanding these genetic \u00E2\u0080\u009Cparasites\u00E2\u0080\u009D, and to develop techniques that use them to our advantage in the fight against cancer. 96 Bibliography [1] Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA: a cancer journal for clinicians. 2016;66:7-30. [2] Reid BM, Permuth JB, Sellers TA. Epidemiology of ovarian cancer: a review. Cancer biology & medicine. 2017;14:9-32. [3] K\u00C3\u00B6bel M, Kalloger SE, Huntsman DG, Santos JL, Swenerton KD, Seidman JD, et al. Differences in Tumor Type in Low-stage Versus High-stage Ovarian Carcinomas. International Journal of Gynecologic Pathology. 2010;29:203. [4] Cortez AJ, Tudrej P, Kujawa KA, Lisowska KM. Advances in ovarian cancer therapy. Cancer Chemotherapy and Pharmacology. 2018;81:17-38. [5] Gilks CB, Prat J. Ovarian carcinoma pathology and genetics: recent advances. Human pathology. 2009;40:1213-23. [6] Kurman RJ, Shih I-M. The Dualistic Model of Ovarian Carcinogenesis Revisited, Revised, and Expanded. The American Journal of Pathology. 2016;186:733-47. [7] Network C. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474:609-15. [8] Petrillo M, Marchetti C, De Leo R, Musella A, Capoluongo E, Paris I, et al. BRCA mutational status, initial disease presentation, and clinical outcome in high-grade serous advanced ovarian cancer: a multicenter study. American journal of obstetrics and gynecology. 2017;217:3340. [9] Ahmed A, Etemadmoghadam D, Temple J, Lynch AG, Riad M, Sharma R, et al. Driver mutations in TP53 are ubiquitous in high grade serous carcinoma of the ovary. The Journal of pathology. 2010;221:49-56. [10] Cole AJ, Dwight T, Gill AJ, Dickson K-A, Zhu Y, Clarkson A, et al. Assessing mutant p53 in primary high-grade serous ovarian cancer using immunohistochemistry and massively parallel sequencing. Scientific Reports. 2016;6. [11] Brosh R, Rotter V. When mutants gain new powers: news from the mutant p53 field. Nature reviews Cancer. 2009;9:701-13. [12] Gurung A, Hung T, Morin J, Gilks CB. Molecular abnormalities in ovarian carcinoma: clinical, morphological and therapeutic correlates. Histopathology. 2013;62:59-70. [13] Wang YK, Bashashati A, Anglesio MS, Cochrane DR, Grewal DS, Ha G, et al. Genomic consequences of aberrant DNA repair mechanisms stratify ovarian cancer histotypes. Nature genetics. 2017;49:856-65. 97 [14] Piek JM, van Diest PJ, Zweemer RP, Jansen JW, Poort-Keesom RJ, Menko FH, et al. Dysplastic changes in prophylactically removed Fallopian tubes of women predisposed to developing ovarian cancer. The Journal of pathology. 2001;195:451-6. [15] Colgan TJ, Murphy J, Cole DE, Narod S, Rosen B. Occult carcinoma in prophylactic oophorectomy specimens: prevalence and association with BRCA germline mutation status. The American journal of surgical pathology. 2001;25:1283-9. [16] Labidi-Galy SI, Papp E, Hallberg D, Niknafs N, Adleff V, Noe M, et al. High grade serous ovarian carcinomas originate in the fallopian tube. Nature communications. 2017;8:1093. [17] Perets R, Wyant GA, Muto KW, Bijron JG, Poole BB, Chin KT, et al. Transformation of the fallopian tube secretory epithelium leads to high-grade serous ovarian cancer in Brca;Tp53;Pten models. Cancer cell. 2013;24:751-65. [18] Yamagami W, Nagase S, Takahashi F, Ino K, Hachisuga T, Aoki D, et al. Clinical statistics of gynecologic cancers in Japan. Journal of gynecologic oncology. 2017;28. [19] Shu CA, Zhou Q, Jotwani AR, Iasonos A, Leitao MM, Konner JA, et al. Ovarian clear cell carcinoma, outcomes by stage: The MSK experience. Gynecologic Oncology. 2015;139:236-41. [20] Chan JK, Teoh D, Hu JM, Shin JY, Osann K, Kapp DS. Do clear cell ovarian carcinomas have poorer prognosis compared to other epithelial cell types? A study of 1411 clear cell ovarian cancers. Gynecologic oncology. 2008;109:370-6. [21] Pectasides D, Fountzilas G, Aravantinos G, Kalofonos C, Efstathiou H, Farmakis D, et al. Advanced stage clear-cell epithelial ovarian cancer: the Hellenic Cooperative Oncology Group experience. Gynecologic oncology. 2006;102:285-91. [22] Gilks BC, Ionescu DN, Kalloger SE, K\u00C3\u00B6bel M, Irving J, Clarke B, et al. Tumor cell type can be reproducibly diagnosed and is of independent prognostic significance in patients with maximally debulked ovarian carcinoma. Human Pathology. 2008;39:1239-51. [23] Jones S, Wang T-LL, Shih I-Me, Mao T-LL, Nakayama K, Roden R, et al. Frequent mutations of chromatin remodeling gene ARID1A in ovarian clear cell carcinoma. Science (New York, NY). 2010;330:228-31. [24] Kuo K-TT, Mao T-LL, Jones S, Veras E, Ayhan A, Wang T-LL, et al. Frequent activating mutations of PIK3CA in ovarian clear cell carcinoma. The American journal of pathology. 2009;174:1597-601. [25] Gounaris I, Charnock-Jones DS, Brenton JD. Ovarian clear cell carcinoma--bad endometriosis or bad endometrium? The Journal of pathology. 2011;225:157-60. [26] Wiegand KC, Shah SP, Al-Agha OM, Zhao Y, Tse K, Zeng T, et al. ARID1A mutations in endometriosis-associated ovarian carcinomas. The New England journal of medicine. 2010;363:1532-43. 98 [27] Bitler BG, Wu S, Park PH, Hai Y, Aird KM, Wang Y, et al. ARID1A-mutated ovarian cancers depend on HDAC6 activity. Nature cell biology. 2017;19:962-73. [28] Chandler RL, Damrauer JS, Raab JR, Schisler JC, Wilkerson MD, Didion JP, et al. Coexistent ARID1A-PIK3CA mutations promote ovarian clear-cell tumorigenesis through pro-tumorigenic inflammatory cytokine signalling. Nature communications. 2015;6:6118. [29] Bast RC, Jr., Hennessy B, Mills GB. The biology of ovarian cancer: new opportunities for translation. Nature reviews Cancer. 2009;9:415-28. [30] Rojas V, Hirshfield K, Ganesan S, Rodriguez-Rodriguez L. Molecular Characterization of Epithelial Ovarian Cancer: Implications for Diagnosis and Treatment. International Journal of Molecular Sciences. 2016;17:2113. [31] Bouchard-Fortier G, Panzarella T, Rosen B, Chapman W, Gien LT. Endometrioid Carcinoma of the Ovary: Outcomes Compared to Serous Carcinoma After 10 Years of Follow-Up. Journal of Obstetrics and Gynaecology Canada. 2017;39:34-41. [32] Mayr D, Hirschmann A, L\u00C3\u00B6hrs U, Diebold J. KRAS and BRAF mutations in ovarian tumors: a comprehensive study of invasive carcinomas, borderline tumors and extraovarian implants. Gynecologic oncology. 2006;103:883-7. [33] Stewart C, Leung Y, Walsh MD, Walters RJ, Young JP, Buchanan DD. KRAS mutations in ovarian low-grade endometrioid adenocarcinoma: association with concurrent endometriosis. Human Pathology. 2012;43:1177-83. [34] McConechy MK, Ding J, Senz J, Yang W, Melnyk N, Tone AA, et al. Ovarian and endometrial endometrioid carcinomas have distinct CTNNB1 and PTEN mutation profiles. Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc. 2014;27:128-34. [35] Cho KR, Shih I-M. Ovarian Cancer. Annual Review of Pathology: Mechanisms of Disease. 2009;4:287-313. [36] Prat J. Ovarian carcinomas: five distinct diseases with different origins, genetic alterations, and clinicopathological features. Virchows Archiv. 2012;460:237-49. [37] Dinulescu DM, Ince TA, Quade BJ, Shafer SA, Crowley D, Jacks T. Role of K-ras and Pten in the development of mouse models of endometriosis and endometrioid ovarian cancer. Nature medicine. 2005;11:63-70. [38] Wu R, Hendrix-Lucas N, Kuick R, Zhai Y, Schwartz DR, Akyol A, et al. Mouse model of human ovarian endometrioid adenocarcinoma based on somatic defects in the Wnt/beta-catenin and PI3K/Pten signaling pathways. Cancer cell. 2007;11:321-33. 99 [39] Anglesio MS, Papadopoulos N, Ayhan A, Nazeran TM, No\u00C3\u00AB M, Horlings HM, et al. Cancer-Associated Mutations in Endometriosis without Cancer. The New England journal of medicine. 2017;376:1835-48. [40] Boland CR, Thibodeau SN, Hamilton SR, Sidransky D, Eshleman JR, Burt RW, et al. A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer. Cancer research. 1998;58:5248-57. [41] Xiao X, Dong D, He W, Song L, Wang Q, Yue J, et al. Mismatch repair deficiency is associated with MSI phenotype, increased tumor-infiltrating lymphocytes and PD-L1 expression in immune cells in ovarian cancer. Gynecologic oncology. 2018. [42] Segev Y, Pal T, Rosen B, McLaughlin JR, Sellers TA, Risch HA, et al. Risk factors for ovarian cancers with and without microsatellite instability. International journal of gynecological cancer : official journal of the International Gynecological Cancer Society. 2014;24:664-9. [43] Segev Y, Zhang S, Akbari MR, Sun P, Sellers TA, McLaughlin J, et al. Survival in women with ovarian cancer with and without microsatellite instability. European journal of gynaecological oncology. 2015;36:681-4. [44] Giudice LC, Kao LC. Endometriosis. Lancet (London, England). 2004;364:1789-99. [45] Pearce CL, Templeman C, Rossing MA, Lee A, Near AM, Webb PM, et al. Association between endometriosis and risk of histological subtypes of ovarian cancer: a pooled analysis of case-control studies. The Lancet Oncology. 2012;13:385-94. [46] Matias-Guiu X, Stewart CJRJR. Endometriosis-associated ovarian neoplasia. Pathology. 2017. [47] Sato N, Tsunoda H, Nishida M, Morishita Y, Takimoto Y, Kubo T, et al. Loss of heterozygosity on 10q23.3 and mutation of the tumor suppressor gene PTEN in benign endometrial cyst of the ovary: possible sequence progression from benign endometrial cyst to endometrioid carcinoma and clear cell carcinoma of the ovary. Cancer research. 2000;60:7052-6. [48] Yamamoto S, Tsuda H, Takano M, Iwaya K, Tamai S, Matsubara O. PIK3CA mutation is an early event in the development of endometriosis-associated ovarian clear cell adenocarcinoma. The Journal of pathology. 2011;225:189-94. [49] Anglesio MS, Bashashati A, Wang YK, Senz J, Ha G, Yang W, et al. Multifocal endometriotic lesions associated with cancer are clonal and carry a high mutation burden. The Journal of pathology. 2015;236:201-9. [50] Lee J-Y, Kim H, Suh D, Kim M-K, Chung H, Song Y-S. Ovarian Cancer Biomarker Discovery Based on Genomic Approaches. Journal of Cancer Prevention. 2013;18:298-312. 100 [51] Guo J, Yu J, Song X, Mi H. Serum CA125, CA199 and CEA combined detection for epithelial ovarian cancer diagnosis: A meta-analysis. Open Medicine. 2017;12:131-7. [52] Montagnana M, Benati M, Danese E. Circulating biomarkers in epithelial ovarian cancer diagnosis: from present to future perspective. Annals of Translational Medicine. 2017;5:276-. [53] Schummer M, Ng WV, Bumgarner RE, Nelson PS, Schummer B, Bednarski DW, et al. Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas. Gene. 1999;238:375-85. [54] Cheng X, Zhang L, Chen Y, Qing C. Circulating cell-free DNA and circulating tumor cells, the \u00E2\u0080\u009Cliquid biopsies\u00E2\u0080\u009D in ovarian cancer. Journal of Ovarian Research. 2017;10:75. [55] Diaz LA, Bardelli A. Liquid biopsies: genotyping circulating tumor DNA. Journal of clinical oncology : official journal of the American Society of Clinical Oncology. 2014;32:579-86. [56] Stewart CM, Kothari PD, Mouliere F, Mair R, Somnay S, Benayed R, et al. The Value of Cell\u00E2\u0080\u0090free DNA for Molecular Pathology. The Journal of pathology. 2018. [57] Khakoo S, Georgiou A, Gerlinger M, Cunningham D, Starling N. Circulating tumour DNA, a promising biomarker for the management of colorectal cancer. Critical reviews in oncology/hematology. 2018;122:72-82. [58] Harris FR, Kovtun IV, Smadbeck J, Multinu F, Jatoi A, Kosari F, et al. Quantification of Somatic Chromosomal Rearrangements in Circulating Cell-Free DNA from Ovarian Cancers. Scientific reports. 2016;6:29831. [59] Gorbunova V, Boeke JD, Helfand SL, Sedivy JM. Sleeping dogs of the genome. Science. 2014;346:1187-8. [60] Goodier JL. Restricting retrotransposons: a review. Mobile DNA. 2016;7. [61] Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution. Nature Reviews Genetics. 2009;10. [62] Tubio JM, Li Y, Ju YS, Martincorena I, Cooke SL, Tojo M, et al. Mobile DNA in cancer. Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes. Science. 2014;345:1251343. [63] Pitk\u00C3\u00A4nen E, Cajuso T, Katainen R, Kaasinen E, V\u00C3\u00A4lim\u00C3\u00A4ki N, Palin K, et al. Frequent L1 retrotranspositions originating from TTC28 in colorectal cancer. Oncotarget. 2014;5:853-9. [64] Xiao-Jie L, Hui-Ying X, Qi X, Jiang X, Shi-Jie M. LINE-1 in cancer: multifaceted functions and potential clinical implications. Genetics in medicine : official journal of the American College of Medical Genetics. 2016;18:431-9. 101 [65] Burns KH. Transposable elements in cancer. Nature reviews Cancer. 2017;17:415-24. [66] Erwin JA, Marchetto MC, Gage FH. Mobile DNA elements in the generation of diversity and complexity in the brain. Nature Reviews Neuroscience. 2014;15. [67] Upton KR, Gerhardt DJ, Jesuadian SJ, Richardson SR, S\u00C3\u00A1nchez-Luque FJ, Bodea GO, et al. Ubiquitous L1 Mosaicism in Hippocampal Neurons. Cell. 2015;161:228-39. [68] Pradhan B, Cajuso T, Katainen R, Sulo P, Tanskanen T, Kilpivaara O, et al. Detection of subclonal L1 transductions in colorectal cancer by long-distance inverse-PCR and Nanopore sequencing. Scientific reports. 2017;7:14521. [69] Cost GJ, Feng Q, Jacquier A, Boeke JD. Human L1 element target\u00E2\u0080\u0090primed reverse transcription in vitro. The EMBO Journal. 2002;21:5899-910. [70] Ostertag EM, Kazazian HH. Twin Priming: A Proposed Mechanism for the Creation of Inversions in L1 Retrotransposition. Genome Research. 2001;11:2059-65. [71] Rodriguez-Martin B, Alvarez EG, Baez-Ortega A, Demeulemeester J, Ju Y, Zamora J, et al. Pan-cancer analysis of whole genomes reveals driver rearrangements promoted by LINE-1 retrotransposition in human tumours. bioRxiv. 2017:179705. [72] Rodi\u00C4\u0087 N, Sharma R, Sharma R, Zampella J, Dai L, Taylor MS, et al. Long interspersed element-1 protein expression is a hallmark of many human cancers. The American journal of pathology. 2014;184:1280-6. [73] Lee E, Iskow R, Yang L, Gokcumen O, Haseley P, Luquette LJ, et al. Landscape of Somatic Retrotransposition in Human Cancers. Science. 2012;337:967-71. [74] Scott EC, Devine SE. The Role of Somatic L1 Retrotransposition in Human Cancers. Viruses. 2017;9. [75] Helman E, Lawrence MS, Stewart C, Sougnez C, Getz G, Meyerson M. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing. Genome research. 2014;24:1053-63. [76] Miki Y, Nishisho I, Horii A, Miyoshi Y, Utsunomiya J, Kinzler KW, et al. Disruption of the APC Gene by a Retrotransposal Insertion of L1 Sequence in a Colon Cancer. Cancer Research. 1992;52:643. [77] Scott EC, Gardner EJ, Masood A, Chuang NT, Vertino PM, Devine SE. A hot L1 retrotransposon evades somatic repression and initiates human colorectal cancer. Genome Res. 2016;26:745-55. [78] Shukla R, Upton KR, Mu\u00C3\u00B1oz-Lopez M, Gerhardt DJ, Fisher ME, Nguyen T, et al. Endogenous retrotransposition activates oncogenic pathways in hepatocellular carcinoma. Cell. 2013;153:101-11. 102 [79] Nguyen THMHM, Carreira PE, Sanchez-Luque FJ, Schauer SN, Fagg AC, Richardson SR, et al. L1 Retrotransposon Heterogeneity in Ovarian Tumor Cell Evolution. Cell reports. 2018;23:3730-40. [80] Yang AS, Est\u00C3\u00A9cio MR, Doshi K, Kondo Y, Tajara EH, Issa J-PJP. A simple method for estimating global DNA methylation using bisulfite PCR of repetitive DNA elements. Nucleic acids research. 2004;32. [81] Daskalos A, Nikolaidis G, Xinarianos G, Savvari P, Cassidy A, Zakopoulou R, et al. Hypomethylation of retrotransposable elements correlates with genomic instability in non-small cell lung cancer. International journal of cancer. 2009;124:81-7. [82] Burns KH. Transposable elements in cancer. Nature Reviews Cancer. 2017;17:415-24. [83] Wolff EM, Byun H-MM, Han HF, Sharma S, Nichols PW, Siegmund KD, et al. Hypomethylation of a LINE-1 promoter activates an alternate transcript of the MET oncogene in bladders with cancer. PLoS genetics. 2010;6. [84] Hur K, Cejas P, Feliu J, Moreno-Rubio J, Burgos E, Boland RC, et al. Hypomethylation of long interspersed nuclear element-1 (LINE-1) leads to activation of proto-oncogenes in human colorectal cancer metastasis. Gut. 2014;63:635-46. [85] Weber B, Kimhi S, Howard G, Eden A, Lyko F. Demethylation of a LINE-1 antisense promoter in the cMet locus impairs Met signalling through induction of illegitimate transcription. Oncogene. 2010;29:5775-84. [86] Senthong A, Kitkumthorn N, Rattanatanyong P, Khemapech N, Triratanachart S, Mutirangura A. Differences in LINE-1 methylation between endometriotic ovarian cyst and endometriosis-associated ovarian cancer. International journal of gynecological cancer : official journal of the International Gynecological Cancer Society. 2014;24:36-42. [87] Ewing AD, Gracita A, Wood LD, Ma F, Xing D, Kim M-S, et al. Widespread somatic L1 retrotransposition occurs early during gastrointestinal cancer evolution. Genome Research. 2015;25:1536-45. [88] Doucet-O'Hare TT, Rodic N, Sharma R, Darbari I, Abril G, Choi JA, et al. LINE-1 expression and retrotransposition in Barrett's esophagus and esophageal carcinoma. Proceedings of the National Academy of Sciences of the United States of America. 2015;112:E4894-E900. [89] Ardeljan D, Taylor MS, Ting DT, Burns KH. The Human Long Interspersed Element-1 Retrotransposon: An Emerging Biomarker of Neoplasia. Clinical chemistry. 2017;63:816-22. [90] Akers SN, Moysich K, Zhang W, Lai G, Miller A, Lele S, et al. LINE1 and Alu repetitive element DNA methylation in tumors and white blood cells from epithelial ovarian cancer patients. Gynecologic Oncology. 2014;132:462-7. 103 [91] Pattamadilok J, Huapai N, Rattanatanyong P, Vasurattana A, Triratanachat S, Tresukosol D, et al. LINE-1 hypomethylation level as a potential prognostic factor for epithelial ovarian cancer. International journal of gynecological cancer : official journal of the International Gynecological Cancer Society. 2008;18:711-7. [92] Paterson AL, Weaver JM, Eldridge MD, Tavare S, Fitzgerald RC, Edwards PA, et al. Mobile element insertions are frequent in oesophageal adenocarcinomas and can mislead paired-end sequencing analysis. BMC genomics. 2015;16:473. [93] Tang Z, Steranka JP, Ma S, Grivainis M, Rodic N, Huang CR, et al. Human transposon insertion profiling: Analysis, visualization and identification of somatic LINE-1 insertions in ovarian cancer. Proceedings of the National Academy of Sciences of the United States of America. 2017;114:E733-E40. [94] Solyom S, Ewing AD, Rahrmann EP, Doucet T, Nelson HH, Burns MB, et al. Extensive somatic L1 retrotransposition in colorectal tumors. Genome Res. 2012;22:2328-38. [95] Helman E, Lawrence MS, Stewart C, Sougnez C, Getz G, Meyerson M. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing. Genome Res. 2014;24:1053-63. [96] Est\u00C3\u00A9cio MR, Gharibyan V, Shen L, Ibrahim AE, Doshi K, He R, et al. LINE-1 hypomethylation in cancer is highly variable and inversely correlated with microsatellite instability. PloS one. 2007;2. [97] Ogino S, Kawasaki T, Nosho K, Ohnishi M, Suemoto Y, Kirkner GJ, et al. LINE-1 hypomethylation is inversely associated with microsatellite instability and CpG island methylator phenotype in colorectal cancer. International journal of cancer. 2008;122:2767-73. [98] Inamura K, Yamauchi M, Nishihara R, Lochhead P, Qian ZR, Kuchiba A, et al. Tumor LINE-1 methylation level and microsatellite instability in relation to colorectal cancer prognosis. Journal of the National Cancer Institute. 2014;106. [99] Harris CR, Normart R, Yang Q, Stevenson E, Haffty BG, Ganesan S, et al. Association of nuclear localization of a long interspersed nuclear element-1 protein in breast tumors with poor prognostic outcomes. Genes & cancer. 2010;1:115-24. [100] Goodier JL, Ostertag EM, Engleka KA, Seleme MC, Kazazian HH. A potential role for the nucleolus in L1 retrotransposition. Human molecular genetics. 2004;13:1041-8. [101] De Luca C, Guadagni F, Sinibaldi-Vallebona P, Sentinelli S, Gallucci M, Hoffmann A, et al. Enhanced expression of LINE-1-encoded ORF2 protein in early stages of colon and prostate transformation. Oncotarget. 2016;7:4048-61. [102] Wang YK, Bashashati A, Anglesio MS, Cochrane DR, Grewal DS, Ha G, et al. Genomic consequences of aberrant DNA repair mechanisms stratify ovarian cancer histotypes. Nature genetics. 2017;advance online publication. 104 [103] McPherson A, Shah SP, Sahinalp SC. deStruct: Accurate Rearrangement Detection using Breakpoint Specific Realignment. bioRxiv. 2017. [104] Pavone ME, Lyttle BM. Endometriosis and ovarian cancer: links, risks, and challenges faced. International journal of women's health. 2015;7:663-72. [105] Anglesio MS, Wang YK, Maassen M, Horlings HM, Bashashati A, Senz J, et al. Synchronous Endometrial and Ovarian Carcinomas: Evidence of Clonality. Journal of the National Cancer Institute. 2016;108:djv428. [106] Chrystoja CC, Diamandis EP. Whole Genome Sequencing as a Diagnostic Test: Challenges and Opportunities. Clinical Chemistry. 2014;60:724-33. [107] Badge RM, Alisch RS, Moran JV. ATLAS: a system to selectively identify human-specific L1 insertions. American journal of human genetics. 2003;72:823-38. [108] Ewing AD, Kazazian HH. High-throughput sequencing reveals extensive variation in human-specific L1 content in individual human genomes. Genome research. 2010;20:1262-70. [109] Rodi\u00C4\u0087 N, Steranka JP, Makohon-Moore A, Moyer A, Shen P, Sharma R, et al. Retrotransposon insertions in the clonal evolution of pancreatic ductal adenocarcinoma. Nature medicine. 2015;21:1060-4. [110] Sanchez-Luque FJ, Richardson SR, Faulkner GJ. Retrotransposon Capture Sequencing (RC-Seq): A Targeted, High-Throughput Approach to Resolve Somatic L1 Retrotransposition in Humans. Methods in molecular biology (Clifton, NJ). 2016;1400:47-77. [111] Alcaide M, Yu S, Davidson J, Albuquerque M, Bushell K, Fornika D, et al. Targeted error-suppressed quantification of circulating tumor DNA using semi-degenerate barcoded adapters and biotinylated baits. Scientific reports. 2017;7:10574. [112] Schr\u00C3\u00B6der J, Hsu A, Boyle SE, Macintyre G, Cmero M, Tothill RW, et al. Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads. Bioinformatics (Oxford, England). 2014;30:1064-72. [113] Robinson JT, Thorvaldsd\u00C3\u00B3ttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nature Biotechnology. 2011;29:24. [114] Krzywinski MI, Schein JE, Birol I, Connors J, Gascoyne R, Horsman D, et al. Circos: An information aesthetic for comparative genomics. Genome Research. 2009. [115] Fridley BL, Ghosh TM, Wang A, Raghavan R, Dai J, Goode EL, et al. Genome-Wide Study of Response to Platinum, Taxane, and Combination Therapy in Ovarian Cancer: In vitro Phenotypes, Inherited Variation, and Disease Recurrence. Frontiers in genetics. 2016;7:37. 105 [116] Jamiruddin MR, Kaitsuka T, Hakim F, Fujimura A, Wei F-YY, Saitoh H, et al. HDAC9 regulates the alternative lengthening of telomere (ALT) pathway via the formation of ALT-associated PML bodies. Biochemical and biophysical research communications. 2016;481:25-30. [117] Wang K, Liang Q, Li X, Tsoi H, Zhang J, Wang H, et al. MDGA2 is a novel tumour suppressor cooperating with DMAP1 in gastric cancer and is associated with disease outcome. Gut. 2016;65:1619-31. [118] Chui MH, Have C, Hoang LN, Shaw P, Lee C-HH, Clarke BA. Genomic profiling identifies GPC5 amplification in association with sarcomatous transformation in a subset of uterine carcinosarcomas. The journal of pathology Clinical research. 2018;4:69-78. [119] Ye M, Huang T, Li J, Zhou C, Yang P, Ni C, et al. Role of CDH13 promoter methylation in the carcinogenesis, progression, and prognosis of colorectal cancer: A systematic meta-analysis under PRISMA guidelines. Medicine. 2017;96. [120] Magnusson K, Gremel G, Ryd\u00C3\u00A9n L, Pont\u00C3\u00A9n V, Uhl\u00C3\u00A9n M, Dimberg A, et al. ANLN is a prognostic biomarker independent of Ki-67 and essential for cell cycle progression in primary breast cancer. BMC cancer. 2016;16:904. [121] Fanjul-Fern\u00C3\u00A1ndez M, Quesada V, Cabanillas R, Cadi\u00C3\u00B1anos J, Fontanil T, Obaya A, et al. Cell-cell adhesion genes CTNNA2 and CTNNA3 are tumour suppressors frequently mutated in laryngeal carcinomas. Nature communications. 2013;4:2531. [122] Zingler N, Willhoeft U, Brose H-P, Schoder V, Jahns T, Hanschmann K-MO, et al. Analysis of 5\u00E2\u0080\u00B2 junctions of human LINE-1 and Alu retrotransposons suggests an alternative model for 5\u00E2\u0080\u00B2-end attachment requiring microhomology-mediated end-joining. Genome Research. 2005;15:780-9. [123] Sunami E, Vu AT, Nguyen SL, Giuliano AE, Hoon DS. Quantification of LINE1 in circulating DNA as a molecular biomarker of breast cancer. Annals of the New York Academy of Sciences. 2008;1137:171-4. [124] Nagai Y, Sunami E, Yamamoto Y, Hata K, Okada S, Murono K, et al. LINE-1 hypomethylation status of circulating cell-free DNA in plasma as a biomarker for colorectal cancer. Oncotarget. 2017;8:11906-16. [125] Wedge E, Hansen JW, Garde C, Asmar F, Tholstrup D, Kristensen SSS, et al. Global hypomethylation is an independent prognostic factor in diffuse large B cell lymphoma. American journal of hematology. 2017;92:689-94. [126] Doucet-O'Hare TT, Rodi\u00C4\u0087 N, Sharma R, Darbari I, Abril G, Choi JA, et al. LINE-1 expression and retrotransposition in Barrett's esophagus and esophageal carcinoma. Proceedings of the National Academy of Sciences of the United States of America. 2015;112:900. [127] Talhouk A, Hoang LN, McConechy MK, Nakonechny Q, Leo J, Cheng A, et al. Molecular classification of endometrial carcinoma on diagnostic specimens is highly concordant with final 106 hysterectomy: Earlier prognostic information to guide treatment. Gynecologic Oncology. 2016;143:46-53. [128] McConechy MK, Talhouk A, Leung S, Chiu D, Yang W, Senz J, et al. Endometrial Carcinomas with POLE Exonuclease Domain Mutations Have a Favorable Prognosis. Clinical Cancer Research. 2016;22:2865-73. [129] Wylie A, Jones AE, D'Brot A, Lu W-JJ, Kurtz P, Moran JV, et al. p53 genes function to restrain mobile elements. Genes & development. 2016;30:64-77. [130] K\u00C3\u00B6bel M, Piskorz AM, Lee S, Lui S, LePage C, Marass F, et al. Optimized p53 immunohistochemistry is an accurate predictor of TP53 mutation in ovarian carcinoma. The journal of pathology Clinical research. 2016;2:247-58. [131] Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, Kazazian HH. High Frequency Retrotransposition in Cultured Mammalian Cells. Cell. 1996;87:917-27. [132] Jung H, Choi J, Lee E. Immune signatures correlate with L1 retrotransposition in gastrointestinal cancers. bioRxiv. 2017:216051. 107 Appendix A Supporting Materials Table A 1. Validated TTC28-L1 transduction events in Chapter 2. Sample Origin breakpoint Target breakpoint ENOC 1 chr22:29066036 chr5:145279611 ENOC 1 chr22:29065556 chr6:97785114 ENOC 2 chr22:29065619 chr11:32392341 ENOC 3 chr22:29065997 chr18:66643446 ENOC 4 chr22:29065742 chr18:27803952 CCOC 1 chr22:29065852 chr4:163516454 CCOC 2 chr22:29065604 chr18:37426280 CCOC 3 chr22:29065433 chr18:38206983 108 Table A 2. SNV/Frameshift mutation coordinate and allelic frequency for each tumor block surveyed in Chapter 2. Sample ID Gene Mutation \u00CE\u0094 Position (hg19) WGS A.F. Absolute read counts/allelic frequencies (A.F.) at different FFPE Sites Corresponding Location ENOC 1 B4 B6 D13 D9 D10 D2 (normal) B4 & B6 - left ovary ARID1A Q563* C>T 1:27057979 24% (2905/13155)22% (5469/12022)45% (1549/7082)22% (2524/5724)44% (4262/9672)44% (5/6164)0 D13 - left conu CSMD1 G38C G>T 8:4495054 50% (1228/10250)12% (4/9816)0 (8/8623)0 (2/7363)0 (9/7590)0 (2/5570)0 D9 & D10 - right conu CTNNA2 V91M G>A 2:80874884 43% (9857/10912)91% (3898/11899)33% (56/9919)1% (2651/5648)47% (5028/9785)52% (13/12768)0 D2 - posterior endocervix KRAS G12V G>T 12:25398284 36% (18363/40480)45% (12166/42854)28% (15075/41425)36% (23308/33202)70% (15132/36324)42% (544/33116)2% PIK3CA R88Q G>A 3:178916876 45% (57/13159)0 (7623/13760)55% (1841/11711)16% (6333/11353)56% (1470/11476)13% (253/14308)2% PIK3CA P449L C>T 3:178928068 39% (3438/8193)42% (1998/5683)35% (2741/7442)37% (0/107)0 (2077/4445)47% (2/4858)0 retrotransposition event validated 2 2 2 2 2 0 ENOC 2 synchronous case A1 A12 D9 D17 D2 D1 (normal) A1 & A12 - right ovary CTNNB1 S37F C>T 3:41266113 88% (10344/12018)86% (10267/13151)79% (6457/13571)48% (4198/10023)42% (4670/9841)48% (9/15117)0 D2 - cervix FAM208B G42S G>A 10:5762911 24% (3809/12485)30% (2173/8717)25% (305/11837)3% (22/11896)0 (20/9792)0 (25/12487)0 D9 & D17 - uterine NEB V4792M G>A 2:152410491 27% (27/13934)0 (30/15638)0 (18/13856)0 (11/13167)0 (20/9792)0 (18/14432)0 D1 - cervix PIGK S155N G>A 1:77632427 43% (3448/16699)21% (6528/14544)45% (1774/13418)13% (8591/13356)64% (1511/10653)14% (26/17126)0 PIK3CA E545K G>A 3:178936091 89% (14755/16811)88% (16008/16250)99% (11222/14536)77% (12104/14315)85% (11526/11799)98% (10/13491)0 TP53 E180* G>T 17:7578392 85% (6271/7925)79% (10198/11471)90% (7893/12607)63% (8687/11259)77% (7788/9295)84% (7/10620)0 retrotransposition event validated 1 1 1 1 1 0 ENOC 3 B18 B19 B20 B21 B22 B3 (normal) B18 & B19 - right ovarian DHX29 T673I C>T 5:54577291 66% (2058/31301)7% (27/24252)0 (7452/26508)28% (22/22006)0 (52/30106)0 (20/38479)0 B20 & B21 - right ovarian KMT2B C1435Y G>A 19:36218600 64% (7/8109)0 (5/8206)0 (3605/9461)38% (7/12114)0 (5/9342)0 (4/12048)0 B22 - ovarian nodule KMT2B T172fs 19:36210763-65 0 0 0 0 40% 0 B3 - anterior cervix NRAS Q61K C>A 1:115256530 93% (13863/16075)86% (5928/12947)46% (6828/8833)77% (9058/13939)65% (3821/13639)28% (9/18974)0 PIK3CA N345K T>A 3:178921553 54% (25452/55847)46% (16749/33593)50% (16267/47411)34% (4978/57014)9% (16046/42836)37% (21/39092)0 TTN S2146P T>C 2:179640155 36% (106/28470)0 (1680/18834)9% (9203/19948)46% (71/25915)0 (47/29284)0 (48/32216)0 retrotransposition event validated 1 1 1 1 1 0 109 Sample ID Gene Mutation \u00CE\u0094 Position (hg19) WGS A.F. Absolute read counts/allelic frequencies (A.F.) at different FFPE Sites Corresponding Location ENOC 4 synchronous case A4 A5 A16 B22 B23 B6 (normal) A16 - left ovary CYTH2 A168V C>T 19:48977230 42% (8096/21371)38% (9226/15795)58% (8215/20459)40% (11/20489)0 (20/20011)0 (19/14995)0 A4 & A5 - left ovary DNAH9 R4205L G>T 17:11840793 25% (4950/14244)35% (5099/11610)42% (5424/14478)35% (5/14215)0 (5/13734)0 (1/10888)0 B22 & B23 - posterior endo. DOCK4 R972C C>T 7:111462434 27% (5670/15583)36% (1230/11298)11% (7275/14390)51% (5008/12557)40% (3911/10118)39% (18/11875)0 B6 - posterior cervix KRAS G12D G>A 12:25398284 67% (29394/63687)46% (13101/39951)33% (23646/52200)46% (22390/49973)46% (19738/46662)42% (46/41232)0 PIK3CA G118D G>A 3:178917478 81% (28792/30622)94% (17588/20579)85% (15779/26143)60% (14128/25231)56% (11763/16139)73% (14/24376)0 PTEN L265fs 10:89717768-70 98% 98% 99% 97% 98% 0 retrotransposition event validated 1 1 1 1 1 1 CCOC 1 B6 B7 B8 B12 B13 K2 (normal) B6-B13 - right ovary ARID1A Q548fs 1:27057934-36 0 0 0 0 34% 0 K2 - posterior cervix FAM98A E60Q G>C 2:33820580 33% (5498/24455)22% (2318/31724)7% (5038/36362)14% (7714/42400)18% (11105/32383)34% (7/41218)0 FAT3 S2855L C>T 11:92534629 56% (10/15186)0 (8/21868)0 (12/22579)0 (11/24887)0 (11/19376)0 (11/26646)0 GNA11 S242L C>T 19:3119041 30% (2689/5256)51% (1150/6722)17% (361/3782)10% (2173/7767)28% (1549/6887)23% (2/46)0 KRAS G12V G>T 12:25398284 45% (2733/32212)8% (2667/38939)7% (4336/49961)9% (11721/53627)22% (15245/39076)41% (32/56865)0 PCDHB6 E630K G>A 5:140531726 34% (2295/5180)44% (1734/8095)21% (2189/5485)40% (3213/12690)25% (2839/9173)31% 0 retrotransposition event validated 1 1 1 1 1 CCOC 2 A1 A4 A6 A11 A14 B2 (normal) A1-A14 - left ovary ELEFN1 P501S C>T 7:1785733 35% 0 0 0 0 0 0 B2 - posterior cervix HTR1E R220W C>T 6:87725710 38% (12493/33485)37% (7994/30042)27% (12858/29355)44% (475/34788)1% (49/48183)0 (39/23275)0 LONP1 R534C C>T 19:5699123 57% (5285/9263)57% (8294/12935)64% (7608/12891)59% (18/19351)0 (492/27196)2% (16/11148)0 PDAP1 D82N G>A 7:98998017 27% (8912/23305)38% (6100/22180)28% (8953/22371)40% (37/27474)0 (369/35843)1% (27/19598)0 PTGS2 R424S C>G 1:186644514 38% (13323/32092)42% (9449/33357)28% (11055/30301)36% (4/37242)0 (12/51601)0 (3/29148)0 SCN5A E555K G>A 3:38645430 48% (6917/17222)40% (5001/15456)32% (7009/16730)42% (10/20177)0 (802/27014)3% (8/13159)0 retrotransposition event validated 1 1 1 0 0 0 110 Sample ID Gene Mutation \u00CE\u0094 Position (hg19) WGS A.F. Absolute read counts/allelic frequencies (A.F.) at different FFPE Sites Corresponding Location CCOC 3 synchronous case B3 B5 B6 B8 C14 C4 (normal) B3-B8 - right ovary ABCD3 G617V G>T 1:94980706 39% (14353/35382)41% (19581/41177)48% (12235/28525)43% (14983/31805)47% (31/20868)0 (28/27810)0 C14 - posterior endometrium ARID1A E2047* G>T 1:27106528 75% (11704/21408)55% (13002/24671)54% (9951/17632)56% (13189/18894)70% (5/12369)0 (7/15859)0 C4 - anterior cervix CPT1B R162W C>T 22:51015042 21% (4218/10878)39% (5193/12641)41% (2361/8053)29% (3829/9136)42% (14/6978)0 (6/8613)0 ITGB8 C634Y G>A 7:20444464 53% (4078/22334)18% (6299/27745)23% (5653/19408)29% (6269/20245)31% (224/15917)1% (24/17082)0 MTOR A2386V C>T 1:11174877 27% (9039/28130)32% (11319/32397)35% (8395/22873)37% (10277/27581)46% (15/18263)0 (22/21418)0 PTEN L108F C>T 10:89692838 30% (8002/23401)34% (9811/27560)35% (7143/19019)38% (7205/21088)34% (12/14125)0 (11/18464)0 retrotransposition event validated 1 1 1 1 0 0 111 Figure A 1. Insert size distribution for sequencing libraries made from frozen tumor (a), FFPE (b), and plasma (c). a. b. c. "@en . "Thesis/Dissertation"@en . "2018-11"@en . "10.14288/1.0372373"@en . "eng"@en . "Pathology and Laboratory Medicine"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivatives 4.0 International"@* . "http://creativecommons.org/licenses/by-nc-nd/4.0/"@* . "Graduate"@en . "LINE-1 retrotranspositions in epithelial ovarian cancer : can we use DNA \u00E2\u0080\u009Cparasites\u00E2\u0080\u009D for good purpose?"@en . "Text"@en . "http://hdl.handle.net/2429/67443"@en .