"Science, Faculty of"@en . "DSpace"@en . "UBCV"@en . "Chun, Hye-Jung Elizabeth"@en . "2020-12-18T18:31:52Z"@en . "2020"@en . "Doctor of Philosophy - PhD"@en . "University of British Columbia"@en . "Rhabdoid tumours (RTs) are highly aggressive paediatric cancers that predominantly affect infants, with an overall 4-year survival rate below 25% and no curative therapy established to date. Nearly all RTs exhibit pathognomonic loss of SMARCB1, a core subunit of the SWI/SNF complex that mobilizes nucleosomes and regulates gene expression and epigenetic reprogramming. RTs are broadly classified into cranial (atypical teratoid RTs / ATRTs) and extra-cranial RTs (malignant RTs / MRTs), yet the extent to which they are different or similar was not fully determined. Previous reports indicated some shared molecular features between the two entities, but there had been no direct comparison between ATRTs and MRTs. Furthermore, previous reports indicated clinical and histological heterogeneity within and across ATRTs and MRTs, yet the extent of molecular heterogeneity was unknown, particularly in MRTs for which genomic, transcriptomic and epigenomic data were lacking. To address these knowledge gaps, I hypothesized that multi-omic data analyses would identify novel mutations, and gene expression and epigenetic features in MRTs and that such analyses could identify potential molecular underpinnings of heterogeneity. To test these hypotheses, I analyzed whole genome, transcriptome, DNA methylome, histone H3K27me3 and H3K27ac modification profiles obtained from 40 MRT cases, and analyzed multi-omics datasets derived from 140 MRT and 161 ATRT cases. My whole genome analyses revealed recurrently altered genes that were previously undescribed in MRTs. Integration of gene expression and epigenetic data revealed the convergence in dysregulation of HOX genes, imprinted genes and other development-regulating genes such as those involved in neural crest development, indicating dysregulation of early human developmental processes in MRTs. I also identified five DNA methylation subgroups of RTs across different anatomical sites. Among these, subgroups containing MRTs and ATRTs expressing relatively high levels of MYC exhibited gene expression signatures and epigenetic modifications indicative of increased immunological activities. I analyzed immunohistochemistry data to confirm increased levels of immune cell infiltration and expression of immune checkpoint proteins in these subgroups. My findings implied the potential utility of immune checkpoint blockade treatments for RT patients despite the low prevalence of mutations in these cancers."@en . "https://circle.library.ubc.ca/rest/handle/2429/76852?expand=metadata"@en . " MOLECULAR CHARACTERIZATION OF RHABDOID TUMOURS FROM MULTIPLE ANATOMICAL SITES by Hye-Jung Elizabeth Chun M.Sc., The University of British Columbia, 2010 B.Sc., The University of British Columbia, 2004 B.Sc., The University of British Columbia, 2000 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Graduate and Postdoctoral Studies (Bioinformatics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) December 2020 \u00C2\u00A9 Hye-Jung Elizabeth Chun, 2020 \t ii The\tfollowing\tindividuals\tcertify\tthat\tthey\thave\tread,\tand\trecommend\tto\tthe\tFaculty\tof\tGraduate\tand\tPostdoctoral\tStudies\tfor\tacceptance,\tthe\tdissertation\tentitled:\t\tMolecular\tcharacterization\tof\trhabdoid\ttumours\tacross\tmultiple\tanatomical\tsites\t\tsubmitted\tby\t Hye-Jung\tElizabeth\tChun\t \tin\tpartial\tfulfillment\tof\tthe\trequirements\tfor\tthe\tdegree\tof\t Doctor\tof\tPhilosophy\tin\t Bioinformatics\t\tExamining\tCommittee:\tMarco\tA.\tMarra,\tProfessor,\tDepartment\tof\tMedical\tGenetics,\tUBC\tSupervisor\t\tSteven\tJ.M.\tJones,\tProfessor,\tDepartment\tof\tMedical\tGenetics,\tUBC\tSupervisory\tCommittee\tMember\t\tPaul\tPavlidis,\tProfessor,\tDepartment\tof\tPsychiatry,\tUBC\tSupervisory\tCommittee\tMember\tJ\u00C3\u00B6rg\tGsponer,\tAssociate\tProfessor,\tDepartment\tof\tBiochemistry\tand\tMolecular\tBiology,\tUBC\t\tUniversity\tExaminer\tScott\tJ.\tTebbutt,\tProfessor,\tDepartment\tof\tMedicine,\tUBC\tUniversity\tExaminer\t\t\tAdditional\tSupervisory\tCommittee\tMembers:\tPoul\tH.B.\tSorensen,\tProfessor,\tDepartment\tof\tPathology\t&\tLaboratory\tMedicine,\tUBC\tSupervisory\tCommittee\tMember\t\t \t iii ABSTRACT Rhabdoid tumours (RTs) are highly aggressive paediatric cancers that predominantly affect infants, with an overall 4-year survival rate below 25% and no curative therapy established to date. Nearly all RTs exhibit pathognomonic loss of SMARCB1, a core subunit of the SWI/SNF complex that mobilizes nucleosomes and regulates gene expression and epigenetic reprogramming. RTs are broadly classified into cranial (atypical teratoid RTs / ATRTs) and extra-cranial RTs (malignant RTs / MRTs), yet the extent to which they are different or similar was not fully determined. Previous reports indicated some shared molecular features between the two entities, but there had been no direct comparison between ATRTs and MRTs. Furthermore, previous reports indicated clinical and histological heterogeneity within and across ATRTs and MRTs, yet the extent of molecular heterogeneity was unknown, particularly in MRTs for which genomic, transcriptomic and epigenomic data were lacking. To address these knowledge gaps, I hypothesized that multi-omic data analyses would identify novel mutations, and gene expression and epigenetic features in MRTs and that such analyses could identify potential molecular underpinnings of heterogeneity. To test these hypotheses, I analyzed whole genome, transcriptome, DNA methylome, histone H3K27me3 and H3K27ac modification profiles obtained from 40 MRT cases, and analyzed multi-omics datasets derived from 140 MRT and 161 ATRT cases. My whole genome analyses revealed recurrently altered genes that were previously undescribed in MRTs. Integration of gene expression and epigenetic data revealed the convergence in dysregulation of HOX genes, imprinted genes and other development-regulating genes such as those involved in neural crest development, indicating dysregulation of early human developmental processes in MRTs. I also identified five DNA methylation subgroups of \t iv RTs across different anatomical sites. Among these, subgroups containing MRTs and ATRTs expressing relatively high levels of MYC exhibited gene expression signatures and epigenetic modifications indicative of increased immunological activities. I analyzed immunohistochemistry data to confirm increased levels of immune cell infiltration and expression of immune checkpoint proteins in these subgroups. My findings implied the potential utility of immune checkpoint blockade treatments for RT patients despite the low prevalence of mutations in these cancers. \t v LAY SUMMARY Rhabdoid tumours (RTs) are deadly childhood cancers that resist known cancer therapies and spread quickly to other areas of the body. RTs arise frequently in kidneys and the brain. To identify changes in the genetic code linked to aggressive properties in RTs, I used sophisticated DNA sequencing and computational tools to identify genetic mutations, gene expression patterns and chemical modifications of DNA and of the proteins that bind the DNA. I discovered that parts of the genetic code that control various developmental processes were altered in RTs. I also found immune cells near the cancer cells in some RTs, and speculated that these immune cells might have the potential to attack the cancer cells if the patient received a type of drugs used to help immune cells recognize cancer cells. I hope that my finding will lead to more research into the use of these drugs to improve patient outcomes. \t vi PREFACE Portions of Chapter 1 have been published: Chun, H-J.E., Khattra, J., Krzywinski, M., Aparicio, S.A. and Marra, M.A. (2014) Second-generation sequencing for cancer genome analysis. Cancer genomics: From Bench to Personalized Medicine. Dellaire, G.D., Berman, J.N., Arceci, R.J. (Editors). 1st Edition. Elsevier. Chapter 2, page 13-30. https://doi.org/10.1016/B978-0-12-396967-5.00002-5. Copyright by Elsevier Inc. I have written most of the text for this book chapter review manuscript with J. Khattra, with guidance and input from my supervisor, Dr. M.A. Marra. Conceptualization and design of figures were done by M. Krzywinski and myself. M. Krzywinski made the final figures in the review, including the one reproduced in this thesis, Figure 1.1. A version of Chapter 2 has been published: Chun, H-J.E., Lim, E.L., Heravi-Moussavi, A., Saberi, S., Mungall, K.L., Bilenky, M., Carles, A., Tse, K., Shlafman, I., Zhu, K., Qian, J.Q., Palmquist, D.L., He, A., Long, W., Goya, R., Ng, M., LeBlanc, V.G., Pleasance, E., Thiessen, N., Wong, T., Chuah, E., Zhao, Y-J., Schein, J.E., Gerhard, D.S., Taylor, M.D., Mungall, A.J., Moore, R.A., Ma, Y., Jones, S.J.M., Perlman, E.J., Hirst, M. and Marra, M.A. Genome-wide profiles of extra-cranial malignant rhabdoid tumors reveal heterogeneity and dysregulated developmental pathways. Cancer Cell (2016) 29(3): 394-406. https://doi.org/10.1016/ j.ccell.2016.02.009. Copyright by Elsevier Inc. M.A. Marra, E.J. Perlman and D.S. Gerhard conceived the study, led the experimental design and provided editorial input of the published manuscript. M.A. Marra supervised the study. E.J. Perlman provided clinical data. M.D. Taylor provided adult and fetal cerebellum RNA-Seq data. K. Zhu, W. Long and J.Q. Qian ran the pipelines that identified somatic mutations and regions with somatic copy number alterations and LOH. D.L. Palmquist and K.L. Mungall performed de novo assemblies of whole genomes and \t vii transcriptomes. A. He and N. Thiessen ran the pipeline that quantified the levels of mRNA transcript abundance using RNA-Seq data. K. Tse and I. Shlafman designed PCR primers and validated putative fusion events using PCR sequencing. E. Pleasance performed the correlation analysis of mutation signatures. M. Ng, V.G. LeBlanc and R. Goya contributed to bioinformatics analyses. T. Wong, E. Chuah, Y. Ma and S.J.M. Jones provided bioinformatics support. J.E. Schein managed sample storage and preparation. Y-J. Zhao, A.J. Mungall and R.A. Moore performed library construction and sequencing. E.L. Lim performed miRNA-Seq analyses. A. Carles, A. Heravi-Moussavi, S. Saberi and M. Bilenky performed DNA methylation and ChIP-Seq analyses described in Sections 2.2.3 and 2.2.4. A. Heravi-Moussavi made Figures 2.11, 2.12 and 2.17. A. Heravi-Moussavi and S. Saberi made Figure 2.15. E.L. Lim made Figures 2.10 and 2.20 and panel C in Figure 2.22. I performed whole genome and gene expression analyses described in Sections 2.2.2 and 2.2.5, and made Figures 2.1 \u00E2\u0080\u0093 2.9, 2.13, 2.14, 2.16, 2.18, 2.19, 2.21, 2.22 (panels A, B and D) and 2.23 \u00E2\u0080\u0093 2.25. Together with E.L. Lim, A. Heravi-Moussavi, S. Saberi, M. Hirst and M.A. Marra, I interpreted the results from whole genome, gene expression, DNA methylation and ChIP-Seq data analyses. I, along with M.A. Marra, E.L. Lim, A. Heravi-Moussavi and M. Hirst wrote the paper. A version of Chapter 3 has been published: Chun, H-J.E., Johann, P.D., Milne, K., Zapatka, M., Buellesbach, A., Ishaque, N., Iskar, M., Erkek, S., Wei, L., Tessier-Cloutier, B., Lever, J., Titmuss, E., Topham, J.T., Bowlby, R., Chuah, E., Mungall, K.L., Ma, Y., Mungall, A.J., Moore, R.A., Taylor, M.D., Gerhard, D.S., Jones, S.J.M., Korshunov, A., Gessler, M., Kerl, K., Hasselblatt, M., Fr\u00C3\u00BChwald, M.C., Perlman, E.J., Nelson, B.H., Pfister, S.M., Marra, M.A. and Kool, M. Identification and analyses of extra-cranial and cranial rhabdoid tumor molecular subgroups reveal tumors with cytotoxic T cell infiltration. Cell Reports (2019) 29(8): 2338-2354. \t viii https://doi.org/10.1016/j.celrep.2019.10.013. Copyright by Elsevier Inc. M.A. Marra and M. Kool conceived the study. M.A. Marra and M. Kool, along with myself and P.D. Johann, designed the study. M.A. Marra and M. Kool supervised the study. S.M. Pfister provided editorial input and support for the study. E.J. Perlman and M. Hasselblatt and provided clinical data and whole tumor slides for malignant rhabdoid tumour and atypical teratoid rhabdoid tumour (ATRT) samples, respectively. A. Korshunov, M. Gessler, K. Kerl, M.C. Fr\u00C3\u00BChwald provided primary ATRT samples. M.D. Taylor provided adult and fetal cerebellum RNA-Seq data. K. Milne performed immunohistochemistry (IHC) experiments, with direction and expertise from B.H. Nelson. B. Tessier-Cloutier provided pathologist expertise. L. Wei performed the transcription factor network analysis. J. Lever generated a list of putative cancer antigens from the published literature using a text-mining approach. J.T. Topham developed the pipeline that quantified transcript abundance levels of endogenous retroviral elements (EREs). R. Bowlby ran the pipeline that quantified the levels of mRNA transcript abundance using RNA-Seq data. M. Zapatka, A. Buellesbach, N. Ishaque, M. Iskar and S. Erkek, contributed to DNA methylation and ChIP-Seq analyses. E. Chuah, K.L. Mungall, Y. Ma and S.J.M. Jones provided bioinformatics support. A.J. Mungall and R.A. Moore performed library construction and sequencing. Together with K. Milne, B. Tessier-Cloutier, E. Titmuss and M.A. Marra, I designed IHC experiments. P.D. Johann performed the t-SNE analysis described in Section 3.2.1, DNA methylation analyses described in Section 3.2.3, and the analyses of enhancers and transcription factor binding sites described in Section 3.2.4. P.D. Johann analyzed H3K27ac enrichment levels at ERE loci described in Section 3.2.7. P.D. Johann made Figure 3.11, and panel A in Figures 3.4 and 3.9, panels A, B, F and G in Figure 3.10, panels A, B and C in Figure 3.12, and panel B in Figure 3.18. K. Milne made panels F and J in Figure 3.17. I performed mutation and DNA \t ix methylation analyses described in Sections 3.2.1 and 3.2.2, and performed the DNA methylation analysis of the normal brain described in Section 3.2.3. I performed gene expression analyses described in Sections 3.2.5 and 3.2.6. I validated IHC results, and performed the analyses of IHC data, putative tumour antigens and DNA methylation levels at ERE loci described in Section 3.2.7. I made Figures 3.1 \u00E2\u0080\u0093 3.3, 3.4 (panel B), 3.5 \u00E2\u0080\u0093 3.8, 3.9 (panel B), 3.10 (panels C, D, E), 3.12 (panel D), 3.13 \u00E2\u0080\u0093 3.16, 3.17 (panels A \u00E2\u0080\u0093 E), 3.18 (panels A, C, D) and 3.19. Together with P.D. Johann, I interpreted the results from whole genome, gene expression, DNA methylation and ChIP-Seq data analyses. I, along with P.D. Johann, M.A. Marra and M. Kool, wrote the paper. The Copyright owner of all published contents stated above is the Elsevier publishers. The Elsevier publishers permitted the use of published figures, tables and texts in this thesis, based on the clause of their sharing policies, stated in https://www.elsevier.com/about/policies/ sharing, as the following: \u00E2\u0080\u009CTheses and dissertations which contain embedded published journal articles as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect.\u00E2\u0080\u009D In compliance with this clause, I have indicated DOI links in the first pages of Chapters 1, 2 and 3, which included contents that are published in manuscripts indicated in this Preface. Extra-cranial rhabdoid tumour samples were provided by E.J. Perlman (Ann and Robert H. Lurie Children\u00E2\u0080\u0099s Hospital in Chicago, USA) through the Children\u00E2\u0080\u0099s Oncology Group (COG), from patients registered on the National Wilms Tumor Study Group 5 or on the COG AREN03B2 protocol. Informed consents were obtained from all patients included in the study. Additional samples included in Chapter 3 were provided by A. Huang (Hospital for Sick Children in Toronto, Canada) through the Rare Brain Tumor Consortium (RBTC), and through \t x the EURHAB study group, with informed consent obtained from all patients included in the study. Research described in Chapters 2 and 3 were performed with the approval of the University of British Columbia - British Columbia Cancer Agency Research Ethics Board (REB number H09-02558). \t xi TABLE OF CONTENTS ABSTRACT\t................................................................................................................................................................\tiii\tLAY SUMMARY\t......................................................................................................................................................\tv\tPREFACE\t....................................................................................................................................................................\tvi\tTABLE OF CONTENTS\t.......................................................................................................................................\txi\tLIST OF FIGURES\t...............................................................................................................................................\txiii\tLIST OF SUPPLEMENTARY MATERIAL\t.................................................................................................\txv\tLIST OF ABBREVIATIONS\t.............................................................................................................................\txvi\tACKNOWLEDGEMENTS\t..............................................................................................................................\txviii\tDEDICATION\t.........................................................................................................................................................\txxi\tCHAPTER 1. Understanding rhabdoid tumour biology through genome science\t...............................\t1\t1.1 Introduction\t......................................................................................................................................................\t1\t1.2 Cancer is a genetic and epigenetic disease\t............................................................................................\t2\t1.3 Use of sequencing technologies to profile genomic and epigenomic aberrations in cancers6\t1.4 Genome science has revealed novel insights into tumour biology\t................................................\t8\t1.5 The SWI/SNF complex in cancers\t........................................................................................................\t12\t1.6 Rhabdoid Tumours\t.....................................................................................................................................\t16\t1.7 Thesis roadmap and chapter summaries\t..............................................................................................\t18\tCHAPTER 2. Characterization of genomic, epigenomic and gene expression landscapes of extra-cranial malignant rhabdoid tumours\t.................................................................................................................\t21\t2.1 Introduction\t...................................................................................................................................................\t21\t2.2 Results\t.............................................................................................................................................................\t22\t2.2.1 Samples and data analyzed for the study\t....................................................................................\t22\t2.2.2 Whole genome landscapes of MRT\t.............................................................................................\t23\t2.2.3 MRTs are comprised of subgroups with distinct DNA methylation profiles that correlate with age at diagnosis\t..................................................................................................................\t40\t2.2.4 MRTs exhibit epigenetic dysregulation of homeobox genes\t...............................................\t44\t2.2.5 RNA-Seq analyses support the existence of two gene expression subgroups and indicate dysregulation of immunoglobulins, BMP and WNT signalling, neural crest development and imprinted genes in MRTs\t.........................................................................................\t53\t2.3 DISCUSSION\t..............................................................................................................................................\t65\t2.4 MATERIALS AND METHODS\t..........................................................................................................\t68\t2.4.1 Sample details and data availability\t.............................................................................................\t68\t2.4.2 WGS library construction, sequencing and sequence read alignment\t..............................\t69\t2.4.3 RNA-Seq library construction, sequencing and sequence read alignment\t......................\t70\t2.4.4 WGBS library construction, sequencing and sequence read alignment\t...........................\t71\t2.4.5 ChIP-Seq library construction, sequencing and sequence read alignment\t......................\t72\t2.4.6 Whole genome sequence data analyses\t.......................................................................................\t74\t2.4.7 RNA-Seq data analyses\t....................................................................................................................\t79\t2.4.8 DNA methylation analyses\t..............................................................................................................\t81\t2.4.9 ChIP-Seq data analyses\t....................................................................................................................\t82\t2.4.10 Annotation of Tumour Suppressor Genes (TSG) and Oncogenes\t...................................\t84\t\t xii CHAPTER 3. Identification and analyses of RT molecular subgroups revealed tumours with increased CD8+ cytotoxic T cell infiltration, and similarities between extra-cranial MRTs and cranial ATRT-MYC\t...............................................................................................................................................\t86\t3.1 INTRODUCTION\t......................................................................................................................................\t86\t3.2 RESULTS\t......................................................................................................................................................\t87\t3.2.1 ATRT-MYC and MRT exhibit similar DNA methylation profiles compared to ATRT-SHH and ATRT-TYR\t..................................................................................................................................\t88\t3.2.2 ATRT-MYC and MRT cases occupy three DNA methylation subgroups that are associated with anatomical sites of occurrence and SMARCB1 mutation patterns\t.................\t96\t3.2.3 ATRT-MYC and MRT exhibit DNA methylation profiles that distinguish them from ATRT-SHH and ATRT-TYR cases\t......................................................................................................\t107\t3.2.4 ATRT-MYC and MRT share distinctive enhancer landscapes compared to other ATRT subgroups\t.........................................................................................................................................\t111\t3.2.5 Immune-related genes, HOX genes and mesoderm development regulators are more highly expressed in ATRT-MYC and MRT compared to ATRT-SHH and ATRT-TYR, which instead expressed neural-like transcriptional profiles\t........................................................\t116\t3.2.6 Gene expression analysis shows the potential of increased T cell presence in the tumour microenvironments of cranial ATRT-MYC and extra-cranial MRT cases\t..............\t121\t3.2.7 Immunohistochemistry confirms increased CD8+ cytotoxic T cell infiltration and PD-L1 immune checkpoint expression in MRTs and ATRT-MYC\t..................................................\t130\t3.3 DISCUSSION\t............................................................................................................................................\t139\t3.4 MATERIALS AND METHODS\t........................................................................................................\t144\t3.4.1 Sample details and data availability\t...........................................................................................\t145\t3.4.2 DNA methylation array data generation and processing\t.....................................................\t147\t3.4.3 Whole genome sequencing data\t..................................................................................................\t147\t3.4.4 RNA-Seq data generation and processing\t................................................................................\t148\t3.4.5 Whole genome bisulfite sequencing (WGBS) data generation and processing\t..........\t148\t3.4.6 Chromatin Immunoprecipitation followed by sequencing (ChIP-Seq) data generation\t...........................................................................................................................................................................\t149\t3.4.7 Mutation analyses\t.............................................................................................................................\t149\t3.4.8 RNA-Seq data analysis\t...................................................................................................................\t150\t3.4.9 DNA methylation array data analyses\t.......................................................................................\t153\t3.4.10 ChIP-Seq data analyses\t................................................................................................................\t157\t3.4.11 Whole-genome-bisulfite sequence (WGBS) data analyses\t..............................................\t158\t3.4.12 Immunohistochemistry (IHC) experiment and data analyses\t.........................................\t159\tCHAPTER 4. Conclusions and Future Directions\t.....................................................................................\t162\tBIBLIOGRAPHY\t.................................................................................................................................................\t171\t\t xiii LIST OF FIGURES Figure 1.1 Profiling molecular alterations using sequencing approaches.\t.............................................\t7\tFigure 2.1 Mutation profiles of 40 MRT cases.\t...........................................................................................\t24\tFigure 2.23 Correlation between somatic SNV rates and age at diagnosis in MRTs\t.......................\t25\tFigure 2.34 Two MRT cases, PADYZI and PAUEKW, exhibited distinct somatic substitution mutation patterns of increased proportions of T>G transversions and C>T transitions, respectively, which were correlated with mutation signatures 17 and 9 (PADYZI) and signatures 1A/B (PAUEKW).\t...................................................................................................................\t27\tFigure 2.45 Analysis of inter-mutational distances showed that MRT genomes did not undergo kataegis.\t............................................................................................................................................................\t28\tFigure 2.5 6An axon-guidance regulating gene, ROBO2, exhibits a significantly higher frequency of somatic mutations in introns compared to the background mutation rate in MRT genomes, a feature not observed in paediatric AML genomes. Cases with somatic intronic mutations exhibit decreased ROBO2 expression.\t...............................................................................\t32\tFigure 2.6 7SPECC1L and KCNJ3 harbour somatic non-coding mutations at frequencies that are significantly higher compared to the background mutation rate.\t..................................................\t33\tFigure 2.7 8Somatic non-coding mutations in SPECC1L and KCNJ3 are associated with altered gene expression levels in MRTs.\t..............................................................................................................\t34\tFigure 2.89 A genome-wide view of regions of recurrent copy number alterations across the 40 MRT cases.\t......................................................................................................................................................\t36\tFigure 2.910Copy number profile of chromosome 22 in MRTs.\t............................................................\t37\tFigure 2.1011Structural rearrangements in MRT genomes.\t.....................................................................\t39\tFigure 2.1112Unsupervised clustering of promoter DNA methylation profiles in MRTs.\t............\t42\tFigure 2.1213Distributions of CpG methylation levels in MRT genomes.\t.........................................\t43\tFigure 2.1314Functional terms related to homeobox genes were significantly enriched for genes with hypermethylated promoter CGIs in Group A compared to Group B.\t................................\t43\tFigure 2.1415Promoters of known TSGs show lower DNA methylation levels in MRTs compared to hESC.\t.......................................................................................................................................\t44\tFigure 2.1516Profiles of H3K27me3 promoter density in MRTs are distinct from those in normal cell types.\t..........................................................................................................................................................\t47\tFigure 2.1617Functional terms related to homeobox genes were enriched for genes whose promoters exhibited lower H3K27me3 levels in MRTs compared to normal samples.\t........\t48\tFigure 2.1718Numbers of typical enhancers and super-enhancers identified in MRTs.\t................\t50\tFigure 2.1819Homeobox-related functional terms were significantly enriched for MRT-specific super-enhancer-associated genes.\t............................................................................................................\t51\tFigure 2.1920Super-enhancer at the HOXC locus in MRTs.\t...................................................................\t52\tFigure 2.2021Expression levels of HOTAIR and HOXC gene family members in MRTs compared to normal cell types.\t.................................................................................................................\t53\tFigure 2.2122Immune-related and embryonic development-related functions were enriched for over-expressed genes in MRTs, while neuron development-related functions were enriched for under-expressed genes in MRTs compared to normal cell types.\t..........................................\t55\tFigure 2.2223Two gene expression subgroups were identified in MRTs.\t...........................................\t57\t\t xiv Figure 2.2324Gene expression subgroups are consistently observed using different analytical approaches.\t......................................................................................................................................................\t58\tFigure 2.2425Differentially expressed genes between gene expression subgroups include immune- and development-regulating genes.\t......................................................................................\t60\tFigure 2.2526MRT cases in gene expression Group 1 frequently harbour broad homozygous deletions at the SMARCB1 locus.\t.............................................................................................................\t64\tFigure 3.127Comparison of DNA methylation data from H9 hESC generated using sequencing and array platforms.\t......................................................................................................................................\t89\tFigure 3.228Comparison of DNA methylation data of primary tumour samples generated using sequencing and array platforms.\t...............................................................................................................\t90\tFigure 3.329Comparison of DNA methylation profiles from MRTs and ATRTs to other paediatric and adult cancer types revealed that RTs are distinct from other brain and kidney cancer types and show similarities to cancers originating from neural-crest-derived cell types, brain cancers and normal brain tissues.\t.....................................................................................\t94\tFigure 3.430Unsupervised clustering and dimension reduction analyses of DNA methylation data from 140 MRTs (92 renal, 48 extra-renal) and 161 ATRTs reveal similarity between MRTs and ATRT-MYC.\t.............................................................................................................................\t95\tFigure 3.531Five DNA methylation subgroups of RTs from cranial and extra-cranial sites correlate with previously known ATRT and MRT subgroups and anatomical sites.\t.............\t99\tFigure 3.632NMF analyses of gene expression data revealed subgroups consistent with DNA-methylation subgroups.\t..............................................................................................................................\t101\tFigure 3.733NMF analyses of DNA methylation profiles from 432 RT cases confirmed ATRT subgroups characterized in multiple previous studies.\t....................................................................\t103\tFigure 3.834SMARCB1 mutation types are associated with DNA-methylation subgroups in RTs.\t...........................................................................................................................................................................\t105\tFigure 3.935RT Group 1 cases have broad deletions at the SMARCB1 locus, affecting expression of nearby genes.\t...........................................................................................................................................\t106\tFigure 3.1036MRT and ATRT-MYC show similar DNA methylation profiles compared to ATRT-SHH and ATRT-TYR.\t................................................................................................................\t109\tFigure 3.1137Pathways enriched for genes that are within subgroup-specific differentially methylated regions (DMRs).\t...................................................................................................................\t110\tFigure 3.1238ATRT-MYC and MRT exhibit enhancer profiles distinct from other ATRT subgroups.\t......................................................................................................................................................\t116\tFigure 3.1339Dysregulation of mesenchymal development genes is observed in ATRT-MYC and MRT, while neural gene dysregulation is observed in ATRT-SHH and ATRT-TYR cases.\t................................................................................................................................................................\t120\tFigure 3.1440Subgroup-classifying genes in RTs.\t....................................................................................\t123\tFigure 3.1541Tumour purity levels across RT subgroups.\t.....................................................................\t124\tFigure 3.1642Gene expression analyses indicate increased T cell presence in RT subgroups containing MRT and ATRT-MYC cases.\t...........................................................................................\t129\tFigure 3.1743Increased immune cell infiltration in RT subgroups is validated by immunohistochemistry (IHC).\t................................................................................................................\t134\tFigure 3.1844Potential tumour antigens that could contribute to RT immunogenicity.\t...............\t137\tFigure 3.1945Summary of RT molecular subgroups.\t...............................................................................\t138\t \t xv LIST OF SUPPLEMENTARY MATERIAL Supplemental_Tables_Ch2 Excel file containing tables for Chapter 2. The tables are referred to as \u00E2\u0080\u009CSupplemental Table 2.X\u00E2\u0080\u009D in the thesis. Supplemental_Tables_Ch3 Excel file containing tables for Chapter 3. The tables are referred to as \u00E2\u0080\u009CSupplemental Table 3.X\u00E2\u0080\u009D in the thesis. \t xvi LIST OF ABBREVIATIONS AML Acute Myeloid Leukemia ATRT Atypical Teratoid Rhabdoid Tumour ATRT-MYC MYC subgroup of ATRTs ATRT-SHH SHH subgroup of ATRTs ATRT-TYR TYR subgroup of ATRTs BCGSC Canada\u00E2\u0080\u0099s Michael Genome Sciences Centre at BC Cancer CAR-T Chimeric Antigen Receptor T-cell CGI CpG Island ChIP-Seq Chromatin ImmunoPrecipitation followed by Sequencing CNA Copy Number Alteration CNS Central Nervous System COG Children\u00E2\u0080\u0099s Oncology Group DE Differentially expressed DMR Differentially methylated regions DKFZ Deutsches Krebsforschungszentrum / German Cancer Research Center FDR False Discovery Rate ERE Endogenous Retroviral Elements ERV Endogenous Retrovirus hESC Human Embryonic Stem Cell HGSC Ovarian high-grade serous carcinoma ICGC International Cancer Genome Consortium IHC Immunohistochemistry InDel Small Insertions and Deletions MB Medulloblastoma MMR MisMatch Repair MRT Malignant Rhabdoid Tumour NMF Non-negative Matrix Factorization PCA Principal Component Analysis PRC Polycomb Repressive Complex \t xvii RPKM Reads Per Kilobase of transcript per Million mapped reads RT Rhabdoid Tumour SCCOHT Small Cell Carcinomas of ovaries, hypercalcemic type SNV Single Nucleotide Variant SWI/SNF SWItch/Sugar-Non-Fermenting chromatin-remodelling protein complex TARGET Therapeutically Applicable Research to Generate Effective Treatments TCGA The Cancer Genome Atlas TCR T cell receptor TSG Tumour Suppressor Genes TF Transcription Factor TFBS Transcription Factor Binding Site t-SNE t-distributed Stochastic Neighbor Embedding UMAP Uniform Manifold Approximation and Projection UTR Untranslated region WGS Whole Genome Sequencing WGBS Whole Genome Bisulfite Sequencing \t xviii ACKNOWLEDGEMENTS Throughout my PhD journey, I have been blessed and privileged to have worked with many scientists, professionals and students from whom I have learned so much about science and humanity. Without their help and support, my achievement would not have been possible. I am indebted to them my continued enthusiasm for science and learning. The words and actions of many people from this journey have inspired me to turn my endeavours into occasions to better myself and serve others. Thanks to them, my PhD experiences have enriched me in ways that I could have not imagined \u00E2\u0080\u0093 indeed I had dreamt, and my dreams fell short. Deo gratias. My deepest gratitude goes to my PhD thesis advisor, Dr. Marco Marra, who have shaped my thinking as a scientist and taught me the ways of excellence, leadership, and personal and professional integrity. He has been an extraordinary mentor who generously gave his time and advice for my development. I cannot thank him enough for all the mentorship and support that he has provided for me throughout my PhD training. His principles in life and his work ethics have made an indelible imprint on mine. I have truly been blessed and honoured to be his student. I sincerely thank my thesis supervisory committee members, Drs. Steven Jones, Paul Pavlidis and Poul Sorensen, for encouraging my progress in PhD training and for providing helpful feedback, which enhanced the scientific caliber of my thesis research. I especially thank Dr. Steven Jones, also my former Master\u00E2\u0080\u0099s thesis advisor, for providing his unwavering support for me throughout these years. I also thank Dr. Paul Pavlidis for his detailed reading of my thesis and thoughtful comments, which enhanced the final document. I thank Dr. Poul Sorensen for his support. I also wish to extend my gratitude to Ms. Sharon Ruschkowski for her administrative help as my graduate program coordinator. \t xix I am very fortunate to have engaged in international collaborative projects that allowed me to work with many top-tier scientists from around the world. In particular, I thank Drs. Daniela Gerhard and Elizabeth Perlman for the opportunity to be involved in the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) consortium. I especially thank Dr. Perlman for her remarkable generosity in providing rare tumour tissue slides so promptly, which enabled me to carry out validation experiments that led to one of the key discoveries in my thesis research. I am also deeply grateful to Dr. Martin Hirst and the members of his laboratory, specifically Drs. Alireza Heravi-Moussavi, Mikhail Bilenky and Saeed Saberi, and Ms. Anna\u00C3\u00AFck Carles, who worked hard with me in publishing the paper described in Chapter 2 of this thesis. Also, I sincerely thank Drs. Marcel Kool and Pascal Johann from German Cancer Research Center (DKFZ), for being such supportive collaborators with whom I had a great pleasure of working together in publishing the paper described in Chapter 3. I also thank Dr. Basile Tessier-Cloutier from UBC and Ms. Katy Milne at the Deeley Research Centre at BC Cancer, who generously lent their expertise and support for the key validation experiments described in Chapter 3. I also wish to extend my thanks to Dr. Karen Novik for her tremendous help with the project management of my thesis research. I thank all current and past members of the Marra lab, whom I have affectionately called Marra labbies, for their help and support throughout my time with them. I want to especially thank Alessia Gagliardi, Farah Zahir, Emilia Lim, Rodrigo Goya, Jung-Eun Song, Isabel Serrano, Susanna Chan and Stephen Lee, as well as Svetlana Bortnik, Courtney Choutka and Chandra Lebovitz from the Gorski lab, for their friendship and team spirit that have been immensely uplifting during my PhD. Many moments of fun and the laughter that I shared with them will always be remembered with great fondness in my heart. I also owe much gratitude to \t xx Ms. Lulu Crisostomo, who consistently provided answers to all my administrative questions with a smile. I am blessed to have worked with many talented colleagues at Canada\u00E2\u0080\u0099s Michael Genome Sciences Centre (BCGSC), who not only impress me with their scientific and professional merits, but also with their steadfast kindness, humility and collegiality. I especially thank Andy and Karen Mungall, Misha Bilenky, Tina Wong, Yaoqing Shen, Martin Krzywinski, Erin Pleasance, Nina Thiessen, Reanne Bowlby and Eric Chua for all their support over the course of my PhD. I have been privileged to receive salary, scholarship and travel funds from the Canadian Institutes for Health Research, University of British Columbia, Roman M. Babicki Fellowship in Medical Research, Canadian Cancer Society Research Institute, BCGSC, Canadian Epigenetics, Environment and Health Research Consortium Network, and the John Bosdet Memorial Funds. Finally, I thank my family and friends who have been there for me through thick and thin. My PhD is the product of their love, support and prayers. I am eternally grateful for them in my life, and give a specific shout-out to Roseanna Chu, Marie Melgarejo, Janet Chin Fatt, Sandra Chua, Ka-Young Park, Jung-Bin Nam, Ji-In Oh, Monica Rumpel, Lisa Rumpel, Julia Rumpel, Maria Anggowo, Christine D\u00E2\u0080\u0099cruz, Patrizia Avendano, Mimi Otzuka, Catherine Uy, Peter Lee, Jamie Sun, Jonathan Lao, Mayleen Bugarin, Theresa Lo, Vida Lopez, Caesar Chow, Jennifer Woo, Dr. Elizabeth Henry, Irene Walls, Crishna Delleva, Miriam Helmer, Helen Ryan and all my sistas from Crestwell. Thank you so much for your cheers throughout my PhD journey!\t xxi DEDICATION To my family - my parents, James and Susanna Chun, and Helena, Danny and Damien Lee. Your love and faith in me sustain me and lift me up. And to my aunt, Margaret Park, and to my grand uncle, Joseph Chun, who fought bravely against cancer and left a legacy that encourages me to keep going in cancer research with persistence and hope. \t 1 CHAPTER 1. Understanding rhabdoid tumour biology through genome science1 1.1 Introduction This thesis is a genome science study of rhabdoid tumours (RTs), rare and aggressive paediatric cancers driven by loss of a single protein, SMARCB1, a subunit of the SWI/SNF chromatin-remodelling complex. This thesis research fundamentally aimed to enhance the current understanding of inter-tumour heterogeneity and oncogenic processes driven by dysregulation of SWI/SNF, the most frequently disrupted protein complex in human cancers (Shain and Pollack, 2013). To achieve this aim, the overarching objectives of this thesis research were to comprehensively survey and characterize genetic, epigenetic and transcriptional features, determine the extent of inter-tumour heterogeneity in RTs across multiple anatomical sites, and identify molecular features that underpin RT pathobiology. To fulfill these objectives, I analyzed and integrated whole-genome, transcriptome and epigenome data from the largest RT cohort so analyzed to date. This analysis revealed dysregulation of developmental pathways, particularly those of neural crest and neural cells, along with previously undescribed DNA-methylation subgroups. Evidence for immune cell infiltration in subsets of RTs was also obtained, an unexpected observation that may help stimulate more research investigating a potential role for immunotherapy in the context of RTs, for which there is no effective treatment to date. As part of the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) consortium, data generated for this thesis work constitute the largest and the most comprehensive \t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t1\tPortions of this Chapter have been published, and the author contributions are provided in the Preface as per the University of British Columbia PhD thesis guideline: H-J. E. Chun et al. (2014) Second-generation sequencing for cancer genome analysis. Cancer genomics: From Bench to Personalized Medicine. Dellaire GD, Berman JN, Arceci RJ. (Eds). 1st Ed. Elsevier. 2013. Chapter 2, pp13-30. https://doi.org/10.1016/B978-0-12-396967-5.00002-5. Copyright by Elsevier Inc.\t\t 2 sequence dataset of extra-cranial RTs to date, contributing an important resource for wider research communities. In this introductory Chapter, I will first provide an overview of cancer biology and known molecular aberrations that drive oncogenic transformation and progression. Next, I will describe the use of sequencing technologies as a key measurement tool that detect genetic, gene expression and epigenetic alterations genome-wide, thus enabling genome science studies such as those described in this thesis. I will then provide an overview of RT biology, and conclude this Chapter with the thesis roadmap and summary. 1.2 Cancer is a genetic and epigenetic disease In 1902, Theodor Boveri observed abnormal chromosome segregation during mitosis in malignant cells in sea urchins, and made a prescient hypothesis that \u00E2\u0080\u009Ca malignant tumour cell is a cell with a specific abnormal chromosome constitution\u00E2\u0080\u009D (Boveri, 1902). Since then, cancers have long been understood as a disease that arises from dynamic changes in genes and genomes through the accumulation of multiple genetic and epigenetic alterations, predominantly of somatic origin. These alterations, when they affect cancer-associated biological processes, lead to transformation of normal cells into malignant ones with characteristics of uncontrolled cellular proliferation, replicative immortality, invasion and metastasis. Like any biological process, cancer-associated processes are regulated by genes, which have been traditionally and broadly categorized as either oncogenes or tumour suppressor genes (TSGs). The discovery of the first proto-oncogene from chickens, c-src, which shares sequence homology with the sarcoma-causing retroviral v-src gene (Stehelin et al., 1976), revealed that the cancer-promoting functions of an oncogene are activated by gain-of-function mutations in an \t 3 otherwise normal cellular gene (proto-oncogene). The notion of cancer-causing mutations was subsequently proven by the discovery of the first cancer-specific non-synonymous SNV (p.Gly652Val) in the HRAS oncogene in the T24 bladder cancer cell line (Reddy et al., 1982). The observation of loss-of-function mutations (\u00E2\u0080\u009Chits\u00E2\u0080\u009D) in both alleles in the RB1 gene in human retinoblastomas led to a statistical analysis that generated the \u00E2\u0080\u009Ctwo-hit hypothesis\u00E2\u0080\u009D by Knudson (Knudson, 1971). This work provided the concept of mutation accumulation in cancer, and of cancer predispositions, often involving germline mutations in cancer genes, that allow malignancy to be reached with fewer somatic mutations. Emerging from these seminal works were efforts to identify mutations, oncogenes and TSGs, as well as to characterize their functional roles in oncogenic processes, which have become a major focus in cancer research. These efforts led to a compilation of genes causally implicated in human cancers, which showed significant over-representation of genes encoding transcriptional regulators, DNA maintenance and repair proteins and protein kinases involved in signal transduction, the cell cycle, motility and metabolism (Futreal et al., 2004). Functional analyses of such cancer genes revealed biological processes and mechanisms that underpinned oncogenesis, a majority of which were summarized as the \u00E2\u0080\u009Challmarks\u00E2\u0080\u009D of cancer (Hanahan and Weinberg, 2000). The hallmarks consisted of sustained proliferative signalling, evasion of growth suppressors, activation of invasion and metastasis, replicative immortality, induction of angiogenesis and resistance to cell death. Characterization of these oncogenic processes and molecular aberrations that enabled them was key for developing cancer therapies (Lord and Ashworth, 2010; Rubin and Gilliland, 2012), including conventional chemotherapy that inhibited cellular proliferative signalling en masse, or targeted therapies that use small-molecule inhibitors e.g. trastuzumab for HER2+ breast cancers. \t 4 While heritable DNA sequence changes, i.e. mutations, can alter gene expression and activity and lead to cancer predisposition and progression, heritable changes in gene expression without altering DNA sequences, known as epigenetic changes (Holliday, 1987), can provide alternative mechanisms by which oncogenic processes can be sustained. The term epigenetics was first coined by Conrad Waddington in the early 1940s (Waddington, 1942) to describe the plasticity in \u00E2\u0080\u009Ccybernetics of development\u00E2\u0080\u009D in embryos (Waddington, 1957), which gives rise to dynamic and highly complex developmental physiology from relatively static genetics (Stricker, Koferle and Beck, 2017). The modes of epigenetic changes include cytosine methylation at CpG sites (commonly referred to as \u00E2\u0080\u009CDNA methylation\u00E2\u0080\u009D), covalent histone modifications, and nucleosome positioning and remodelling. The first cancer-specific epigenetic alterations, detected using methylation-sensitive restriction enzymes and Southern blotting (Andrew P Feinberg and Vogelstein, 1983), were observed in the promoters of human growth hormone genes in colon cancer tissues, which were hypomethylated compared to promoters in adjacent normal tissues. Since then, aberrant promoter methylation has been established as one of the major epigenetic changes that are involved in oncogenic processes, leading to transcriptional activation of oncogenes or silencing of TSGs. Classic examples include hypomethylation of HRAS (Andrew P. Feinberg and Vogelstein, 1983) and hypermethylation of RB1 (Greger et al., 1989; Ohtani-Fujita et al., 1993) and MLH1 (Veigl et al., 1998). Histone modification, involving the covalent modification (e.g. acetylation, methylation, phosphorylation) of certain histone protein residues, is the latest epigenetic mechanism that was found to alter gene expression in cancer (Feinberg and Tycko, 2004). It was first revealed by elegant studies that showed transcriptional silencing of methylated genes only after they were \t 5 packaged into condensed chromatin (Keshet, Lieman-Hurwitz and Cedar, 1986; Buschhausen et al., 1987). Further pioneering studies delineated mechanistic models of epigenetic gene silencing, which showed that methylated-CpG-binding proteins, such as MECP2, and DNA methyltransferases (DNMTs), physically interacted with histone deacetylases (HDACs; Meehan, Lewis and Bird, 1992; Jones et al., 1998). The crucial role of histone methylation at lysine residues in chromatin modification and transcriptional regulation was shown by David Allis\u00E2\u0080\u0099s group (Strahl et al., 1999; Rea et al., 2000), further elucidating cooperative epigenetic mechanisms involving DNA methylation, histone methylation and chromatin remodelling to regulate gene expression. In cancer, the first cancer-specific histone modification was observed in the T24 bladder cancer cell line, in which transcriptional repression of CDKN2A was associated with increased H3K9 methylation and decreased H3K4 methylation levels (Nguyen et al., 2002). The notion of dysregulated histone modification in cancers was further supported by discoveries of recurrently mutated histone-modifier genes, such as nonsense mutations in KDM6A/UTX (which encodes a histone demethylase) across multiple types of leukemias and solid cancers (Van Haaften et al., 2009), and non-synonymous SNVs in EZH2 (which encodes a histone methyltransferase) in lymphomas (Morin et al., 2010). Since these discoveries, many other epigenetic regulators have been found to be mutated in multiple cancer types (reviewed in Shen and Laird, 2013), including SMARCB1, the first epigenetic modifier gene found to be implicated in cancer (Versteege et al., 1998). Mutations in epigenetic modifier genes can disrupt epigenetic control, which, in turn, can dysregulate the transcription of genes, including those that maintain genome stability and DNA repair, and of other oncogenes and TSGs. Together, these discoveries inform us that genomes and epigenomes influence each other, and the interplay \t 6 between genetic and epigenetic alterations can lead to malignant transformation and sustain the \u00E2\u0080\u009Challmark\u00E2\u0080\u009D phenotypes of cancers. 1.3 Use of sequencing technologies to profile genomic and epigenomic aberrations in cancers After the first discovery of cancer-specific HRAS mutations was made using a sequencing approach (Reddy et al., 1982), discoveries of mutations in oncogenes and tumour suppressor genes continued to deepen our understanding of cancer biology. However, the pivotal point came with the arrival of sequencing-by-synthesis technologies (also known as the next- or second-generation sequencing method, following the dideoxy chain-termination method developed by Dr. Frederick Sanger\u00E2\u0080\u0099s Group, which is known as the \u00E2\u0080\u009CSanger method\u00E2\u0080\u009D (Sanger, Nicklen and Coulson, 1977)). The sequencing-by-synthesis technologies offered massively parallel capacity with dramatically increased sensitivity and cost-effectiveness to interrogate over 98% of the whole genome with 99.99% accuracy (at 50X sequence coverage; Ajay et al., 2011), allowing genome-wide detection of genetic, transcriptomic and epigenetic alterations in cancers. Mutations that could be detected through next-generation sequencing approaches included substitution mutations (also known as single nucleotide variants, or SNVs); insertions and deletions of small numbers of nucleotides (collectively referred to as InDels); chromosomal copy number alterations; and genome structural alterations such as inter- and intra-chromosomal translocations, inversions and duplications (Figure 1.1). Detection of somatic and germline mutations was enabled through comparisons of sequence data from cancer cells to sequence data \t 7 from normal cells. In addition to mutations, sequencing approaches can also detect the presence of microbial nucleic acids in samples. Figure 1.1 Profiling molecular alterations using sequencing approaches. Second-generation sequencing can simultaneously profile multiple modalities of genetic alterations in cancer genomes, including SNVs, InDels, copy number alterations and structural alterations in genomes such as deletions, inversions and translocations. Sheared DNA fragments are sequenced as \u00E2\u0080\u009Creads\u00E2\u0080\u009D, which are then aligned against the reference human genome sequence. Mismatches in alignments, relative differences in sequence depths (\u00E2\u0080\u009Ccoverage\u00E2\u0080\u009D), and unusual alignment patterns are analyzed to identify putative genetic alterations. \t 8 Since their introduction, sequencing-by-synthesis technologies have also been applied to targeted re-sequencing of sub-genomic regions (such as hybridization capture (e.g. exome-seq) or PCR-based amplicon sequencing) and used for profiling transcriptomes, specifically by sequencing cDNAs that are reverse transcribed from RNA transcripts and by quantifying transcript abundance at the levels of genes and transcript isoforms (RNA-Seq). Also, applications of various biochemical methodologies followed by sequencing allow for profiling of chromatin, including nucleosome occupancy levels and topological interactions within the genome (e.g. DNase-Seq, MNase-Seq, ATAC-Seq, Hi-C-Seq), interactions between DNA and proteins such as histones and transcription factors (ChIP-Seq), and CpG methylation genome-wide (whole-genome-bisulfite-Seq, or WGBS). 1.4 Genome science has revealed novel insights into tumour biology The massively parallel capacity, increased sensitivity and cost-effectiveness that the sequencing-by-synthesis technologies offered helped spawn the field of genome science, fueling multi-omics research that explored whole genomes, transcriptomes and epigenomes at unprecedented scales. Bioinformatics expertise that enabled analyses and integration of large datasets became an integral part of cancer research, and became paramount in carrying out large-scale genomics studies for adult and paediatric cancer types, as exemplified in The Cancer Genome Atlas (TCGA), the International Cancer Genome Consortium (ICGC), the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) consortium and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) consortium. These studies aimed to comprehensively characterize genome-wide molecular alterations and provide a reference molecular landscape for each cancer type by performing integrative analyses of large \t 9 cohorts of primary tumour samples. These \u00E2\u0080\u009Clandscape\u00E2\u0080\u009D studies further allowed comparisons of molecular alterations amongst different cancer types, enabling \u00E2\u0080\u009Cpan-cancer\u00E2\u0080\u009D analyses that identified molecular similarities and differences across cancers to further enhance our understanding of cancer biology overall (Kahles et al., 2018; Campbell et al., 2020). These consortium-based efforts, such as those by TCGA, also helped establish model infrastructures for large-scale, international cancer genomics research projects that required standards for data generation, data curation and quality metrics, secure data sharing and privacy, standard file formats and effective modes of communication amongst collaborators, including data visualization and the use of cloud platforms for common data analysis environments and storage. Affordable sequencing of genomes at highly redundant coverage, and of large cohorts of cancer samples, have driven the discovery of variations between cancer samples, often referred to as inter-tumour heterogeneity, as well as variations within a single tumour, i.e. intra-tumour heterogeneity. Multi-region sequencing, single-cell sequencing and longitudinal analyses of multiple biopsy sequencing (reviewed in Dagogo-Jack and Shaw, 2018) have shown that all cancers are dynamic, capable of adapting and evolving over the course of the disease, and are heterogeneous entities that vary between tumour subclonal populations and across tumour tissue regions and time as the disease progresses. Selective pressures acting on such variations can serve as fuel for the emergence of resistance to therapy and subsequent disease progression. One approach to improve the understanding of inter-tumour heterogeneity is through identification and characterization of molecular subgroups, which can help refine how cancers are classified, improving upon previous approaches that were primarily based on microscopic assessments of anatomic and morphological features, to group cancers based on shared driver \t 10 alterations or dysregulated pathways, which can be more therapeutically informative. For example, classification of breast cancers based on the status of estrogen and progesterone receptor expression revealed a subgroup of cancers with high expression of these hormone receptors, which showed positive responses to tamoxifen and chemotherapy treatments (Fisher et al., 1983). Another example is the molecular classification of medulloblastoma (MB) into four subgroups (i.e. WNT, SHH, Groups 3 and 4), which enabled development of SMO inhibitor treatments that targeted the SHH pathway for patients in the SHH subgroup (Rudin et al., 2009), and also enabled development of de-escalating treatment plans for patients whose tumours were in the WNT subgroup, which tended to be non-metastatic and more chemotherapy-sensitive compared to those in other MB subgroups (reviewed in Juraschka and Taylor, 2019). Genome science, through the use of the next-generation sequencing technologies, led to discoveries of cancer-associated genes and oncogenic processes that had not been previously implicated in cancer. The significance of these studies, in regards to expanding our fundamental knowledge of cancer biology, is reflected in a revised list of \u00E2\u0080\u009Cthe next generation\u00E2\u0080\u009D hallmarks of cancers, which carried the following additions: genome instability and mutations, avoiding immune destruction, tumour-promoting inflammation and deregulation of cellular energetics (Hanahan and Weinberg, 2011). The use of next-generation sequencing technologies has been particularly helpful in revealing extensive epigenetic dysregulation in cancers. Aberrant gene expression and genome instability associated with global hypomethylation, enhancer dysregulation, chromatin modification or loss of imprinting were identified in virtually all cancer types (reviewed in Feinberg and Tycko, 2004). In paediatric cancers, epigenetic dysregulation is particularly pronounced (Schwartzentruber et al., 2012; Mack et al., 2014). In contrast to adult cancers that accumulate somatic mutations over time (Lawrence et al., 2013), paediatric cancers \t 11 harbour substantially fewer mutations and reveal an alternative path to malignancy through dysregulation of gene expression by aberrant epigenetic changes. Distinct mutation profiles reflect this important difference in underlying oncogenic mechanisms between paediatric and adult cancers. For example, analyses of whole genomes and exomes of 24 paediatric cancer types (n = 879 cases) revealed that among the 77 most significantly mutated genes identified, 25% of them (in 23 out or 24 cancer types) were involved in epigenetic modulation (ICGC, 2018). In contrast, the top three functional categories of significantly mutated genes in TCGA adult cancers were MAPK, PI(3)K and TGF-\u00CE\u00B2 signalling pathways (Kandoth et al., 2013). Another striking example of epigenetic dysregulation driving paediatric cancers is the observation of non-synonymous SNVs in H3F3A, almost exclusively in paediatric glioblastomas, that result in disruption of post-translational modification of the K27 or G34 amino acid in the histone 3 tail, both of which are involved with epigenetic regulation of transcription (Schwartzentruber et al., 2012). In contrast, mutations in classic oncogenes and TSGs, such as CDKN2A deletion and EGFR amplification, as well as frequent copy number alterations, are mainly observed in adult glioblastomas (Sturm et al., 2012). In addition to epigenetic dysregulation, paediatric cancers also exhibit dysregulated developmental processes (Li et al., 2005; Han et al., 2016; Selvadurai et al., 2020). The notion of a cancer cell of origin from stem or progenitor cell types during disrupted developmental processes is further supported in a study of paediatric rhabdoid tumours, which reported RTs arising from oncogenic perturbation during specific developmental time windows. The study made use of a conditional Smarcb1 and Nf2 knockout mouse model, and showed that Smarcb1 ablation in the early neural-crest-lineage resulted in development of tumours that resembled human RTs, while Smarcb1 loss was not tumorigenic in Schwann cells that differentiated from \t 12 neural crest cells (Vitte et al., 2017). The study also reported that Nf2-inactivated tumours did not exhibit increased cell proliferation with additional loss of Smarcb1. These observations indicate context-dependent consequences of mutations, and that driver alterations may require certain cellular or epigenetic contexts in order for oncogenic transformation to occur. Genome science studies that profile whole genomes, transcriptomes and epigenomes of cancers using second-generation sequencing technologies can reveal aspects of genome-wide molecular characteristics that arise as consequences of driver alterations, and may also allude to genetic, gene expression or epigenetic contexts associated with oncogenic transformation. 1.5 The SWI/SNF complex in cancers Among epigenetic regulators, the SWItch/Sucrose Non-Fermenting (SWI/SNF) complex is of particular importance in cancer biology, as it is the most frequently mutated protein complex in human cancers (19%; Kadoch et al., 2013; Shain and Pollack, 2013). The SWI/SNF complex is an evolutionarily conserved chromatin-remodelling complex that uses the energy of ATP hydrolysis to mobilize nucleosomes and remodel chromatin (Phelan et al., 1999). Based on the constituent subunits, the SWI/SNF complex is categorized into three classes, namely the canonical BRG1/BRM-Associated Factor (BAF), Polybromo-associated BAF (PBAF) and non-canonical BAF (ncBAF) complexes (Mashtalir et al., 2018). In human, SWI/SNF is comprised of 10-15 subunits out of the 29 subunits that have been identified thus far (Arnaud, Le Loarer and Tirode, 2018), and consists of one ATPase (SMARCA2/BRM or SMARCA4/BRG1), three core subunits (SMARCB1/BAF47/INI1/SNF5, SMARCC1/BAF155 and SMARCC2/BAF170) and class-specific subunits (ARID1A/BAF250a or ARID1B/BAF250b for BAF complexes; PBRM1/BAF180 and ARID2/BAF200 for PBAF complexes). In addition to these core subunits, \t 13 there are up to 15 auxiliary subunits characterized thus far, e.g. SMARCE1/BAF57, BRD9, BCL7, ACTL6/BAF53 and SYT/SS18. Biochemical studies using subunit-specific antibodies showed that SWI/SNF subunits freely combine with one another in cultured cells, predicting nearly 300 possible assemblies of the complex (Wang et al., 1996; Wu, Lessard and Crabtree, 2009). Diverse forms of the SWI/SNF complex with different combinations of auxiliary subunits are thought to be associated with diverse functions of the complex, involved in various cellular and developmental contexts. For example, BAF complexes were shown to change auxiliary subunits during mouse development, resulting in different compositions of BAF complexes in pluripotent embryonic stem cells (also known as esBAF), multipotent neural stem cells (npBAF), postmitotic neurons (nBAF), cardiac progenitor cells (cBAF) and hematopoietic stem cells (Wu, Lessard and Crabtree, 2009; Kadoch and Crabtree, 2015). These observations further imply that auxiliary subunits are combined to affect specificity of target genes and to regulate specific biological functions (Mohrmann and Verrijzer, 2005; Son and Crabtree, 2014). SWI/SNF complexes are involved in a broad range of biological processes. As a chromatin remodeler, the complex plays a key role in regulating gene expression by altering nucleosome occupancy at gene promoters, leading to either transcriptional activation or repression (Kia et al., 2008). Recently, SWI/SNF has been shown to play important roles in modulation of enhancers and other epigenetic regulators to provide a proper balance between cellular differentiation and the maintenance of stem cell-like properties such as self-renewal (Kadoch, Copeland and Keilhack, 2016; Lu and Allis, 2017). The loss of ARID1A and SMARCB1 resulted in altered enhancer targeting (Mathur et al., 2017). In particular, SMARCB1 loss caused preferential disruption in the formation of typical enhancers compared to super-enhancers where the former are regulatory elements typically characterized by enrichments of \t 14 active histone marks, particularly H3K27ac, in regions less than 1 kb in length, whereas the latter are characterized by unusually high densities and enrichments of H3K27ac marks in regions over 10 kb in length (Whyte et al., 2013; Heinz et al., 2015). This suggested a mechanism by which differentiation regulated by typical enhancers was disrupted, while stemness regulated by super-enhancers was maintained in cancer (Wang et al., 2017; Langer, Ward and Archer, 2019). Another mechanism used by SWI/SNF to balance between differentiation and stemness appears to be through modulating repressive chromatin states mediated by the Polycomb Repressive Complexes (PRC). SWI/SNF has been shown to antagonize activities of the PRC that mediate chromatin compaction and methylation at the lysine 27 residue of histone 3 proteins (H3K27), which lead to transcriptional repression. SWI/SNF was shown to directly evict PRC1 (Kia et al., 2008; Kadoch et al., 2017; Stanton et al., 2017). In particular, loss of SMARCA4 resulted in disruption of PRC1 eviction, leading to genome-wide increases in the abundance of PRC1 and PRC2, and increased genome-wide abundance of H3K27me3 marks (Stanton et al., 2017). Furthermore, SMARCB1 antagonizes activities of EZH2, a catalytic subunit of PRC2 (Wilson et al., 2010). Corroborating the observations of an antagonistic relationship between SWI/SNF and PRC complexes, in vitro assays showed that SWI/SNF-defective cancer cell lines with mutations in ARID1A, PBRM1 or SMARCA4 were sensitive to EZH2 inhibition (Kim et al., 2015). RT cell lines and mouse xenografts with SMARCB1 loss also showed increased sensitivity to treatments of an EZH2 inhibitor that blocks the binding of the substrate required for methylation, S-adenosylmethionine (SAM; Knutson et al., 2012, 2013). Crucial roles of SWI/SNF in regulating differentiation have been further demonstrated in multiple cellular contexts, including development of the neural crest (Chandler and Magnuson, 2016), post-mitotic neurons (Lessard et al., 2007), mesenchymal cells (Sinha et al., 2020) and T cells (Gebuhr et al., 2003). Diverse \t 15 functions of the SWI/SNF complex extend to DNA repair and maintenance of genome stability, shown by the recruitment of SWI/SNF to sites of double-stranded breaks, disruption of p53 acetylation and defects in sister chromatid cohesion upon PBRM1 loss, and interaction of ARID1A with MMR proteins and increased mutability upon ARID1A loss (Brownlee, Meisenberg and Downs, 2015; Shen et al., 2015, 2018; Cai et al., 2019). Dysregulation of SWI/SNF in cancers was first discovered from identifying biallelic inactivation of the SMARCB1 gene in MRT cell lines (Versteege et al., 1998). It is now known that nearly 20% of human cancers harbour mutations in genes that encode SWI/SNF subunits, a pan-cancer frequency approaching that of TP53 mutations, which is 26% (Kadoch et al., 2013; Shain and Pollack, 2013). Among SWI/SNF subunits, a subset of them is particularly frequently mutated in specific cancer types. SMARCB1 loss is observed in nearly 100% of RTs and epitheloid sarcomas (Sullivan et al., 2013), renal medullary carcinomas (Msaouel et al., 2020) and undifferentiated atypical chordomas with notochordal differentiation (Mobley et al., 2010; Hasselblatt et al., 2016). Synovial sarcomas, for which a SS18-SSX translocation is pathognomonic, are also driven by SMARCB1-deficient SWI/SNF, as the SS18-SSX fusion protein has been shown to mediate the expulsion of SMARCB1 from the SWI/SNF complex (Kadoch and Crabtree, 2013). Other highly recurrently mutated genes that encode SWI/SNF subunits in cancers include SMARCA2 and SMARCA4, which harbour loss-of-function mutations in nearly 100% of small cell carcinomas of the ovary, hypercalcemic type (SCCOHT; Jelinic et al., 2014; Ramos et al., 2014); ARID1A, which is mutated in nearly 50% of ovarian clear cell carcinomas (Jones et al., 2010; Wiegand et al., 2017); and PBRM1, which is mutated in approximately 40% of clear cell renal cell carcinomas (Varela et al., 2011). ARID1A mutations are also frequently found across other adult solid cancer types such as endometrioid, bladder, \t 16 gastric, colorectal and lung carcinomas (Arnaud, Le Loarer and Tirode, 2018). These subunits have been shown to play important roles in chromatin remodelling activities (e.g. SMARCA2 and SMARCA4 have the APTase and helicase domains, enabling nucleosome mobilization from ATP hydrolysis; SMARCB1 is involved in target specificity, remodelling efficiency and PRC antagonism), or to be involved in biological processes whose dysregulation contributes to the hallmarks of cancers such as DNA repair (e.g. ARID1A facilitates DNA double-strand break end processing by sustaining the activation of ATR, which controls the DNA damage response; Brownlee, Meisenberg and Downs, 2015; Shen et al., 2015) and maintenance of genome stability (e.g. PBRM1 regulates p53; Cai et al., 2019). 1.6 Rhabdoid Tumours Rhabdoid tumours (RTs) are highly malignant paediatric solid cancers with an overall 4-year survival rate of 23% (Tomlinson et al., 2005) and with no standard therapy yet available (Brennan, Stiller and Bourdeaut, 2013). RTs can occur throughout the body, and are commonly found in the central nervous system (also known as atypical teratoid RTs, or \u00E2\u0080\u009CATRTs\u00E2\u0080\u009D) and extra-cranial sites (malignant rhabdoid tumours, or \u00E2\u0080\u009CMRTs\u00E2\u0080\u009D), including kidneys (RT of the kidneys, or \u00E2\u0080\u009CRTKs\u00E2\u0080\u009D) and soft tissues. RTs are rare cancers, with an annual incidence rate of approximately 0.5 per million worldwide (Brennan, Stiller and Bourdeaut, 2013; Heck et al., 2013), predominantly affecting infants under the age of 1 year old (median = 11 months; Tomlinson et al., 2005). Within the paediatric population, clinical burdens of RTs are significant, with cranial ATRTs accounting for 10-15% of CNS cancers (Packer et al., 2002), and extra-cranial MRTs accounting for 18% of renal cancers, 14% of soft tissue cancers and 9% of liver cancers (Brennan, Stiller and Bourdeaut, 2013). Previous studies have indicated some degrees of \t 17 clinical and demographic variation, demonstrated by a few remarkable long-term survivors (Ravindra et al., 2002; Hirth et al., 2003; Ammerlaan et al., 2008). Age and tumour stage at diagnosis have both been associated with outcomes (Tekautz et al., 2005; Tomlinson et al., 2005). RTs result primarily from loss-of-function driver alterations in SMARCB1 (>95% of cases), and, rarely, from loss-of-function alterations in SMARCA4 (Versteege et al., 1998; Biegel et al., 2002). Previous studies have shown that RTs exhibit largely diploid genomes with very few alterations in DNA sequences, chromosomal copy numbers, or structural alterations, such as translocations (McKenna et al., 2008; Jackson et al., 2009; Lee et al., 2012; Hasselblatt et al., 2013). To understand oncogenic processes driven by SMARCB1 loss, previous studies investigated the effects of SMARCB1 re-introduction in SMARCB1-deficient RT cell lines. In these studies, SMARCB1 re-introduction resulted in the arrest of the cell cycle at the G0-G1 phase and was associated with transcriptional repression of Cyclin D1, which was mediated by the recruitment of histone deacetylases (HDACs) at the CCND1 gene promoter. Also, SMARCB1 re-introduction resulted in the activation of P16INK4A by evicting PRC at the CDKN2A gene promoter, leading to hypophosphorylation of RB1 (Betz et al., 2002; Versteege et al., 2002; Kia et al., 2008; Kim and Roberts, 2014). SMARCB1 has been shown to regulate other known cancer genes and pathways. For example, SMARCB1 directly binds to the promoters of the GLI and PTCH1 genes and represses their expression, and SMARCB1 loss was shown to aberrantly activate SHH signalling pathways in MRT cell lines and tumour tissues (Jagani et al., 2010). SMARCB1 also directly binds to the promoters of the AURKA (encodes Aurora kinase A) and PLK1 (encodes Polo-like kinase 1) oncogenes and represses their expression (Lee et al., 2011). Studies of primary RT samples also reported on other biological processes in SMARCB1-\t 18 deficient MRT cell lines and tumour tissues, such as dysregulation of early developmental processes such as myogenesis (Pomeroy et al., 2002) and aberrant expression of stem cell-associated transcription factors (Venneti et al., 2011). While studies such as these reported molecular aberrations associated with SMARCB1 loss that is common among RTs, other studies have reported evidence of inter-tumour heterogeneity across RTs. For example, a spectrum of histological characteristics of RTs even within a single anatomical site was observed (n = 33 MRT and 11 ATRT cases; Kohashi et al., 2016). Also, gene expression (n = 20 ATRT cases, Birks et al., 2013; n = 10 RTK and 13 ATRT cases, Grupenmacher et al., 2013) and DNA methylation profiles (n = 259 ATRT cases, Torchia et al., 2015) that distinguished extra-cranial MRTs from cranial ATRTs were reported, making it unclear whether findings from MRTs can be applied to ATRTs, and vice versa. Furthermore, the studies that compared MRTs and ATRTs analyzed small sample cohorts, and all previous RT studies used targeted approaches that profiled sub-genomic regions (i.e. microarrays or exome-seq), and often focused on ATRTs only. There was a lack of knowledge in genome-wide mutation, gene expression and epigenetic characteristics of RTs from different anatomic regions, especially those of extra-cranial MRTs. I was motivated to address this knowledge gap in the field, and carried out this thesis research to characterize multi-omic landscapes of RTs. 1.7 Thesis roadmap and chapter summaries The objective of my thesis research is to determine the extent to which genetic, gene expression and epigenetic are similar or different between extra-cranial MRTs and cranial ATRTs, and to determine the existence of RT subgroups that are characterized by subgroup-specific molecular features. To achieve this objective, I aimed to comprehensively characterize \t 19 molecular landscapes of RTs by performing integrative analyses of multi-omic datasets from primary RT samples from different anatomical sites. The multi-omic datasets consist of whole genome, transcriptome, methylome and genome-wide histone modification ChIP-sequencing data generated using the Illumina sequencing technology. The overarching hypotheses of my thesis research are the following: First, I hypothesized that the use of second-generation sequencing technologies would identify mutations, aberrant gene expression and epigenetic features that were previously unknown in RTs. Second, I also hypothesized that despite uniformly harbouring SMARCB1 loss, RTs are molecularly heterogeneous, consisting of molecular subgroups with distinct biological characteristics regardless of anatomical sites of occurrence. Thirdly, I hypothesized that characterization of RT subgroups would enable identification of molecular features that have subgroup-specific therapeutic implications. Chapter 2 describes multi-omic data analyses of 40 primary extra-cranial MRT cases. The goal of the research described in Chapter 2 was to determine the extent of molecular heterogeneity in extra-cranial MRTs, and to characterize representative molecular landscapes of MRTs. Key findings included identification of two gene expression subgroups, one of which exhibited characteristics similar to ATRTs while the other resembled RTKs, and the identification of dysregulated developmental pathways, particularly those involving homeobox genes and regulators of neural crest development. Chapter 3 describes comparative multi-omic analyses of cranial ATRTs and extra-cranial MRTs using the largest RT cohort so analyzed to date (n = 301 cases). The goal of the research described in Chapter 3 was to determine the extent of molecular heterogeneity across RTs from multiple anatomical sites, to identify molecular subgroups, and characterize distinct subgroup \t 20 features including those that may have clinical implications. Key findings included identification of consensus DNA-methylation subgroups regardless of anatomical sites of occurrence, molecular similarities between MRTs and the MYC subgroup of ATRTs (ATRT-MYC), and increased levels of immune cell infiltration and immune checkpoint protein expression in subgroups consisting of ATRT-MYC and extra-cranial MRTs, potentiating the application of immunotherapy for a subset of RT patients. Finally, Chapter 4 provides the overall conclusion of this thesis research and describes limitations of this work and suggestions for future directions and experiments to further our understanding of this clinically devastating, yet scientifically tantalizing cancer. \t 21 CHAPTER 2. Characterization of genomic, epigenomic and gene expression landscapes of extra-cranial malignant rhabdoid tumours2 2.1 Introduction Extra-cranial MRTs and cranial ATRTs are predominantly driven by SMARCB1 loss. However, previous studies have alluded to clinical, histological and gene expression heterogeneity within and across MRTs and ATRTs, reporting on the existence of a few long-term survivors (Ravindra et al., 2002; Hirth et al., 2003; Ammerlaan et al., 2008), the correlation of patient outcome with age at diagnosis or with tumour stage (Tekautz et al., 2005; Tomlinson et al., 2005) and a spectrum of histological features of RTs even within a single anatomical site (Kohashi et al., 2016). Specifically in ATRTs, Torchia et al. (2015) reported the existence of two gene expression subgroups and correlated these with survival characteristics. However, the existence of molecularly distinguishable subgroups has not been reported in MRTs. Moreover, the extent of overlap in molecular signatures between ATRTs and MRTs had not been fully elucidated, with inconsistent reports in the literature (Pomeroy et al., 2002; Grupenmacher et al., 2013), which reinforced the need to comprehensively analyze genome-wide mutation, gene expression and epigenetic profiles of MRTs and determine the extent of inter-tumoural heterogeneity within MRTs. Motivated by this need, I aimed to characterize the whole genome, transcriptome and epigenome profiles of MRTs through the application of second-generation sequencing technologies. A set of 40 MRT cases was collected and characterized as part of the \t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t2\tA version of this Chapter has been published, and the author contributions are provided in the Preface as per the University of British Columbia PhD thesis guideline: H-J. E. Chun et al. Genome-wide profiles of extra-cranial malignant rhabdoid tumors reveal heterogeneity and dysregulated developmental pathways. Cancer Cell (2016) 29(3): 394-406. https://doi.org/10.1016/j.ccell.2016.02.009. Copyright by Elsevier Inc.\t\t 22 Therapeutically Applicable Research To Generate Effective Treatments (TARGET) paediatric cancer research initiative (https://ocg.cancer.gov/ programs/target). The research goals of the TARGET initiative are equivalent to those of The Cancer Genome Atlas (TCGA) for adult cancers, which are to apply genomics approaches to identify and characterize molecular alterations in cancers in hopes for better therapeutics development. All data generated for this study are deposited in and accessible through NCBI dbGaP (accession number phs000470). Additional data are available at https://ocg.cancer.gov/programs/target/data-matrix. Complete sample information with clinical details for the 40 samples characterized in this study is available in the Supplemental Table 2.1. 2.2 Results 2.2.1 Samples and data analyzed for the study To facilitate comparisons across extra-cranial MRTs, my colleagues at Canada\u00E2\u0080\u0099s Michael Genome Sciences Centre (BCGSC) applied whole genome-Seq (WGS), RNA-Seq, ChIP-Seq targeting H3K27ac and H3K27me3 and whole genome bisulfite-Seq (WGBS) on 40 treatment-na\u00C3\u00AFve primary MRT cases, three MRT cell lines (G401, KP-MRT-RY, KP-MRT-NS; Peebles, Trisch and Papageorge, 1978; Garvin et al., 1993) and three human embryonic stem cell (hESC) lines (H7, H9, H14; Thomson et al., 1998). The median tumour purity was estimated to be 88% based on heterozygous variant allele frequencies using APOLLOH (Ha et al., 2012; range 42.8% to 95.0%). \t 23 2.2.2 Whole genome landscapes of MRT To produce a reference somatic mutation landscape for MRTs, I analyzed whole genome sequence data from 40 pairs of primary tumour and matched normal samples to identify somatic single nucleotide variants (SNVs), small insertions and deletion mutations (InDels), and regions with loss of heterozygosity (LOH) and structural alterations including copy number alterations (CNA) and chromosomal rearrangements. To identify high-confidence putative somatic SNVs and InDels, I selected somatic variants identified by two algorithms, Strelka (Saunders et al., 2012) and SAMtools mpileup (Li et al., 2009). This approach identified a total of 26,471 high-confidence SNVs genome-wide. To assess the level of true positive variants using my approach, I compared SNVs identified using WGS data to SNVs identified in matching RNA-Seq datasets using SNVMix2 (Goya et al., 2010). 255 SNVs were common between the RNA-Seq and the whole genome datasets. Of these, 243 SNVs were identified by both Strelka and mpileup (243 out of 255; 95.3%), indicating that this proportion represented an estimate of the accuracy of using mpileup and Strelka jointly to identify SNVs genome-wide. I next analyzed mutations in SMARCB1, the known driver gene in RTs, across 40 tumour samples. As expected, my analysis identified loss-of-function mutations, copy number losses and somatic LOH affecting SMARCB1 in 39 of 40 cases. The single case lacking a SMARCB1 alteration harboured a germline deletion of one base and somatic LOH in SMARCA4 (Figure 2.1). \t 24 Figure 2.1 Mutation profiles of 40 MRT cases. The figure shows selected genetic alterations identified from whole genome analyses (top), the number and types of somatic SNVs genome-wide (middle; median of 612.5 mutations indicated by a dashed line) and clinical characteristics of 40 MRT cases (bottom). Each column represents a case. Previous studies, using exome-sequencing data, reported low mutation prevalence in RTs (Lee et al., 2012; Lawrence et al., 2013; Vogelstein et al., 2013). To determine whether these observations were consistent when entire genomes were analyzed, I calculated somatic mutation prevalence based on high-confidence somatic sequence alterations identified using whole genome sequence data (n = 28,918 substitutions and InDels in total) from 40 MRT tumour-normal pairs. The median somatic mutation rate was 0.254 mutations per Mb (a median of 612.5 PHF21BMKL1BRD1SUZ12ZBTB39MBD1MBD2KMT2D / MLL2DSCAMUPB1TTC28EWSR1NF2LIFSPECC1LCABIN1SMARCA4SMARCB1PAUEKWPADYZIPAKHTLPAJNFPPAUNPAPAUFVPPATXEEPABKLNPARIRNPATBLFPATENHPAUCGJPATFZZPASADZPAJNFZPARZBIPARPFYPATFXWPASXNAPASCDHPAUGYZPARRCLPARGRNPASZYEPATAFTPASVDPPATXKAPASABDPATDVLPARUGKPAJMRBPAJLWMPASDLAPASYNFPAUHAZPARECBPASWZZPASRHUPAUDPVPAJNERNumber of somatic SNVs050010001500C>TC>AC>GT>CT>AT>GSomatic nonsenseLOHSomatic deletion withdecreased expression Somatic heterozygous missenseSomatic fusionGermline fusionGermline InDelGermline InDel & Somatic CNNLOHGermline missense &SexAnatomical SiteAgeTumor StageFemale MaleKidney Liver Soft tissue<1 year >=5 years1\u00E2\u0088\u00925 yearsI/II III/IVSomatic CNNLOH\t 25 mutations per case), which was in agreement with the previous reports. I thus confirmed that MRTs exhibited one of the lowest tumour mutation burdens among human cancer types (in comparison to median somatic mutation rates that range between 1.5 and 13.2 mutations per Mb in other paediatric and adult cancer types (n = 27); Lawrence et al., 2013). Somatic SNV rates showed a significant positive correlation with chronological age at diagnosis (Spearman rho = 0.74, p-value = 2.78e-07), a similar observation also made in paediatric medulloblastoma genomes (Jones et al., 2012; Figure 2.2). Figure 2.23 Correlation between somatic SNV rates and age at diagnosis in MRTs Age at diagnosis (x-axis) and the number of somatic SNVs (y-axis) in MRT cases are shown. The Spearman correlation coefficient between the age and the number of SNVs was 0.74 (p-value = 2.78e-07). To determine whether there was any mutation signature (Nik-Zainal et al., 2012; Alexandrov et al., 2013) present in RT genomes, I analyzed somatic SNVs and two flanking 0 1000 2000 3000 4000 5000400600800100012001400Age at diagnosis (days)Number of somatic SNV\t 26 bases on 5\u00E2\u0080\u0099 and 3\u00E2\u0080\u0099 ends, and compared the proportions of 96 substitution types in the context of trinucleotides. One case, TARGET-52-PADYZI, exhibited an increase in the proportion of T>G transversions, and, along with another case, TARGET-52-PAUEKW, exhibited the highest number of somatic mutations (Figure 2.1). Alterations in DNA repair genes were not observed in either case. The mutation spectrum of TARGET-52-PADYZI was most strongly correlated with mutation signatures 17 and 9 described in Alexandrov et al., both of which are characterized by predominant T>G mutations (Pearson correlation coefficients = 0.962 and 0.607, respectively; Figure 2.3A). The mutation spectrum of TARGET-52-PAUEKW was most strongly correlated with signatures 1A and 1B (Pearson correlation coefficients = 0.877 and 0.925, respectively), characterized by C>T mutations in the context of NpCpG trinucleotides that are presumed to arise from spontaneous deamination of 5-methyl-cytosine (Alexandrov et al., 2013; Figure 2.3B). TARET-52-PAUEKW is the eldest patient in the cohort (14.8 years old; median age of the cohort = 11.1 months), which is consistent with the potential association of signatures 1A and 1B with age. Apart from these two cases, there was no evidence for the existence of a specific mutational process in RT genomes, as similar spectra of proportions of substitution types were observed across the cohort. In addition to determining the presence of dominant mutational signatures, I next investigated whether RT genomes harboured localized hyper somatic mutations, a phenomenon referred to as kataegis, which was first reported in breast cancers (Nik-Zainal et al., 2012). To identify regions of localized hyper somatic mutations, I analyzed the distances between adjacent somatic SNVs. Evaluation of inter-mutational distances did not reveal obvious genomic regions with clusters of mutations (Figure 2.4), indicating that RT genomes did not undergo kataegis, and further supporting the notion that RT genomes did not undergo any specific mutational processes. \t 27 Figure 2.34 Two MRT cases, PADYZI and PAUEKW, exhibited distinct somatic substitution mutation patterns of increased proportions of T>G transversions and C>T transitions, respectively, which were correlated with mutation signatures 17 and 9 (PADYZI) and signatures 1A/B (PAUEKW). fraction of mutations in trinucleotidesC>AC>GC>TT>AT>CT>G0.200.150.100.050.00A(C>A)AA(C>A)CA(C>A)GA(C>A)TC(C>A)AC(C>A)CC(C>A)GC(C>A)TG(C>A)AG(C>A)CG(C>A)GG(C>A)TT(C>A)AT(C>A)CT(C>A)GT(C>A)TA(C>G)AA(C>G)CA(C>G)GA(C>G)TC(C>G)AC(C>G)CC(C>G)GC(C>G)TG(C>G)AG(C>G)CG(C>G)GG(C>G)TT(C>G)AT(C>G)CT(C>G)GT(C>G)TA(C>T)AA(C>T)CA(C>T)GA(C>T)TC(C>T)AC(C>T)CC(C>T)GC(C>T)TG(C>T)AG(C>T)CG(C>T)GG(C>T)TT(C>T)AT(C>T)CT(C>T)GT(C>T)TA(T>A)AA(T>A)CA(T>A)GA(T>A)TC(T>A)AC(T>A)CC(T>A)GC(T>A)TG(T>A)AG(T>A)CG(T>A)GG(T>A)TT(T>A)AT(T>A)CT(T>A)GT(T>A)TA(T>C)AA(T>C)CA(T>C)GA(T>C)TC(T>C)AC(T>C)CC(T>C)GC(T>C)TG(T>C)AG(T>C)CG(T>C)GG(T>C)TT(T>C)AT(T>C)CT(T>C)GT(T>C)TA(T>G)AA(T>G)CA(T>G)GA(T>G)TC(T>G)AC(T>G)CC(T>G)GC(T>G)TG(T>G)AG(T>G)CG(T>G)GG(T>G)TT(T>G)AT(T>G)CT(T>G)GT(T>G)TAA(C>A)AA(C>A)CA(C>A)GA(C>A)TC(C>A)AC(C>A)CC(C>A)GC(C>A)TG(C>A)AG(C>A)CG(C>A)GG(C>A)TT(C>A)AT(C>A)CT(C>A)GT(C>A)TA(C>G)AA(C>G)CA(C>G)GA(C>G)TC(C>G)AC(C>G)CC(C>G)GC(C>G)TG(C>G)AG(C>G)CG(C>G)GG(C>G)TT(C>G)AT(C>G)CT(C>G)GT(C>G)TA(C>T)AA(C>T)CA(C>T)GA(C>T)TC(C>T)AC(C>T)CC(C>T)GC(C>T)TG(C>T)AG(C>T)CG(C>T)GG(C>T)TT(C>T)AT(C>T)CT(C>T)GT(C>T)TA(T>A)AA(T>A)CA(T>A)GA(T>A)TC(T>A)AC(T>A)CC(T>A)GC(T>A)TG(T>A)AG(T>A)CG(T>A)GG(T>A)TT(T>A)AT(T>A)CT(T>A)GT(T>A)TA(T>C)AA(T>C)CA(T>C)GA(T>C)TC(T>C)AC(T>C)CC(T>C)GC(T>C)TG(T>C)AG(T>C)CG(T>C)GG(T>C)TT(T>C)AT(T>C)CT(T>C)GT(T>C)TA(T>G)AA(T>G)CA(T>G)GA(T>G)TC(T>G)AC(T>G)CC(T>G)GC(T>G)TG(T>G)AG(T>G)CG(T>G)GG(T>G)TT(T>G)AT(T>G)CT(T>G)GT(T>G)TC>AC>GC>TT>AT>CT>Gfraction of mutations in trinucleotides0.200.150.100.050.00B\t 28 Bar plots show fractions of somatic substitutions in 96 trinucleotide classes in PADYZI (A) and PAUEKW (B). The fraction of T>G in PADYZI is 0.45 (median of all cases = 0.094). A mutated base is indicated within round brackets (\u00E2\u0080\u009C()\u00E2\u0080\u009D), flanked by the bases immediately adjacent, i.e. 5\u00E2\u0080\u0099 and 3\u00E2\u0080\u0099, to the mutated base. Colours of the bars indicate six classes of single base substitution mutations. Figure 2.45 Analysis of inter-mutational distances showed that MRT genomes did not undergo kataegis. A rainfall plot shows distances (in bp) between two somatic SNVs in a genome of an MRT case, PASADZ, which exhibits a representative profile of inter-mutational distances in RT genomes. Genomes affected by kataegis would contain regions of somatic hypermutations, which would be indicated by the existence of clusters of dots with shorter inter-mutational distances (< 100 bp, as shown in breast cancers (Nik-Zainal et al., 2012)), resembling the appearance of a \u00E2\u0080\u009Crainfall\u00E2\u0080\u009D in the plot. Mutations were ordered linearly on the x-axis from chromosome 1 to 22. Colours indicate six substitution classes. \t 29 Previous studies applied targeted approaches to identify mutations in the coding regions of RT genomes (Lee et al., 2012; Torchia et al., 2015), and reported that SMARCB1 was the only recurrently mutated gene in RTs. To test my hypothesis that there might be unrecognized recurrent alterations in both coding and non-coding regions in the RT genomes, I analyzed whole genome sequence data and identified 204 somatic non-synonymous amino acid substitution mutations genome-wide (median of five mutations per case, ranging from 0 to 18). Aside from SMARCB1, DSCAM was the only recurrently mutated gene, with two cases exhibiting heterozygous somatic non-synonymous substitution mutations (p.Val424Ile and p.Ser1354Thr). These residues are conserved across 92% and 100% of vertebrate species (n = 100; UCSC genome browser) within the Ig-like C2-type 5 and 10 domains, respectively. DSCAM encodes a neural cell adhesion molecule involved in nervous system development (Yamakawa et al., 1998). Somatic DSCAM alterations were detected at low frequencies (approximately 10% of cases) in prostate, uterine, colorectal and lung cancers in TCGA datasets (reported via cBioPortal; Cerami et al., 2012; Gao et al., 2014). However, the role of DSCAM alterations in malignancy remains unclear. Among genes that harboured non-synonymous SNVs in single cases, six of them were involved in epigenetic regulation (KMT2D/MLL2, MBD1, ZBTB39, BRD1, PHF21B and SUZ12; Figure 2.1), indicating that epigenetic dysregulation may occur in addition to SMARCB1 loss in some MRT cases. Out of 28,918 somatic alterations found across RT genomes, the majority (97.1%) were found in non-coding regions. Previous studies reported significantly mutated non-coding regions in multiple cancer types, such as recurrently mutated introns, UTRs and intergenic regions in multiple myelomas (Chapman et al., 2011), and hotspot mutations in the promoter regions of TERT, SDHD, PLEKHS1 or WDR74 in multiple cancer types have shown to affect gene \t 30 expression (Horn et al., 2013; Vinagre et al., 2013; Weinhold et al., 2014). To identify such non-coding mutations that may have functional impact, I sought candidate regulatory loci by identifying mutation clusters in the genome, which exhibited significantly higher frequencies of somatic alterations compared to random chance based on the background mutation rate. This analysis revealed clusters of somatic single nucleotide alterations in the introns of 97 genes (BH-corrected binomial p-value < 0.05), including LRRC4C, ROBO2 and EPHA5, which are involved in axon guidance. ROBO2 and EPHA5 were reported to be recurrently altered in pancreatic cancers (Biankin et al., 2012); ROBO2 was altered by LOH, copy number alteration (predominantly deletion) and non-synonymous substitution mutations, while EPHA5 copy number variants (predominantly amplifications) were observed. When I compared all 19 cases with ROBO2 intronic mutations to the remaining cases that lacked ROBO2 mutations, I observed significant decreases in ROBO2 gene expression levels in cases with mutations (Figure 2.5; BH-corrected Wilcoxon Mann-Whitney U p-value = 0.00052). Neither ROBO2 mutations nor axon guidance-related genes have previously been linked to extra-cranial MRT development. To further identify potentially functional non-coding mutations, I focused my analysis on non-coding mutations within introns and regions 2 kb upstream of transcription start sites to identify genes with somatic alterations in these regions, which correlated with gene expression changes. I identified SPECC1L intron mutations in six cases and KCNJ3 intron mutations in eight cases (Figures 2.6A and 2.6B). There were significant differences in mean gene expression levels between mutated and non-mutated cases for SPECC1L and KCNJ3 (BH-corrected Wilcoxon p values = 1.82e-10 and 2.30e-08, respectively), with SPECC1L mutations associated with decreased gene expression (log2 fold change = -1.45; Figure 2.7A) and KCNJ3 mutations associated with increased gene expression (log2 fold change = 1.20; Figure 2.7B). SPECC1L \t 31 encodes cytospin-A, a key protein that stabilizes microtubules required for cell migration-related functions, and is specifically involved in migration of cranial neural crest cells during craniofacial development in embryos (Wilson et al., 2016). SPECC1L alterations have been linked to phenotypes that are consistent with disruption of cranial neural crest cell migration, impacting facial structures in human patients (Saadi et al., 2011) and in animal models (Gfrerer et al., 2014). KCNJ3 encodes a membrane protein that forms potassium channels. Over-expression of KCNJ3 was associated with increased self-renewal capacity of MCF-7 breast cancer cell line, and was also associated with poor outcomes in breast cancer patients (Kammerer et al., 2016). \t 32 Figure 2.5 6An axon-guidance regulating gene, ROBO2, exhibits a significantly higher frequency of somatic mutations in introns compared to the background mutation rate in MRT genomes, a feature not observed in paediatric AML genomes. Cases with somatic intronic mutations exhibit decreased ROBO2 expression. Somatic mutations in ROBO2 were identified from WGS analysis using Strelka and mpileup in MRT (A; n = 40 pairs of tumour and matched normal samples) and paediatric AML cases (B; n = 30 pairs). Gene models indicate exonic and intronic regions with vertical and horizontal lines, respectively, and drawn from 5\u00E2\u0080\u0099 (left) and 3\u00E2\u0080\u0099 (right). The scale bar represents 10% of the gene length. Dot colours represent six classes of substitutions plus InDels. \t 33 (C) Boxplot shows ROBO2 gene expression levels in MRT cases with mutations in ROBO2 introns (labeled as \u00E2\u0080\u009CMutants\u00E2\u0080\u009D) and in MRT cases without mutations in the introns (labeled as \u00E2\u0080\u009CWild-type\u00E2\u0080\u009D; BH-adjusted Wilcoxon p-value = 0.00052). A box in the plot indicates the interquartile range (IQR) with the median represented by a thick line within the box. The lines above and below the box represent +1.5 IQR. Figure 2.6 7SPECC1L and KCNJ3 harbour somatic non-coding mutations at frequencies that are significantly higher compared to the background mutation rate. (A) The SPECC1L gene model is represented by horizontal and vertical lines that indicate introns and exons, respectively, and is drawn from 5\u00E2\u0080\u0099 (left) to 3\u00E2\u0080\u0099 (right). The scale bar represents 10% of the gene length. Dot colours represent six classes of substitutions plus InDels. (B) The KCNJ3 gene model is represented as in (A). \t 34 Figure 2.7 8Somatic non-coding mutations in SPECC1L and KCNJ3 are associated with altered gene expression levels in MRTs. The box plots show (A) SPECC1L and (B) KCNJ3 gene expression distribution in MRT cases with intronic mutations (\u00E2\u0080\u009CMutants\u00E2\u0080\u009D) and cases without mutations (\u00E2\u0080\u009CWild-type\u00E2\u0080\u009D). BH-adjusted Wilcoxon p-values for SPECC1L and KCNJ3 are 1.82e-10 and 2.30e-08, respectively. Previous studies have indicated a role for SMARCB1 in genome integrity (Isakoff et al., 2005; Vries et al., 2005). To determine the extent of genome instability in RTs, I analyzed structural alterations using whole genome sequence data. I identified 11 genomic loci affected by recurrent somatic copy number deletions, and 4 loci affected by somatic copy number gains (Figure 2.8). Of the 15 loci affected by copy number alterations, nine loci were on chromosome 22, on which SMARCB1 resides. While genes with copy number gains did not exhibit significant expression increases, 111 genes with copy number deletions exhibited significant expression decreases compared to cases without copy number deletions in those regions (BH-adjusted p-value < 0.05; Supplemental Table 2.2). These genes included those altered in other cancer types such as BCR, CRKL and CABIN1, or those involved in cell differentiation and development \t 35 (NF2, LIF, SPECC1L and PRAME). Almost all chromosome 22 deletions, except those that affected NF2 and LIF, affected SMARCB1 as expected, given that loss of SMARCB1 is the driver alteration in RTs (Figures 2.1 and 2.9). In two separate cases, NF2 and LIF deletions were focal and independent of SMARCB1 deletion, indicating that these deletions are not passenger events. NF2 encodes merlin/schwannomin, a protein specifically expressed in Schwann cells in the nervous system. Merlin plays an important role in regulating cell shape, cell growth and cell adhesion, and mutations in NF2 cause neurofibromatosis type 2 (NF2), also known as vestibular schwannoma. NF2 is also frequently mutated in several cancer types including mesotheliomas, hepatic cancers, melanomas and clear cell renal cell carcinomas (Petrilli and Fernandez-Valle, 2016). LIF is involved in the induction of hematopoietic and neuronal cell differentiation, and regulation of mesenchymal to epithelial conversion during kidney development (Plisov et al., 2001). It is notable that chromatin accessibility at binding sites of STAT3, a downstream target of LIF, is coordinated by SMARCA4-containing BAF complexes (Ho et al., 2011). The convergence of LIF and SWI/SNF target genes supports the idea that alterations in LIF may impact shared biological processes that are regulated by the SWI/SNF complex, thus plausibly be relevant to RT pathobiology. In addition to copy number alterations on chromosome 22, regions of chromosome 5q34 (in five cases), 5q31.3 - 5q32 (in two cases) and 7q35 (in two cases) exhibited recurrent copy number losses that have not previously been linked to MRTs (Figure 2.8). Within the 5q31.3 \u00E2\u0080\u0093 5q32 region, I observed significant expression decreases of TCERG1, LARS, YIPF5 and RBM27 as expected (BH-adjusted Mann-Whitney U p-value < 0.05). \t 36 Figure 2.89 A genome-wide view of regions of recurrent copy number alterations across the 40 MRT cases. Number of recurrent copy number alterations (CNAs) is represented by the height of red (amplification) and blue (deletion) bars. Genes and other features within recurrently altered regions are labeled. Coloured text labels indicate genes with significant increases (red), decreases (blue) or no change (black) in expression levels, when cases with CNAs were compared to cases with no alteration (BH-corrected Wilcoxon p-value < 0.05). \t 37 Figure 2.910Copy number profile of chromosome 22 in MRTs. Regions with recurrent CNAs are shown in greater detail for chromosome 22. Frequency of CNAs and gene labels are represented as in Figure 2.8. To further determine the extent of structural alterations in RT genomes, I analyzed chromosomal rearrangements that arose from inter-chromosomal translocations, deletions or inversions, and focused my analyses on those that resulted in gene fusions involving protein-coding genes. I identified 18 somatic and 8 germline gene fusion events across the 40 RT cases \t 38 (Supplemental Table 2.3). Of the 26 gene fusions, 22 of them involved chromosome 22 (Figure 2.10). Of these, seven fusions in six cases arose as a consequence of SMARCB1 deletions, and had either BCR, CABIN1 or SPECC1L as fusion partners. The remaining 14 fusions were derived from inter-chromosomal translocations, inversions or deletions that did not directly affect SMARCB1. Other somatic fusions involved ACVR1 (one case; derived from an inter-chromosomal translocation) and PTPRD (one case; derived from a duplication), which were not associated with significant changes in gene expression. In contrast, cases with somatic fusions in BCR (one case; from an inter-chromosomal translocation), MMP11 (one case; from a duplication) and SPECC1L (three cases, and also one additional case with germline fusions derived from deletions) exhibited loss of gene expression (RPKM < 1; median of the cohort = 5.35 RPKM). Fusions generally involved genes that were previously linked to other cancer types or developmental processes, such as BCR (found in leukemia; Sawyers, 1999), ACVR1 (involved in BMP-mediated bone development; Shore et al., 2006) and SPECC1L (involved in craniofacial development; Wilson et al., 2016). My analyses of whole genome sequence data revealed novel structural alterations in MRTs, involving genes that were implicated in cancer biology. \t 39 Figure 2.1011Structural rearrangements in MRT genomes. Circos plot shows structural rearrangements, identified using Trans-ABySS, which involved at least one protein-coding gene in MRTs. Each line indicates genomic coordinates that are connected by a structural rearrangement event (i.e. fusions), with colours representing the potential mechanism of the event, e.g. translocations, duplications, inversions or deletions. \t 40 2.2.3 MRTs are comprised of subgroups with distinct DNA methylation profiles that correlate with age at diagnosis Motivated by previous reports that suggested a role for the SWI/SNF complex in modifying CpG methylation, e.g. (Kia et al., 2008), my collaborators from Dr. Hirst\u00E2\u0080\u0099s laboratory and I analyzed whole genome bisulfite sequence data from 40 MRT cases to characterize DNA methylation landscapes of MRTs (Methods). My collaborator and I also analyzed data from three MRT cell lines to determine the extent of DNA methylation similarities between the cell lines and primary RT samples. In addition, we obtained data from three human embryonic stem cell (hESC) lines and four neural progenitor cell samples (NPC; Roadmap Epigenomics Consortium et al., 2015) to use as normal comparators. To investigate the extent of heterogeneity across MRT methylomes, my collaborators performed unsupervised clustering of promoter CpG island (CGI) methylation, which revealed two distinct subgroups (Groups A and B; Figure 2.11). Group A exhibited higher average methylation levels both within and outside promoter CGIs compared to Group B (Welch\u00E2\u0080\u0099s t-test p-values = 4.629e-16 and 2.2e-16, respectively; Figures 2.11 and 2.12). MRT cell lines demonstrated the highest degree of CGI methylation and clustered with Group A (Figures 2.11). In contrast, MRTs in Group B clustered with hESC and demonstrated lower overall methylation levels both within and outside CGIs. Compared to the \u00E2\u0080\u009Cnormal\u00E2\u0080\u009D cell types, i.e. NPCs and hESCs, promoter-associated CGIs from both MRT subgroups were found to be hypermethylated, while non-CGI regions were found to be hypomethylated (Figure 2.12), consistent with patterns generally observed in cancer epigenomes (Feinberg and Vogelstein, 1983; Toyota et al., 1999). To investigate potential clinical and biological implications of DNA methylation subgroups, I compared the subgroups against patient data, and found that Group A was \t 41 significantly enriched for patients beyond the infant stage i.e. > 1 years old compared to Group B (Figure 2.11; one-sided Fisher\u00E2\u0080\u0099s exact p-value = 0.0013). Pathway analyses of methylated promoters unique to Group A showed significant enrichments of homeobox terms (BH-adjusted Welch\u00E2\u0080\u0099s t-test p-value < 0.05; Figure 2.13). Compared to hESCs, both MRT subgroups exhibited a significant increase in the number of tumour suppressor gene (TSG) promoters (annotated in the TSGene database (Zhao, Sun and Zhao, 2013)) that gained methylation (n = 859) compared to the number of TSG promoters that lost methylation (n = 453; one-sided Fisher\u00E2\u0080\u0099s exact p-value = 0.02041; Figure 2.14). The three most significant biological processes enriched for TSGs with methylated promoters in MRTs were embryonic pattern specification process (GO:0007389; BH-corrected p-value = 7.77e-10), regulation of RNA metabolic process (GO:0051252; BH-corrected p-value = 6.22e-10) and transcription regulation (GO:0006355; BH-corrected p-value = 5.60e-10). Furthermore, these TSGs included those that were previously reported to be epigenetically silenced in other cancer types, including DLEC1 in colon and gastric cancers, and lymphomas (Ying et al., 2009; Wang et al., 2012); RASSF1 in breast, lung and brain cancers (Keishi et al., 2003; Palakurthy et al., 2009); IRX1 in gastric cancers (Guo et al., 2010); TWIST2 in leukemias (Thathia et al., 2012) and TBX5 in colon cancers (Yu et al., 2010). Overall, these observations are compatible with the notion of a role for epigenetically repressed TSGs in MRT development. \t 42 Figure 2.1112Unsupervised clustering of promoter DNA methylation profiles in MRTs. Unsupervised clustering of promoter CpG island (CGI) methylation levels revealed correlation with patient age at diagnosis (Fisher\u00E2\u0080\u0099s exact test p-value = 0.0013). The heat map displays promoter CGIs with at least 15% difference in methylation (row means) between Groups A and B. NAAELVNAAELYNAAELZPAUEKWPARIRNPAUFVPPASCDHPATBLFPADYZIPAKHTLPATENHPAJMRBPARZBIPASADZPASXNAPATXEEPAUCGJPAUNPAPARRCLPAJLWMPASABDPATDVLPASVDPPASRHUPATFXWPATAFTPASDLAPAUGYZPATFZZPARPFYNAAEMBNAAEMANAAEMCPARGRNPARECBPAUDPVPAJNERPABKLNPAJNFZPAUHAZPARUGKPATXKAPASZYEPASYNFPASWZZPAJNFPSexAnatomical siteGroupAgeAge>=1 year<1 yearGroupMRT Cell lineMRT Group AMRT Group BhESCAnatomical siteKidneySoft tissueLiverhESCSexFemaleMalena020406080100% Methylationna\t 43 Figure 2.1213Distributions of CpG methylation levels in MRT genomes. Box plots show the distribution of CpG methylation levels outside (left) and within (right) CGIs for MRT cases in Group A (n = 10) and B (n = 30), MRT cell lines (n = 3), neural progenitor cell cultures (NPC; n = 4) and human embryonic stem cell (hESC) lines (n = 3). Figure 2.1314Functional terms related to homeobox genes were significantly enriched for genes with hypermethylated promoter CGIs in Group A compared to Group B. \t 44 The bar plot shows adjusted p-values and number of genes (in brackets) mapped to the most highly enriched terms from the Swiss-Prot Protein Information Resource (\u00E2\u0080\u009CSP\u00E2\u0080\u009D), InterPro (\u00E2\u0080\u009CI\u00E2\u0080\u009D), protein annotation from the Simple Modular Architecture Research Tool (SMART; \u00E2\u0080\u009CSM\u00E2\u0080\u009D), UniProt sequence annotation (\u00E2\u0080\u009CUP\u00E2\u0080\u009D), and biological processes and molecular functions from Gene Ontology (\u00E2\u0080\u009CGO-B\u00E2\u0080\u009D and \u00E2\u0080\u009CGO-M\u00E2\u0080\u009D, respectively; gray dashed line at p-value = 0.05). Figure 2.1415Promoters of known TSGs show lower DNA methylation levels in MRTs compared to hESC. Heat map shows TSG promoter CGIs with methylation gain in MRT subgroups compared to hESC (BH-adjusted Welch\u00E2\u0080\u0099s t-test p-value < 0.05). \t2.2.4 MRTs exhibit epigenetic dysregulation of homeobox genes In addition to chromatin remodelling functions of the SWI/SNF complex, several studies have described roles for SWI/SNF in modulating histone modification profiles and thus altering chromatin states, particularly those associated with PRC2-mediated repression (Kadoch, Copeland and Keilhack, 2016) and active enhancers (Alver et al., 2017). \t 45 PRC2 (Polycomb Repressive Complex 2) is involved in regulation of transcriptional repression and chromatin compaction. The core enzymatic subunit of PRC2, namely EZH1 or EZH2, catalyzes the methylation of lysine 27 in the H3 N-terminus tail, establishing the repressive H3K27me3 histone modification mark that is associated with facultative heterochromatin and transcriptional repression. PRC2 is an important epigenetic regulator of pluripotency and cellular differentiation during development (Margueron and Reinberg, 2011). SMARCB1 has been shown to antagonize PRC2 activities by evicting PRC2 from its target gene loci (Kia et al., 2008), or by repressing EZH2 expression (Wilson et al., 2010). In primary RT samples and cell lines, SMARCB1 loss has been associated with an increase in global levels of H3K27me3 (Wilson et al., 2010). My collaborators and I thus investigated whether promoter H3K27me3 density could discriminate MRT samples from normal cell types. My collaborators performed analyses of H3K27me3 ChIP-Seq data from 35 primary MRT samples, three MRT cell lines, two hESC lines and one normal kidney sample from the TARGET cohort. For additional normal comparators, my collaborators obtained H3K27me3 ChIP-Seq data from two normal fetal brain samples and two neuronal progenitor cell (NPC) cultures from the Roadmap Epigenomics Consortium (Roadmap Epigenomics Consortium et al., 2015). Unsupervised hierarchical clustering of 200 gene promoters (defined as regions +/- 2 kb of transcription start sites) with the most variable H3K27me3 densities could distinguish MRTs from normal samples. Contrary to an expectation of increased H3K27me3 levels in the absence of SMARCB1, a majority of the most variably methylated promoters (158 out of 200; 79%) exhibited lower H3K27me3 densities within the gene promoter regions in MRTs compared to normal cell types (Figure 2.15). Functional enrichment analyses of genes with promoters that exhibited lower H3K27me3 levels compared to normal samples revealed significant enrichments (BH-corrected \t 46 p-values between 1.67e-37 and 4.17e-44) in homeobox terms, pattern formation, morphogenesis and organogenesis, which were terms that appeared to be driven primarily by homeobox genes (Figure 2.16). Significant enrichments of terms related to DNA binding, transcription and transcription regulation were also noted. In contrast, analyses of genes associated with increased H3K27me3 density did not reveal a statistically significant enrichment of any functional category. Consistent with the observation of aberrant HOX gene expression in ATRTs compared to other paediatric brain cancers (Chakravadhanula et al., 2014), H3K27me3 analyses supported the notion that epigenetic dysregulation of HOX genes may be characteristic of RTs. \t 47 Figure 2.1516Profiles of H3K27me3 promoter density in MRTs are distinct from those in normal cell types. Unsupervised hierarchical clustering of H3K27me3 density at gene promoters were performed for MRT (n = 35), fetal brain (n = 2), NPC (n = 2), hESC (n = 2) and normal kidney samples (n = 1). Lower H3K27me3 densities were observed at the majority of gene promoters in MRTs compared to normal samples. Notably, an MRT case, PAKLYZ, which expressed both SMARCB1 and SMARCA4 (mRNA transcript abundance = 48.8 and 37.3 RPKM, respectively), clustered with normal cell types. SexAnatomical siteSample typeAgeMethylation MethylationMRT Group AMRT Group BnaAge>=1 year<1 yearSample typeMRTFetal brainNPChESCNormal kidneyAnatomical site of MRTKidneySoft tissuenaSexFemaleMalenana2468H3K27me3 promoter densityGene Cluster 1Gene Cluster 2Gene Cluster 3PASNEDPASAMZPARZRHPAKTCTPADYCEPAVDPRPASILRPATBLFPASGGNPASXNAPAVVITPAWFWKPAWFBLPASGCLPARZBIPASXGFPAUCGJPAVITIPASABDPATDVLPAUEKWPARUGKPAJPHJPAREWIPAUDPVPAUGYZPAUHAZPARIRNPATENHPAVYKDPATFZZPASRHUPAJLRAPAUFVPPAVITI-NormNPC1NAAEMCNAAEMBPAKLYZNPC2BRAIN1BRAIN2\t 48 Figure 2.1617Functional terms related to homeobox genes were enriched for genes whose promoters exhibited lower H3K27me3 levels in MRTs compared to normal samples. The bar plot shows the adjusted p-values and number of genes (in brackets) mapped to top enriched terms from InterPro (\u00E2\u0080\u009CI\u00E2\u0080\u009D) and biological processes from Gene Ontology (\u00E2\u0080\u009CGO-B\u00E2\u0080\u009D; gray dashed line at p-value = 0.05). In addition to modulating PRC2 activities and H3K27me3 profiles, the SWI/SNF complex has been shown to modulate enhancer functions. SMARCB1 loss was associated with global reduction in H3K27ac levels, particularly impacting enhancers that maintain lineage specificity in cells (Alver et al., 2017; Wang et al., 2017). While typical enhancers are usually less than 1 kb in length (Whyte et al., 2013), super-enhancers span over 10 kb on average, and exhibit disproportionately high densities of the active H3K27ac mark and the binding of transcription activators, including the Mediator protein (MED1) and master transcription factors such as OCT4 and SOX2. Super-enhancers are thought to regulate key genes that define cell-type specificity and have emerged as important regulatory entities in cancer (Hnisz et al., 2013; Sharifnia et al., 2019). To explore the active regulatory landscape in MRTs, my collaborators \t 49 and I analyzed H3K27ac ChIP-Seq data generated from ten primary MRT samples, three MRT cell lines and three hESC lines. In addition, my collatorators obtained H3K27ac ChIP-Seq data from one normal fetal brain sample from the Roadmap Epigenomics Consortium to use as a normal comparator. To identify super-enhancers, my collaborators analyzed H3K27ac-enriched regions (referred to as \u00E2\u0080\u009Cpeaks\u00E2\u0080\u009D) to define enhancers and ranked these based on the H3K27ac enrichment signals to identify the regions that exhibited disproportionately high enrichment signals compared to typical enhancers as described in previous studies (Figure 2.17; Hnisz et al., 2013; Whyte et al., 2013). To reveal candidate super-enhancers potentially unique to MRTs, my collaborators identified super-enhancers that were in > 50% of MRT samples and absent in normal fetal brain and hESC samples. This yielded 136 MRT-specific super-enhancers that were associated with 197 genes that were within 20 kb upstream and downstream of the enhancer regions (Supplemental Tables 2.4 and 2.5). Consistent with the H3K27me3 results, my collaborators and I observed significant enrichments of homeobox genes, DNA binding and chromatin-related terms among the annotations for these genes (Figure 2.18; Supplemental Table 2.6; FDR < 0.0001). These enrichments appeared to be driven by members of HOXA, HOXB, HOXBC, HISTH1 and HISTH2 gene clusters. Strikingly, MRT-specific super-enhancers blanketed the histone H4 gene cluster and HOXC cluster, which also includes the regulatory non-coding RNA HOTAIR involved in PRC2 functions (Figure 2.19; Gupta et al., 2010). These observations, along with the observation of loss of repressive H3K27me3 marks at these loci (described in Section 2.2.4), are consistent with the previous reports of aberrant over-expression of HOX gene members and HOTAIR in ATRTs (Chakravadhanula et al., 2014) and in my RNA-Seq data from MRTs (Figure 2.20). Candidate MRT-specific super-enhancers were also associated with Fms-related tyrosine kinase 3 ligand (FLT3LG; a tumour suppressor gene) and \t 50 STAT3 (an oncogene), both of which are involved in the regulation of cellular differentiation (Laouar et al., 2003) and self-renewal capacity in embryonic stem and neural crest cells (Ying et al., 2003). Figure 2.1718Numbers of typical enhancers and super-enhancers identified in MRTs. Genome-wide H3K27ac profiles were used to define typical enhancer (right bars in lighter shade) and super-enhancer states (left bars in darker shades) in MRT, fetal brain and hESC samples. Bar colours indicate different sample types, i.e. fetal brain (blue; \u00E2\u0080\u009CBrainHu04\u00E2\u0080\u009D), hESC (orange; \u00E2\u0080\u009CNAAEMA\u00E2\u0080\u009D, \u00E2\u0080\u009CNAAEMB\u00E2\u0080\u009D, \u00E2\u0080\u009CNAAEMC\u00E2\u0080\u009D) and MRTs (red). Candidate super-enhancers unique to MRTs were identified as those that were in > 50% of MRT samples and absent in normal fetal brain and hESC samples. \t 51 Figure 2.1819Homeobox-related functional terms were significantly enriched for MRT-specific super-enhancer-associated genes. Bar graph shows BH-adjusted p-values and numbers of genes (in brackets) that mapped to top enriched terms from InterPro using DAVID (gray dashed line at p-value = 0.05). \t 52 Figure 2.1920Super-enhancer at the HOXC locus in MRTs. \t 53 UCSC genome view shows H3K27ac peaks at the HOXC locus in MRT (black), hESC (orange) and fetal brain samples (light blue). The y-axis scale represents read density. Figure 2.2021Expression levels of HOTAIR and HOXC gene family members in MRTs compared to normal cell types. The distribution of expression levels of the non-coding HOTAIR gene (A) and all HOXC gene family members (B) in MRTs, hESC and normal cerebellum tissues from adult (\u00E2\u0080\u009CAdult Cere\u00E2\u0080\u009D) and fetus (\u00E2\u0080\u009CFetal Cere\u00E2\u0080\u009D). 2.2.5 RNA-Seq analyses support the existence of two gene expression subgroups and indicate dysregulation of immunoglobulins, BMP and WNT signalling, neural crest development and imprinted genes in MRTs I used RNA-Seq data to characterize MRT gene expression profiles in 40 samples. Informed by previous studies that indicated similarities between MRTs, hESC and neural precursor cells in the expression of stem cell-associated genes and embryonic stem cell markers, (Gadd et al., 2010; Wilson et al., 2010; Venneti et al., 2011; Deisch, Raisanen and Rakheja, 2013), I analyzed RNA-Seq data from three hESC lines and four fetal cerebellum samples \t 54 (Northcott et al., 2012) as normal comparators. To identify genes differentially expressed in MRTs, I compared MRT samples to normal samples, and found 2,713 differentially expressed genes between MRTs and hESC, and 4,505 differentially expressed genes between MRTs and fetal cerebellum samples (BH-adjusted p-values < 0.05, log2 FC > 1; Supplemental Tables 2.7 and 2.8). Identification of genes common to both comparisons resulted in 398 over-expressed genes and 615 under-expressed genes in MRTs compared to both normal cell types, i.e. hESC and fetal cerebellum. Among the most significantly over-expressed genes in MRTs were genes that were involved in immune functions (e.g. IGKC, IGKJ5) and in embryonic development (e.g. MGP, LUM; Figure 2.21; Supplemental Table 2.9). Conversely, the most significantly under-expressed genes include those involved in embryonic development (e.g. ZIC3, SOX3) and neuron functions (e.g. GABRA3, CADPS; Figure 2.21, Supplemental Table 2.10). This result is consistent with previous associations of SWI/SNF loss with immune (Agalioti et al., 2000; Cui et al., 2004) and development dysregulation (Li, 2002). \t 55 Figure 2.2122Immune-related and embryonic development-related functions were enriched for over-expressed genes in MRTs, while neuron development-related functions were enriched for under-expressed genes in MRTs compared to normal cell types. Bar plots show BH-adjusted p-values and number of genes (in brackets) mapped to top enriched Gene Ontology (GO) terms for over-expressed genes (A) and under-expressed genes (B) in MRTs compared to fetal cerebellum and hESC (gray dashed line at p-value = 0.05). \t 56 To determine the extent of gene expression heterogeneity across MRTs, I performed non-negative matrix factorization (NMF; Gaujoux and Seoighe, 2010) analysis to reveal gene expression subgroups using the top 25% most variably expressed protein-coding genes (n = 3,179) across 40 MRT samples. I identified two gene expression subgroups (Figure 2.22A; n = 22 cases in Group 1; n = 18 cases in Group 2), which were also observed using hierarchical clustering and principal component analysis (PCA; Figures 2.23A and 2.23B), demonstrating the reproducibility of the two subgroup solution identified using the NMF method. Among the clinical correlates, anatomical sites of occurrence were significantly associated with Group 1 (Fisher\u00E2\u0080\u0099s exact test p-value = 0.04), which included all six cases from extra-renal sites i.e. liver and soft tissues (Figure 2.22B). \t 57 Figure 2.2223Two gene expression subgroups were identified in MRTs. (A) Consensus heat map from the NMF analysis of gene expression data shows two subgroups and associated clinical features (shown in B). AOver-expressed in a gene expression sub-groupMethylation gain in a DNA methylation sub-group Decreased gene expression due to somatic CN lossSomatic missense mutationSomatic fusionHepatic fibrosis*/Hepatic stellatecell activation*Cellular immuneresponse*Axonal guidance*LXR/RXR activation*Wnt/\u00CE\u00B2-catenin signaling*Stem cell dev*/differentiation*Migration*/Invasion*/EMT*IGF-1 signalingStem cell dev/DifferentiationCreatin biosynthesisVitamin A biosynthesisHGF/c-Met signaling-33Z-score0PAUNPAPATFZZPAJNFPPAKHTLPADYZIPAJLWMPASADZPARPFYPARUGKPAUCGJPARGRNPATFXWPARECBPAUDPVPASYNFPAJNERPASWZZPASXNAPASZYEPASDLAPAUHAZPASCDHPABKLNPAJNFZPASVDPPASABDPATXKAPATBLFPATAFTPARRCLPATXEEPAJMRBPATENHPARZBIPAUGYZPAUFVPPAUEKWPASRHUPARIRNPATDVLSub-group 2 Sub-group 1AgeAnatomical siteTumor stageSex MaleI / II<1 yearFemaleIII IVKidney Soft tissueLiver1-5 years >5 yearsBMP4BMP5BMP6ACVR2AID3SMAD6SMAD9LIFBAMBIACVR1WNT5AHIC1DACT1APCWNT5BWNT11FGF1FGF2ZIC5MYBL2ZEB1CDH1LMO4SEMA3CMSCZIC2MSX1TFAP2AEBF1TSPAN18CDH6GATA6HAND2TCF21Neural crest signaling regulatorsNeural crestmigrationBMP-signalingWNT-signalingMyoSNCardFGFNeural plateborder regulatorsNeural crestspecificationNeural crestdiversificationLBX1TWIST2BCD\t 58 (C) Heat map shows differentially expressed genes between Groups 1 and 2, and pathways that were significantly enriched for differentially expressed genes (black/gray bars on the right; * indicates statistical significance of BH-corrected p-value < 0.05). (D) Mutations, differential expression and differential CGI promoter methylation in genes that regulate various stages of neural crest development are shown in each sample (column). \u00E2\u0080\u009CMyo\u00E2\u0080\u009D indicates myoblasts, \u00E2\u0080\u009CSN\u00E2\u0080\u009D for sympathetic neurons and \u00E2\u0080\u009CCard\u00E2\u0080\u009D for cardiac cells derived from neural crest cells. Figure 2.2324Gene expression subgroups are consistently observed using different analytical approaches. Gene expression subgroups identified using NMF (Figure 2.22A) are observed using hierarchical clustering (A) and PCA (B) approaches. The top 25% most variably expressed genes were analyzed. Blue and red colours indicate samples in NMF-derived Group 1 and 2, respectively. To address gene expression differences between the two gene expression subgroups, I identified 880 genes that were significantly differentially expressed between Group 1 and Group 2 (BH-adjusted DESeq p-value < 0.05; Figure 2.24; Supplemental Table 2.11). The most significantly over-expressed genes in Group 1 compared to Group 2 (log2 FC > 2.5) included immunoglobulins (e.g. IGKC, IGKJ5, IGLC1), and genes involved in BMP signalling (e.g. \t 59 BMP4, SOSTDC1) and cell differentiation regulation (e.g. DLK1, MEOX2, ID3, BEX1). Functional terms that were enriched for over-expressed genes in Group 1 included immune response, hepatic fibrosis/hepatic stellate cell activation, axon guidance and liver X receptor/retinoid X receptor (LXR/RXR) activation (Figure 2.22C). In Group 2, the most significantly over-expressed genes (log2 FC > 6) were involved in cell adhesion and migration (e.g. PCDH18, SMOC2, THBS2, EMILIN1), WNT signalling (e.g. WNT5A, DACT1, HIC1) and cell differentiation (e.g. TCF21, MEIS1). Also in Group 2, we observed over-expression of the E-cadherin repressors, namely ZEB1 and TWIST2 (BH-adjusted p-values = 0.03 and 0.04, respectively), and as expected, under-expression of E-cadherin (CDH1; BH-adjusted p-value = 6.31e-19), suggesting that MRTs in Group 2 may exhibit more mesenchymal phenotypes compared to Group 1. Functional terms that were significantly enriched for over-expressed genes in Group 2 included embryonic stem cell pluripotency regulation, epithelial-to-mesenchymal transition (EMT) and retinoic acid receptor activation. However, the enrichment of Group 2 pathways was not significant (BH-corrected p-value > 0.08; Figure 2.22C). Notably, I observed significant over-representation of imprinted genes in the differentially expressed genes between Group 1 and Group 2 (hypergeometric enrichment p-value = 6.45e-11). Of the 123 known human imprinted genes amassed from the literature (Glaser, Ramsay and Morison, 2006; Pozharny et al., 2010) and the curated imprinted gene database (http://www.geneimprint. com/site/genes-by-species), 15 genes were found to be differentially expressed between the two subgroups. Among these, 13 of them were over-expressed in Group 1 compared to Group 2 (Supplemental Table 2.12). I also observed significant over-representation of annotated oncogenes and tumour suppressor genes (n = 100) among the differentially expressed genes between Group 1 and Group 2 (hypergeometric p-value = 1.03e-39). Oncogenes that were \t 60 relatively over-expressed in Group 1 compared to Group 2 (n = 16) included HMGA2, ETV1, IGF1R, CD74 and CCND3. In Group 2, over-expressed oncogenes (n = 7) included PCDH18, PBX1, AQR, AURKA and FEV. Under-expressed tumour suppressor genes in Group 1 compared to Group 2 (n = 12) included HIC1, LIMA1, RARB, IGFBP5 and PEG3, whereas under-expressed tumour suppressor genes in Group 2 compared to Group 1 (n = 59) included H19, PLAGL1, FN1, SPINT2 and GJA1. Figure 2.2425Differentially expressed genes between gene expression subgroups include immune- and development-regulating genes. 880 genes were significantly differentially expressed between Group 1 and 2 (FDR < 0.05). Blue indicates genes that are over-expressed in Group 1 compared to Group 2, while red indicates genes over-expressed in Group 2 compared to Group 1. Lighter shades indicate genes detected at \t 61 an FDR threshold of 0.05, while the darker shades indicate genes detected at an FDR threshold of 0.01. Among genes involved in early development regulation, neural crest genes appeared to be particularly dysregulated in MRTs. Gene Ontology (GO) enrichment analyses of the 880 differentially expressed genes revealed significant enrichments of neural crest development and neural crest differentiation terms (BH-adjusted p-value = 0.049; Supplemental Table 2.13). The enriched categories spanned several stages of neural crest development such as neural crest induction, migration and differentiation (Figure 2.22D; Simoes-Costa and Bronner, 2015). Key neural crest-regulating genes include BMP4/5/6, ID3, SMAD6/9 and FGF1/2 (involved in neural crest induction); ZIC2/5 (regulate neural plate border formation); HAND2, SEMA3C and MSC (regulate neural crest diversification). These observations support the potential involvement of dysregulated neural crest development processes in MRT biology, and are consistent with the notion that MRT cell of origin may be derived from the neural crest lineage (Fischer et al., 1989; Ota et al., 1993; Sugimoto et al., 1999). My study was the first to describe the presence of gene expression subgroups in extra-cranial MRTs including RTs of the kidney (RTKs), whereas gene expression subgroups in cranial ATRTs had already been described (Torchia et al., 2015). To assess whether gene expression subgroups of extra-cranial MRTs are similar to ATRTs or RTKs, I used published data from Grupenmacher et al. (2013), which analyzed gene expression differences between ATRTs and RTKs. I compared differentially expressed genes between MRT subgroups to the 29 over-expressed genes in ATRTs compared to RTKs from Grupenmacher et al (2013) (Supplemental Table 2.11). Among the 11 genes that overlapped, ten of them (LIPG, MFGE8, \t 62 DKK3, BEX1, FAM5C, ID3, TPM1, ENC1, SLC1A3 and TTYH1) were found to be over-expressed in Group 1. These included genes involved in neuronal differentiation (BEX1, FAM5C, SLC1A3), neural crest differentiation (ENC1) or other regulators in developmental processes such as axial patterning in embryos (DKK3) and myogenesis (ID3). Conversely, of the 92 genes reported to be over-expressed in RTKs compared to ATRTs (Grupenmacher et al., 2013), 21 genes were found to be differentially expressed between MRT subgroups (TCF21, H2AFJ, IGFBP5, NR4A2, RERG, PCDH18, TMEM155, EMILIN1, AQR, MEIS1, PHLDB1, AURKA, TARBP1, MAP3K8, SH3KBP1, WNT5A, CXCR7, COPS8, CTSL1, COL4A5 and GJB2), and all were over-expressed in Group 2 (one-sided Fisher\u00E2\u0080\u0099s exact p-value = 1.193e-13; Supplemental Table 2.11). These included genes involved in cell adhesion (PCDH18, EMILIN1, SH3KBP1), cell cycle-related kinase pathways (AURKA, MAP3K8), or development and cell differentiation (MEIS1, TCF21, NR4A2, WNT5A, IGFBP5). These observations suggest that gene expression Group 1 may be more similar to ATRTs, while Group 2 may be more similar to RTKs. Next, I explored relationships between gene expression subgroups and genomic alterations. Group 1 had more cases with mutations in epigenetic modifier genes (KMT2D/MLL2, MBD1, ZBTB39, SUZ12 and BRD1) compared to Group 2 (MBD2 and PHF21B; Figure 2.1), although this observation was not significant (Fisher\u00E2\u0080\u0099s exact p-value = 0.30). However, Group 1 exhibited a significant enrichment of cases with broad (> 10 kb) homozygous deletions at the SMARCB1 locus (14 out of 22 cases in Group 1; Fisher\u00E2\u0080\u0099s exact p-value = 0.025; Figure 2.25). In addition, Group 1 showed significant association with mutations in NF2, EWSR1, and BMP-interacting genes, namely LIF, ACVR1 and BAMBI. For NF2 (encodes Merlin, a Schwann cell-specific protein involved in regulation of cell growth and shape), five of six cases had deletions \t 63 associated with significant decreases in NF2 gene expression, and one case harboured a germline missense SNV and somatic copy number neutral LOH. For EWSR1, four out of five cases had deletions associated with significant decreases in gene expression, and one case harboured a somatic missense SNV. EWSR1 is involved in regulation of gene expression, various cell signalling, and RNA processing and transport processes. A t(11;22)(q24;q12) translocation in EWSR1 is a known driver mutation in Ewing sarcomas, and neuroectodermal and various other cancer types. All five cases with homozygous deletions in LIF, a BMP interactor, were in Group 1. A case with a somatic missense SNV in ACVR1 (Activin A receptor, type 1), which encodes a BMP signal transducer, and a case with a somatic fusion involving BAMBI (BMP and Activin Membrane Bound Inhibitor) were among those in Group 1. The association of mutations in BMP-related genes, together with significant enrichments of BMP signalling pathway genes that are over-expressed in Group 1, indicates a role of BMP signalling pathway for Group 1. Group 1 also contained 18 out of 22 recurrent fusions identified across MRTs. Of these, 11 fusions were exclusive to Group 1, all involving genes on chromosome 22, which are TTC28 (five cases; four germline and one somatic events; fusion partner with GABRA4 (on chr4), CRABP1 (on chr15), non-coding regions in chr8, and with itself through duplication), UPB1 (three cases; all somatic events; fusion partner with COMPT (on chr22), GGT1 (on chr22) and UPP2 (on chr2)) and CABIN1 (two cases; both somatic events; fusion partner with AK095700/RHOF (on chr12), NAPA (on chr9) and a non-coding region in chr22). Out of the 11 fusions, nine were due to inter-chromosomal translocations, inversions or duplications that did not directly involve SMARCB1. The observation of recurrent fusions independent of SMARCB1 deletions is compatible with a hypothesis that these fusions may play a role relevant to MRT development or progression. \t 64 Figure 2.2526MRT cases in gene expression Group 1 frequently harbour broad homozygous deletions at the SMARCB1 locus. Copy number states are plotted across the SMARCB1 locus on chromosome 22. Dark blue indicates regions with two-copy loss, i.e. homozygous deletions (\u00E2\u0080\u009CHOMD\u00E2\u0080\u009D), and light blue indicates regions with one-copy loss, i.e. heterozygous deletions (\u00E2\u0080\u009CHETD\u00E2\u0080\u009D). Regions with copy number gains are shown in shades of red, with darker shades indicating increased copy number amplifications. Blue and red colour bars on the left indicate samples in the NMF-derived gene expression Group 1 or 2, respectively. Vertical dotted lines indicate the locations of SMARCB1 and nearby genes of interest, i.e. EWSR1 and NF2. \t 65 2.3 DISCUSSION This chapter describes the first multi-omic study of extra-cranial MRTs, contributing reference genomic, epigenomic and transcriptional landscapes based on 40 primary tumour cases. As a demonstration of their utility, I used these datasets to determine the extent of inter-tumoural heterogeneity in MRTs, and to identify alterations in genes and pathways that were previously unlinked to MRTs as well as those that confirmed previous findings. Whole genome analyses of 40 primary MRT cases revealed several mutations, and structural and copy-number alterations in genes that were previously undescribed in MRTs. As expected, my whole genome analysis identified loss-of-function alterations in SMARCB1 (and SMARCA4 in a single case) in all but one of the MRT cases, confirming that SMARCB1 loss is a pathognomonic characteristic of this disease. Although MRT genomes indeed appeared to generally harbour fewer mutations (median somatic mutation rate = 0.254 mutations per Mb) compared to other paediatric and adult cancer types (median somatic mutation rates = 0.376 and 2.00, respectively; Lawrence et al., 2013), my whole genome analysis revealed recurrently mutated genes that were previously unknown in MRTs. Among the genes with somatic mutations, notable ones included genes involved in early development processes e.g. cell differentiation (LIF, ACVR1, PTPRD), cell adhesion (MMP11, DSCAM), axon guidance (ROBO2) and neural crest cell functions (SPECC1L, NF2). My whole genome analysis also revealed that introns of ROBO2 and KCNJ3 genes were significantly frequently mutated compared to the background mutation rate. This observation appeared unique to MRTs, as such biased frequency of mutations in introns was not observed in paediatric AML genomes, which I analyzed to determine if intronic mutations observed in MRTs would be observed in another paediatric cancer type. Intriguingly, non-coding mutations in both genes were associated with \t 66 significant differences in gene expression levels compared to cases lacking mutations, potentially indicating a role for non-coding mutations in transcriptional dysregulation. My analysis, as well as those by others, confirmed that the dominant driver in MRTs was SMARCB1 loss. However, the effects of SMARCB1 loss on transcriptional and epigenetic regulation were not uniform across the cases I studied, as indicated by the presence of candidate molecular subgroups identified in my work. Based on gene expression analyses, two subgroups were characterized, the ATRT-like Group 1 and the RTK-like Group 2. The ATRT-like Group 1 exclusively contained all non-kidney MRTs, while the RTK-like Group 2 contained all kidney MRTs. It is notable that differentially expressed genes between the two subgroups were significantly enriched in many important pathways involved in cell differentiation and development. Specifically, Group 1 exhibited over-expression of imprinted genes and genes involved in BMP signalling and axon guidance (which is also found to regulate mesenchymal differentiation and organogenesis of the lung and pancreas (Hinck, 2004; Cozzitorto et al., 2020)), while the RTK-like Group 2 exhibited over-expression of WNT signalling and EMT-promoting genes. These observations are consistent with the notion that transcriptional consequences of SMARCB1 loss are heterogeneous, and may point to different cell types of origin underlying MRT heterogeneity. Analyses of gene expression, DNA methylation, and H3K27me3 and H3K27ac ChIP-Seq data consistently presented evidence for extensive dysregulation in pathways particularly involved in early human development. For example, dysregulation of HOX genes emerged as a characteristic of MRTs. Significant enrichments of homeobox gene promoters with differential DNA methylation levels and reduced H3K27me3 densities, as well as the presence of a super-enhancer at the HOXC locus in MRTs are consistent with the notion that epigenetic \t 67 dysregulation may promote the expression of homeobox genes and the involvement of dysregulated embryonic developmental processes in MRT initiation and progression. RNA-Seq analyses further revealed that MRTs exhibited dysregulated expression of genes involved in nearly all stages of neural crest development, including neural crest induction, neural plate border regulation, and neural crest specification, migration and diversification. Recent mouse model studies demonstrated that SMARCB1 inactivation at specific embryonic days (i.e. embryonic day (E) 6 \u00E2\u0080\u0093 E10 or E9.5 \u00E2\u0080\u0093 E12.5; Han et al., 2016; Vitte et al., 2017; Carugo et al., 2019) or in Schwann cell precursors resulted in development of tumours with features of human RTs (Vitte et al., 2017). These studies, together with my results, are consistent with the notion that MRTs may be derived from neural crest cells or their derivatives that arise during early developmental stages. Clinical implications of the molecular subgroups of MRTs, such as subgroup-specific therapeutic vulnerabilities and differences in disease outcome, were not addressed in this study. Whether there is a clinical difference between these subgroups or not (e.g. survival differences, different responses to treatments) remains to be revealed. Identification of subgroup-specific therapeutic targets and testing them in vitro or in vivo systems using cell lines or mouse models may also be explored for the future. Within ATRTs, different molecular subgroups have been reported by multiple studies, and a consensus around the number of ATRT subgroups has not been reached. Previous studies described two gene expression and DNA methylation subgroups (n=43 cases; Torchia et al., 2015), three gene expression subgroups (n = 30 cases; Han et al., 2016) or three DNA methylation subgroups (n = 150 and 162 cases, respectively; Johann et al., 2016; Torchia et al., 2016) within ATRTs. How these ATRT subgroups relate to MRT \t 68 subgroups and clinical implications of multiple molecular subgroups within RTs remain to be investigated. 2.4 MATERIALS AND METHODS 2.4.1 Sample details and data availability 40 primary pre-therapy extra-cranial MRT (34 from kidneys, 4 from soft tissues and 2 from liver) and matched normal (16 adjacent kidney and 24 peripheral blood) samples were provided by Dr. Elizabeth Perlman (Ann and Robert H. Lurie Children\u00E2\u0080\u0099s Hospital in Chicago). The patients were registered on the Children\u00E2\u0080\u0099s Oncology Group (COG) protocol on the National Wilms Tumor Study Group 5 or on COG AREN03B2 banked by the COG Biopathology Center with parental informed consent. My study was performed with the approval of the University of British Columbia - British Columbia Cancer Agency Research Ethics Board (REB number H09-02558). For six of these samples, matched normal kidney tissues were available and used as normal comparators for RNA-Seq analyses. Three MRT cell lines (G401, KP-MRT-RY and KP-MRT-NS) and three human embryonic stem (hES) cell lines (H7, H9 and H14) were used in my study. Complete sample names and other details are provided in Supplemental Table 2.1. Nationwide Children\u00E2\u0080\u0099s Hospital prepared cells and nucleic acids, and shipped these materials to Canada\u00E2\u0080\u0099s Michael Smith Genome Sciences Centre at BC Cancer (BCGSC) for sequencing and analyses. Tumour purity was estimated based on WGS analyses using APOLLOH software (Ha et al., 2012), which measures variant allele frequencies of germline heterozygous variants in tumour samples, and calculates the deviation from expected allele frequencies to identify regions with loss of heterozygosity (LOH) and estimate tumour purity levels. The median tumour purity \t 69 based on WGS analyses was 88.0% (ranging from 42.8% to 95.0%). The MRT cell lines used were G401 (a cell line derived from renal MRT, obtained from ATCC\u00C2\u00AE (Garvin et al., 1993)) and KP-MRT-RY (derived from renal MRT; Katsumi et al., 2008) and KP-MRT-NS (derived from the ascitic fluid taken from a renal MRT patient; Sugimoto et al., 1999). The human embryonic stem (hES) cell lines used were H7, H9 and H14 (Thomson et al., 1998). This dataset was generated as part of the TARGET (Therapeutically Applicable Research To Generate Effective Treatments) initiative in the National Cancer Institute\u00E2\u0080\u0099s Office of Cancer Genomics. Sequencing reads and analyzed data files have been deposited to NCBI dbGaP. The dbGaP accession number for this study is phs000470. The data are available to approved investigators through dbGaP. Additional data are available at their data portal at http://target.nci.nih.gov/dataMatrix/ TARGET_DataMatrix.html. 2.4.2 WGS library construction, sequencing and sequence read alignment Methods for paired-end whole genome sequencing (WGS) library preparation are published in Chun et al. (2016). In brief, WGS libraries were prepared using Illumina\u00E2\u0080\u0099s PCR-free protocol, based on the TrueSeq DNA Sample prep kit (Illumina Catalogue Number FC-121-1002). Following this protocol, 100bp paired-end reads were generated on the Illumina HiSeq 2500 platform using v3 chemistry. A minimum of three lanes of sequencing was performed for each sample to reach the desired haploid sequence coverage (minimum, median and maximum coverage of 33.3X, 39.8X and 44.2X, respectively). All cell lines and the 40 pairs of primary tumour and matched normal samples were subjected to plate-based PCR-free whole genome sequencing as described above, with the exception of TARGET-52-PAJNFZ-10A-01D (normal), \t 70 which was subjected to four cycles of PCR amplification prior to sequencing due to low DNA quantity. All samples were sequenced to a minimum coverage of 30X. Following the selection of reads that passed Chastity filtering (Kircher, Heyn and Kelso, 2011), reads were aligned to the reference genome (GRCh37-lite/hg19; http://www.bcgsc.ca/downloads/genomes/9606/hg19/ 1000genomes/bwa_ind/genome) using the Burrows-Wheeler Aligner (BWA) (version 0.5.7; Li and Durbin, 2010). BAM files were sorted using SAMtools (version 0.1.13; Li et al., 2009), and reads were merged and marked for duplicates using Picard MarkDuplicates.jar (version 1.71; http://sourceforge.net/projects/picard). The merged BAM files and files obtained through subsequent analyses were submitted to the TARGET data depository mentioned above. 2.4.3 RNA-Seq library construction, sequencing and sequence read alignment Methods for paired-end whole transcriptome sequencing (RNA-Seq) library preparation are published in Chun et al. (2016). In brief, polyadenylated (polyA+) RNA was purified from 2 \u00C2\u00B5g of RNA from each sample and used to synthesize complementary DNA (cDNA) for strand-specific sequencing. The first strand of complementary DNA (cDNA) was synthesized from the purified polyA+ RNA using the Superscript cDNA Synthesis kit (Life Technologies, USA) and 5\u00C2\u00B5M of random hexamer primers with 1\u00C2\u00B5g/\u00C2\u00B5L Actinomycin D. The second strand cDNA was synthesized following the Superscript cDNA Synthesis protocol, replacing dTTP with dUTP in the dNTP mix and digesting the second strand using UNG (Uracil-N-Glycosylase, Life Technologies, USA) in the post-adapter ligation reaction, thereby achieving the construction of libraries preserving Watson / Crick strand specificity. The libraries were sequenced, 2 per 75 \t 71 paired-end lane, on the Illumina HiSeq 2500 platform using v3 chemistry and the HiSeq Control Software (version 2.0.10). RNA-Seq reads that passed Illumina Chastity filtering were aligned to the human reference genome (version GRCh37-lite/hg19) and to exon junction sequences using BWA (version 0.5.7; using default parameters for \u00E2\u0080\u009Caln\u00E2\u0080\u009D and \u00E2\u0080\u009Csampe\u00E2\u0080\u009D and the \u00E2\u0080\u0093s option to disable the Smith-Waterman alignment), as described previously (Morin et al., 2008). The exon junction sequences were constructed from all known transcript models in EnsEMBL (version 69), RefSeq and known genes from the UCSC databases (downloaded from the UCSC FTP site in November 2011). Sequences with exon junctions were repositioned back onto the genome as gapped alignments using JAGuaR (version 2.0.3; Butterfield et al., 2014). Duplicate reads were flagged using the Picard MarkDuplicates.jar program (http://sourceforge.net/projects/picard). Resultant BAM files were then split into positive strand and negative strand BAM files based on the orientation of the paired-end reads. 2.4.4 WGBS library construction, sequencing and sequence read alignment Methods for paired-end whole genome bisulfite sequencing (WGBS) library preparation are published in Chun et al. (2016). In brief, 1-5 \u00C2\u00B5g of genomic DNA was used for bisulfite conversion and library construction as described (Gascard et al., 2015). To track the efficiency of bisulfite conversion, 1ng of unmethylated lambda DNA (Promega) was spiked into 1\u00C2\u00B5g genomic DNA quantified. DNA was sheared to a target size of 300 bp using Covaris sonication and the fragments were end repaired using DNA ligase and dNTPs at 30\u00C2\u00B0C for 30 min. Repaired DNA was purified and prepared for A-tailing. Cytosine methylated paired- end adapters were ligated to the DNA and adapter flanked DNA fragments were obtained. Prior to bisulfite conversion, an \t 72 aliquot of library fragments was amplified with 10 cycles of PCR and sized on an Agilent Bioanalyzer High Sensitivity DNA chip. Amplicons were between 200 \u00E2\u0080\u0093 700 bp in length. Bisulfite conversion of the methylated adapter-ligated DNA fragments was achieved using the EZ Methylation-Gold kit (Zymo Research) following the manufacturer\u00E2\u0080\u0099s protocol. Five cycles of PCR was used to enrich the bisulfite-converted DNA. Post-PCR purification and size-selection of bisulfite-converted DNA was performed, extracting the 350 \u00E2\u0080\u0093 500 bp fraction, or 275 \u00E2\u0080\u0093 425 bp fraction if the former was of weak intensity. Libraries were sequenced using paired-end 100 / 125 nt V3/4 sequencing chemistry on an Illumina HiSeq2000 / 2500 following manufacturer's protocols (Illumina, Hayward, CA). Sequences were examined for quality, sample swap, reagent contamination and bisulfite conversion rate using custom in-house scripts. Sequences were aligned to the human genome reference (version GRCh37-lite / hg19a) using Bismark (version 0.7.6; Krueger and Andrews, 2011) and Bowtie (version 0.12.5; Langmead et al., 2009) allowing up to 2 mismatches in the 50 bp seed region (using -n 2 -l 50 parameters). Data from each lane were then merged using Picard (http://sourceforge.net/projects/ picard). Data quality assessment was performed, first based on human bisulfite conversion rate and mapping efficiency computed using Bismark from its mapping report. GC bias was assessed in one million bins of 100 bp-long regions that were randomly selected across the genome using BEDTools (Quinlan and Hall, 2010) and Spearman rank correlation between GC content and sequence coverage was computed. 2.4.5 ChIP-Seq library construction, sequencing and sequence read alignment Methods for paired-end chromatin immunoprecipitation followed by sequencing (ChIP-Seq) library preparation have been published in Chun et al. (2016). In brief, samples for \t 73 chromatin immunoprecipitation (ChIP) were prepared from cross-linked tissues or cells using 1% formaldehyde, from which chromatin was extracted. Chromatin DNA was then fragmented and the chromatin pre-cleared with 40\u00C2\u00B5L of blocked Protein A/G sepharose beads (GE Healthcare, USA). 0.1\u00C2\u00B5g of pre-cleared chromatin was reserved as the input control. Sepharose beads were used for ChIPs and each immunoprecipitation was carried out with pre-cleared chromatin and antibody. Antibodies used in this study were subjected to rigorous quality assessment to meet the International Human Epigenome Reference Standards (http://ihec- epigenomes.org) including western blot of whole cell extracts, 384 peptide dot blot (Active Motif MODified Histone Peptide Array) and ChIP-seq using control cell pellets (HL60). ChIP DNA was run and size separated using PAGE (8%) gel to select for DNA fraction of 200-500bp. Using purified DNA, the library was prepared following a modified paired-end (PE) library protocol (Illumina Inc., USA). Purified DNA was subjected to end-repair and Illumina PE adapter ligation. The adapter-ligated products were purified, then PCR-amplified in 10 cycles. PCR products of the desired size range were purified using 8% PAGE or an in-house size selection robot, and the DNA quality was assessed and quantified. Libraries were normalized and pooled, with the final concentration of the pooled library double-checked to ensure optimal sequencing. Libraries were sequenced using paired-end 76nt sequencing V3/4 chemistry on an HiSeq2000 / 2500 following the manufacturer\u00E2\u0080\u0099s protocols (Illumina) using multiplex custom index adapters added during library construction to distinguish pooled samples. Raw sequences were examined for quality, sample swap and reagent contamination using custom in-house scripts. Sequence reads were aligned to the NCBI GRCh37-light human reference using BWA (version 0.5.7; Li and Durbin, 2010) and default parameters, and assessed for overall quality using Findpeaks (Fejes et al., 2008). \t 74 2.4.6 Whole genome sequence data analyses 2.4.6.1 Sequence variant identification To identify putative SNVs and small insertions and deletions (InDels), WGS data from tumour and matched normal samples were analyzed using SAMtools mpileup (Li et al., 2009), followed by MutationSeq (Ding et al., 2012) and Strelka (Saunders et al., 2012). SAMtool mpileup (version 0.1.17) was run with \u00E2\u0080\u0093C50 \u00E2\u0080\u0093DSEuf parameters. MutationSeq (version 1.0.2) was used to assign scores to SNVs based on the likelihood of somatic calls. The putative variants were then filtered against known polymorphisms (from the dbSNP database, version 132) to identify somatic variants. Somatic SNV identification using Strelka (version 0.4.7) involved analyses of paired tumour-normal WGS data with the default settings. Low-quality SNVs were filtered from the resulting VCF files by applying SAMtools varFilter (with default parameters) and removing SNVs with a quality score lower than 20. Finally, snpEff and snpSift (Cingolani et al., 2012) were used to annotate SNVs using the EnsEMBL (version 69), dbSNP (version 132; Sherry, Ward and Sirotkin, 1999) and COSMIC (version 64) databases. SNVs identified by mpileup and Strelka were also annotated using snpEff and snpSift and the same annotation source. To analyze germline variants, WGS data were analyzed to identify SNVs and InDels that were found in both tumour and normal genomes using the \u00E2\u0080\u009Csingle library\u00E2\u0080\u009D mode in SAMtools mpileup. To filter out polymorphisms and other common variants that were unlikely pathogenic, putative germline variants in the dbSNP database were removed. Among the putative germline variants, I took particular note of variants that were heterozygous in the normal genome and homozygous in the tumour genome, as these germline variants may have undergone somatic \t 75 alterations in the tumour (e.g. copy number neutral LOH), and were thus judged as possibly of heightened relevance to disease biology. 2.4.6.2 Identification of significantly mutated genes and their association with gene expression changes To identify significantly mutated loci genome-wide, the method used by Chapman et al. (Chapman et al., 2011) was adopted for this study. Briefly, a binomial test was performed to calculate the probability of seeing at least the observed number of mutations in a locus given random chance. For each locus, the probability was calculated thus; where N = length of a locus, n = total number of mutations in a locus across all cases, \u00C2\u00B5 = average somatic mutation rate across the genome, i.e. total number of mutations in all MRT cases divided by the number of mutation-callable bases, i.e. read depth > 3, across the genome. P-values were corrected for multiple hypothesis testing using the Benjamini-Hochberg method (Benjamini and Hochberg, 1995), and the statistical significance threshold was set at an adjusted p-value of 0.05. To identify genes with significant non-coding mutations in introns and UTRs, the total length of the gene was used to calculate a probability. To explore the association of intronic mutations with gene expression changes, the fold change of gene expression between cases with significant intronic mutations (mutants) and cases without mutations (wild type) was compared using the non-parametric Wilcoxon Mann-Whitney U test. \t 76 To evaluate the likelihood of observing a particular fold change magnitude between mutants and wild type groups by random chance, the observed fold changes against null distributions of fold changes of randomly selected genes between two randomly selected groups of MRT cases were compared over 1,000 iterations. The significance of the difference between an observed fold change and the null distribution was determined using Z-scores and multiple hypothesis-adjusted p-values (using the Benjamini-Hochberg method; adjusted p-value threshold at 0.05). 2.4.6.3 Verification of putative SNVs and InDels Selected samples with putative SNVs and InDels of interest underwent targeted deep amplicon sequencing to verify the variants. The methods of verification have been published in Chun et al. (2016). In brief, primers were designed using Primer3 software (Rozen and Skaletsky, 2000) with a GC clamp, an optimal Tm of 64\u00C2\u00B0C and an optimal oligo length of 28bp to ensure specificity. Primers were tested using a combination of UCSC\u00E2\u0080\u0099s in-silico PCR tool based on the alignment against the reference human genome (GRCh37) and custom in-house software to verify a unique hit and to check that the variant was located within 250bp of the nearest end of the amplicon to ensure coverage in an Illumina MiSeq 250bp paired end read. The primers were tagged with Illumina adapters to enable a direct sequencing approach during sample preparation. gDNA templates were used as starting material to generate PCR products, with amplicon sizes ranging between 184 - 667bp. Amplicons were then pooled by template and underwent a second round of amplification with 6 PCR cycles. Following DNA quality assessment and quantification, the indexed libraries were pooled together and sequenced on the Illumina MiSeq platform using paired-end 250bp reads and v2 reagents. An in-house generated \t 77 PhiX control library was spiked into the samples as a sequencing control for cluster generation and calibration of sequencing quality metrics used in base calling and quality score calculation. 2.4.6.4 Copy number alteration and Loss of Heterozygosity (LOH) analyses The copy number state was estimated using the Hidden Markov Model-based method developed by Shah et al. (Shah et al., 2006), which identifies genomic segments with a uniform copy number. The method was modified to correct for sequence coverage bias across GC-rich regions. Chromosomal regions with unequal sequence coverage within the tumour and normal genomes were identified as regions harbouring somatic copy number alterations. To determine the effects of copy number alterations on gene expression, the means of gene expression levels (RPKM) were compared between cases with copy number alterations and cases without copy number alterations using Wilcoxon Mann-Whitney U test. P-values were adjusted for multiple hypotheses testing using the Benjamini-Hochberg method. Regions with LOH were identified using APOLLOH (Ha et al., 2012), which identifies heterozygous SNPs in a normal genome and then identify chromosomal regions in which these SNPs exhibit variant allele frequencies that deviate from the expected value of 0.5 (indicating heterozygosity) in tumour samples. 2.4.6.5 Identification of putative chromosomal rearrangements based on whole genome and transcriptome assemblies Chromosomal rearrangements were identified using a de novo short read assembler and annotator, ABySS (v1.3.2; Simpson et al., 2009) and Trans-ABySS (v1.4.6; Robertson et al., 2010). Somatic chromosomal rearrangements were identified based on whole genome assemblies \t 78 using WGS data from tumour-normal pairs. Among these, expressed gene fusions were identified based on transcriptome assemblies using RNA-Seq data. For whole genome assemblies, BWA was used to align assembled contigs and reads to the genome (GRCh37-lite). First, WGS libraries were assembled using 24- mer and 44-mer reads in single-end mode. Contigs and reads were then re-assembled using 64-mer reads, first in single-end mode and then in paired-end mode. Meta-assemblies of contigs were then used as input for the Trans-ABySS analysis to identify structural variants against the reference genome and annotated gene models. For strand-specific transcriptome assemblies, a range of k-mers from k=50 to k=96 were generated using reads that mapped to the positive and negative strands of the genome, as well as reads with ambiguous strand information. The positive and negative strand assemblies were extended where possible, and then merged together to produce a meta-assembly contig dataset. GMAP (version 2012-12-20) was used to align transcriptome contigs and reads to the genome to identify long-range chromosomal rearrangements and gene fusions (Wu and Watanabe, 2005). Supporting evidence for assembled contigs was based on the number of aligned reads. For chromosomal rearrangements, supporting evidence was based on the number of reads that spanned an event breakpoint within a contig and the number of read pairs flanking the breakpoint. My colleagues at BCGSC used custom in-house software (the \u00E2\u0080\u009CGenome Validator\u00E2\u0080\u009D; developed by Readman Chiu, Richard Mar, Ka-Min Nip, Dustin Bleile and Caleb Choo at BCGSC), to further filter high-confidence events, which required > 2 reads that flanked or spanned a breakpoint. The Genome Validator also annotated somatic or germline events by comparing a tumour genome to its matched normal genome, and flagged potential somatic events whose confidence was low due to a lack of flanking reads either in the tumour or normal \t 79 genome. To identify high-confidence germline events, I further reduced false positive events identified in normal genomes by removing events with identical breakpoints found in assemblies of 27 normal peripheral blood genomes previously sequenced at the BCGSC. 2.4.6.6 Verification of putative somatic chromosomal rearrangements Chromosomal rearrangement events were verified using targeted deep amplicon sequencing on an AB capillary sequencing platform. The contigs from each fusion were extended to contain at least 390 nucleotides flanking the event\u00E2\u0080\u0099s genomic position. PCR primers were designed against genomic DNA at the breakpoint position using Primer3 at an optimal Tm of 64\u00C2\u00B0C with a desired size range of 400 \u00E2\u0080\u0093 600 bp. PCR primer pairs were then computationally determined to match the contig sequence provided as an input. Forward and reverse primers were tailed with T7 and M13Reverse 5\u00E2\u0080\u0099 priming sites. PCR amplification was performed using Phusion polymerase (Fisher Scientific, catalogue # F-540L) according to manufacturer\u00E2\u0080\u0099s specifications. PCR products were sequenced using T7 and M13R primers on the AB 3730XL DNA Sequencing platform. 2.4.7 RNA-Seq data analyses To quantify exon- and gene-level expression, aligned and filtered sequences analyzed using in-house-developed gene coverage analysis software, which converted the BAM files into WIG files in order to calculate the sequenced base coverage across exons and convert these values into read counts. My colleagues at BCGSC used collapsed exon models to represent a gene and quantify gene expression. RPKM was then calculated using read counts that were \t 80 normalized against exon lengths (per kilobase; as annotated by the EnsEMBL gene model version 69) and against the sequencing depth for each sample (per millions of aligned reads). To filter out genes with expression levels considered to be noise, I filtered out genes that were expressed below 1 RPKM in all 52 cases, which are primary MRTs (n = 40), cell lines (MRT and hES; n = 6) and normal kidneys (n = 6), resulting in the exclusion of 23,329 genes (out of 58,450 genes; EnsEMBL annotation, version 69) for all subsequent gene expression analyses. To identify differentially expressed (DE) genes, I used the DESeq R package (version 1.14.0; R version 3.0.3; Anders and Huber, 2010) and a multiple-hypotheses adjusted p-value of 0.05 for a significance threshold. For subsequent analyses, I removed poorly expressed genes that were identified to be over-expressed in one group compared to the other, and had a median expression level less than 1 RPKM. To identify pathways enriched for DE genes, Ingenuity Pathway Analysis\u00C2\u00AE and DAVID (Dennis Jr et al., 2003) pathway enrichment analysis tools were used. Enrichment p-values were corrected for multiple hypotheses testing using the Benjamini-Hochberg method. To perform clustering analyses of RNA-Seq data, the following filtering steps were applied: First, genes expressed below a noise threshold level of 1 RPKM in more than 75% of samples were removed. Then, the remaining genes were ranked by their coefficients of variation to identify the top 25% most variably expressed protein-coding genes with a mean RPKM > 1 (n = 3,179). These were used for subsequent analyses. I first made use of the non-negative matrix factorization (NMF) method, implemented in the NMF R package (version 0.20.2;\tGaujoux and Seoighe, 2010). The NMF analysis was run using the default Brunet algorithm over 50 and 200 iterations for the rank survey and the clustering runs, respectively. A preferred cluster solution of two groups was selected based on evaluating cophenetic coefficients, silhouette widths and \t 81 consensus matrices generated from clustering solutions with 2 to 10 groups. To evaluate consistency of the two-group result obtained using NMF, the same genes were used to perform unsupervised hierarchical clustering analysis and Principal Component Analysis (PCA). Unsupervised hierarchical clustering was performed based on z-scores of log2-transformed gene expression values (RPKM) using complete linkage and the Spearman correlation coefficient as the distance metric (the hclust R package; version 3.2.0). For visualization, the heatmap.2 function in the gplots R package (version 2.16.0) was used. PCA analysis was performed using log2-transformed RPKM values and the FactoMineR R package (version 1.29). To determine if the two group clustering solution was affected by differential presence of normal cells in bulk tissues, the mean tumour purity levels between the two groups were compared. No significant difference was observed (Welch\u00E2\u0080\u0099s t-test p-value = 0.097; median = 87.05% and 89.12% between Group 1 and Group 2 cases, respectively). To identify clinical covariates that were significantly associated with a gene expression subgroup, Fisher\u00E2\u0080\u0099s exact test was performed using the fisher.test function in R. Multiple hypotheses testing correction was applied using the Benjamini-Hochberg method implemented in the p.adjust function in R. 2.4.8 DNA methylation analyses Methylation status for each aligned CpG was calculated using Bismark Methylation Extractor (v. 0.10.1; Krueger and Andrews, 2011) at a minimum of 3x coverage per site in a strand-specific manner (run-time parameters; -p, no_overlap, --comprehensive, --bedGraph, --counts). Overlapping methylation calls from read_1 and read_2 were scored once. Cytosine coverage was computed using an in-house BAM2WIG Java program and custom Perl scripts \t 82 from the Martin Hirst laboratory at UBC (https://hirstlab.msl.ubc.ca) were used to calculate fractional methylation levels at CpGs with sequence coverage greater than 3. Unsupervised clustering was performed using the hclust function in R using average regional methylation levels as well as variable methylation levels greater than 1.5 IQR. Annotations and coordinates for CpG islands were obtained from the UCSC database (downloaded in March 2013). To identify differentially methylated CpG island-associated gene promoters between subgroups, average CpG methylation levels in gene promoters (transcription start sites [TSS] + 2 kb) were compared using Welch\u00E2\u0080\u0099s t-test followed by multiple hypotheses testing correction using the the Benjamini-Hochberg method (significance threshold at 0.05). Pathway enrichment analyses using genes with differentially methylated promoters were performed using DAVID (Dennis Jr et al., 2003). 2.4.9 ChIP-Seq data analyses For both H3K27me3 and H3K27ac, peak enrichment was calculated using MACS2 (version 2.1.0; Zhang et al., 2008), with default parameters to evaluate the significance of enriched ChIP regions with P = 0.01 cut-off (Roadmap Epigenomics Consortium et al., 2015). The \u00E2\u0080\u0098-broad\u00E2\u0080\u0099 option was used to analyze H3K27me3 datasets. To increase the specificity of peak calling, input DNA control samples were included, which were processed identically except for the immunoprecipitation step, which was excluded in the controls. One DNA input control per sample was generated and used to analyze H3K27me3 and H3K27ac data. Read densities were calculated by taking the number of reads within the estimated sequenced DNA fragment length determined by MACS2 for each ChIP-Seq sample (median estimated fragment length = 199 bp), \t 83 and normalizing against the read densities profiled in a respective input DNA control sample. Fold enrichment values from MACS2 were used to represent normalized ChIP-Seq signals. To identify gene promoters with differential H3K27me3 enrichment in MRT samples compared to normal samples, peaks that overlapped with promoter regions (TSS + 2 kb) were considered. For each gene promoter region, the sum of fold enrichment values of all H3K27me3 peaks within the promoter region was taken to represent the H3K27me3 signal for the promoter. The average H3K27me3 signal values between MRT (n = 35) and normal samples (comprising 1 normal kidney, 2 hESC lines, 2 NPC and 2 fetal brain samples) were compared. The top 200 gene promoters with the largest differences in H3K27me3 signals between MRT and normal samples were then selected for downstream analyses. Hierarchical clustering analysis was performed with complete linkage and Pearson correlation coefficients as the distance metrics using the hclust function in R. Pathway enrichment analysis was carried out using DAVID enrichment tool. To identify large regions with high H3K27ac signals, i.e. regions often termed as \u00E2\u0080\u009Csuper-enhancers\u00E2\u0080\u009D, H3K27ac peaks identified using MACS2 were used to represent constituent enhancers (\u00E2\u0080\u009Cenhancers\u00E2\u0080\u009D) and identify super-enhancers as described by Hnisz et al. (2013). In brief, enhancers within 12.5 kb of each other were combined (\u00E2\u0080\u009Cstitched\u00E2\u0080\u009D) into a single region. The sum of fold enrichment values of all constituent enhancers within a stitched region represented the fold enrichment level of a stitched enhancer. All enhancers were then ranked by fold enrichment values, and plotted the H3K27ac enrichment values (y-axis) versus the ranks (x-axis). This plot revealed an inflection point in the distribution of H3K27ac enrichment values, beyond which the enrichment values rapidly increased. The exact inflection point was determined as the point at which the difference of fold enrichment between two enhancers was \t 84 greater than 1. Enhancers beyond the inflection point were considered as super-enhancers. MRT-specific super-enhancers were identified as those that were present in five or more MRT samples (out of 10 samples with H3K27ac ChIP-Seq data) and were absent in all normal comparators (3 hESC lines and 1 fetal brain sample). The intersectBed tool (Quinlan and Hall, 2010) was used to determine overlapping peak regions. To identify functional enrichments of genes associated with enhancers, genes within 20 kb upstream and downstream of an enhancer region were considered as enhancer-associated genes, and used for pathway enrichment analyses using DAVID. Results were visualized using Enrichment Map plugin (version 2.01; Merico et al., 2010) in Cytoscape 3 (Shannon et al., 2003), with the FDR threshold and overlap coefficient of 0.001 and 0.6, respectively. \t 2.4.10 Annotation of Tumour Suppressor Genes (TSG) and Oncogenes To identify cancer-relevant genes in this study, I annotated my gene sets with oncogene and tumour suppressor gene designations based on a curated list of bona fide and putative oncogenes and tumour suppressor genes from the following public resources and published literature: 1. Annotated genes that were causally implicated in cancer from the COSMIC Cancer Gene Census database (Futreal et al., 2004) 2. Tumour suppressor genes, oncogenes and translocated cancer genes in the Molecular Signature Database (MSigDB) from the Gene Set Enrichment Analysis website (Subramanian et al., 2005) 3. Tumour suppressor genes from the TSGene database (Zhao, Sun and Zhao, 2013) \t 85 4. Oncogenes and tumour suppressor genes compiled in a review article by Vogelstein et al. (Vogelstein et al., 2013) 5. Putative oncogenes, putative tumour suppressor genes and recurrently mutated genes from the Cancer Drivers Database based on the Cancer Genome Atlas (TCGA) datasets (Rubio-Perez et al., 2015) 6. Manually curated list of bona fide oncogenes from the CancerQuest Oncogene table (https://www.cancerquest.org/cancer-biology/cancer-genes) Gene lists were concatenated and manually summarized to produce unique entries for each gene. Individual genes can have evidence for both tumor suppressing and oncogenic functions. Gene annotation types included \u00E2\u0080\u009Ctrue\u00E2\u0080\u009D or \u00E2\u0080\u009Cputative\u00E2\u0080\u009D oncogenes and tumour suppressor genes, or \u00E2\u0080\u009Crecurrently somatically mutated genes\u00E2\u0080\u009D based on the level of evidence available. In brief, genes that were manually curated and identified in more than one source were considered as true oncogenes and tumour suppressor genes, while genes that were listed in only one source based on a computational analysis were considered as putative ones. Genes in the Cancer Drivers Database without strong support for either an oncogenic or tumour suppressor role were classified as recurrently somatically mutated genes. \t 86 CHAPTER 3. Identification and analyses of RT molecular subgroups revealed tumours with increased CD8+ cytotoxic T cell infiltration, and similarities between extra-cranial MRTs and cranial ATRT-MYC3 3.1 INTRODUCTION Previous studies of extra-cranial MRTs and cranial ATRTs revealed that both entities exhibited stable and diploid genomes that harboured very few somatic mutations (0.2 mutations per Mb; Lee et al., 2012; Chun et al., 2016), and were primarily driven by biallelic inactivation of SMARCB1. Gene expression comparisons between MRTs and ATRTs or other tumours revealed several pathways and genes that appeared to be commonly dysregulated in MRTs and ATRTs. In particular, dysregulation of certain early developmental pathways was consistently observed in MRTs and ATRTs, such as pathways involved in neural crest and neuron differentiation (Torchia et al., 2015), or embryonic stem cell development (Gadd et al., 2010; Chun et al., 2016). Despite being uniformly driven by SMARCB1/SMARCA4 loss, both MRTs and ATRTs exhibited extensive levels of molecular heterogeneity, with multiple subgroups reported across several previous studies (Birks et al., 2011; Chun et al., 2016; Johann et al., 2016; Torchia et al., 2016; Nemes and Fr\u00C3\u00BChwald, 2018; Pinto et al., 2018). From these studies, some genes and pathways have emerged as being commonly dysregulated in extra-cranial MRT and ATRT subgroups, such as HOX genes and other homeobox-containing genes and genes involved in neural or neural-crest development. While findings from these studies suggested potential \t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t3\tA version of this Chapter has been published, and the author contributions are provided in the Preface as per the University of British Columbia PhD thesis guidelines: H-J. E. Chun et al. Identification and analyses of extra-cranial and cranial rhabdoid tumor molecular subgroups reveal tumors with cytotoxic T cell infiltration. Cell Reports (2019) 29(8): 2338-2354. https://doi.org /10.1016/j.celrep.2019.10.013. Copyright by Elsevier Inc.\t\t 87 similarities among some subgroups of RTs from different anatomical sites, direct comparisons of genomic, gene expression and epigenetic landscapes amongst RTs from different anatomical sites had been lacking, limiting our understanding of the extent to which they are similar or different. I was motivated to perform such comparative analyses to address this knowledge gap, and hypothesized that RTs consist of molecular subgroups with distinct biological characteristics regardless of anatomical sites of occurrence. I also hypothesized that characterization of RT subgroups would enable identification of molecular features that have subgroup-specific therapeutic implications. 3.2 RESULTS To facilitate direct comparisons across RTs from multiple anatomical sites, I amassed multi-omic datasets from 301 RTs from the brain (n = 161), kidney (n = 92) and other soft tissues (n = 44; 4 cases with unknown tissue types). I combined previously published MRT and ATRT datasets generated from 40 MRT cases at BCGSC (data described in Chapter 2; Chun et al., 2016) and 150 ATRT cases from German Cancer Research Center (DKFZ; Johann et al., 2016). In addition, my colleagues at BCGSC and collaborators at DKFZ generated new data from an additional 100 MRT and 11 ATRT cases. In total, I analyzed 74 cases with whole genome sequence data (WGS; n = 56 MRT and 18 ATRT tumour-normal pairs), 90 cases with transcriptome sequence data (RNA-Seq; n = 65 MRT and 25 ATRT), 86 cases with whole genome bisulfite sequence data (WGBS; n = 69 MRT and 17 ATRT), 49 cases with H3K27me3 ChIP-Seq data (n = 35 MRT and 14 ATRT), 48 cases with H3K27ac ChIP-Seq data (n = 34 MRT and 14 ATRT) and 301 cases with DNA methylation array data (n = 140 MRT and 161 \t 88 ATRT; generated on Infinium Human Methylation 450K or Infinium MethylationEPIC 850K platform; Supplemental Table 3.1). 3.2.1 ATRT-MYC and MRT exhibit similar DNA methylation profiles compared to ATRT-SHH and ATRT-TYR DNA methylation data have been analyzed to robustly distinguish different entities of cancers within the same organ site, e.g. multiple cancer types from the central nervous system (Capper et al., 2018), as well as to identify novel subgroups within the same cancer type (Sturm et al., 2012; Cancer Genome Atlas Research Network, 2014). DNA methylation profiling was also shown to yield robust subgroup clustering results using DNA from formalin-fixed paraffin-embedded materials, results that were reproduced using DNA from fresh-frozen materials (Hovestadt et al., 2013). Since my combined DNA methylation datasets were derived using multiple profiling methods (i.e. 450K and 850K arrays, and WGBS) from multiple data source sites, I first evaluated the robustness of DNA methylation profiles against batch effects, which may arise from the use of different technologies and data source sites, by comparing DNA methylation profiles from the same cell line generated by two independent centres. I observed a highly linear correlation between them (Pearson rho = 0.958; Figure 3.1). Second, to further determine the extent of reproducibility of DNA methylation profiles generated using different technologies, I compared DNA methylation profiles generated using WGBS and Infinium 450K Human Methylation array platforms from 17 TCGA tumour cases and 14 ATRT cases, and performed unsupervised clustering analyses. The analyses consistently showed clustering by cases, and also resulted in robust identification of known molecular subgroups (Figures 3.2A and 3.2B), indicating that DNA methylation profiles from WGBS and arrays are directly comparable. \t 89 Figure 3.127Comparison of DNA methylation data from H9 hESC generated using sequencing and array platforms. (A) Significant linear correlation was observed between DNA methylation profiles generated using 450K Illumina Infinium arrays at two independent research centres, i.e. Korea Advanced Institute of Science and Technology (KAIST) and University of California San Diego (UCSD; regression rho = 0.958). (B) Significant linear correlation was observed between DNA methylation profiles generated using WGBS at BCGSC and using 450K methylation arrays at UCSD (regression rho = 0.846). Red lines indicate y = x. \t 90 Figure 3.228Comparison of DNA methylation data of primary tumour samples generated using sequencing and array platforms. (A) Unsupervised hierarchical clustering of CpG methylation profiles of 17 TCGA samples that underwent both WGBS and 450K Illumina Infinium array experiments. (B) Unsupervised hierarchical clustering of CpG methylation profiles of 14 ATRT samples that underwent both WGBS and 450K Illumina Infinium array experiments. \t 91 Upon determining the robustness of signals of DNA methylation profiles, I performed comparative analyses of DNA methylation data from RTs and other paediatric and adult cancer types to identify common methylation profiles that may allude to aspects of shared biology between RTs and other cancer types. By combining the data generated using multiple DNA methylation profiling technologies (i.e. WGBS, Illumina 450K and 850K arrays), I analyzed 301 RT samples and the samples of other cancer types from publicly available DNA methylation data. I compared methylation profiles of RTs to 33 adult and 4 paediatric cancer types, 23 normal tissue types from TCGA and TARGET consortiums (n = 10,232 cases) and two SMARCB1-deficient paediatric chordoma cases (from BCGSC Personalized Oncogenomics (POG) program; Pleasance et al., 2020). First, I performed unsupervised hierarchical clustering of RT cases and cases of other cancer types and normal cell types represented by the median methylation values at the CpG sites (Figure 3.3A). Next, I clustered 10,234 cases using t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) dimension reduction analyses based on the top 10,000 most variably methylated CpG sites (Figures 3.3B and 3.3C). The clustering results indicated that RTs were more similar to cancers of neural crest origin (neuroblastomas, uveal melanomas, pheochromocytomas and paragangliomas), brain cancers (glioblastomas, low-grade gliomas) and normal brain tissues, consistent with our previous result reported for miRNA profiles (Chun et al., 2016). Furthermore, MRTs and ATRTs consistently clustered together in a separate cluster from other cancer types, indicating that they are most similar to one another. To determine the extent of molecular heterogeneity in RTs from different anatomical sites (e.g. the brain, kidneys, soft-tissues, spine, liver), my DKFZ collaborator and I analyzed DNA methylation profiles generated from 301 RT cases to reveal underlying clusters, using t-\t 92 SNE and unsupervised hierarchical clustering, respectively. Results from multiple algorithms consistently yielded clusters that substantiated the previous observation of three distinct clusters within ATRTs (Johann et al., 2016). ATRT-SHH and ATRT-TYR subgroups formed separate clusters distinct from ATRT-MYC, which clustered together with extra-cranial MRTs, indicating molecular similarities between ATRT-MYC and MRTs relative to the other ATRT subgroups (Figure 3.4). \t 93 \t 94 Figure 3.329Comparison of DNA methylation profiles from MRTs and ATRTs to other paediatric and adult cancer types revealed that RTs are distinct from other brain and kidney cancer types and show similarities to cancers originating from neural-crest-derived cell types, brain cancers and normal brain tissues. (A) Unsupervised hierarchical clustering of 301 MRT and ATRT cases, 9,757 TCGA adult cancer cases and 475 TARGET paediatric cancer cases, altogether representing 33 different tumour types and 23 normal tissue types. Clustering analysis was performed using DNA methylation array data. Median methylation levels were used to represent each tumour and normal tissue type. The top 8,000 most variable median values were used for the analysis. A suffix, \u00E2\u0080\u009C_TUM\u00E2\u0080\u009D, indicates an adult tumour case, whereas \u00E2\u0080\u009C_Tum\u00E2\u0080\u009D denotes a paediatric tumour case. \u00E2\u0080\u009C_NORM\u00E2\u0080\u009D indicates a matched adult normal case, whereas \u00E2\u0080\u009C_Norm\u00E2\u0080\u009D indicates a paediatric normal case. Unsupervised UMAP (B) and t-SNE (C) dimension reduction analyses were performed on MRTs, SMARCB1-deficient chordomas, normal cell types, and TCGA and TARGET cancer cases (n = 10,234 cases) using the top 10,000 most variably methylated CpG sites. Disease abbreviations are listed in the Method. \t 95 Figure 3.430Unsupervised clustering and dimension reduction analyses of DNA methylation data from 140 MRTs (92 renal, 48 extra-renal) and 161 ATRTs reveal similarity between MRTs and ATRT-MYC. (A) t-SNE analysis was performed using the top 2,000 most variably methylated CpG sites and revealed three separate clusters that consisted primarily of ATRT-SHH, ATRT-TYR, and of ATRT-MYC and MRT. (B) Unsupervised hierarchical clustering was performed using the top 1% most variably methylated CpG sites (n = 3,958) and yielded a clustering result consistent with (A). \t 96 3.2.2 ATRT-MYC and MRT cases occupy three DNA methylation subgroups that are associated with anatomical sites of occurrence and SMARCB1 mutation patterns DNA methylation analyses of 301 RT cases produced a cluster of ATRT-MYC and extra-cranial MRTs distinct from ATRT-SHH and ATRT-TYR clusters (Figure 3.4). I performed NMF analysis (Gaujoux and Seoighe, 2010), which further revealed separation of the group of ATRT-MYC and MRT cases into three subgroups (Groups 1, 3 and 4; Figures 3.5A and 3.5B), consistent with hierarchical clustering, UMAP and t-SNE results (Figures 3.5C and 3.5D). NMF cophenetic coefficients and silhouette widths indicated that Group 1 could be further separated into two subgroups (Figures 3.5E-G). However, I could not detect any molecular or clinical correlate associated with the two additional subgroups, except increased DNA methylation age associated with one of the two groups (Horvath, 2013). Lack of obvious molecular or clinical correlates that support the existence of two subgroups within Group 1 led me to focus my analyses on five DNA-methylation subgroups, i.e. the previously defined ATRT-SHH and -TYR (Johann et al., 2016), and three novel subgroups (Groups 1, 3 and 4) containing ATRT-MYC and MRT cases (Figure 3.5A). Group 1 (\u00E2\u0080\u009CATRT-MYC-like\u00E2\u0080\u009D; n = 67) contained a mixture of extra-cranial MRT and ATRT cases that were predominantly part of the ATRT-MYC subgroup. Group 1 consisted of 32 cranial ATRT cases (31 of them were previously classified as ATRT-MYC (Johann et al., 2016)) and 35 extra-cranial MRT cases (19 from kidneys, 12 from non-kidney tissues, 4 from unknown tissue types). Group 2 (\u00E2\u0080\u009CATRT-TYR-like\u00E2\u0080\u009D; n = 58) mostly consisted of ATRT-TYR cases (51 ATRT-TYR cases, 5 ATRT-MYC cases and 2 ATRT-SHH cases). Group 3 (\u00E2\u0080\u009CRTK-like\u00E2\u0080\u009D; n = 59) was dominated by RT of the kidneys (RTK), consisting of 2 ATRT and 57 MRT cases, of which 53 cases were RTKs. Group 4 (\u00E2\u0080\u009Cextra renal MRT-like\u00E2\u0080\u009D; n = 59) mostly contained extra-\t 97 renal MRTs (48 MRT cases, which included 28 cases from non-kidney tissues) but also contained 11 ATRT cases (6 ATRT-MYC, 4 ATRT-SHH and 1 ATRT-TYR). Group 5 (\u00E2\u0080\u009CATRT-SHH-like\u00E2\u0080\u009D; n = 58) mostly consisted of ATRT-SHH cases (57 ATRT-SHH cases and 1 ATRT-TYR case). Group 1 and ATRT-SHH exhibited increased DNA methylation age compared to other subgroups (Wilcoxon p-values = 6.30e-10 and 1.62e-05, respectively). Subgroups were not significantly associated with sex (Fisher\u00E2\u0080\u0099s exact p-values between 0.16 \u00E2\u0080\u0093 0.86) or chronological age (Kruskal-Wallis p-value = 0.25). \t 98 \t 99 Figure 3.531Five DNA methylation subgroups of RTs from cranial and extra-cranial sites correlate with previously known ATRT and MRT subgroups and anatomical sites. (A) Unsupervised NMF analysis was performed using the top 10,000 most variably methylated CpG sites, which revealed five subgroups (top). Clinical factors and gene expression subgroups of MRTs and previously characterized ATRT subgroups are shown in coloured tracks (middle). Chronological age and predicted DNA methylation age are shown in bar plots (bottom). (B) Cophenetic coefficients and silhouette widths for NMF clusters from k = 2 to k = 15 are shown. The highest cophenetic coefficients and silhouette widths were from the NMF solutions with 5 and 6 clusters. Unsupervised t-SNE (C) and UMAP (D) analyses showed results consistent with (A). Unsupervised NMF analysis was performed using top 10,000 most variably methylated CpG sites from 140 MRT and 44 ATRT-MYC samples, which revealed subgroups consistent with the five RT subgroups, with the most robust clustering solutions at k = 3 (E) and k = 4 (G). Clinical information and previously described ATRT subgroups and MRT gene expression subgroups are shown in coloured tracks (as in A). Cophenetic coefficients and silhouette widths indicating the robustness of NMF clustering from k = 2 to k = 15 are shown in (F). \t 100 To explore how the novel DNA-methylation subgroups might be related to previously characterized MRT subgroups, I compared MRT cases in Groups 1, 3 and 4 against cases in the MRT gene expression subgroups described in Chun et al. (2016), and observed a significant overlap between Group 3 and the RTK-like gene expression subgroup from Chun et al. (2016) (16 out of 18 cases (89%); Fisher\u00E2\u0080\u0099s exact p-value = 0.0070). I analyzed RNA-Seq data generated from 65 MRT cases (40 cases from Chun et al. (2015) and an additional 25 MRT cases) and performed an NMF analysis, which yielded two gene expression subgroups (Figures 3.6A and 3.6B). I again observed a significant association between Group 3 and the gene expression subgroup that exclusively consisted of RTKs (24 out of 25 cases (96%); Fisher\u00E2\u0080\u0099s exact p-value = 1.07e-05; Figure 3.6A). To further compare the five subgroups against the previously described ATRT subgroups reported in Torchia et al. (2016), i.e. Group 1, Group 2A and Group 2B, I obtained DNA methylation array data generated from 131 ATRT cases published by Torchia et al. (2016). I then performed unsupervised hierarchical clustering and NMF analyses using 10,000 of the most variably methylated CpGs across the combined sample sets (n = 432 cases; Figure 3.7). Both NMF and hierarchical clustering consistently supported the existence of five DNA-methylation subgroups, revealing that Group 1, Group 2A and Group 2B in Torchia et al. (2016) were identical to ATRT-SHH, ATRT-TYR and ATRT-MYC in Johann et al. (2016), respectively. In addition, my analysis further revealed two distinct groups within ATRT-SHH, which were separated based on the regions of the brain, i.e. supratentorial and infratentorial regions. Taken together, my analyses support the existence of five DNA methylation subgroups of RTs across different anatomical sites, which include known subgroups consistent with previous studies, as well as the subgroups that have not been previously identified. \t 101 Figure 3.632NMF analyses of gene expression data revealed subgroups consistent with DNA-methylation subgroups. (A) NMF consensus matrix from clustering RNA-Seq data of 65 MRT cases (top). Clinical correlates are shown in the bottom panel. Gene expression subgroup 1 contained all extra-renal MRT cases, while gene expression subgroup 2 exclusively contained RTK cases. Metrics indicating the robustness of clustering solutions from k = 2 to k = 15 are shown in (B). \t 102 \t 103 Figure 3.733NMF analyses of DNA methylation profiles from 432 RT cases confirmed ATRT subgroups characterized in multiple previous studies. Unsupervised NMF analysis was performed using 10,000 most variably methylated CpG sites from 140 MRT cases (92 renal, 48 extra-renal cases; Chun et al. (2016) and Chun et al. (2019)), 161 ATRT cases from Johann et al. (2016) and 131 ATRT cases from Torchia et al. (2015). Coloured tracks (bottom) show clinical factors and previously characterized subgroups of MRTs and ATRTs, with \u00E2\u0080\u009CBCCA-MRT GExp Subgroup\u00E2\u0080\u009D indicating MRT gene expression subgroups previously characterized by Chun et al. (2016) and \u00E2\u0080\u009CDKFZ-ATRT DNAm Subgroup\u00E2\u0080\u009D indicating ATRT DNA methylation subgroups by Johann et al. (2016), respectively. \u00E2\u0080\u009CSickKids-ATRT DNAm Subgroup\u00E2\u0080\u009D and \u00E2\u0080\u009CSickKids-ATRT GExp Subgroup\u00E2\u0080\u009D indicate ATRT DNA methylation and gene expression subgroups by Torchia et al (2015), respectively. To investigate whether the five DNA methylation subgroups were associated with genetic alterations, I analyzed somatic mutations using whole genome sequencing (WGS) data from tumour and matched normal pairs (56 MRTs and 18 ATRT cases). I observed that Group 3 and ATRT-SHH almost exclusively contained cases with somatic nonsense mutations (10 out of 12 cases) or focal deletions of SMARCB1 (11 out of 13 cases; Figure 3.8A). Group 1 was significantly associated with broad homozygous deletions at the SMARCB1 locus (Figure 3.8B). Of 26 cases with SMARCB1 deletions larger than 10 kilobases (kb), 14 cases were in Group 1 (Fisher\u00E2\u0080\u0099s exact p-value = 2.12e-08). To corroborate this observation based on WGS data, my DKFZ collaborator extended SMARCB1 copy number analyses to 301 cases, making use of the DNA methylation array data to infer copy number status, and detected regions with copy number alterations by using the sum of methylated and unmethylated signals across chromosome 22 (Sturm et al., 2012). This analysis consistently showed more Group 1 cases harbouring large deletions at the SMARCB1 locus compared to cases in other subgroups (Figure 3.8B). As expected, genes co-deleted with SMARCB1 (74 genes; Supplemental Table 3.2) were significantly under-expressed in Group 1 compared to the other subgroups that did not harbour \t 104 deletions (Wilcoxon p-value = 2.59e-07; Figure 3.9). These genes included CABIN1 (a regulator of p53 and T cell receptor signalling; Sun et al., 1998; Jang et al., 2009), SUSD2 (a tumour suppressor gene involved in G1 cell cycle arrest; Cheng et al., 2016), SPECC1L (a regulator of craniofacial morphogenesis and cranial neural crest cell delamination; Wilson et al., 2016), PIWIL3 (encodes a member of the PIWI subfamily of Argonaute proteins that mediate the repression of transposable elements during meiosis; Sasaki et al., 2003) and MIF (macrophage migration inhibitory factor, involved in cell-mediated immunity and inflammation; Lue et al., 2002). The association between SMARCB1 deletions and DNA methylation Group 1 is compatible with the notion that dysregulation of multiple genes in addition to SMARCB1 may contribute to molecular subgroup formation. UPB1TCERG1LSPECC1LSLC25A24RNGTTPDE4DMINPP1DMDCOL4A3BPCDCA4BCRCABIN1VAV3RELNPCDH15KCNJ12KALRNDMDCNTN5BCRABHD2PTPRDHSF5AVL9SMARCB1XPR1WDR70SMARCB1OSBPL8MGLLSLC25A21MIPOL1SMARCB1SMARCB1SLC25A21PTPRMMIPOL1Group 1 Group 3 Group 4 ATRT-SHH ATRT-TYRStop gainedFrame shiftMis-senseStart gainedStop lostSplice site donor/acceptorCodon insertion/deletionFusion-duplicationFusion-deletionFusion-translocationFusion-inversionMutation TypesA ATRT-MYC\t 105 Figure 3.834SMARCB1 mutation types are associated with DNA-methylation subgroups in RTs. (A) Oncoprints show recurrently mutated genes across RT subgroups. Different mutation types are indicated as coloured boxes for MRT and ATRT cases. (B) Copy number states at the SMARCB1 locus are shown in the plot. Samples are organized by the five RT subgroups, as indicated by coloured bars on the left. Dark blue indicates regions with two-copy loss, while light blue indicates one-copy loss. Regions with copy number gains are shown by shades of red, with darker shades indicating increased copy number amplification. The dotted vertical line indicates the location of SMARCB1 gene. \t 106 Figure 3.935RT Group 1 cases have broad deletions at the SMARCB1 locus, affecting expression of nearby genes. (A) Heat map indicates chromosomal copy gain or loss, estimated using DNA methylation data, centered at the SMARCB1 locus across the five DNA methylation subgroups. (B) Boxplots show the mean expression levels of 55 genes (left) and expression levels of MIF (left), which were co-deleted with SMARCB1. * indicates significant under-expression of genes in Group 1 compared to other RT subgroups (Wilcoxon p-value < 0.05). AFRYGroup5Group 1ATRTMRTGroup 2ATRT-TYRGroup 3Group 4Group 5ATRT-SHH-MYCMRTSMARCB1BExpression of MIF that was co-deleted with SMARCB1 Expression of all genes co-deleted with SMARCB1Group 1Group 3Group 4ATRT\u00E2\u0088\u0092SHHATRT\u00E2\u0088\u0092TYR010203040506070Mean RPKM05001000150020002500RPKMGroup 1Group 3Group 4ATRT\u00E2\u0088\u0092SHHATRT\u00E2\u0088\u0092TYR* *-1.0 +1.0median absolute deviation\t 107 3.2.3 ATRT-MYC and MRT exhibit DNA methylation profiles that distinguish them from ATRT-SHH and ATRT-TYR cases Previous studies highlighted genome-wide hypermethylation as one of the most prominent epigenetic features of ATRT-SHH and ATRT-TYR, while ATRT-MYC exhibited global hypomethylation compared to other ATRT subgroups (Johann et al., 2016). In contrast, no evidence for global hypermethylation was found in extra-cranial MRTs (Chapter 2; Chun et al., 2016). I investigated whether the molecular similarities observed between ATRT-MYC and extra-cranial MRTs would be further supported by comparison of genome-wide DNA methylation levels. To directly compare global DNA methylation levels across extra-cranial MRTs and ATRTs, my DKFZ collaborator and I analyzed whole genome bisulfite sequence (WGBS) data from 69 MRT and 15 ATRT cases and DNA methylation array data from 301 cases, respectively. MRT and ATRT-MYC cases exhibited global DNA methylation levels that were significantly lower than ATRT-SHH and ATRT-TYR (Wilcoxon p-value = 2.2e-07; Figure 3.10A), but were comparable to normal brain tissues (from 8 adult and 2 fetal brain samples; Wilcoxon p-values = 0.050 and 0.524, respectively; Figure 3.10C). MRTs further exhibited significantly lower methylation levels in intronic and non-genic regions compared to normal brain samples (Wilcoxon p-values = 0.011 and 8.26e-05, respectively; Figures 3.10D and 3.10E). A previous study by Johann et al. (2016) showed that global hypomethylation in ATRT-MYC compared to other ATRT subgroups was linked to the prevalence of partially methylated domains (PMDs), which are associated with repressed chromatin and heterochromatin (Schroeder et al., 2013). My collaborator found that PMDs were significantly more abundant in MRTs compared to ATRT-SHH and ATRT-TYR (Wilcoxon p-value = 0.00014; Figure 3.10B). In particular, MRTs in Groups 1 and 3 exhibited global hypomethylation associated with higher \t 108 PMD fractions compared to ATRT-SHH and ATRT-TYR (Wilcoxon p-value = 1.65e-07), whereas MRTs in Group 4 exhibited PMD fractions that were comparable to ATRT-SHH and ATRT-TYR (Figures 3.10F and 3.10G). This observation indicated that while global hypomethylation is an epigenetic feature that is characteristic of most MRTs, Group 4 appears to have a distinct DNA methylation landscape. \t 109 Figure 3.1036MRT and ATRT-MYC show similar DNA methylation profiles compared to ATRT-SHH and ATRT-TYR. (A) Boxplot shows the distribution of mean genome-wide WGBS DNA methylation levels in MRTs and ATRT subgroups (* Wilcoxon p-value < 0.05; n.s. = not significant). (B) Boxplot shows the distribution of fractions of a genome covered by partially methylated domains (PMDs) in MRT and ATRT using WGBS data, which revealed significantly more PMDs compared to ATRT-SHH and ATRT-TYR. Boxplots show distributions of median methylation levels of all CpG sites targeted by the arrays across the genome (C), CpG sites in annotated intronic regions (D) and non-genic regions (E). DNA methylation levels were profiled for 301 RT cases and 10 normal brain cases using Illumina 450K and 850K arrays. (F) Boxplot shows mean global DNA methylation levels based on WGBS data across five RT subgroups. (G) Boxplot shows the fraction of a genome covered by PMDs across the five RT subgroups. To characterize biological processes that are likely dysregulated by differential DNA methylation across the subgroups, my collaborator identified differentially methylated regions (DMRs, average length = 1 kb) for each subgroup, and performed gene set enrichment analyses using genes within subgroup-specific DMRs. Genes in Group-1-specific DMRs were enriched for immune-related pathways that were specifically related to interleukin-1-associated proinflammatory activities. These genes, which included IRAK, Toll-like receptors (TLRs), TRAF6 and JNK (Figure 3.11A), are critical for initiating innate immune response against foreign pathogens, and IRF7-associated pathways known to be activated upon viral infection (Vanpouille-Box et al., 2018). My collaborator and I also observed a significant enrichment of up-regulated genes in Group-1-specific DMRs, which are involved in retinoic acid signalling (e.g. NCOR2, a transcriptional repressor implicated in hematological malignancies; Lin et al., 1998), a pathway that was previously unlinked to MRT or ATRT biology (Figure 3.11A; Supplemental Table 3.3). For Group-3-specific DMRs, significant enrichments were observed \t 110 for up-regulated genes involved in DNA excision repair, BMP signalling and pathways implicated in renal cell carcinoma development, consistent with RTK-like characteristics observed in this subgroup (Figure 3.11B). Genes associated with Group-4-specific DMRs were enriched for focal adhesion, FGFR signalling and NF-\u00CE\u00BAB signalling, which is a key regulatory pathway for immune and inflammatory responses (Figure 3.11C; Didonato, Mercurio, & Karin, 2012). These observations pointed to subgroup-specific epigenetic modulation of genes involved in known signalling pathways linked to cancers, including those that have been therapeutically exploited e.g. FGFR and retinoic acid signalling, and genes involved in inflammatory responses and immune functions, which was an unexpected observation for RTs. Figure 3.1137Pathways enriched for genes that are within subgroup-specific differentially methylated regions (DMRs). Gene set enrichment of Group 1- (A), Group 3- (B) and Group 4-specific (C) DMRs. The x-axis indicates the significance of the enrichment test. \t 111 3.2.4 ATRT-MYC and MRT share distinctive enhancer landscapes compared to other ATRT subgroups Previous studies have shown that the SWI/SNF complex plays key roles in enhancer maintenance and targeting, and that loss of SMARCB1 leads to aberrant SWI/SNF complex assembly, which, in turn, alters enhancer functions and target genes (Alver et al., 2017; Wang et al., 2017). Also, our previous studies (Chun et al., 2016; Johann et al., 2016) reported distinct super-enhancers that were associated with over-expression of key genes that characterized MRT and ATRT subgroups, further supporting the importance of enhancer dysregulation in shaping molecular landscapes of RTs. These findings led me to assess active enhancer states in ATRTs and MRTs using H3K27ac ChIP-Seq data from a total of 35 MRTs and 14 ATRT cases. To identify cases with similar H3K27ac profiles, my collaborator performed unsupervised hierarchical clustering of enhancer elements defined by H3K27ac signal densities (\u00E2\u0080\u009Cpeaks\u00E2\u0080\u009D) determined using MACS2 (Zhang et al., 2008). ATRT-MYC and MRT cases clustered together (Figure 3.12A), supporting the notion that ATRT-MYC and MRT share similar global enhancer profiles compared to the other ATRT subgroups. H3K27ac levels were also increased in subgroup-specific DMRs (Figure 3.12B), consistent with the notion of up-regulation of genes in these regions (as observed in NCOR2, mentioned in Section 3.2.3). To further characterize similarities in the enhancer landscapes between MRT and ATRT-MYC, my collaborator identified 25 high-density clusters of H3K27ac signals, indicative of super-enhancers that were common between ATRT-MYC and MRTs. The most prominent super-enhancer was identified at the HOXC locus (Figure 3.12D, Supplemental Table 3.4). The enhancer activity at this locus was further supported by significant over-expression of HOXC genes and the HOTAIR lncRNA gene in ATRT-MYC and MRT compared to ATRT-SHH and \t 112 ATRT-TYR (Figure 3.12D; Wilcoxon p-values < 2.4e-15 for HOXC genes and DESeq adjusted p-value = 3.43e-05 for HOTAIR). My collaborator also identified 61 regular enhancer elements that were specific to and common between ATRT-MYC and MRT (Supplemental Table 3.5), in the proximity of genes involved in epigenome modification and regulation of early development, such as CREBBP (encodes a histone acetyltransferase involved in embryonic development and growth control; Merk et al., 2018), PRDM6 (histone methyltransferase and transcriptional repressor involved in smooth muscle differentiation; Davis et al., 2006), TRERF1 (transcriptional regulator that interacts with CBP/p300) and TINAGL1 (encodes an antigen associated with tubulointerstitial nephritis; also involved in proliferation and migration of cranial neural crest cells; Neiswender, Navarre, Kozlowski, & LeMosy, 2017). To further explore enhancer-mediated transcriptional dysregulation, my collaborator calculated the enrichment of transcription factor binding sites (TFBS) within enhancer regions across MRTs, and compared these to the previously published TFBS enrichment data for ATRT-MYC, ATRT-SHH and ATRT-TYR to identify the transcription factors that could bind to enhancer regions common between ATRT-MYC and MRT. My collaborator then analyzed enrichment of TFBS within enhancer regions that were unique to MRT, ATRT-MYC, ATRT-SHH or ATRT-TYR by calculating enrichment scores based on observed and expected numbers of TF motifs found in combined enhancer regions for each subgroup. Unsupervised hierarchical clustering of TF motif enrichment scores showed clustering of ATRT-MYC and MRT, implying that common TFs could act on enhancers in ATRT-MYC and MRT (Figure 3.12C). TFs presumed to bind to such enhancer sites included those known to regulate mesoderm and neural crest development, e.g. HES7 and REST, which suppress neuronal transcription programs (Bessho et al., 2001; Bruce et al., 2004). Particularly interesting was the finding of immune-\t 113 related TFs enriched within enhancer regions unique to ATRT-MYC and MRT. Such immune-related TFs included XBP-1, a TLR-activated TF required for production of pro-inflammatory cytokines (Martinon et al., 2010), corroborating the DMR analysis results (Section 3.2.3), which indicated epigenetic dysregulation of genes involved in interleukin-1-mediated signalling. Moreover, in ATRT-MYC, I observed TFBS enrichments for TFs involved in apoptosis and immune regulation, such as GMEMB1/2, RAD21, IRF5/8/9 and STAT1. IRF5/8/9 are involved in the induction of type I interferons (IFNs), inflammatory cytokines and MHC class I genes and hence promote immune responses involving CD8+ cytotoxic T cells, Natural Killer (NK) cells and other immune cells (Langlais, Barreiro and Gros, 2016; Nan et al., 2018). Likewise, STAT1 plays key roles in regulating the expression of multiple IFN target genes (Ivashkiv and Donlin, 2014). Overall, these observations indicated the unexpected possibility of immune modulation through epigenetic dysregulation in RTs. AATRT-TYRATRT-MYCATRT-SHHMRTSamples8.00.0RPKMBGroup 1Group 3Group 4ATRT-TYRATRT-SHHSubgroupsGroup 1-specific DMRH3K27ac signal0.00.10.20.30.4-1000 bp center +1000 bp -1000 bp center +1000 bpGroup 3-specific DMR0.00.10.20.30.4Group 4-specific DMR0.00.10.20.30.4-1000 bp center +1000 bp\t 114 (Figure 3.12 continued on the next page) \t 115 \t 116 Figure 3.1238ATRT-MYC and MRT exhibit enhancer profiles distinct from other ATRT subgroups. (A) Unsupervised clustering of H3K27ac ChIP-Seq read densities resulted in a cluster of ATRT-MYC and MRTs, indicated by green and purple bars, respectively. (B) Line plots show the H3K27ac signal densities of the five RT subgroups at Group 1-, Group 3- and Group 4-specific DMRs, respectively. Subgroup-specific DMRs showed the highest H3K27ac signal density levels in the respective subgroups. (C) Unsupervised hierarchical clustering using enrichment scores of TFBS at enhancers that are enriched in ATRT-MYC and MRT. Heat map colours represent the log2 enrichment scores of TFs in ATRT-MYC and MRT-specific enhancers. Colours next to gene names indicate known biological processes associated with TFs. (D) Boxplot (top panel) shows expression levels of genes within (regions shaded in light purple) and adjacent to a super-enhancer at the HOXC locus, which includes the non-coding HOTAIR gene (bottom). 3.2.5 Immune-related genes, HOX genes and mesoderm development regulators are more highly expressed in ATRT-MYC and MRT compared to ATRT-SHH and ATRT-TYR, which instead expressed neural-like transcriptional profiles Our DNA methylation and H3K27ac ChIP-Seq data indicated epigenetic dysregulation of TFs that could potentially modulate transcriptional networks and contribute to similarities observed between ATRT-MYC and MRT. To determine such transcriptional similarities, I compared gene expression profiles and identified 584 over-expressed genes and 2,500 under-expressed genes in ATRT-MYC and MRT compared to ATRT-SHH and ATRT-TYR (DESeq adjusted p-value < 0.05; Figure 3.13A). The most significantly over-expressed genes in ATRT-MYC and MRT were tissue-type-specific genes such as TCF21 (a mesoderm-specific TF), GCG and KERA (expressed specifically in pancreas and cornea, respectively), and developmental regulators of mesoderm-derived tissue types such as DMP1, EPYC and MEOX2 (involved in bone differentiation and vascular smooth muscle development, respectively). Notably, I observed \t 117 significant over-expression of 26 members of all HOX gene families in ATRT-MYC and MRT (Figure 3.13A). In contrast, ATRT-SHH and ATRT-TYR exhibited over-expression of genes involved in development of the brain, sensory organs and neurons, e.g. SOX1, GPR98/ADGRV1, OTX2. Pathway enrichment analyses of differentially expressed genes consistently revealed significant enrichments of developmental pathways for mesenchymal cell types and mesoderm-derived organs, such as the skeletal system, muscle structure and connective tissues for ATRT-MYC and MRT (Figure 3.13B), while the most significantly enriched pathways for ATRT-SHH and ATRT-TYR predominantly involved neuronal development (Figure 3.13C). Notably, in addition to developmental regulation-related pathways, ATRT-MYC and MRT exhibited significantly enriched immune-related pathways including regulation of immune system process (GO:0002682; BH-adjusted p-value = 1.40e-04) and innate immune response (GO:0045087; adjusted p-value = 0.050). In contrast, there was no immune-related process that was significantly enriched for over-expressed genes in the ATRT-SHH and ATRT-TYR subgroups. Together, these observations supported the notion that dysregulated transcriptional programs involved in early development, particularly those involved in mesodermal development, may be characteristic for ATRT-MYC and MRT, while the transcriptional programs involved in neuronal development may be characteristic of ATRT-SHH and ATRT-TYR biology. In particular, significant enrichments of immune-related pathways for over-expressed genes in ATRT-MYC and MRT corroborate the TFBS results that indicate an enrichment of immune-related TFBS in enhancers unique to ATRT-MYC and MRT, which is in turn consistent with the notion that ATRT-MYC and MRT may share more pronounced immune-related phenotypes compared to ATRT-SHH and ATRT-TYR. \t 118 To determine transcriptional characteristics that distinguish Groups 1, 3 and 4 from each other, I identified functional categories enriched for subgroup-specific differentially expressed genes and constructed Gene Ontology enrichment networks. Networks of the most significantly enriched pathways for Group 1 included early developmental processes as well as ERK/MAPK signalling (Figure 3.13D, Supplemental Table 3.6). Group 3 networks also included early developmental processes, in addition to cell migration, cell adhesion and extra-cellular matrix organization (Figure 3.13E), while Group 4 networks predominantly consisted of immune-related categories including T cell activation, inflammatory responses, wound healing and NF-\u00CE\u00BAB pathways (Figure 3.13F). To further explore the association between RT subgroups and early developmental processes, I investigated which early progenitor cell types would most likely resemble RTs by correlating gene expression profiles of the RT subgroups to those of various progenitor or embryonic stem cell types profiled in published studies (Prescott et al., 2015; Roadmap Epigenomics Consortium et al., 2015; Chun et al., 2016). Group 1 showed higher correlations to CD56+ mesodermal progenitor cells compared to other subgroups, while Group 3 exhibited the highest correlation to embryonic stem cell lines (Figure 3.13G). In contrast, ATRT-SHH showed the highest correlation to cranial neural crest cells, neuronal progenitors and brain tissues, indicating that ATRT-SHH exhibited the most neural-like characteristics among the RT subgroups. \t 119 \t 120 Figure 3.1339Dysregulation of mesenchymal development genes is observed in ATRT-MYC and MRT, while neural gene dysregulation is observed in ATRT-SHH and ATRT-TYR cases. (A) Volcano plot shows differentially expressed (DE) genes with multiple-hypotheses-adjusted p-values on the y-axis, and the fold change (FC) of gene expression in ATRT-MYC and MRT compared to other ATRT subgroups on the x-axis. The top 20 significantly DE genes, as well as \t 121 significantly DE HOX genes and genes involved in neural or mesenchymal development, are labeled in colours as shown. Bar plots show most significantly enriched pathways for 584 over-expressed genes (B), and 2,500 under-expressed genes (C) in MRTs and ATRT-MYC compared to ATRT-SHH and ATRT-TYR. Enrichment map networks of Gene Ontology (GO) terms that are significantly enriched for Group 1- (D), Group 3- (E) and Group 4-specific (F) DE genes. A node size is proportional to the number of genes in the GO category and a node colour indicates an BH-adjusted enrichment p-value. The thickness of an edge is proportional to a fraction of shared genes between GO terms. (G) Unsupervised hierarchical clustering that compared gene expression profiles of RTs against various progenitor cell types, i.e. hESC lines from the TARGET consortium, mesodermal progenitors (hESC-derived CD56+ mesoderm cultured cells), cranial neural crest cell cultures, neural progenitors (neurospheres derived from ganglionic eminence and cortex, germinal matrix), fetal brains, adult brains (brain cortex) and paediatric kidneys from the TARGET consortium. Pearson correlation coefficients were transformed to Z-scores to show the deviation from the mean for each cell types. 3.2.6 Gene expression analysis shows the potential of increased T cell presence in the tumour microenvironments of cranial ATRT-MYC and extra-cranial MRT cases Our DNA methylation and enhancer analyses revealed the unexpected possibility of immune modulation through epigenetic dysregulation in ATRT-MYC and MRT. Gene expression analyses, which showed enrichment of over-expressed genes involved in immune and inflammatory responses, further substantiated distinct immune profiles in ATRT-MYC and MRT cases, especially those belonging to Group 4. To further identify gene expression signatures representative of each subgroup, I performed binary Random Forest classification on significantly differentially expressed genes for each subgroup. Notably, high expression levels of AIM2 and UBD/FAT10 appeared to be unique to Group 4, and were identified as Group 4-classifying features (Figure 3.14). AIM2 is involved in innate immune responses through \t 122 recognition of cytosolic double-stranded DNA, activation of NF-kB pathway and formation of caspase-1-activating inflammasomes (Hornung et al., 2009; Vanpouille-Box et al., 2018), while UBD/FAT10 is also involved in promoting inflammation through activation of the NF-kB-mediated pathways (Gong et al., 2010; Choi, Kim and Yoo, 2014). These results indicated the potential for increased immune modulation in tumours and immune activities in their microenvironments. However, there was a possibility of these observations might result from higher proportions of normal cells in the bulk tumour tissue sample. To determine whether the immune-related gene expression signals were genuinely derived from malignant cells, I performed differential gene expression and pathway enrichment analyses after removing samples with tumour purity levels below 75% (median of the cohort = 88%; Figure 3.15). I repeatedly observed the same immune-related pathway enrichments in Group 4, suggesting that the enrichment of immune-related signals did not simply reflect lower tumour purity. Taken together, my data indicated that RT subgroups might have an increased level of immune cell presence. \t 123 Figure 3.1440Subgroup-classifying genes in RTs. Heat map shows expression levels of the top 10 subgroup-classifying genes from a binary Random Forest classifier. The genes were ranked based on Gini scores that represent the overall discriminative values of genes to distinguish groups. \t 124 Figure 3.1541Tumour purity levels across RT subgroups. Boxplot shows the tumour purity levels estimated using whole genome sequence data using APOLLOH software. To further investigate this possibility, I used CIBERSORT (Newman et al., 2015) to analyze RNA-Seq data, deconvoluting immune cell gene expression signatures and estimating the extent of immune cell presence. I observed a wide range of proportions of effector T cell types, and derived a quantitative measure that represented the overall immune-promoting T cell presence by calculating the sum of effector T cell proportions (Figures 3.16A and 3.16H). In particular, proportions of CD8+ cytotoxic T cells and tumour-associated M2 macrophages (Sica et al., 2006) were the highest among those of the 22 immune cell types profiled using CIBERSORT (Figure 3.16C), suggesting the involvement of both pro- and anti-tumoural immune functions in tumour microenvironments. Among cases that were predicted to have a high presence of CD8+ cytotoxic T cells in their microenvironment (i.e. those within the top 25th \t 125 percentile of CD8+ cytotoxic T cell proportions estimated by CIBERSORT), I observed a significant over-representation of cases in Group 1 and 4 (Fisher\u00E2\u0080\u0099s exact p-values = 0.018 and 5.13e-03, respectively), while a significant under-representation was observed for cases in Group 3 and ATRT-SHH cases (Fisher\u00E2\u0080\u0099s exact p-values = 2.13e-04 and 0.031, respectively; Figure 3.16B). It was notable that the two ATRT-TYR case with the highest CD8+ cytotoxic T cell proportions exhibited high expression of TBXT (196.4 and 35.3 RPKM, median of the cohort = 0.0021 RPKM; Figure 3.16D), which encodes T-brachyury, an embryonic transcription factor, that is required for mesoderm formation and differentiation, and that has been linked to immune responses in chordomas that over-express this protein (Palena et al., 2007). To gain insight into biological processes that might underlie increased immune cell presence in RT subgroups, I analyzed genes involved in aspects of T cell-mediated immune responses. I found that nearly all HLA genes that encoded MHC class I and II (18 out of 19 genes) were significantly over-expressed in cases with CD8+ T cell proportions greater than the median (BH-adjusted DESeq p-values < 0.05; Figure 3.16E). Notably, gene expression levels of the master transcription factors of MHC class I and II genes, NLRC5 and CIITA, were significantly over-expressed in these cases (adjusted DESeq p-values = 0.0001 and 0.0018, respectively), consistent with the increased expression of HLA genes. RNA-Seq analyses also showed that these cases exhibited significantly higher Shannon Wiener index scores that represented increased T cell receptor (TCR) diversity (Welch\u00E2\u0080\u0099s t-test p-value = 0.012; Figure 3.16I; Bolotin et al., 2015; Shugay et al., 2015), indicative of inflammatory microenvironments in these cases. The cases with increased CD8+ T cell proportions further exhibited significantly higher expression levels of several key genes involved in antigen presentation pathways (Figure 3.16E), including antigen degradation, processing and transportation. Such genes included \t 126 PSMB8/9/10 (which encodes components of the immunoproteasome), TAP1 (encodes a component of the transporter-associated-with-antigen-processing (TAP) complex) and B2M (encodes MHC-class-I heavy chain). I also observed that genes involved in T cell activation, maintenance, homing and infiltration were significantly over-expressed in these cases (Figure 3.16F). Such genes included TNF and IFNG (involved in T cell activation), CXCL9/10 (encode chemokines that support and attract the influx of CD8+ T cells), and genes encoding perforins (PRF1) or granzymes (GZMA, GZMB), which are secreted by activated cytotoxic T cells. I also observed significant over-expression of CLEC9A/DNGR-1 (DESeq adjusted p-value = 0.0062), which is known to be expressed in the CD8\u00CE\u00B1+ antigen-presenting dendritic cells that are associated with tumour microenvironments infiltrated with T cells (Gajewski, Schreiber and Fu, 2013). Overall, these results indicated that RTs exhibiting high CD8+ T cell proportions might have tumour microenvironments with functionally active CD8+ cytotoxic T cells. Next I investigated how tumour cells in potentially inflamed microenvironments might elude elimination by the immune system, by analyzing genes involved in T cell inhibitory functions. I observed significant over-expression of IL10 and IL10RA, which encode a T cell inhibitory cytokine and its receptor, respectively. Also, several key immune checkpoint genes (e.g. PDCD1/PD1, CD274/PD-L1, HAVCR2/TIM3, LAG3) were significantly over-expressed in cases with high CD8+ T cell proportions (DESeq adjusted p-values < 0.05; Figure 3.16G). Furthermore, the Ras/ERK/MAP kinase pathway was significantly enriched for over-expressed genes in these cases (BH adjusted p-value = 1.6e-04). This pathway is known to maintain clonal anergy, an immune tolerance mechanism by which lymphocytes become functionally inactivated following an antigen encounter (Schwartz, 2003). Taken together, these observations suggested that RTs with immune-cell-mediated pro-inflammatory microenvironments might evade the \t 127 immune system via two different mechanisms; (1) by increasing the expression of immunosuppressive programs, or (2) by reducing the expression of antigen-presenting MHC complex components. To understand whether the level of immune cell presence observed in RTs is comparable to other paediatric cancers that occur in similar anatomical sites, I analyzed gene expression data and compared T cell scores obtained using CIBERSORT in medulloblastomas (n = 105 cases) and Wilms tumours (n = 130 cases; Gadd et al., 2017) to those in RTs. T cell scores in RTs were significantly higher in Groups 1, 4 and ATRT-TYR compared to medulloblastomas and Wilms tumours (Wilcoxon p-values < 0.05; Figure 3.16J), consistent with a hypothesis that effector T cell presence is higher in RTs than these other paediatric cancers of the brain and kidney. \t 128 \t 129 Figure 3.1642Gene expression analyses indicate increased T cell presence in RT subgroups containing MRT and ATRT-MYC cases. (A) Stacked bar plot shows CD8+ cytotoxic T cell proportions (yellow) and cumulative T cell scores (blue), based on the sum of absolute proportions of effector T cells (i.e. all T cell types except regulatory T cells (Treg) profiled using CIBERSORT). The samples (n = 90) are ordered based on CD8+ cytotoxic T cell proportions (and in all subsequent figures in Figure 3.16). A subgroup of each sample is indicated in (B). (C) Heat map shows absolute proportions of 22 immune cell types predicted using CIBERSORT. (D) Bar plot shows expression levels of the TBXT gene, which encodes T-brachyury. Heat maps indicate expression levels of genes involved in antigen presentation and processing (E), T cell activation and homing (F) and immunosuppressive signalling (G). All genes were \t 130 significantly over-expressed in cases with CD8+ T cell proportions greater than the median of the cohort (FDR < 0.05, except for CTLA4 (FDR = 0.10)). (H) Stacked bar plot shows relative proportions of 22 immune cell types predicted using CIBERSORT. (I) Boxplot shows distributions of Shannon Wiener index scores representing T cell receptor (TCR) diversity (higher Shannon Wiener index scores indicate greater TCR diversity). (J) Boxplot shows T cell scores across the five RT subgroups, and paediatric medulloblastoma and Wilms tumour cases (* indicates Wilcoxon p-value < 0.05). 3.2.7 Immunohistochemistry confirms increased CD8+ cytotoxic T cell infiltration and PD-L1 immune checkpoint expression in MRTs and ATRT-MYC To validate immune cell presence predicted from my gene expression analyses and to orthogonally assess the extent of immune cell infiltration in RT tissues, I analyzed data from the multiplex immunohistochemistry (IHC) experiment of 185 tumour samples from 62 patients (n = 35 MRT cases [nine cases in Group 1, 20 cases in Group 3, six cases in Group 4] and 27 ATRT cases [ten cases in ATRT-MYC, ten cases in ATRT-SHH, seven cases in ATRT-TYR]) using an antibody panel to identify CD8+ cytotoxic T cells (CD3+CD8+), CD4+ helper T cells (CD3+CD8-) and myeloid cells including macrophages and microglia (CD68+). In addition, the multiplex IHC experiment also used an antibody panel to determine the expression of the immune checkpoint proteins, namely PD1 and PD-L1. MRT samples were selected from among the cases that were profiled using RNA-Seq or DNA methylation array data, or both, while ATRT samples were from a separate cohort due to the lack of availability of tissue slides from profiled cases. To comprehensively assess immune cell presence across tissues, I assessed immune infiltrates in three types of regions in tumour microenvironments (total number of regions profiled = 2,979; Supplemental Tables 3.7 and 3.8), i.e. tumour-rich regions away from \t 131 necrosis (TT; n = 1,803 regions), peri-vascular regions (PV; n = 591 regions) and peri-stromal regions at the interface of benign or normal tissues (PS; n = 585 regions). IHC data showed significantly more tumour-infiltrating CD3+ lymphocytes and CD3+CD8+ cytotoxic T-cells in MRTs and ATRT-MYC compared to ATRT-SHH and ATRT-TYR in all regions of the tumour microenvironment (Wilcoxon p-value < 2.2e-16; Figures 3.17A and 3.17B). Overall tumour-infiltrating CD3+ lymphocyte densities determined using IHC were consistent with predicted effector T cell scores, exhibiting a significant linear correlation between them (linear regression p-value = 0.0025, Pearson rho = 0.540; Figure 3.17C). CD3+CD8+ cytotoxic T cell infiltration levels were also consistent with predicted CD8+ T cell proportions (linear regression p-value = 0.0019, Pearson rho = 0.569; Figure 3.17D). However, CD68+ myeloid cell densities did not significantly correlate with predicted macrophage proportions (linear regression p-value = 0.251; Pearson rho = 0.015; Figure 3.17E). Consistent with gene-expression-based predictions, the majority (88.6%) of tumour-infiltrating CD3+ lymphocytes in MRT and ATRT-MYC were CD8+ cytotoxic T cells, which have been positively associated with survival and response to immune checkpoint blockade in other cancer types (Tumeh et al., 2014; Barnes and Amir, 2017). ATRT-SHH exhibited the lowest CD3+ lymphocytes and CD8+ cytotoxic T cell infiltration (Figures 3.17A and 3.17B), consistent with the prediction that ATRT-SHH might be the most immunologically \u00E2\u0080\u009Ccold\u00E2\u0080\u009D group among the RT subgroups. CD4+ helper T cell levels were comparable across RTs (Figure 3.17). IHC further revealed significantly increased expression of PD-L1 in MRTs compared to ATRTs (Wilcoxon p-value < 2.2e-16; Figure 3.17H). Moreover, a significant increase in PD-L1-expressing CD68+ myeloid cells was observed in MRTs compared to ATRTs (Wilcoxon p-value < 2.2e-16; Figures 3.17I and 3.17J). MRT cases in Group 4 further exhibited the highest mean density of PD1-\t 132 expressing lymphocytes among the five subgroups (Wilcoxon p-value = 0.0002; Figure 3.17K). Among RT subgroups, ATRT-SHH exhibited the highest mean densities of PD-L1 negative CD68+ myeloid cells (Kruskal-Wallis p-value = 9.60e-12, Dunn\u00E2\u0080\u0099s adjusted p-values against ATRT-SHH < 9.46e-03; Figure 3.17L), the presence of which has been associated with poor prognosis of immune checkpoint blockade (Herbst et al., 2014), consistent with the observation of immunologically cold characteristics based on low CD8+ T cell infiltration observed in this subgroup. \t 133 \t 134 Figure 3.1743Increased immune cell infiltration in RT subgroups is validated by immunohistochemistry (IHC). IHC profiling was performed on 2,979 regions selected from 175 tumour tissue slides of 35 extra-cranial MRT cases (9 from Group 1, 20 from Group 3, 6 from Group 4) and 27 ATRT cases (10 from ATRT-MYC, 10 from ATRT-SHH and 7 from ATRT-TYR). CD68+ myeloid cells were profiled from 915 tumour-enriched (TT), 304 peri-vascular (PV) and 297 peri-stromal (PS) regions. CD3+ lymphoid cells were profiled from 888 TT, 287 PV and 288 PS regions. (A) Boxplots show distributions of CD3+ leukocyte density (y-axis in log10 scale) in TT, PS and PV regions. MRT cases in Groups 1, 3, 4 and ATRT-MYC cases showed significantly higher CD3+ cell densities compared to ATRT-SHH and ATRT-TYR in all regions (Wilcoxon p-values = 1.041e-11, 2.932e-11 and 1.026e-09, respectively). CD3+ T cell density is the sum of CD3+CD8+ and CD3+CD8- cell counts per mm2. (B) Boxplots show distributions of CD8+ cytotoxic T cell densities in TT, PS and PV regions (y-axis in log10 scale). MRT cases in Groups 1, 3, 4 and ATRT-MYC cases showed significantly higher CD8+ T cell densities compared to ATRT-SHH and ATRT-TYR in all regional types (Wilcoxon p-values = 2.2e-16, 6.94e-15 and 3.84e-12, respectively). Scatter plots show comparisons between T cell scores predicted using CIBERSORT and the median CD3+ leukocyte densities determined using IHC (C), and between predicted CD8+ T cell proportions and the median CD3+CD8+ cytotoxic T cell densities determined using IHC (D; x- and y-axes in log10 scale). Dashed lines indicate positive linear correlations (Pearson rho = 0.540 and 0.569, linear regression p-values = 0.0025 and 0.0019 for CD3+ and CD3+CD8+ cells, respectively). For macrophage validation, total fractions of macrophages M0, M1 and M2 predicted from CIBERSORT analyses were compared against CD68+ myeloid cell densities determined using IHC (E; Pearson correlation rho = -2.335, linear regression p-value = 0.2509). \t 135 (F) Examples of cases with high (top) and low (bottom) T cell infiltration revealed by multiplex IHC staining (CD3+ green; CD8+ brown). Images are at 30X magnification. Scale bars indicate 100 \u00C2\u00B5m. Boxplots show distribution of densities of CD4+ helper T cells, i.e. CD3+CD8- cells (G), PD-L1+ cells (H) and PD-L1 positive CD68+ immune cells (I) across MRTs in Groups 1, 3, 4 and ATRT-MYC, ATRT-SHH and ATRT-TYR (x- and y-axes in log10 scale). * indicates statistical significance p-value < 0.05. (J) Example of a case with high PD-L1+CD68+ immune cell infiltration, revealed by multiplex IHC staining of PD-L1 (red) and CD68 (green). Overlapping colours from PD-L1 positive CD68+ cells are shown in yellow. The image is at 30X magnification. Scale bars indicate 100 \u00C2\u00B5m. (K) Boxplots show distributions of PD1-expressing leukocyte densities, with Group 4 exhibiting the highest median level among the RT subgroups (* Wilcoxon p-value < 0.05). (L) Boxplot shows distributions of PD-L1 negative CD68+ immune cell densities, which are significantly higher in ATRT-SHH compared to other RT subgroups (* Dunn\u00E2\u0080\u0099s adjusted p-value < 0.05). Given the very low mutation (and presumably related neoantigen) load in RTs, I explored other potential sources of tumour antigens that might lead to increased immunogenicity in some RT subgroups. Motivated by previous studies linking epigenetic de-repression of endogenous retroviral elements (EREs) that mimicked viral infection, which in turn induced anti-tumour immune responses (Chiappinelli et al., 2015; Roulois et al., 2015), I analyzed RNA-Seq, CpG DNA methylation levels and H3K27ac levels within annotated regions of ERE, which included long / short interspersed nuclear elements (LINE / SINE), long terminal repeat (LTR) retrotransposons and endogenous retroviruses (ERVs; from RepeatMasker; n = 3,877,818 loci). Although a significant increase in H3K27ac read coverage was observed in Groups 1, 3, 4 compared to ATRT-SHH and ATRT-TYR (Welch\u00E2\u0080\u0099s t-test p-value = 3.17e-05; Figure 3.18B), DNA methylation and expression levels of EREs were comparable across RT subgroups, inconsistent with the notion of epigenetic de-repression of EREs in RT subgroups (Welch\u00E2\u0080\u0099s t-test \t 136 p-values > 0.05; Figures 3.18A, 3.18C). In addition to mutations or ERE de-repression, aberrantly expressed or over-expressed genes have been shown to elicit immune responses (Simpson et al., 2005). To identify such genes that might play a role in increased immunogenicity in RTs, I performed linear regression analysis to identify genes whose expression levels significantly correlated with effector T cell scores (linear regression p-values < 0.05), and filtered them against tumour antigens reported in the literature (n = 104; Lever and JM Jones, 2017). The analysis revealed nine known tumour antigen genes: ABCC3, CDR2, CEACAM21, CEACAM4, DSE, EPS8, ISG15, MUC1, TBXT. Of these, IGS15 and TBXT were aberrantly expressed in RTs compared to 56 normal cell/tissue types profiled by the GTEx consortium (Figure 3.18D). These observations indicate that over-expression of tumour antigen genes may play a role in the increased immunogenicity in some RT subgroups. \t 137 Figure 3.1844Potential tumour antigens that could contribute to RT immunogenicity. Boxplots show distributions of median DNA methylation levels at known endogenous retroviral element (ERE) loci (A), H3K27ac signal densities (B), and number of EREs that were expressed at a level greater than the median of the RT cohort (C). (D) Boxplot shows distributions of expression levels of ISG15 (top) and TBXT (bottom) across MRTs, ATRT subgroups, adult and fetal cerebellum samples (n = 5 and 4; labeled \u00E2\u0080\u009CAdult_Cere\u00E2\u0080\u009D and \u00E2\u0080\u009CFetal_Cere\u00E2\u0080\u009D, respectively), normal kidney samples (n = 6; \u00E2\u0080\u009CTARGET_Kidney\u00E2\u0080\u009D) and samples from 52 normal tissue types from the GTEx consortium (n = 2,500; with a suffix, \u00E2\u0080\u009C_N\u00E2\u0080\u009D). \t 138 Figure 3.1945Summary of RT molecular subgroups. \t 139 3.3 DISCUSSION Our integrative analyses of multi-omic datasets enabled identification of five molecular subgroups of RTs, including previously identified cranial ATRT subgroups and novel subgroups of extra-cranial MRTs (Figure 3.19), which were associated with distinct anatomical sites, SMARCB1 mutation patterns, and gene expression and epigenetic alterations. Among ATRTs, ATRT-MYC appeared most similar to MRTs \u00E2\u0080\u0093 particularly those in Group 1 \u00E2\u0080\u0093 at genetic, epigenetic and transcriptional levels, sharing molecular alterations with MRTs at both global and local levels. Among the most pronounced global epigenetic alterations common in both ATRT-MYC and MRT compared to other ATRT subgroups was genome-wide hypomethylation, which is most likely linked to PMDs. These PMDs covered substantial fractions of the genomes in ATRT-MYC and MRT, but were absent in both ATRT-SHH and ATRT-TYR. The hypomethylated regions were associated with increased H3K27ac levels, leading to up-regulation of genes that were significantly enriched for homeobox genes involved in early human development. Furthermore, based on gene expression analyses, MRT and ATRT-MYC appeared to be more mesenchymal-like, relative to other ATRTs, as suggested by over-expression of genes involved in development of mesenchymal cell types and mesoderm-derived organs and under-expression of genes involved in neural development. Relative to ATRT-TYR, ATRT-SHH exhibited higher levels of expression of neuro-developmental genes, in agreement with previously reported data (Han et al., 2016; Torchia et al., 2016). Comparisons of transcriptome profiles derived from progenitor cells including mesodermal and neural progenitors and cranial neural crest cell cultures, further showed that ATRT-SHH exhibited the highest degree of similarity to differentiated brain tissues. This is consistent with the finding that this subgroup shows neural-like gene expression patterns and the enrichment of transcription factors that are \t 140 expected in differentiated neurons. In contrast, Groups 1 and 3 were the most similar to mesoderm progenitor cells and embryonic stem cells, respectively, suggesting that these subgroups may be linked to less differentiated cell types and from different cell lineages. These observations may point to potentially different cells of origin for RT subgroups, and allude to the potential role of multiple developmental states in contributing to disease heterogeneity. Several lines of evidence described in my study supported immune modulation and increased immunogenicity in RTs. ATRT-MYC and MRTs showed an unexpected enrichment of TFBS in enhancers of genes involved in immune modulation, including IRF5/8/9, STAT1, RFX1/5 and XBP-1. IRF5 induces type I interferons, which play key roles in mediating innate immune responses against viruses and other infectious agents. In response to type I interferons, IRF8, IRF9 and STAT1 form the IFN-stimulated gene factor 3 complex, which in turn, up-regulates IFN-induced genes that drive a cellular antiviral state (Langlais, Barreiro and Gros, 2016). RFX1, RFX5 and XBP-1 promote expression of MHC class II, which presents antigens to CD4+ T cells. A previous study reported increased global chromatin accessibility after IFN-\u00CE\u00B3 treatment in Pbrm1-deficient mouse melanoma cell lines compared to wild-type cell lines (Pan et al., 2018). In particular, the open chromatin regions \u00E2\u0080\u0093 presumably enriched for enhancers and gene promoters \u00E2\u0080\u0093 were reported to be enriched for IRF motifs and associated with IFN-regulated genes. My observations are consistent with a hypothesis that mutant SWI/SNF exerts epigenetic effects on immune functions in SWI/SNF-defective tumours such as RTs. Among the RT subgroups, Groups 1 and 4 exhibited differentially methylated or expressed genes that were enriched for immune signalling pathways, including several TFs and effector genes that play important roles in type I interferon-mediated signalling. For example, UBD and AIM2, whose over-expression was specific to Group 4, play key roles in innate immunity by mediating NF-\u00CE\u00BAB \t 141 activation (Hornung et al., 2009; Gong et al., 2010) and cytosolic DNA sensing, which is involved in the maturation and antigen presentation of dendritic cells and in defense against viruses (Vanpouille-Box et al., 2018). It is notable that renal medullary carcinomas, another cancer type with pathognomonic SMARCB1 loss, also exhibited activation of cGAS-STING cytosolic DNA sensing pathways (Msaouel et al., 2020), providing additional support for a linkage between SMARCB1-deficient cancers and activation of innate immune signalling pathways. My gene expression analyses also showed the association between the levels of cytotoxic T cell presence and increased expression of genes involved in antigen presentation and processing, potentially indicating a role of activities of these pathways in increased immunogenicity in RTs. Over-expression of genes involved in T cell activation and homing in RTs with higher antigen presentation activities may also align with the hypothesis of increased inflammatory tumour microenvironments, which are infiltrated with active T cells that express perforin and granzymes, in subgroups that are composed of ATRT-MYC and MRT cases, i.e. Groups 1, 3 and 4. Increased infiltration of CD8+ cytotoxic T cells in the tumour-rich, peri-vascular and peri-stromal regions of ATRT-MYC and MRTs was confirmed using IHC. I observed significant linear correlations between IHC-validated cell densities and cellular fractions predicted based on gene expression profiles for both CD3+ lymphocytes and CD8+ cytotoxic T cells. The macrophage fractions estimated using gene expression data did not validate using IHC. It is likely that validation using a single surface marker, i.e. CD68+, may not suffice to accurately profile macrophages as macrophages have multiple cell surface markers, many of which are also expressed in other immune cell types e.g. CD68 expression in microglia (Zhao et al., 2018). \t 142 MRTs further exhibited increased infiltration of PD-L1+CD68+ myeloid cells, a feature that has been associated with favourable responses to immune checkpoint inhibition (ICI) therapy in multiple cancer types including non-small cell lung cancers, gastric cancers, head and neck squamous cell carcinomas, melanomas and urothelial cancers (Herbst et al., 2014; Mariathasan et al., 2018). MRTs in Group 4 further exhibited increased PD-1-positive lymphoid cells. In contrast, ATRT-SHH consistently exhibited characteristics of relatively immunologically \u00E2\u0080\u009Ccold\u00E2\u0080\u009D tumours, including low immune-related gene expression levels and the lowest levels of CD8+ cytotoxic T cell infiltration. Furthermore, ATRT-SHH exhibited the highest levels of PD-L1-negative CD68+ myeloid cells, which have been associated with pro-tumoural microenvironments (Noy and Pollard, 2014) and with poor responses to a PD-L1 blockade therapy in multiple cancer types (Herbst et al., 2014). One of the potential causes of increased immune infiltration in RTs may be viral protein expression from reactivation of EREs. Recent work has shown increased ERV expression associated with increased H3K27ac levels or decreased DNA methylation levels of ERV loci in gliomas (Krug et al., 2019) and in other solid cancers, such as colorectal cancers (Chiappinelli et al., 2015; Roulois et al., 2015). Another recent study showed increased ERE expression in ICI-responding RT tissues, and increased presence of double-stranded RNAs upon SMARCB1 re-introduction in RT cell lines (Leruste et al., 2019), indicating that ERE re-activation may indeed be one of the mechanisms underpinning immune activation in RTs. Although H3K27ac levels were increased at ERE loci in Groups 1, 3 and 4, I did not observe significantly higher ERE expression levels in these subgroups compared to ATRT-SHH and ATRT-TYR, indicating that reactivation of EREs in RTs is inconclusive based on my study. \t 143 Another potential cause of increased immune infiltration in RT may be tumour antigen expression from developmentally silenced genes that are normally restricted in their expression, for example to early embryonic stages or to specific tissue types. It is notable that TBXT, which encodes the embryonic transcription factor T-brachyury (Yamaguchi et al., 1999), was expressed exclusively in two ATRT-TYR cases that exhibited the highest estimated levels of T cell presence based on gene expression data. An HLA epitope for human T-brachyury peptides has been identified and was shown to elicit anti-tumour responses against T-brachury-expressing tumour cells (Palena et al., 2007), perhaps indicating that T-brachyury-expressing RTs may also be primed to boost anti-tumour responses from cytotoxic T cells. In addition, my data indicate that mutations in immune modulating genes may contribute to increased CD8+ cytotoxic T cell infiltration. For example, due to homozygous co-deletion with SMARCB1 in Group 1 cases, significant under-expression of MIF (which encodes a macrophage inhibitory factor) may contribute to the increased immunogenicity observed in this subgroup, as suggested previously by a study that demonstrated decreased levels of immune-suppressive regulatory T cells and increased levels of CD8-induced tumour cytotoxicity in MIF double knock-out mice compared to wild-type mice (Choi et al., 2012). Although ICI has emerged as a promising cancer therapy, it has been described by multiple studies that cancers with high mutational burdens were most likely sensitive to ICI, presumably because mutation-derived neoantigens provide a substrate for T cell recognition (Couzin-Frankel, 2013; Schumacher and Schreiber, 2015; Hellmann, Nathanson, et al., 2018). As such, in cancer types including melanomas and lung cancers, a high mutational burden has been considered as one of the biomarkers to determine the effectiveness of ICI treatment options (Schumacher and Schreiber, 2015; Hellmann, Callahan, et al., 2018). However, several recent \t 144 lines of evidence indicate that mutations in SWI/SNF subunits can also increase tumour immunogenicity. For example, patients with PBRM1-deficient clear cell renal cell carcinomas (ccRCC) showed increased survival in response to anti-PD1 immunotherapy compared to patients with ccRCC that did not harbour PBRM1 mutations (Miao et al., 2018). Small cell carcinomas of the ovary, hypercalcemic type (SCCOHT), which harbour a pathognomonic SMARCA4 loss, showed increased cytotoxic T cell infiltration (Jelinic et al., 2018). Furthermore, murine melanoma cell lines with siRNA-mediated repression of Pbrm1, Arid2 and Brd7 exhibited increased sensitivity to T cell killing and increased response to PD-1 and CTLA4 antibody treatments compared to the control cell lines and cell lines treated with siRNAs that targeted other genes (Pan et al., 2018). Notably, ccRCC and SCCOHT are known to have lower mutation burdens than other kidney and ovarian tumours, indicating that high mutational burden may not be the key biomarker for ICI responses across all cancer types. Thus, my findings are consistent with the notion that SWI/SNF mutations can contribute to tumour immunogenicity in ways that may increase their vulnerability to ICI. The surprising immunologically \u00E2\u0080\u009Chot\u00E2\u0080\u009D characteristics in MRT and ATRT-MYC lay the groundwork for further studies to assess whether increased immune-cell infiltration phenotypes and molecular features shared in MRT and ATRT-MYC can be usefully deployed in the clinic. 3.4 MATERIALS AND METHODS Methods and procedures pertaining to library preparation, sequencing and generation of previously published data have been described in Chapter 2 (Chun et al., 2016) and Johann et al. (2016). \t 145 3.4.1 Sample details and data availability Data from 141 primary extra-cranial MRT and 161 primary cranial ATRT samples were analyzed for this Chapter. Data for 40 out of 141 MRT samples were generated as part of a previous report described in Chapter 2 (Chun et al., 2016). In addition to these data, 29 MRT samples (26 from kidneys, 3 from soft tissues) were provided by Dr. Elizabeth Perlman (Ann and Robert H. Lurie Children\u00E2\u0080\u0099s Hospital in Chicago, USA) through the Children\u00E2\u0080\u0099s Oncology Group (COG) in the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) consortium. From COG, my colleagues at BCGSC received DNA aliquots from pre-therapy tumour and normal samples from peripheral blood or kidney from rhabdoid tumours (RTs), registered on the National Wilms Tumor Study Group 5 or on COG AREN03B2 banked by the COG Biopathology Center with parental informed consent. Studies were performed with the approval of the University of British Columbia - British Columbia Cancer Agency Research Ethics Board (REB number H09-02558). Nine MRT samples (three from the spine, two from kidneys, remainder from various non-renal tissues e.g., pelvis, face) were provided by Dr. Annie Huang (Hospital for Sick Children in Toronto, Canada) through the Rare Brain Tumour Consortium (RBTC). An additional 63 MRT samples (31 from kidneys, 8 from the liver, remainder from various non-renal tissues e.g., retroperitoneum, intra-abdomen, face) were provided via the EURHAB study group, with informed consent obtained from all patients included in the study. Data for 150 out of 161 ATRT samples were generated as part of a previous report (Johann et al., 2016). An additional 11 ATRT-MYC samples were provided by Dr. Martin Hasselblatt (Institute of Neuropathology, University Hospital Muenster, Germany). To validate ATRT subgroups, DNA methylation data from 131 primary ATRT samples, published in a previous report (Torchia et al., 2016), were provided by Dr. Annie Huang \t 146 (Hospital for Sick Children in Toronto, Canada) through the Rare Brain Tumour Consortium (RBTC). To enable as comprehensive a study as possible, my collaborators and I aggregated all obtainable RT samples that passed quality criteria from COG and EURHAB studies. For samples provided through COG, Nationwide Children\u00E2\u0080\u0099s Hospital prepared cells and nucleic acids, and shipped these materials to DKFZ for DNA methylation array experiments and to Canada\u00E2\u0080\u0099s Michael Smith Genome Sciences Centre at BC Cancer (BCGSC) for whole-genome-, whole-genome-bisulfite-, RNA- and ChIP-Seq experiments. For samples provided through DKFZ, cells and nucleic acids were prepared at various sample providers\u00E2\u0080\u0099 institutions, and underwent DNA methylation profiling at DKFZ. Complete sample information, including clinical data (e.g. age, sex of patient subjects), is provided in Supplemental Table 3.1. Tumour content was estimated for 74 cases (56 MRT and 18 ATRT cases) using whole genome sequence data generated from tumour and matched normal pairs, based on allele frequencies of heterozygous SNPs calculated using APOLLOH software (Ha et al., 2012), as described in Chapter 2. The median tumour purity of the cohort was 88.31% (minimum = 42.78%; maximum = 95.04%). Raw DNA methylation array data generated from ATRT samples by DKFZ have been deposited in the Gene Expression Omnibus (GEO) under the accession number GSE123601. The accession number for raw DNA methylation array data and sequencing data generated from MRT samples by TARGET is NCBI dbGaP: phs000470, with additional data available at http://target.nci.nih.gov/dataMatrix/TARGET_DataMatrix.html. \t 147 3.4.2 DNA methylation array data generation and processing DNA methylation array data from 150 primary ATRT samples were previously published (Johann et al., 2016). DNA methylation array data from nine MRT samples from Dr. Annie Huang were generated using Illumina\u00E2\u0080\u0099s Infinium HumanMethylation450 BeadChip (450K) platform. Using Illumina\u00E2\u0080\u0099s Infinium MethylationEPIC (850K) platform, DNA methylation array data were generated for 40 primary MRT samples that were previously analyzed using whole-genome-bisulfite-sequencing (Chun et al., 2016), and also for an additional 11 ATRT and 91 MRT cases. Raw 450K and 850K data were generated and processed as previously described (Johann et al., 2016; Capper et al., 2018). In brief, I processed the raw IDAT files using the minfi R package (version 1.20.2; Aryee et al., 2014). Multiple batches of array datasets were combined using the combineArrays function in the minfi package. Background noise and batch effects were corrected using the single-sample normal-exponential out-of-band (Noob) method. 3.4.3 Whole genome sequencing data Whole genome sequencing (WGS) data from pairs of 40 primary MRT and 18 primary ATRT cases, and their corresponding matched normal samples, were previously published (Chun et al., 2016; Johann et al., 2016). My colleagues at BCGSC generated WGS data for an additional 16 pairs of MRT and matched normal samples. WGS library construction, sequencing and read alignment were performed as previously described in Chapter 2 and in Chun et al. (2016). In brief, all primary tumour and matched normal samples underwent plate-based PCR-free WGS on the Illumina HiSeq 2500 platform to achieve the desired sequence coverage (> 30X). Sequences were aligned to the human reference genome GRCh37-lite/hg19a using the \t 148 Burrows-Wheeler Aligner (BWA; version 0.5.7; Li & Durbin, 2010). Merged BAM files were marked for duplicates using Picard MarkDuplicates.jar (version 1.71). 3.4.4 RNA-Seq data generation and processing Whole-transcriptome sequencing (RNA-Seq) data from 40 primary MRT and 25 primary ATRT cases were previously published (Chun et al., 2016; Johann et al., 2016). My colleagues at BCGSC generated RNA-Seq data for an additional 25 primary MRT cases. RNA-seq library construction and sequencing were performed as previously described in Chapter 2 and in Chun et al. (2016). Briefly, paired-end polyA+ RNA sequencing was performed, preserving strand specificity, on the Illumina HiSeq 2500 platform. Sequenced reads were aligned to the human reference genome (GRCh37-lite / hg19 version) and to exon-exon junction sequences using BWA (bwa-aln version 0.5.7). JAGuaR (version 2.0.3) was used to reposition sequences mapped to exon junctions back onto the genome as gapped alignments. My colleagues at BCGSC calculated the sequenced base coverage across collapsed exon models to quantify gene-level expression using the gene coverage analysis pipeline as previously described (Chun et al., 2016). All external RNA-Seq data (i.e. neuron progenitor data from the Roadmap Epigenomics Consortium (Roadmap Epigenomics Consortium et al., 2015), cranial neural crest data (Prescott et al., 2015)) were processed using the same software pipeline and gene annotation version as MRT RNA-Seq data, as previously described (Chun et al., 2016). 3.4.5 Whole genome bisulfite sequencing (WGBS) data generation and processing Whole-genome bisulfite sequencing (WGBS) data from 40 primary MRT and 17 primary ATRT cases were previously published (Chun et al., 2016; Johann et al., 2016). My colleagues \t 149 at BCGSC generated WGBS data for an additional 29 primary MRT cases. WGBS library construction and sequencing were performed as previously described in Chapter 2 (Chun et al., 2016). Briefly, fragmented converted DNA was sequenced using paired-end 100/125 nt V3/4 sequencing chemistry on the Illumina HiSeq 2500 platform. WGBS data from ATRT and MRT cases were previously published (Chun et al., 2016; Johann et al., 2016) and described in Chapter 2. Alignment of ATRT and MRT WGBS data was performed using Bismark (Krueger and Andrews, 2011) as described in Chapter 2. 3.4.6 Chromatin Immunoprecipitation followed by sequencing (ChIP-Seq) data generation H3K27ac and H3K27me3 ChIP-Seq data from 10 primary MRT and 14 primary ATRT cases were previously published (Chun et al., 2016; Johann et al., 2016). My colleagues at BCGSC generated H3K27ac and H3K27me3 ChIP-Seq data for an additional 24 and 25 primary MRT cases, respectively. ChIP-Seq library construction and sequencing were performed as previously described in Chapter 2. Briefly, samples were prepared from cross-linked tissues, from which ChIP was performed using the extracted chromatin. Fragmented chromatin DNA was sequenced using paired-end sequencing chemistry on the HiSeq 2000/2500 platforms. 3.4.7 Mutation analyses In addition to the previously described data, I analyzed WGS data from an additional samples, consisting of 18 pairs of ATRTs and 16 pairs of MRTs, and identified somatic mutations, which are copy number alterations (CNA), single nucleotide variants (SNVs), short insertions and deletions (InDels) and structural variants such as inversions, duplications, translocations that may lead to gene fusions. To allow data comparability, I used the same suite \t 150 of software tools described in Chapter 2, published in Chun et al. (2016). To identify SNVs and InDel, I used Strelka (version 2.0.7; Saunders et al., 2012), SAMtools mpileup (version 0.1.17; Li et al., 2009) and MutationSeq (Ding et al., 2012). Regions with loss of heterozygosity were identified using APOLLOH (version 012.2014a; Ha et al., 2012). Copy number estimation and segmentation were performed using a Hidden Markov model-based method (Shah et al., 2006). Structural variants such as chromosomal translocations and inversions were detected using Trans-ABySS (version 1.4.10; Robertson et al., 2010). 3.4.8 RNA-Seq data analysis For differential gene expression analyses, I selected genes that were expressed above a noise threshold of 1 RPKM in all 90 samples to remove genes that were likely expressed at a noise level, or sparsely expressed across my cohort. 27,790 out of 58,450 genes (EnsEMBL version 69) were removed using this filter. To identify differentially expressed genes, I used the DESeq R package (version 1.14.0; R version 3.3.2; Anders and Huber, 2010) and an adjusted p-value threshold of 0.05. For subsequent analyses, I further filtered out transcripts with low abundance, which were identified as over-expressed in one group compared to another group, but had a median expression level less than 1 RPKM. To identify pathways, biological processes or terms enriched for differentially expressed genes, I performed pathway enrichment analyses using DAVID (version 6.8; Huang, Sherman and Lempicki, 2009), g:Profiler (Reimand et al., 2007), Metascape (Tripathi et al., 2015) and Ingenuity Pathway Analysis\u00C2\u00A9 tool, with an multiple-hypotheses adjusted p-value threshold of 0.05. To visualize networks of significantly enriched pathways, I constructed connectivity maps using EnrichmentMap plug-in (version 3.2.0) and Cytoscape (version 3.7.1) based on significantly enriched Gene Ontology (GO) \t 151 biological processes from g:Profiler queries. The node colour in a map represents a Benjamini-Hochberg (BH)-adjusted p-value from enrichment tests. The size of a node is proportional to the number of genes associated with a biological process. The thickness of an edge is proportional to a similarity coefficient based on the fraction of shared genes between two biological processes. To identify subgroup-classifying genes, I performed binary Random Forest classification analyses on significantly differentially expressed genes for each subgroup using the randomForest R package (version 4.6-14). To train a random forest, I used 10,001 trees (odd number of trees chosen to avoid random tie breaking) on the training set. The training set consisted of 2/3 \u00E2\u0080\u009Cin-bag\u00E2\u0080\u009D samples that were randomly selected with replacement for each subgroup, and were tested against 1/3 of samples, a bootstrapped selection of \u00E2\u0080\u009Cout-of-bag\u00E2\u0080\u009D samples. Levels of the overall accuracy of the binary classifier were 86.7%, 88.9%, 91.1%, 91.1% and 88.9%; and the error rates were 13.3%, 11.1%, 8.9%, 8.9%, and 11.1% for Groups 1, 3, 4, ATRT-SHH and ATRT-TYR, respectively. To cluster samples based on gene expression, I first removed genes expressed below 1 RPKM in > 75% of samples, and then ranked the remaining genes based on the coefficient of variation. I performed unsupervised non-negative matrix factorization (NMF) using the top 25% most variably expressed genes (n = 3,984), and considered a k value i.e., a clustering solution at which the highest cophenetic coefficient and silhouette width were observed. I used the NMF R package (version 0.20.2; Gaujoux and Seoighe, 2010), with a default Brunet algorithm and 50 and 500 iterations for the rank survey and the clustering runs, respectively. To deconvolute gene expression signals originating from various immune cell types, I applied CIBERSORT analysis using the CIBERSORT R script (version 1.04; Newman et al., 2015) to gene-level RPKM data with 5,000 permutations using the absolute signature score \t 152 mode. To detect T cell receptor (TCR) sequences, I used MiXCR (version 2.1.9; Bolotin et al., 2015) on FASTQ data generated from paired-end RNA-Seq of 25 ATRT and 65 MRT cases, and identified reads that aligned to reference germline V, D, J and C gene sequences from GenBank. The reads were then assembled for clonotype mapping i.e., construction of full-length CDR3 regions of the TCR. I then analyzed TCR \u00CE\u00B2 clonotypes generated from MiXCR to calculate Shannon-Wiener index scores, which quantify the diversity of TCR repertoires in each sample using VDJTools (Shugay et al., 2015). To compare gene expression levels in RTs against various normal tissue types, I used RPKM values from the Genotype-Tissue Expression dataset (GTEx version 9; 2,500 samples; 52 normal tissue types), and from normal cerebellum (n = 9) and normal kidney (n = 6) datasets. To quantify endogenous retroviral element (ERE) transcript abundance levels, I used reads that mapped to loci that contained short or long interspersed nuclear elements (SINEs or LINEs, respectively) and long terminal repeat (LTR) retrotransposons including endogenous retroviruses (ERVs), annotated by UCSC RepeatMasker. The annotated list of 5,467,457 repeat elements and their genomic coordinates was obtained from RepeatMasker (hg19 version; Feb 2009 - RepeatMasker open-4.0.5-Repeat Library 20140131; Smit, Hubley and Green, 2013). The list was then filtered to remove EREs with uncertain status i.e., those annotated with \u00E2\u0080\u009C?\u00E2\u0080\u009D, and EREs on the Y and non-canonical chromosomes. I further removed EREs that overlapped with known gene promoters (defined as 1,500 bp upstream of a transcription start site) or with gene bodies in order to remove EREs whose transcript abundance levels were likely confounded by \u00E2\u0080\u009Chost\u00E2\u0080\u009D gene transcription. After filtering, the final number of EREs used for analyses was 3,877,818. Raw transcription abundance levels were quantified for each ERE by counting the number of unambiguously mapped reads, with their mate reads mapping within 10 kb from a \t 153 read centre. ERE transcript abundance levels were normalized for a library depth, and represented in reads per million reads mapped (RPM). 3.4.9 DNA methylation array data analyses To enable comparisons between 450K and 850K/EPIC arrays, only the probes represented on both arrays were used for all downstream analyses. In addition, I applied the following filtering criteria to obtain probes that were likely to target unique CpGs. I removed probes targeting the X and Y chromosomes, probes containing a single nucleotide polymorphism (SNP; based on dbSNP132Common database) within five base pairs of and including the targeted CpG-site (n = 24,536) and probes not mapping uniquely to the human reference genome (GRCh37/hg19), allowing for maximum one mismatch (Zhou, Laird and Shen, 2017). Finally, 395,786 filtered probes were used for all subsequent analyses. To estimate the copy number state for more cases, my collaborator at DKFZ analyzed DNA methylation array data from 450K and 850K/EPIC platforms, using the conumee R package (http://bioconductor.org/packages/conumee) to estimate the chromosomal copy number state. Regions with values > 0.3 were considered to have putative copy number amplifications, while regions with values < -0.3 were considered to have putative copy number deletions. To compare DNA methylation profiles generated using WGBS and DNA methylation array platforms, I downloaded from the NCBI Gene Expression Omnibus (GEO) publicly available 450K IDAT files from the H9 human embryonic stem cell line, generated at two different source sites, i.e. University of California San Diego (UCSD; GSM853421) and Korea Advanced Institute of Science and Technology (KAIST; GSM936834). The raw IDAT files were analyzed using the minfi R package (version 1.20.2; Aryee et al., 2014), and followed the \t 154 filtering processes described above. To determine the linearity between the two datasets, linear regression was performed using the R lm function. To determine the comparability between WGBS and methylation array data, I analyzed two sets of data from TCGA and DKFZ, which were generated from samples that underwent both WGBS and Illumina 450K DNA methylation array experiments. I used a third-party R program, methyLiftover (Titus et al., 2016) to analyze publicly available TCGA level-3 datasets from 17 samples: LUAD (n = 4; Lung adenocarcinoma), BRCA (n = 1; Breast adenocarcinoma), UCEC (n = 2; Uterine corpus endometrial carcinoma), BLCA (n = 6; Bladder urothelial carcinoma), STAD (n = 4; Stomach adenocarcinoma). To analyze DKFZ datasets that consisted of 14 ATRT cases with known subgroups (n = 2 cases in ATRT-MYC, 6 in ATRT-SHH and 6 in ATRT-TYR), I applied the filtering processes described above, and analyzed WGBS data using the in-house NovoAlign and Novo5mC software to calculate methylation levels. For analyses, CpG sites with sequence coverage > 10 were used. Array data were analyzed using the minfi R package (version 1.20.2; Aryee et al., 2014), normalized and back-ground corrected using the preprocess-quantile method. To identify molecular subgroups of RTs, I combined DNA methylation array data generated from 161 ATRT and 140 MRT samples and performed unsupervised NMF analyses. To use CpG sites that were considered to be positively methylated for the analysis, I removed CpG sites with 0% methylation across all 301 samples, and then selected those with beta-values greater than 0.3 in at least 10% of samples (Cancer Genome Atlas Research Network, 2014, 2015). The selected CpG sites were then ranked using standard deviation, such that the most variably methylated CpG sites could be selected for the analysis. I performed unsupervised NMF using the top 10,000 most variably methylated CpG sites with a default Brunet algorithm and 50 and 500 iterations for the rank survey and the clustering runs, respectively. \t 155 To test for dependencies between a subgroup and a clinical or molecular covariate (e.g. age, mutation status), I applied Fisher\u00E2\u0080\u0099s exact test using the fisher.test function in R, and applied BH multiple-hypotheses-testing correction to obtain adjusted p-values using the p.adjust function in R. To compare RTs against other adult and paediatric cancer types and normal tissue types, I performed a hierarchical clustering analysis on 10,222 DNA methylation profiles that represented 33 adult tumour types (n = 9,012 cases) and 23 adult normal tissue types (n = 746 cases) from TCGA, four paediatric tumour types (n = 452 cases) and one paediatric normal tissue type (brain; n = 12 cases) from TARGET. Also included in this analysis were DNA methylation profiles from eight normal adult brain samples and two normal fetal brain samples from DKFZ. Adult tumour types from TCGA (disease abbreviations as shown in Figure 3.3) are: BRCA (n = 796; Breast invasive carcinoma), LGG (n = 534; Low-grade glioma), HNSC (n = 530; Head and neck squamous cell carcinoma), THCA (n = 515; Thyroid carcinoma), PRAD (n = 503; Prostate adenocarcinoma), LUAD (n = 475; Lung adenocarcinoma), SKCM (n = 473; Skin cutaneous melanoma), UCEC (n = 439; Uterine corpus endometrial carcinoma), BLCA (n = 419; Bladder urothelial carcinoma), STAD (n = 396; Stomach adenocarcinoma), LIHC (n = 380; Liver hepatocellular carcinoma), LUSC (n = 370; Lung squamous cell carcinoma), KIRC (n = 325; Kidney renal clear cell carcinoma), COAD (n = 316; Colon adenocarcinoma), CESC (n = 309; Cervical squamous cell carcinoma and endocervical adenocarcinoma), KIRP (n = 276; Kidney renal papillary cell carcinoma), SARC (n = 265; Sarcoma), ESCA (n = 186; Esophageal carcinoma), PAAD (n = 185; Pancreatic adenocarcinoma), PCPG (n = 184; Pheochromocytoma and paraganglioma), TGCT (n = 156; Testicular germ cell tumors), GBM (n = 153; Glioblastoma multiforme), LAML (n = 140; Acute myeloid leukemia), THYM (n = 124; Thymoma), READ (n \t 156 = 99; Rectum adenocarcinoma), MESO (n = 87; Mesothelioma), UVM (n = 80; Uveal melanoma), ACC (n = 80; Adrenocortical carcinoma), KICH (n = 66; Kidney chromophobe), UCS (n = 57; Uterine carcinosarcoma), DLBC (n = 48; Diffuse large B cell lymphoma), CHOL (n = 36, Cholangiocarcinoma) and OV (n = 10; Ovarian serous cystadenocarcinoma). Paediatric tumour types from TARGET included CCSK (n = 11; Clear cell sarcoma of the kidneys), NBL (n = 224; Neuroblastoma), OS (n = 86; Osteosarcoma) and WT (n = 131; Wilms tumor). The level 3 TCGA and TARGET data were generated using Illumina Human Methylation 450K platform, and were obtained through the TCGA GDC Data Portal at https://portal.gdc.cancer .gov/ and the TARGET Data Portal at ftp://caftpd.nci.nih.gov/pub/OCG-DCC/TARGET/. WGBS data of SMARCB1-deficient paediatric chordomas (n = 2) were obtained from the BCGSC Personalized Oncogenomics Program (POG; Pleasance et al., 2020), and were analyzed as described above. For pan-cancer DNA methylation analyses, two approaches were used. In the first approach, median beta-values of filtered CpG probes were determined to represent each cancer and normal tissue type. Unsupervised hierarchical clustering was applied on these median values, together with DNA methylation profiles from primary MRT and ATRT cases, using the hclust function in R (version 3.3.2). The clustering was performed on the top 10,000 and the top 1% (n = 3,958) most variably methylated probes, using complete linkage and the Spearman and Pearson correlation coefficients as the distance metrics. To visualize the clustering results, I used the heatmap.2 function in the gplots R package (version 2.16.0). The second approach considered beta-values of filtered CpG probes for all 10,224 cases individually, analyzing CpG sites with beta-values > 0.3 in > 10% of the cases to perform t-SNE and UMAP dimension reduction analyses using the top 32,000 most variably methylated CpGs (Capper et al., 2018). \t 157 For t-SNE analysis, 3,000 iterations and perplexity of 40 were applied using the Rtsne R package (version 0.15). The UMAP dimension reduction analysis was performed using the umap R package (version 0.2.6.0) with default parameter settings. Both t-SNE and UMAP results were visualized using the basic plot function in R. 3.4.10 ChIP-Seq data analyses Alignment of sequenced reads was performed as described in Johann et al. (2016). Briefly, BWA was used to align sequence reads, and duplicate reads were marked using Picard MarkTools. For enhancer and peak-centred analyses of H3K27ac and H3K27me3 ChIP-Seq data, my collaborator at DKFZ used MACS2 (Zhang et al., 2008) with default settings to call peaks. Peak calling was performed for each sample in the sample cohort, and peaks that were present in two or more samples were retained for analyses. Resulting peaks were merged and used for further analyses. Signals at peaks were calculated using the \u00E2\u0080\u009CcountsForRegions\u00E2\u0080\u009D function followed by scaling the counts to library size (Hisano et al., 2013). My collaborator at DKFZ applied the same method to calculate H3K27ac signals at gene promoters, which are defined as regions + 500 bp around the transcription start site (TSS). Peaks (putative enhancers) with the top 1,500 most variable signals across all samples were used for unsupervised hierarchical clustering. For transcription factor (TF) enrichment analyses, enhancers enriched in MRTs were identified by comparing the mean signal of enhancers in MRTs against ATRT-MYC, ATRT-SHH and ATRT-TYR using ANOVA with an FDR threshold of 0.05, and requiring a minimum log2 fold change of 1.5 between MRT and ATRT enhancer signals. Nucleosome-free regions (NFRs) of these specific enhancers were identified using the HOMER software \t 158 (http://homer.ucsd.edu/homer/; version 4.10; Heinz et al., 2010). For TF enrichment, the ENCODE motifs were downloaded from http://compbio.mit.edu/encode-motifs/. Each motif was compared to the NFRs from MRT-specific enhancers. Chi-square tests were applied to identify significantly enriched TF motifs (FDR < 0.01) at the enhancer regions. Enrichment values for ATRT-subgroup-specific enhancers were taken from previously published data (Johann et al., 2016). Super-enhancers were identified using the HOMER software (http://homer.ucsd.edu/ homer/ngs/index.html) and the findPeaks command with \u00E2\u0080\u009C-style super\u00E2\u0080\u009D option. My collaborator at DKFZ used ATRT-subgroup-specific super-enhancers that were identified in a previously published study (Johann et al., 2016). For MRT-specific super-enhancers, H3K27ac data were combined for all MRT samples and compared to samples in each ATRT subgroup. To identify super-enhancers that were common between MRT and ATRT-MYC, my collaborator compared the coordinates of super-enhancers and selected those that overlapped by at least 25% between MRT and ATRT-MYC-specific enhancers, but not between MRT and other ATRT-subgroup-specific enhancers. 3.4.11 Whole-genome-bisulfite sequence (WGBS) data analyses To identify partially methylated domains (PMDs) in MRT, my collaborator at DKFZ analyzed WGBS data and applied the method described in Johan et al. (2016) to identify PMDs in ATRTs. In brief, average methylation levels within windows of 10 kb were calculated in steps of 1 bp. Overlapping 10 kb-windows with an average methylation level less than 0.6 were merged, and resulting regions larger than 100 kb were defined as PMDs. \t 159 To identify differentially methylated regions (DMRs), my collaborator at DKFZ used the bsseq R package (version 1.18.0; Hansen, Langmead and Irizarry, 2012) to create the data frames for methylated reads and to calculate the whole coverage per sample based on aligned reads. A CpG site with a minimum coverage of five reads was selected for downstream analyses. For each sample, a bsseq object was generated, and then analyzed to identify DMRs specific for each of the five subgroups, using the callDMR function in the DSS R package (version 2.12.0; parameters used: minlen = 50, minCG = 5). To identify a gene that overlapped with DMRs, the longest transcript of a protein-coding gene was considered. To visualize methylation levels in a genomic region, the GVIZ R package was used (version 1.26.4; Hahne and Ivanek, 2016). 3.4.12 Immunohistochemistry (IHC) experiment and data analyses Detailed protocols for the IHC experiment were published in Chun et al. (2019). The IHC experimental design was done in collaboration with Ms. Katy Milne and Dr. Brad Nelson at the Deeley Research Centre at BC Cancer, and Dr. Basile Tessier-Cloutier at the Department of Pathology and Laboratory Medicine at UBC. Ms. Milne performed the IHC experiment. Dr. Tessier-Cloutier examined the slides and selected tumour regions to profile. I performed the validation and data analyses. In brief, two multi-colour immunohistochemical panels were stained on whole tissue slides using two staining schemes. Slides were deparaffinized using xylene and graded alcohols. The slides were then subjected to antigen retrieval using a decloaking solution, and loaded onto a Biocare Intellipath FLX\u00C2\u00AE autostainer. The staining schemes required blocking of endogenous peroxidase activity followed by blocking of non-specific binding using a blocking reagent. For multiplex IHC targeting of CD3 and CD8, my collaborator at the Deeley Research Centre at BC \t 160 Cancer used the following staining scheme: Primary antibodies of CD8 and CD3 were combined into a cocktail, and added to the slides. These were then treated to put CD8 on IP DAB chromogen and CD3 on IP Warp Red chromogen. The slides were then counterstained with CAT Hematoxylin, then washed and air-dried prior to coverslipping. For multiplex IHC targeting of PD1, PD-L1 and CD68, my collaborator used the following staining scheme: In the first round of staining, primary antibodies of PD-L1 and CD68 were combined into a cocktail, and added to the slides. These were treated to put CD68 on IP Ferengi Blue chromogen and PD-L1 on IP DAB chromogen. The slides then underwent a denaturation step, followed by the second round of staining. In the second round, primary PD1 antibody was applied to the slides, which were then treated to put PD1 on IP Warp Red chromogen. Slides were then counterstained with CAT Hematoxylin, and then washed and air-dried, followed by coverslipping. To generate IHC data, the following steps were performed. In brief, IHC-stained slides were scanned at 10X to create whole slide scans using the Vectra 3 multispectral imaging system (Perkin Elmer, Waltham, MA). The files generated were then passed to Dr. Tessier-Cloutier for selection of 15 tumour-rich (TT), 5 peri-vascular (PV) and 5 peri-stromal (PS) regions per sample based on whole slide scans of H&E-stained slides using the Pannoramic Midi system (3D Histech). Slides were then re-scanned to generate multispectral images of the selected regions at 20X magnification. Multispectral imaging enabled spectral separation between different chromogens for better visualization of images and spectral superimposition of different chromogens to identify co-expression of proteins. Multispectral images were analyzed using the inForm Image Analysis software (Perkin Elmer) to automatically identify cell phenotypes and perform cell counts. Five phenotyping algorithms were created using a training set of images (10 per algorithm) selected to recognize diverse cell phenotypes. All resulting cell counts were \t 161 compared and visually validated in all cases by me. To enhance visibility and discrimination between IHC colours for validation, IHC images were adjusted to reduce the blue hematoxylin signals by 50% and were re-coloured with the following pseudo-colours: CD3+ signals in green and CD8+ in brown. To visualize PD-L1 and CD68 expression, IHC images were adjusted to reduce the blue hematoxylin signals by 50%, and were further modified into pseudo-immunofluorescence images with following pseudo-colours: CD68+ in green, PD-L1+ in red, PD1+ in cyan and PD-L1+CD68+ in yellow. Validated cell counts were normalized into cell densities, which were calculated by dividing validated cell counts by the scanned area in mm2, which is calculated by multiplying a number of pixels of the scanned image by 2.5 X 107. Cell densities were used for downstream analyses and for plotting using R. Statistical significance of cell density differences across subgroups was evaluated using Wilcoxon Mann-Whitney U and Kruskal-Wallis tests. \t 162 CHAPTER 4. Conclusions and Future Directions Cancer has long been considered a genetic disease caused by mutations in oncogenes and TSGs (Futreal et al., 2004; Croce, 2008). Evolving sequencing technologies and genomic analyses, reviewed in Chapter 1, enabled cost-effective interrogation of the human genome, leading to discoveries deepened our understanding of the molecular underpinnings of cancer. These discoveries included mutations in epigenetic modifier genes and alterations in DNA methylation and histone modification, which implied important roles for epigenetic dysregulation in cancer initiation and progression (Feinberg and Tycko, 2004; Timp and Feinberg, 2013). Among epigenetic modifier genes, the genes encoding subunits of the SWI/SNF chromatin-remodelling complex were the most frequently mutated across multiple adult and paediatric cancer types (Shain and Pollack, 2013). Extra-cranial and cranial RTs (MRTs and ATRTs, respectively) are highly aggressive paediatric cancer types driven by loss of SMARCB1 or, in rare cases, by loss of SMARCA4, both of which are core subunits of the SWI/SNF complex. As RTs harbour virtually no other recognized driver alteration and also possess relatively stable diploid genomes, they provide a uniquely clean model system of human primary cancer tissues in which to study how defective SWI/SNF contributes to cancer development and progression. Despite having a uniform driver alteration, RTs exhibit clinical, histological and molecular heterogeneity, providing a useful cancer model system in which to investigate the driving forces behind tumour heterogeneity. The aim of my thesis study was comprehensively characterize genomic, transcriptomic and epigenomic landscapes of RTs from multiple anatomical sites, and to reveal biological \t 163 relationships among them. I hypothesized that the use of second-generation sequencing technologies to identify molecular alterations genome-wide would reveal previously unknown mutations or gene expression and epigenetic dysregulation that play important roles in RT pathobiology. Furthermore, I expected that heterogeneous molecular features of RTs could be robustly identified and categorized into molecular subgroups across multiple anatomical sites of occurrence. Finally, I further proposed that characterization of RT subgroups would enable identification of molecular features that are representative of each subgroup, including those with therapeutic implications. To test these hypotheses, I carried out two main research to characterize multi-omic landscapes of extra-cranial MRTs and of RTs from multiple anatomical sites, as described in Chapters 2 and 3, respectively. Chapter 2 describes the first genome-wide molecular characterization study of extra-cranial MRTs. This work shows that despite the uniform driver alteration of SMARCB1 loss, molecular landscapes of MRTs are not uniform, as shown by gene expression subgroups that distinguish between ATRT-like and RTK-like MRTs. This work also identified that HOX genes and other homeobox-domain-containing genes are epigenetically dysregulated, and are associated with differential expression of genes involved in early human development pathways. In particular, nearly all processes involved in neural-crest development include dysregulated genes in MRTs, either by mutations, altered gene expression or DNA methylation, implicating a link between dysregulation of neural crest cells and MRT development. Expanding the investigation of RTs from other anatomical sites, Chapter 3 describes the largest integrative analysis of extra-cranial and cranial RTs to date. In addition, it describes for the first time, that five molecular subgroups exist irrespective of the anatomical sites of occurrence, which highlights similarities between extra-cranial MRTs and ATRT-MYC at the \t 164 genetic, epigenetic and transcriptomic levels. My work also reveals that the RT subgroups comprised of MRT and ATRT-MYC cases exhibit increased levels of cytotoxic T-cell infiltration and immune checkpoint protein expression. These are biological features that are associated with positive responses to immune checkpoint inhibitor treatments (Yan et al., 2018), an unexpected therapeutic avenue for a cancer type with a paucity of mutations (Hellmann, Nathanson et al., 2018). My observations adds to increasing lines of evidence which indicate that defective SWI/SNF leads to increased immunogenicity in tumours, and may thus be a biological factor predictive of immunotherapy treatments. Supporting this notion, other studies reported increased survival in anti-PD-1-treated patients with clear cell renal cell carcinomas with PBRM1 loss-of-function mutations (Miao et al., 2018) or with tumours that harboured ARID1A mutations (Okamura et al., 2020). Furthermore, analyses of TCGA cohorts (n = 43,728 cases) revealed a significant increase in overall survival of patients treated with immune checkpoint inhibitors (n = 1,661), who had tumours with loss-of-function mutations in genes encoding SWI/SNF subunits (Courtet et al., 2020). In addition to conventional immune checkpoint blockade, chimeric antigen receptor (CAR) T-cells targeting B7-H3/CD276 have recently been shown to mediate potent anti-tumour effects in mouse xenografts models of ATRTs (Theruvath et al., 2020), indicating that multiple modalities of immunotherapy may be efficacious for RT patients. My work shows that MRT and ATRT-MYC have greater extents of hypomethylated regions in genomes and that differentially expressed genes in MRT and ATRT-MYC compared to ATRT-SHH and ATRT-TYR are significantly enriched for cytosolic DNA and RNA sensing pathways. Following up on these findings, I hypothesized that hypomethylated regions in RT genomes led to epigenetic de-repression of ERVs, and that activation of ERV expression would correlate with increased immunogenicity in RTs. To test this hypothesis, I compared gene \t 165 expression, DNA methylation and H3K27ac enrichment profiles at ERV loci. Although DNA methylation levels at the ERV loci were comparable across RTs, I observed higher H3K27ac enrichments at ERV loci in cases that exhibited increased T-cell infiltration compared to the cases that did not. Notably, paediatric high-grade gliomas similarly showed increased H3K27ac levels at ERV loci in H3K27M mutant tumours, which posits a hypothesis that H3K27M mutations induce aberrant deposition of H3K27ac, leading to increased ERV expression in these gliomas (Krug et al., 2019). The convergence of increased H3K27ac levels at ERV loci in H3K27M-mutant gliomas and SMARCB1-deficient RTs may suggest a role of SWI/SNF in direct silencing of ERVs through enhancer modulation, or indirectly through regulating the activities of PRC2. However, my RNA-Seq analysis did not show compelling evidence that supported the notion of ERV re-activation in RTs, as ERV expression levels were not significantly different in the RTs with increased H3K27ac levels at ERV loci. This observation is inconsistent with the findings of another RT study, which showed that ERV expression was SMARCB1-dependent using total RNA-Seq data derived from RT cell lines (Leruste et al., 2019). A possible explanation for these inconsistent findings may be the use of a poly-A-selection protocol for RNA-Seq in my study, which limits the ability to comprehensively profile non-coding transcripts without poly-A tails, including those transcribed from the repeat elements such as ERVs. A study has shown significant differences in repeat element expression using poly-A versus total RNA-Seq data, with the total RNA-Seq data providing more comprehensive profiles of repeat element transcript abundance with a greater dynamic range of the abundance levels compared to poly-A RNA-Seq data, demonstrating that the use of total RNA-Seq protocol is more suitable to quantify ERV expression (Solovyov et al., 2018). Solovyov and colleagues also showed that ERV expression status was more predictive of immunotherapy responses than \t 166 conventional immune signatures, providing a rationale for future studies to determine ERV expression in primary RT samples. Such studies may involve total RNA-Seq analyses and the use of wet-lab methods that stain and visualize double-stranded RNA in RT tissue slides, or confirm SMARCB1-dependent activities of cytosolic DNA/RNA sensing pathways using RT cell lines with SMARCB1 re-introduction. The observation of increased immunogenicity made from this thesis further suggests the importance of determining immune cell repertoires in RT microenvironments. The use of data generated from bulk tumour tissues has limitations in determining the extent of intra-tumour heterogeneity and to delineate different cell types in the tumour microenvironment. Gene-expression profiles generated from bulk samples are an amalgamation of signals coming from an admixture of different tumour clones, non-cancer normal cells, necrotic cells, immune and stromal cells in a tumour microenvironment, making it challenging to accurately deconvolute cell types and measure their fractions. As an example of such limitations, I show that a significant correlation exists between IHC-validated cell counts and predicted cellular fractions of cytotoxic T-cells, but that this correlation is not observed for macrophages. This reflects limitations of current deconvolution methods that are based on pre-determined gene expression signatures limited to certain cell types, producing only crude estimates of cellular profiles in the tumour microenvironment at best. The application of single-cell RNA-Seq technologies including those that preserve spatial information to directly measure transcript abundance levels at single cell levels would enable more comprehensive profiling of different cell types including immune cells and their spatial distributions in microenvironments of immunologically \u00E2\u0080\u009Chot\u00E2\u0080\u009D RTs compared to those of immunologically \u00E2\u0080\u009Ccold\u00E2\u0080\u009D RTs. \t 167 There are lines of evidence supporting the notion of increased immunogenicity in SWI/SNF-defective cancers, and it has been proposed to consider the status of SWI/SNF mutations as one of biological factors predictive of immunotherapy responses (Yan et al., 2018). However, further investigations are required to determine the link between immunogenicity and SWI/SNF dysregulation, as underlying mechanisms and the role of SWI/SNF in modulating immune systems remain unclear. Furthermore, the observation of immunologically \u00E2\u0080\u009Ccold\u00E2\u0080\u009D RTs indicates that the presence of SWI/SNF mutations alone may not be sufficient to identify immunologically \u00E2\u0080\u009Chot\u00E2\u0080\u009D tumours, and that additional biological (e.g. activation of certain pathways, interactions with tumour microenvironments) or molecular features (e.g. mutations, gene expression and epigenetic alterations) are likely playing a role. To explore these additional features of SWI/SNF-defective tumours, future studies may involve analyses of pan-cancer multi-omics data to identify molecular features that are common across various cancer types with SWI/SNF mutations. Analyses of clinical data, such as survival data and treatment outcome (especially the outcome of treatments using immune checkpoint inhibitors), would be helpful to determine features that have clinical implications. A study that compares a cohort of multiple cancer types with SWI/SNF subunit mutations against a cohort of cancers that do not have SWI/SNF mutations may identify mutational, transcriptional and epigenetic characteristics that are distinct in cancers with SWI/SNF mutations. Future comparative analysis will further require a careful selection of SWI/SNF mutant cancer groups and appropriate control groups, as well as modalities of data selected to identify molecular features that are shared among SWI/SNF mutant cancers. A gene expression study that compared multiple cancer types with SMARCB1 and SMARCA4 loss against those with intact SWI/SNF (Le Loarer et al., 2015) showed that cases with similar cellular histology clustered, irrespective of SMARCB1 or SMARCA4 mutation status. \t 168 These results suggest that the cellular- or developmental context may substantially influence the molecular landscape shaped by SWI/SNF dysregulation, and highlights the importance of careful experimental design and selection of control cohorts to properly investigate molecular features common to SWI/SNF-mutant cancers. In this work, I observed gene expression similarities between RT subgroups and different cell types with varying degrees of differentiation, which ranged from mesenchymal progenitor cells to differentiated neurons. Such similarities of RTs to a spectrum of cell types may reflect persistent shadows of multiple cell types of origin, setting various epigenetic contexts that affect gene expression landscapes in RT subgroups. The evidence for the existence of multiple cell types of origin is shown in mouse model studies in which Smarcb1 ablation in early neural progenitor cells (Han et al., 2016) and liver progenitor cells (Carugo et al., 2019) led to development of tumours that resembled human ATRTs and MRTs, respectively. Other studies also reported that RT-like tumours developed when Smarcb1 was ablated between embryonic day 6 and day 10 (Han et al., 2016; Vitte et al., 2017), whereas Smarcb1 ablation outside of this temporal window resulted in embryonic lethality, hepatic toxicity or development of T-cell lymphomas in mice (Klochendler-Yeivin et al., 2000; Roberts et al., 2002; Han et al., 2016). Building on observations made from my work and other studies, there are still unanswered questions. For instance, it is still unclear what the characteristics of epigenetic states are, which lead to oncogenesis versus cell death following SMARCB1 loss, and whether these characteristics can be therapeutically exploited. Furthermore, it remains to be of interest to profile the cell types that are present before, during and after the period of embryonic day 6 to day 10 of the mouse development to identify the cell types that give rise to each RT subgroup upon SMARCB1 loss. To answer these questions, future efforts may involve lineage tracing and \t 169 longitudinal analyses of gene expression and epigenetic profiles of individual cells in mouse embryos using single-cell RNA-Seq and single-cell ATAC-Seq. An inspiring study was recently published for medulloblastomas (MBs), in which authors performed single-cell RNA-Seq to map a lineage trajectory of developing neural progenitor cells using a conditional Ptch knockout mouse model, and identified Olig2+ progenitors to be MB tumour-initiating cells upon Ptch loss (Zhang et al., 2019). Similarly, single-cell RNA-Seq analyses of paediatric ependymomas showed evidence for impaired neural development processes and identified subpopulations of undifferentiated cells that were sensitive to HDAC inhibitor treatment (Gojo et al., 2020). For RTs, profiling transcriptomes and open-chromatin states at single-cell resolutions using conditional Smarcb1 knockout mouse models published by Han et al. (2017) and by Carugo et al. (2019). for ATRT-like and MRT-like tumours, respectively, can be useful in identifying and characterizing RT-initiating cells, and to understand the gene expression and epigenetic landscapes primed for oncogenic transformation upon Smarcb1 loss. To further investigate clinical implications and impact of molecular heterogeneity in RTs, the establishment and the use of model systems, such as mouse models, patient-derived cell cultures and organoids, that faithfully recapitulate RT biology will be important to understand drug responses and disease progression. For example, a large panel of patient-derived cell cultures for glioblastomas (n = 100 cases) was established, which enabled pharmacogenomics analyses to investigate responses to approximately 1,500 drugs (Johansson et al., 2020). For RTs, an organoid biobank has been recently established (Calandrini et al., 2020). Single-cell RNA-Seq and 3D imaging analyses provided evidence that the organoids retained cellular heterogeneity and properties of original tumour tissues including similar cellular morphology \t 170 and common driver genetic alterations. The use of these model systems and genomic analyses of them will be helpful to develop more effective treatment regimens for RTs. Nobel laureate Dr. Sidney Brenner once remarked that \u00E2\u0080\u009Cprogress in science depends on new techniques, new discoveries and new ideas, probably in that order\u00E2\u0080\u009D (Robertson, 1980). Through the use of second-generation sequencing technologies, I was able to carry out genome science studies of RTs, profiling whole genomes, transcriptomes and epigenomes, and uncovered recurrently altered genes and pathways that were previously uncharacterized in RTs. My work also revealed novel molecular subgroups across RTs from multiple anatomical sites, and further identified subgroups with increased immunogenicity. These observations provoke hypotheses that SWI/SNF mutations may contribute to immune modulation in tumours and their microenvironments in ways that increase their vulnerabilities to immunotherapy treatments. My findings, together with the data generated from my study, have contributed to enhanced understanding of RT biology, which I hope will ultimately result in development of better therapies for patients with this deadly disease. \t 171 BIBLIOGRAPHY\tAgalioti, T. et al. (2000) Ordered recruitment of chromatin modifying and general transcription factors to the IFN-b promoter, Cell, 103(4), 667\u00E2\u0080\u0093678. Ajay, S. S. et al. (2011) Accurate and comprehensive sequencing of personal genomes, Genome Research, 21(9), 1498\u00E2\u0080\u00931505. Alexandrov, L. B. et al. (2013) Signatures of mutational processes in human cancer, Nature, 500(7463), 415\u00E2\u0080\u0093421. Alver, B. H. et al. (2017) The SWI/SNF chromatin remodelling complex is required for maintenance of lineage specific enhancers, Nature Communications, 8, 14648. Ammerlaan, A. C. J. et al. (2008) Long-term survival and transmission of INI1-mutation via nonpenetrant males in a family with rhabdoid tumour predisposition syndrome, British Journal of Cancer, 98(2), 474\u00E2\u0080\u0093479. Anders, S. and Huber, W. (2010) Differential expression analysis for sequence count data, Genome biology, 11(R106), 1\u00E2\u0080\u009312. Arnaud, O., Le Loarer, F. and Tirode, F. (2018) BAFfling pathologies: Alterations of BAF complexes in cancer, Cancer Letters, 419, 266\u00E2\u0080\u0093279. Aryee, M. J. et al. (2014) Minfi: A flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays, Bioinformatics, 30(10), 1363\u00E2\u0080\u00931369. Barnes, T. A. and Amir, E. (2017) HYPE or HOPE: the prognostic value of infiltrating immune cells in cancer, British Journal of Cancer. Nature Publishing Group, 117(4), 451\u00E2\u0080\u0093460. Benjamini, Y. and Hochberg, Y. (1995) Controlling the False Discovery Rate: A practical and powerful approach to multiple testing, Journal of the Royal Statistical Society, 57(1), 289\u00E2\u0080\u0093300. Bessho, Y. et al. (2001) Hes7: A bHLH-type repressor gene regulated by Notch and expressed in the presomitic mesoderm, Genes to Cells, 6(2), 175\u00E2\u0080\u0093185. Betz, B. L. et al. (2002) Re-expression of hSNF5/INI1/BAF47 in pediatric tumor cells leads to G1 arrest associated with induction of p16ink4a and activation of RB, Oncogene, 21(34), 5193\u00E2\u0080\u00935203. Biankin, A. V et al. (2012) Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes, Nature, 491(7424), 399\u00E2\u0080\u0093405. Biegel, J. A. et al. (2002) The role of INI1 and the SWI/SNF complex in the development of rhabdoid tumors: Meeting summary from the workshop on childhood atypical teratoid/rhabdoid \t 172 tumors, Cancer Research, 62(1), 323\u00E2\u0080\u0093328. Birks, D. K. et al. (2011) High expression of BMP pathway genes distinguishes a subset of atypical teratoid/rhabdoid tumors associated with shorter survival, Neuro-Oncology, 13(12), 1296\u00E2\u0080\u00931307. Birks, D. K. et al. (2013) Pediatric rhabdoid tumors of kidney and brain show many differences in gene expression but share dysregulation of cell cycle and epigenetic effector genes, Pediatric Blood Cancer, 60, 1095\u00E2\u0080\u00931102. Bolotin, D. A. et al. (2015) MiXCR: Software for comprehensive adaptive immunity profiling, Nature Methods. Nature Publishing Group, 12(5), 380\u00E2\u0080\u0093381. Boveri, T. (1902) \u00C3\u009Cber mehrpolige Mitosen als Mittel zur Analyse des Zellkerns. [Concerning multipolar mitoses as a means of analysing the cell nucleus.] C. Kabitzch, W\u00C3\u00BCrzburg and Verh. d. phys. med. Ges. zu W\u00C3\u00BCrzburg. N.F., Bd. 35. Brennan, B., Stiller, C. and Bourdeaut, F. (2013) Extracranial rhabdoid tumours: What we have learned so far and future directions, The Lancet Oncology. Elsevier Ltd, 14(8), e329\u00E2\u0080\u0093e336. Brownlee, P. M., Meisenberg, C. and Downs, J. A. (2015) The SWI/SNF chromatin remodelling complex: Its role in maintaining genome stability and preventing tumourigenesis, DNA Repair. Elsevier B.V., 32, 127\u00E2\u0080\u0093133. Bruce, A. W. et al. (2004) Genome-wide analysis of repressor element 1 silencing transcription factor/neuron-restrictive silencing factor (REST/NRSF) target genes, Proceedings of the National Academy of Sciences, 101(28), 10458\u00E2\u0080\u009310463. Buschhausen, G. et al. (1987) Chromatin structure is required to block transcription of the methylated herpes simplex virus thymidine kinase gene, Proceedings of the National Academy of Sciences, 84(5), 1177\u00E2\u0080\u00931181. Butterfield, Y. S. et al. (2014) JAGuaR: Junction alignments to genome for RNA-seq reads, PLoS ONE, 9(7), e102398. Cai, W. et al. (2019) PBRM1 acts as a p53 lysine-acetylation reader to suppress renal tumor growth, Nature Communications. 10(1), 1\u00E2\u0080\u009315. Calandrini, C. et al. (2020) An organoid biobank for childhood kidney cancers that captures disease and tissue heterogeneity, Nature Communications. 11(1), 1\u00E2\u0080\u009314. Campbell, P. J. et al. (2020) Pan-cancer analysis of whole genomes, Nature, 578(7793), 82\u00E2\u0080\u009393. Cancer Genome Atlas Research Network (2014) Comprehensive molecular characterization of \t 173 gastric adenocarcinoma, Nature. Nature Publishing Group, 513(7517), 202\u00E2\u0080\u0093209. Cancer Genome Atlas Research Network (2015) Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas, New England Journal of Medicine, 372(26), 2481\u00E2\u0080\u00932498. Capper, D. et al. (2018) DNA methylation-based classification of central nervous system tumours, Nature, 555(7697), 469\u00E2\u0080\u0093474. Carugo, A. et al. (2019) p53 Is a Master Regulator of Proteostasis in SMARCB1-Deficient Malignant Rhabdoid Tumors, Cancer Cell, 35, 204\u00E2\u0080\u0093220. Cerami, E. et al. (2012) The cBio Cancer Genomics Portal: An open platform for exploring multidimensional cancer genomics data, Cancer Discovery, 2(5), 401\u00E2\u0080\u0093404. Chakravadhanula, M. et al. (2014) Expression of the HOX genes and HOTAIR in atypical teratoid rhabdoid tumors and other pediatric brain tumors, Cancer Genetics, 207(9), 425\u00E2\u0080\u0093428. Chandler, R. L. and Magnuson, T. (2016) The SWI/SNF BAF-A complex is essential for neural crest development, Developmental Biology, 411(1), 15\u00E2\u0080\u009324. Chapman, M. A. et al. (2011) Initial genome sequencing and analysis of multiple myeloma, Nature, 471(7339), 467\u00E2\u0080\u0093472. Cheng, Y. et al. (2016) SUSD2 is frequently downregulated and functions as a tumor suppressor in RCC and lung cancer, Tumor Biology, 37, 9919\u00E2\u0080\u00939930. Chiappinelli, K. B. et al. (2015) Inhibiting DNA Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses, Cell, 162(5), 974\u00E2\u0080\u0093986. Choi, S. et al. (2012) Role of Macrophage Migration Inhibitory Factor in the Regulatory T Cell Response of Tumor-Bearing Mice, The Journal of Immunology, 189(8), 3905\u00E2\u0080\u00933913. Choi, Y., Kim, J. K. and Yoo, J. Y. (2014) NF\u00CE\u00BAB and STAT3 synergistically activate the expression of FAT10, a gene counteracting the tumor suppressor p53, Molecular Oncology, 8(3), 642\u00E2\u0080\u0093655. Chun, H.-J. E. et al. (2016) Genome-Wide Profiles of Extra-cranial Malignant Rhabdoid Tumors Reveal Heterogeneity and Dysregulated Developmental Pathways, Cancer Cell, 29(3), 394\u00E2\u0080\u0093406. Cingolani, P. et al. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff, Fly, 6(2), 80\u00E2\u0080\u009392. Courtet, K. et al. (2020) Inactivating mutations in genes encoding for components of the BAF/PBAF complex and immune-checkpoint inhibitor outcome, Biomarker research. Biomarker Research, 8, 26. \t 174 Couzin-Frankel, J. (2013) Breakthrough of the Year 2013: Cancer Immunotherapy, Science, 342(6165), 1432\u00E2\u0080\u00931433. Cozzitorto, C. et al. (2020) A Specialized Niche in the Pancreatic Microenvironment Promotes Endocrine Differentiation, Developmental Cell, 55, 1\u00E2\u0080\u009313. Croce, C. M. (2008) Oncogenes and cancer, New England Journal of Medicine, 358(5), 502\u00E2\u0080\u0093511. Cui, K. et al. (2004) The Chromatin-Remodeling BAF Complex Mediates Cellular Antiviral Activities by Promoter Priming, Molecular and Cellular Biology, 24(10), 4476\u00E2\u0080\u00934486. Dagogo-Jack, I. and Shaw, A. T. (2018) Tumour heterogeneity and resistance to cancer therapies, Nature Reviews Clinical Oncology, 15(2), 81\u00E2\u0080\u009394. Davis, C. A. et al. (2006) PRISM/PRDM6, a transcriptional repressor that promotes the proliferative gene program in smooth muscle cells, Molecular and Cellular Biology, 26(7), 2626\u00E2\u0080\u00932636. Deisch, J., Raisanen, J. and Rakheja, D. (2013) Immunohistochemical Expression of Embryonic Stem Cell Markers in Malignant Rhabdoid Tumors, Pediatric and Developmental Pathology, 14(5), 353\u00E2\u0080\u0093359. Dennis Jr, G. et al. (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery, Genome Biology, 4(9), R60.1-R60.11. Didonato, J. A., Mercurio, F. and Karin, M. (2012) NF-\u00CE\u00BAB and the link between inflammation and cancer, Immunological Reviews, 246(1), 379\u00E2\u0080\u0093400. Ding, J. et al. (2012) Feature-based k for somatic mutation detection in tumour-normal paired sequencing data, Bioinformatics, 28(2), 167\u00E2\u0080\u0093175. Feinberg, A. P. and Tycko, B. (2004) The history of cancer epigenetics, Nature Reviews Cancer, 4, 143\u00E2\u0080\u0093153. Feinberg, Andrew P and Vogelstein, B. (1983) Hypomethylation distinguishes gene of some human cancers from their normal counterparts, Nature, 301, 89\u00E2\u0080\u009392. Feinberg, Andrew P. and Vogelstein, B. (1983) Hypomethylation of ras oncogenes in primary human cancers, Biochemical and Biophysical Research Communications, 111(1), 47\u00E2\u0080\u009354. Fejes, A. P. et al. (2008) FindPeaks 3.1: A tool for identifying areas of enrichment from massively parallel short-read sequencing technology, Bioinformatics, 24(15), 1729\u00E2\u0080\u00931730. Fischer, H. . et al. (1989) Malignant rhabdoid tumour of the kidney expressing neurofilmanet \t 175 proteins: Immunohistochemical findings and histogenetic aspects, Pathology-Research and Practice, 184, 541\u00E2\u0080\u0093547. Fisher, B. et al. (1983) Influence of tumor estrogen and progesterone receptor levels on the response to tamoxifen and chemotherapy in primary breast cancer, Journal of Clinical Oncology, 1(4), 227\u00E2\u0080\u0093241. Futreal, P. A. et al. (2004) A census of human cancer genes, Nature Reviews Cancer, 4(3), 177\u00E2\u0080\u0093183. Gadd, S. et al. (2010) Rhabdoid tumor: gene expression clues to pathogenesis and potential therapeutic targets, Laboratory Investigation, 90(5), 724\u00E2\u0080\u0093738. Gadd, S. et al. (2017) A Children\u00E2\u0080\u0099s Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor, Nature Genetics. Nature Publishing Group, 49(10), 1487\u00E2\u0080\u00931494. Gajewski, T. F., Schreiber, H. and Fu, Y. X. (2013) Innate and adaptive immune cells in the tumor microenvironment, Nature Immunology, 14(10), 1014\u00E2\u0080\u00931022. Gao, J. et al. (2014) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal, Science Signaling, 6(269), 1\u00E2\u0080\u009334. Garvin, A. J. et al. (1993) The G401 cell line, utilized for studies of chromosomal changes in Wilms\u00E2\u0080\u0099 tumor, is derived from a rhabdoid tumor of the kidney, American Journal of Pathology, 142(2), 375\u00E2\u0080\u0093380. Gascard, P. et al. (2015) Epigenetic and transcriptional determinants of the human breast, Nature Communications, 6(6351), 1\u00E2\u0080\u009310. Gaujoux, R. and Seoighe, C. (2010) A flexible R package for nonnegative matrix factorization, BMC Bioinformatics, 11, 367. Gebuhr, T. C. et al. (2003) The Role of Brg1, a Catalytic Subunit of Mammalian Chromatin-remodeling Complexes, in T Cell Development, Journal of Experimental Medicine, 198(12), 1937\u00E2\u0080\u00931949. Gfrerer, L. et al. (2014) Functional Analysis of SPECC1L in Craniofacial Development and Oblique Facial Cleft Pathogenesis, Plastic and Reconstructive Surgery, 134(4), 748\u00E2\u0080\u0093759. Glaser, R. L., Ramsay, J. P. and Morison, I. M. (2006) The imprinted gene and parent-of-origin effect database now includes parental origin of de novo mutations, Nucleic Acids Research, 34, D29\u00E2\u0080\u0093D31. \t 176 Gojo, J. et al. (2020) Single-Cell RNA-Seq Reveals Cellular Hierarchies and Impaired Developmental Trajectories in Pediatric Ependymoma, Cancer Cell, 38(1), 44\u00E2\u0080\u009359. Gong, P. et al. (2010) The Ubiquitin-Like Protein FAT10 Mediates NF- kB Activation, Journal of the American Society of Nephrology, 21(2), 316\u00E2\u0080\u0093326. Goya, R. et al. (2010) SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors, Bioinformatics, 26(6), 730\u00E2\u0080\u0093736. Greger, V. et al. (1989) Epigenetic changes may contribute to the formation and spontaneous regression of retinoblastoma, Human Genetics, 83(2), 155\u00E2\u0080\u0093158. Grupenmacher, A. T. et al. (2013) Study of the gene expression and microRNA expression profiles of malignant rhabdoid tumors originated in the brain (AT/RT) and in the kidney (RTK), Child\u00E2\u0080\u0099s Nervous System, 29(11), 1977\u00E2\u0080\u00931983. Guo, X. et al. (2010) Homeobox gene IRX1 is a tumor suppressor gene in gastric carcinoma, Oncogene, 29, 3908\u00E2\u0080\u00933920. Gupta, R. A. et al. (2010) Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis, Nature, 464(7291), 1071\u00E2\u0080\u00931076. Ha, G. et al. (2012) Integrative analysis of genome-wide loss of heterozygosity and mono-allelic expression at nucleotide resolution reveals disrupted pathways in triple negative breast cancer, Genome research, 22, 1995\u00E2\u0080\u00932007. Van Haaften, G. et al. (2009) Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer, Nature Genetics, 41(5), 521\u00E2\u0080\u0093523. Hahne, F. and Ivanek, R. (2016) Statistical Genomics: Methods and Protocols - Chapter Visualizing genomic data using Gviz and Bioconductor. 1st edn. Edited by E. Math\u00C3\u00A9 and S. Davis. New York: Springer New York. Han, Z.-Y. et al. (2016) The occurrence of intracranial rhabdoid tumours in mice depends on temporal control of Smarcb1 inactivation, Nature Communications, 7(10421), 1\u00E2\u0080\u009311. Hanahan, D. and Weinberg, R. A. (2000) The Hallmarks of Cancer\u00E2\u0080\u0099, Cell, 100, 57\u00E2\u0080\u009370. Hanahan, D. and Weinberg, R. A. (2011) Hallmarks of cancer: The next generation, Cell, 144(5), 646\u00E2\u0080\u0093674. Hansen, K. D., Langmead, B. and Irizarry, R. A. (2012) BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions, Genome Biology, 13(10), R83. Hasselblatt, M. et al. (2013) High-resolution genomic analysis suggests the absence of recurrent \t 177 genomic alterations other than SMARCB1 aberrations in Atypical Teratoid/Rhabdoid Tumors, Genes, Chromosomes & Cancer, 52, 185\u00E2\u0080\u0093190. Hasselblatt, M. et al. (2016) Poorly differentiated chordoma with SMARCB1/INI1 loss: a distinct molecular entity with dismal prognosis, Acta Neuropathologica. Springer Berlin Heidelberg, 132(1), 149\u00E2\u0080\u0093151. Heck, J. E. et al. (2013) Epidemiology of rhabdoid tumors of early childhood, Pediatric Blood and Cancer, 60(1), 77\u00E2\u0080\u009381. Heinz, S. et al. (2010) Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities, Molecular Cell. Elsevier Inc., 38(4), 576\u00E2\u0080\u0093589. Heinz, S. et al. (2015) The selection and function of cell type-specific enhancers, Nature Reviews Molecular Cell Biology, 16(3), 144\u00E2\u0080\u0093154. Hellmann, M. D., Nathanson, T., et al. (2018) Genomic Features of Response to Combination Immunotherapy in Patients with Advanced Non-Small-Cell Lung Cancer, Cancer Cell, 33, 843\u00E2\u0080\u0093852. Hellmann, M. D., Callahan, M. K., et al. (2018) Tumor Mutational Burden and Efficacy of Nivolumab Monotherapy and in Combination with Ipilimumab in Small-Cell Lung Cancer, Cancer Cell, 33(5), 853\u00E2\u0080\u0093861. Herbst, R. S. et al. (2014) Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in cancer patients, Nature, 515(7528), 563\u00E2\u0080\u0093567. Hinck, L. (2004) The versatile roles of \u00E2\u0080\u009Caxon guidance\u00E2\u0080\u009D cues in tissue morphogenesis, Developmental Cell, 7(6), 783\u00E2\u0080\u0093793. Hirth, A. et al. (2003) Cerebral Atypical Teratoid/Rhabdoid Tumor of Infancy: Long-Term Survival after Multimodal Treatment, also Including Triple Intrathecal Chemotherapy and Gamma Knife Radiosurgery-Case Report, Pediatric Hematology and Oncology, 20(4), 327\u00E2\u0080\u0093332. Hisano, M. et al. (2013) Genome-wide chromatin analysis in mature mouse and human spermatozoa, Nature Protocols, 8(12), 2449\u00E2\u0080\u00932470. Hnisz, D. et al. (2013) Super-enhancers in the control of cell identity and disease, Cell, 155(4), 934-947. Ho, L. et al. (2011) esBAF facilitates pluripotency by conditioning the genome for LIF / STAT3 signalling and by regulating polycomb function, Nature Cell Biology, 13(8), 903\u00E2\u0080\u0093913. \t 178 Holliday, R. (1987) The inheritance of epigenetic defects, Science, 238(4824), 163\u00E2\u0080\u0093170. Horn, S. et al. (2013) TERT promoter mutations in familial and sporadic melanoma, Science, 339, 959\u00E2\u0080\u0093961. Hornung, V. et al. (2009) AIM2 recognizes cytosolic dsDNA and forms a caspase-1-activating inflammasome with ASC, Nature, 458(7237), 514\u00E2\u0080\u0093518. Horvath, S. (2013) DNA methylation age of human tissues and cell types, Genome Biology, 14(R115), 1\u00E2\u0080\u009319. Hovestadt, V. et al. (2013) Robust molecular subgrouping and copy-number profiling of medulloblastoma from small amounts of archival tumour material using high-density DNA methylation arrays, Acta Neuropathologica, 125(6), 913\u00E2\u0080\u0093916. Huang, D. W., Sherman, B. T. and Lempicki, R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nature Protocols, 4(1), 44\u00E2\u0080\u009357. ICGC, C. (2018) The landscape of genomic alterations across childhood cancers, Nature, 555(7696), 321\u00E2\u0080\u0093327. Isakoff, M. S. et al. (2005) Inactivation of the Snf5 tumor suppressor stimulates cell cycle progression and cooperates with p53 loss in oncogenic transformation, Proceedings of the National Academy of Sciences, 102(49), 17745\u00E2\u0080\u009317750. Ivashkiv, L. B. and Donlin, L. T. (2014) Regulation of type I interferon responses, Nature Reviews Immunology, 14(1), 36\u00E2\u0080\u009349. Jackson, E. M. et al. (2009) Genomic analysis using high-density single nucleotide polymorphism-based oligonucleotide arrays and multiplex ligation-dependent probe amplification provides a comprehensive analysis of INI1/SMARCB1 in malignant rhabdoid tumors, Clinical Cancer Research, 15(6), 1923\u00E2\u0080\u00931930. Jagani, Z. et al. (2010) Loss of the tumor suppressor Snf5 leads to aberrant activation of the Hedgehog-Gli pathway, Nature Medicine, 16(12), 1429\u00E2\u0080\u00931434. Jang, H. et al. (2009) Cabin1 retrains p53 activity on chromatin, Nature Structural & Molecular Biology, 16(9), 910\u00E2\u0080\u0093916. Jelinic, P. et al. (2014) Recurrent SMARCA4 mutations in small cell carcinoma of the ovary, Nature Genetics, 46(5), 424\u00E2\u0080\u0093426. Jelinic, P. et al. (2018) Immune-Active Microenvironment in Small Cell Carcinoma of the Ovary , Hypercalcemic Type\u00E2\u0080\u0089: Rationale for Immune Checkpoint Blockade, Journal of National Cancer \t 179 Institute, 110(7), 1\u00E2\u0080\u00934. Johann, P. D. et al. (2016) Atypical Teratoid/Rhabdoid Tumors Are Comprised of Three Epigenetic Subgroups with Distinct Enhancer Landscapes, Cancer Cell, 29(3), 379\u00E2\u0080\u0093393. Johansson, P. et al. (2020) A patient-derived cell atlas informs precision targeting of glioblastoma, Cell Reports, 32, 107897. Jones, D. et al. (2012) Dissecting the genomic complexity underlying medulloblastoma, Nature, 488(7409), 100\u00E2\u0080\u0093105. Jones, P. L. et al. (1998) Methylated DNA and MeCP2 recruit histone deacetylase to repress transcription, Nature Genetics, 19(2), 187\u00E2\u0080\u0093191. Jones, S. et al. (2010) Frequent Mutations of Chromatin Remodeling Gene ARID1A in Ovarian Clear Cell Carcinoma, Science, 330, 228\u00E2\u0080\u0093231. Juraschka, K. and Taylor, M. D. (2019) Medulloblastoma in the age of molecular subgroups: A review, Journal of Neurosurgery: Pediatrics, 24(4), 353\u00E2\u0080\u0093363. Kadoch, C. et al. (2013) Proteomic and bioinformatic analysis of mammalian SWI/SNF complexes identifies extensive roles in human malignancy, Nature Genetics, 45(6), 592\u00E2\u0080\u0093601. Kadoch, C. et al. (2017) Dynamics of BAF\u00E2\u0080\u0093Polycomb complex opposition on heterochromatin in normal and oncogenic states, Nature Genetics, 49(2), 213\u00E2\u0080\u0093222. Kadoch, C., Copeland, R. A. and Keilhack, H. (2016) PRC2 and SWI/SNF Chromatin Remodeling Complexes in Health and Disease, Biochemistry, 55(11), 1600\u00E2\u0080\u00931614. Kadoch, C. and Crabtree, G. R. (2013) Reversible disruption of mSWI/SNF (BAF) complexes by the SS18-SSX oncogenic fusion in synovial sarcoma, Cell, 153(1), 71\u00E2\u0080\u009385. Kadoch, C. and Crabtree, G. R. (2015) Mammalian SWI/SNF chromatin remodeling complexes and cancer: Mechanistic insights gained from human genomics, Science Advances, 1(5), e1500447\u00E2\u0080\u0093e1500447. Kahles, A. et al. (2018) Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients, Cancer Cell, 34, 1\u00E2\u0080\u009314. Kammerer, S. et al. (2016) KCNJ3 is a new independent prognostic marker for estrogen receptor positive breast cancer patients, Oncotarget, 7(51), 84705\u00E2\u0080\u009384717. Kandoth, C. et al. (2013) Mutational landscape and significance across 12 major cancer types, Nature, 502(7471), 333\u00E2\u0080\u0093339. Katsumi, Y. et al. (2008) Trastuzumab activates allogeneic or autologous antibody-dependent \t 180 cellular cytotoxicity against malignant rhabdoid tumor cells and Interleukin-2 augments the cytotoxicity, Clinical Cancer Research, 14(4), 1192\u00E2\u0080\u00931199. Keishi, H. et al. (2003) Epigenetic inactivation of RASSF1A candidate tumor suppressor gene at 3p21.3 in brain tumors, Oncogene, 22(49), 7862\u00E2\u0080\u00937865. Keshet, I., Lieman-Hurwitz, J. and Cedar, H. (1986) DNA methylation affects the formation of active chromatin, Cell, 44(4), 535\u00E2\u0080\u0093543. Kia, S. K. et al. (2008) SWI/SNF Mediates Polycomb Eviction and Epigenetic Reprogramming of the INK4b-ARF-INK4a Locus, Molecular and Cellular Biology, 28(10), 3457\u00E2\u0080\u00933464. Kim, K. H. et al. (2015) SWI/SNF-mutant cancers depend on catalytic and non-catalytic activity of EZH2, Nature Medicine, 21(12), 1491\u00E2\u0080\u00931496. Kim, K. H. and Roberts, C. W. M. (2014) Mechanisms by which SMARCB1 loss drives rhabdoid tumor growth, Cancer Genetics, 207(9), 365\u00E2\u0080\u0093372. Kircher, M., Heyn, P. and Kelso, J. (2011) Addressing challenges in the production and analysis of illumina sequencing data, BMC Genomics, 12, 382.e1\u00E2\u0080\u009314. Klochendler-Yeivin, A. et al. (2000) The murine SNF5/INI1 chromatin remodeling factor is essential for embryonic development and tumor suppression, EMBO Reports, 1(6), 500\u00E2\u0080\u0093506. Knudson, A. G. (1971) Mutation and Cancer: Statistical study of retinoblastoma, Proceedings of the National Academy of Sciences, 68(4), 820\u00E2\u0080\u0093823. Knutson, S. K. et al. (2012) A selective inhibitor of EZH2 blocks H3K27 methylation and kills mutant lymphoma cells, Nature Chemical Biology, 8(11), 890\u00E2\u0080\u0093896. Knutson, S. K. et al. (2013) Durable tumor regression in genetically altered malignant rhabdoid tumors by inhibition of methyltransferase EZH2, Proceedings of the National Academy of Sciences, 110(19), 7922\u00E2\u0080\u00937927. Kohashi, K. et al. (2016) Reclassification of rhabdoid tumor and pediatric undifferentiated/unclassified sarcoma with complete loss of SMARCB1/INI1 protein expression: three subtypes of rhabdoid tumor according to their histological features, Modern Pathology, 29(10), 1232\u00E2\u0080\u00931242. Krueger, F. and Andrews, S. R. (2011) Bismark: A flexible aligner and methylation caller for Bisulfite-Seq applications, Bioinformatics, 27(11), 1571\u00E2\u0080\u00931572. Krug, B. et al. (2019) Pervasive H3K27 Acetylation Leads to ERV Expression and a Therapeutic Vulnerability in H3K27M Gliomas, Cancer Cell, 35(5), 782-797. \t 181 Langer, L. F., Ward, J. M. and Archer, T. K. (2019) Tumor suppressor SMARCB1 suppresses super-enhancers to govern hESC lineage determination, eLife, 8, 1\u00E2\u0080\u009323. Langlais, D., Barreiro, L. B. and Gros, P. (2016) The macrophage IRF8/IRF1 regulome is required for protection against infections and is associated with chronic inflammation, The Journal of Experimental Medicine, 213(4), 585\u00E2\u0080\u0093603. Langmead, B. et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome, Genome Biology, 10(3), R25. Laouar, Y. et al. (2003) STAT3 Is Required for Flt3L-Dependent Dendritic Cell Differentiation, Immunity, 19(6), 903\u00E2\u0080\u0093912. Lawrence, M. S. et al. (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes, Nature, 499(7457), 214\u00E2\u0080\u0093218. Lee, R. S. et al. (2012) A remarkably simple genome underlies highly malignant pediatric rhabdoid cancers, Journal of Clinical Investigation, 122(8), 2983\u00E2\u0080\u00932988. Lee, S. et al. (2011) Aurora A is a repressed effector target of the chromatin remodeling protein INI1/hSNF5 required for rhabdoid tumor cell survival, Cancer Research, 71(9), 3225\u00E2\u0080\u00933235. Leruste, A. et al. (2019) Clonally Expanded T Cells Reveal Immunogenicity of Rhabdoid Tumors, Cancer Cell, 36, 597\u00E2\u0080\u0093612. Lessard, J. et al. (2007) An essential switch in subunit composition of a chromatin remodeling complex during neural development, Neuron, 55(2), 201\u00E2\u0080\u0093215. Lever, J. and JM Jones, S. (2017) Painless Relation Extraction with Kindred, Proceedings of the BioNLP 2017 workshop, 176\u00E2\u0080\u0093183. Li, E. (2002) Chromatin modification and epigenetic reprogramming in mammalian development, Nature Reviews Genetics, 3(9), 662\u00E2\u0080\u0093673. Li, H. et al. (2009) The Sequence Alignment/Map format and SAMtools, Bioinformatics, 25(16), 2078\u00E2\u0080\u00932079. Li, H. and Durbin, R. (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform, Bioinformatics, 26(5), 589\u00E2\u0080\u0093595. Li, Z. et al. (2005) Developmental stage-selective effect of somatically mutated leukemogenic transcription factor GATA1, Nature Genetics, 37(6), 613\u00E2\u0080\u0093619. Lin, R. J. et al. (1998) Role of the histone deacetylase complex in acute promyelocytic leukaemia, Nature, 391(6669), 811\u00E2\u0080\u0093814. \t 182 Le Loarer, F. et al. (2015) SMARCA4 inactivation defines a group of undifferentiated thoracic malignancies transcriptionally related to BAF-deficient sarcomas, Nature Genetics, 47(10), 1200\u00E2\u0080\u00931205. Lord, C. J. and Ashworth, A. (2010) Biology-driven cancer drug development: Back to the future, BMC Biology, 8, 38. Lu, C. and Allis, D. C. (2017) SWI/SNF complex in cancer, Nature Genetics, 49(24), 178\u00E2\u0080\u0093179. Lue, H. et al. (2002) Macrophage migration inhibitory factor (MIF): mechanisms of action and role in disease, Microbes and Infection, 4, 449\u00E2\u0080\u0093460. Mack, S. C. et al. (2014) Epigenomic alterations define lethal CIMP-positive ependymomas of infancy, Nature, 506(7489), 445\u00E2\u0080\u0093450. Margueron, R. and Reinberg, D. (2011) The Polycomb complex PRC2 and its mark in life, Nature, 469(7330), 343\u00E2\u0080\u0093349. Mariathasan, S. et al. (2018) TGF\u00CE\u00B2 attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells, Nature, 554(7693), 544\u00E2\u0080\u0093548. Martinon, F. et al. (2010) TLR activation of the transcription factor XBP1 regulates innate immune responses in macrophages, Nature Immunology, 11(5), 411\u00E2\u0080\u0093418. Mashtalir, N. et al. (2018) Modular Organization and Assembly of SWI/SNF Family Chromatin Remodeling Complexes, Cell, 175(5), 1272\u00E2\u0080\u00931288. Mathur, R. et al. (2017) ARID1A loss impairs enhancer-mediated gene regulation and drives colon cancer in mice, Nature Genetics, 49(2), 296\u00E2\u0080\u0093302. McKenna, E. S. et al. (2008) Loss of the Epigenetic Tumor Suppressor SNF5 Leads to Cancer without Genomic Instability, Molecular and Cellular Biology, 28(20), 6223\u00E2\u0080\u00936233. Meehan, R., Lewis, J. D. and Bird, A. P. (1992) Characterization of MECP2, a vertebrate DNA binding protein with affinity for methylated DNA, Nucleic Acids Research, 20(19), 5085\u00E2\u0080\u00935092. Merico, D. et al. (2010) Enrichment map: A network-based method for gene-set enrichment visualization and interpretation, PLoS ONE, 5(11), e13984. Merk, D. J. et al. (2018) Opposing Effects of CREBBP Mutations Govern the Phenotype of Rubinstein-Taybi Syndrome and Adult SHH Medulloblastoma, Developmental Cell, 44, 1\u00E2\u0080\u009316. Miao, D. et al. (2018) Genomic correlates of response to immune checkpoint therapies in clear cell renal cell carcinoma, Science, 5951, 1\u00E2\u0080\u009311. Mobley, B. C. et al. (2010) Loss of SMARCB1/INI1 expression in poorly differentiated \t 183 chordomas, Acta Neuropathologica, 120(6), 745\u00E2\u0080\u0093753. Mohrmann, L. and Verrijzer, C. P. (2005) Composition and functional specificity of SWI2/SNF2 class chromatin remodeling complexes, Biochimica et Biophysica Acta - Gene Structure and Expression, 1681(2\u00E2\u0080\u00933), 59\u00E2\u0080\u009373. Morin, R. D. et al. (2008) Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing, BioTechniques, 45, 81\u00E2\u0080\u009394. Morin, R. D. et al. (2010) Somatic mutations altering EZH2 (Tyr641) in follicular and diffuse large B-cell lymphomas of germinal-center origin, Nature Genetics, 42(2), 181\u00E2\u0080\u0093185. Msaouel, P. et al. (2020) Comprehensive Molecular Characterization Identifies Distinct Genomic and Immune Hallmarks of Renal Medullary Carcinoma, Cancer Cell, 37, 1\u00E2\u0080\u009315. Nan, J. et al. (2018) IRF9 and unphosphorylated STAT2 cooperate with NF-\u00CE\u00BAB to drive IL6 expression, Proceedings of the National Academy of Sciences, 115(15), 3906\u00E2\u0080\u00933911. Neiswender, H. et al. (2017) Early craniofacial defects in zebrafish that have reduced function of a wnt-interacting extracellular matrix protein, Tinagl1, Cleft Palate-Craniofacial Journal, 54(4), 381\u00E2\u0080\u0093390. Nemes, K. and Fr\u00C3\u00BChwald, M. C. (2018) Emerging therapeutic targets for the treatment of malignant rhabdoid tumors, Expert Opinion on Therapeutic Targets, 22(4), 365\u00E2\u0080\u0093379. Newman, A. M. et al. (2015) Robust enumeration of cell subsets from tissue expression profiles, Nature Methods, 12(5), 453\u00E2\u0080\u0093457. Nguyen, C. T. et al. (2002) Histone H3-lysine 9 methylation is associated with aberrant gene silencing in cancer cells and is rapidly reversed by 5-aza-2\u00E2\u0080\u00B2-deoxycytidine, Cancer Research, 62(22), 6456\u00E2\u0080\u00936461. Nik-Zainal, S. et al. (2012) Mutational processes molding the genomes of 21 breast cancers, Cell, 149(5), 979\u00E2\u0080\u0093993. Northcott, P. A. et al. (2012) Subgroup-specific structural variation across 1,000 medulloblastoma genomes, Nature, 487(7409), 49\u00E2\u0080\u009356. Noy, R. and Pollard, J. W. (2014) Tumor-Associated Macrophages: From Mechanisms to Therapy, Immunity, 41(1), 49\u00E2\u0080\u009361. Ohtani-Fujita, N. et al. (1993) CpG methylation inactivates the promoter activity of the human retinoblastoma tumor-suppressor gene, Oncogene, 8(4), 1063\u00E2\u0080\u00931067. Okamura, R. et al. (2020) ARID1A alterations function as a biomarker for longer progression-\t 184 free survival after anti-PD-1/PD-L1 immunotherapy, Journal for ImmunoTherapy of Cancer, 8(1), 1\u00E2\u0080\u00936. Ota, S. et al. (1993) Malignant rhabdoid tumor: A study with two established cell lines, Cancer, 71(9), 2862\u00E2\u0080\u00932872. Packer, R. J. et al. (2002) Atypical Teratoid/Rhabdoid Tumor of the Central Nervous System: Report on Workshop, Journal of Pediatric Hematology/Oncology, 24(5), 337\u00E2\u0080\u0093342. Palakurthy, R. K. et al. (2009) Epigenetic Silencing of the RASSF1A Tumor Suppressor Gene through HOXB3-Mediated Induction of DNMT3B Expression, Molecular Cell, 36(2), 219\u00E2\u0080\u0093230. Palena, C. et al. (2007) The human T-box mesodermal transcription factor Brachyury is a candidate target for T-cell - Mediated cancer immunotherapy, Clinical Cancer Research, 13(8), 2471\u00E2\u0080\u00932478. Pan, D. et al. (2018) A major chromatin regulator determines resistance of tumor cells to T cell \u00E2\u0080\u0093 mediated killing, Science, 1710, 1\u00E2\u0080\u009312. Peebles, P. T., Trisch, T. and Papageorge, A. G. (1978) Isolation of four unusual pediatric solid tumor cell lines, Pediatric Research, 12(s4), 485\u00E2\u0080\u0093485. Petrilli, A. M. and Fernandez-Valle, C. (2016) Role of Merlin/NF2 Inactivation in Tumor Biology, Oncogene, 35(5), 537\u00E2\u0080\u0093548. Phelan, M. L. et al. (1999) Reconstitution of a core chromatin remodeling complex from SWI/SNF subunits, Molecular Cell, 3(2), 247\u00E2\u0080\u0093253. Pinto, E. M. et al. (2018) Malignant rhabdoid tumors originating within and outside the central nervous system are clinically and molecularly heterogeneous, Acta Neuropathologica, 136(2), 315\u00E2\u0080\u0093326. Pleasance, E. et al. (2020) Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes, Nature Cancer, 1, 452\u00E2\u0080\u0093468. Plisov, S. Y. et al. (2001) TGF beta 2, LIF and FGF2 cooperate to induce nephrogenesis, Development, 128, 1045\u00E2\u0080\u00931057. Pomeroy, S. L. et al. (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression, Nature, 415, 436\u00E2\u0080\u0093442. Pozharny, Y. et al. (2010) Genomic loss of imprinting in first-trimester human placenta, American Journal of Obstetrics and Gynecology, 202(391), e1-8. Prescott, S. L. et al. (2015) Enhancer Divergence and cis-Regulatory Evolution in the Human \t 185 and Chimp Neural Crest, Cell, 163(1), 68\u00E2\u0080\u009384. Quinlan, A. R. and Hall, I. M. (2010) BEDTools: A flexible suite of utilities for comparing genomic features, Bioinformatics, 26(6), 841\u00E2\u0080\u0093842. Ramos, P. et al. (2014) Small cell carcinoma of the ovary, hypercalcemic type, displays frequent inactivating germline and somatic mutations in SMARCA4, Nature Genetics, 46(5), 427\u00E2\u0080\u0093429. Ravindra, K. V. et al. (2002) Long-term survival after spontaneous rupture of a malignant rhabdoid tumor of the liver, Journal of Pediatric Surgery, 37(10), 1488\u00E2\u0080\u00931490. Rea, S. et al. (2000) Regulation of chromatin structure by site-specific histone H3 methyltransferases, Nature, 406(6796), 593\u00E2\u0080\u0093599. Reddy, E. P. et al. (1982) A point mutation is responsible for the acquisition of transforming properties by the T24 human bladder carcinoma oncogene, Nature, 300(5888), 149\u00E2\u0080\u0093152. Reimand, J. et al. (2007) g:Profiler-a web server for functional interpretation of gene lists from large-scale experiments, Nucleic Acids Research, 35, 193\u00E2\u0080\u0093200. Roadmap Epigenomics Consortium et al. (2015) Integrative analysis of 111 reference human epigenomes, Nature, 518(7539), 317\u00E2\u0080\u0093330. Roberts, C. W. M. et al. (2002) Highly penetrant, rapid tumorigenesis through conditional inversion of the tumor suppressor gene Snf5\u00E2\u0080\u0099, Cancer Cell, 2(5), 415\u00E2\u0080\u0093425. Robertson, G. et al. (2010) De novo assembly and analysis of RNA-seq data, Nature Methods, 7(11), 909\u00E2\u0080\u0093912. Robertson, M. (1980) Biology in the 1980s, plus or minus a decade, Nature, 285(5764), 358\u00E2\u0080\u0093359. Roulois, D. et al. (2015) DNA-Demethylating Agents Target Colorectal Cancer Cells by Inducing Viral Mimicry by Endogenous Transcripts, Cell, 162(5), 961\u00E2\u0080\u0093973. Rozen, S. and Skaletsky, H. (2000) Primer3 on the WWW for general users and for biologist programmers, Methods of Molecular Biology, 132, 365\u00E2\u0080\u0093386. Rubin, E. H. and Gilliland, D. G. (2012) Drug development and clinical trials - The path to an approved cancer drug, Nature Reviews Clinical Oncology, 9(4), 215\u00E2\u0080\u0093222. Rubio-Perez, C. et al. (2015) In Silico Prescription of Anticancer Drugs to Cohorts of 28 Tumor Types Reveals Targeting Opportunities, Cancer Cell, 27(3), 382\u00E2\u0080\u0093396. Rudin, C. M. et al. (2009) Treatment of medulloblastoma with hedgehog pathway inhibitor GDC-0449, New England Journal of Medicine, 361(12), 1173\u00E2\u0080\u00931178. \t 186 Saadi, I. et al. (2011) Deficiency of the cytoskeletal protein SPECC1L leads to oblique facial clefting, American Journal of Human Genetics, 89(1), 44\u00E2\u0080\u009355. Sanger, F., Nicklen, S. and Coulson, A. R. (1977) DNA sequencing with chain-terminating inhibitors, Proceedings of the National Academy of Sciences, 74(12), 5463\u00E2\u0080\u00935467. Sasaki, T. et al. (2003) Identification of eight members of the Argonaute family in the human genome, Genomics, 82(3), 323\u00E2\u0080\u0093330. Saunders, C. T. et al. (2012) Strelka: Accurate somatic small-variant calling from sequenced tumor-normal sample pairs, Bioinformatics, 28(14), 1811\u00E2\u0080\u00931817. Sawyers, C. L. (1999) Chronic myeloid leukemia, The New England Journal of Medicine, 340(17), 1330\u00E2\u0080\u00931340. Schroeder, D. I. et al. (2013) The human placental methylome, Proceedings of the National Academy of Sciences, 110(15), 6037\u00E2\u0080\u00936042. Schumacher, T. N. and Schreiber, R. D. (2015) Neoantigens in cancer immunotherapy, Science, 348(6230), 69\u00E2\u0080\u009374. Schwartz, R. H. (2003) T Cell Anergy, Annual Review of Immunology, 21(1), 305\u00E2\u0080\u0093334. Schwartzentruber, J. et al. (2012) Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma, Nature, 482(7384), 226\u00E2\u0080\u0093231. Selvadurai, H. J. et al. (2020) Medulloblastoma Arises from the Persistence of a Rare and Transient Sox2+ Granule Neuron Precursor, Cell Reports, 31(2), 107511. Shah, S. P. et al. (2006) Integrating copy number polymorphisms into array CGH analysis using a robust HMM, Bioinformatics, 22(14), 431\u00E2\u0080\u0093439. Shain, A. H. and Pollack, J. R. (2013) The Spectrum of SWI/SNF Mutations, Ubiquitous in Human Cancers, PLoS ONE, 8(1), 1\u00E2\u0080\u009311. Shannon, P. et al. (2003) Cytoscape: A software environment for integrated models of biomolecular interaction networks, Genome Research, 13(22), 2498\u00E2\u0080\u00932504. Sharifnia, T. et al. (2019) Small-molecule targeting of brachyury transcription factor addiction in chordoma, Nature Medicine, 25, 292\u00E2\u0080\u0093300. Shen, H. and Laird, P. W. (2013) Interplay between the cancer genome and epigenome, Cell, 153, 38\u00E2\u0080\u009355. Shen, J. et al. (2015) ARID1A deficiency impairs the DNA damage checkpoint and sensitizes cells to PARP inhibitors, Cancer discovery, 5(7), 752\u00E2\u0080\u0093767. \t 187 Shen, J. et al. (2018) ARID1A deficiency promotes mutability and potentiates therapeutic antitumor immunity unleashed by immune checkpoint blockade, Nature Medicine, 24, 556\u00E2\u0080\u0093562. Sherry, S. T., Ward, M. and Sirotkin, K. (1999) dbSNP - Database for single nucleotide polymorphisms and other classes of minor genetic variation, Genome Research, 9(8), 677\u00E2\u0080\u0093679. Shore, E. M. et al. (2006) A recurrent mutation in the BMP type I receptor ACVR1 causes inherited and sporadic fibrodysplasia ossificans progressiva, Nature Genetics, 38(5), 525\u00E2\u0080\u0093527. Shugay, M. et al. (2015) VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires, PLoS Computational Biology, 11(11), 1\u00E2\u0080\u009316. Sica, A. et al. (2006) Tumour-associated macrophages are a distinct M2 polarised population promoting tumour progression: Potential targets of anti-cancer therapy, European Journal of Cancer, 42(6), 717\u00E2\u0080\u0093727. Simoes-Costa, M. and Bronner, M. E. (2015) Establishing neural crest identity: a gene regulatory recipe, Development, 142(2), 242\u00E2\u0080\u0093257. Simpson, A. J. G. et al. (2005) Cancer/testis antigens, gametogenesis and cancer, Nature Reviews Cancer, 5(8), 615\u00E2\u0080\u0093625. Simpson, J. T. et al. (2009) ABySS: A parallel assembler for short read sequence data, Genome Research, 19(6), 1117\u00E2\u0080\u00931123. Sinha, S. et al. (2020) Pbrm1 Steers Mesenchymal Stromal Cell Osteolineage Differentiation by Integrating PBAF- Dependent Chromatin Remodeling and BMP / TGF- b Signaling, Cell Reports, 31(4), 1\u00E2\u0080\u009317. Smit, A. F., Hubley, R. and Green, P. (2013) RepeatMasker Open-4.0. 2013-2015. Solovyov, A. et al. (2018) Global Cancer Transcriptome Quantifies Repeat Element Polarization between Immunotherapy Responsive and T Cell Suppressive Classes, Cell Reports, 23(2), 512\u00E2\u0080\u0093521. Son, E. Y. and Crabtree, G. R. (2014) The role of BAF (mSWI/SNF) complexes in mammalian neural development, American Journal of Medical Genetics Part C, 166(3), 333\u00E2\u0080\u0093349. Stanton, B. Z. et al. (2017) Smarca4 ATPase mutations disrupt direct eviction of PRC1 from chromatin, Nature Genetics, 49(2), 282\u00E2\u0080\u0093288. Stehelin, D. et al. (1976) DNA related to the transforming gene(s) of avian sarcoma viruses is present in normal avian DNA, Nature, 260, 170\u00E2\u0080\u0093173. Strahl, B. D. et al. (1999) Methylation of histone H3 at lysine 4 is highly conserved and \t 188 correlates with transcriptionally active nuclei in Tetrahymena, Proceedings of the National Academy of Sciences, 96(26), 14967\u00E2\u0080\u009314972. Stricker, S. H., Koferle, A. and Beck, S. (2017) From profiles to function in epigenomics, Nature Reviews Genetics, 18(1), 51\u00E2\u0080\u009366. Sturm, D. et al. (2012) Hotspot Mutations in H3F3A and IDH1 Define Distinct Epigenetic and Biological Subgroups of Glioblastoma, Cancer Cell, 22(4), 425\u00E2\u0080\u0093437. Subramanian, A. et al. (2005) Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles, Proceedings of the National Academy of Sciences, 102(43), 15545\u00E2\u0080\u009315550. Sugimoto, T. et al. (1999) Malignant rhabdoid-tumor cell line showing neural and smooth-muscle-cell phenotypes, International Journal of Cancer, 82(5), 678\u00E2\u0080\u0093686. Sullivan, L. M. et al. (2013) Epithelioid sarcoma is associated with a high percentage of SMARCB1 deletions, Modern Pathology, 26(3), 385\u00E2\u0080\u0093392. Sun, L. et al. (1998) Cabin 1, a negative regulator for Calcineurin signaling in T lymphocytes, Immunity, 8(6), 703\u00E2\u0080\u0093711. Tekautz, T. M. et al. (2005) Atypical Teratoid/Rhabdoid Tumors (ATRT): Improved Survival in Children 3 Years of Age and Older With Radiation Therapy and High-Dose Alkylator-Based Chemotherapy, Journal of Clinical Oncology, 23(7), 1491\u00E2\u0080\u00931499. Thathia, S. H. et al. (2012) Epigenetic Inactivation Of TWIST2 In Acute Lymphoblastic Leukemia Modulates Proliferation, Cell Survival And Chemosensitivity, Haematologica, 97, 371\u00E2\u0080\u0093378. Theruvath, J. et al. (2020) Locoregionally administered B7-H3-targeted CAR T cells for treatment of atypical teratoid / rhabdoid tumors, Nature Medicine, 26(5), 712-719. Thomson, J. A. et al. (1998) Embryonic stem cell lines derived from human blastocysts, Science, 282(5391), 1145\u00E2\u0080\u00931147. Timp, W. and Feinberg, A. P. (2013) Cancer as a dysregulated epigenome allowing cellular growth advantage at the expense of the host, Nature Reviews Cancer, 13(7), 497\u00E2\u0080\u0093510. Titus, A. J. et al. (2016) methyLiftover: cross-platform DNA methylation data integration, Bioinformatics, 32(16), 2517\u00E2\u0080\u00932519. Tomlinson, G. E. et al. (2005) Rhabdoid Tumor of the Kidney in The National Wilms\u00E2\u0080\u0099 Tumor Study: Age at Diagnosis As a Prognostic Factor, Journal of Clinical Oncology, 23(30), 7641\u00E2\u0080\u0093\t 189 7645. Torchia, J. et al. (2015) Molecular subgroups of atypical teratoid rhabdoid tumours in children: An integrated genomic and clinicopathological analysis, The Lancet Oncology, 16(5), 569\u00E2\u0080\u0093582. Torchia, J. et al. (2016) Integrated (epi)-Genomic Analyses Identify Subgroup-Specific Therapeutic Targets in CNS Rhabdoid Tumors, Cancer Cell, 30(6), 891\u00E2\u0080\u0093908. Toyota, M. et al. (1999) CpG island methylator phenotype in colorectal cancer, Proceedings of the National Academy of Sciences, 96, 8681\u00E2\u0080\u00938686. Tripathi, S. et al. (2015) Meta- and orthogonal integration of Influenza \u00E2\u0080\u009COMICs\u00E2\u0080\u009D data defines a role for UBR4 in virus budding, Cell Host and Microbe, 18(6), 723\u00E2\u0080\u0093735. Tumeh, P. C. et al. (2014) PD-1 blockade induces responses by inhibiting adaptive immune resistance, Nature, 515(7528), 568\u00E2\u0080\u0093571. Vanpouille-Box, C. et al. (2018) Cytosolic DNA Sensing in Organismal Tumor Control, Cancer Cell, 34, 1\u00E2\u0080\u009318. Varela, I. et al. (2011) Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma, Nature, 469(7331), 539\u00E2\u0080\u0093542. Veigl, M. L. et al. (1998) Biallelic inactivation of hMLH1 by epigenetic gene silencing, a novel mechanism causing human MSI cancers, Proceedings of the National Academy of Sciences, 95(15), 8698\u00E2\u0080\u00938702. Venneti, S. et al. (2011) Malignant Rhabdoid Tumors Express Stem Cell Factors, Which Relate To the Expression of EZH2 and Id Proteins, The American Journal of Surgical Pathology, 35(10), 1463\u00E2\u0080\u00931472. Versteege, I. et al. (1998) Truncating mutations of hSNF5/INI1 in aggressive paediatric cancer, Nature, 394(6689), 203\u00E2\u0080\u0093206. Versteege, I. et al. (2002) A key role of the hSNF5/INI1 tumour suppressor in the control of the G1-S transition of the cell cycle, Oncogene, 21(42), 6403\u00E2\u0080\u00936412. Vinagre, J. et al. (2013) Frequency of TERT promoter mutations in human cancers, Nature Communications, 4(2185), 1\u00E2\u0080\u00936. Vitte, J. et al. (2017) Timing of Smarcb1 and Nf2 inactivation determines schwannoma versus rhabdoid tumor development, Nature Communications, 8(300), 1\u00E2\u0080\u009313. Vogelstein, B. et al. (2013) Cancer genome landscapes, Science, 339(6127), 1546\u00E2\u0080\u00931558. Vries, R. G. J. et al. (2005) Cancer-associated mutations in chromatin remodeler hSNF5 promote \t 190 chromosomal instability by compromising the mitotic checkpoint, Genes and Development, 19(6), 665\u00E2\u0080\u0093670. Waddington, C. H. (1942) The epigenotype, Endeavour, 18\u00E2\u0080\u009320. Waddington, C. H. (1957) The Strategy of Genes. Edited by G. Allen. London: Ruskin House. Wang, W. et al. (1996) Purification and biochemical heterogeneity of the mammalian SWI-SNF complex, EMBO, 15(19), 5370\u00E2\u0080\u00935382. Wang, X. et al. (2017) SMARCB1-mediated SWI/SNF complex function is essential for enhancer regulation, Nature Genetics, 49(2), 289\u00E2\u0080\u0093295. Wang, Z. et al. (2012) Epigenetic silencing of the 3p22 tumor suppressor DLEC1 by promoter CpG methylation in non-Hodgkin and Hodgkin lymphomas, Journal of Translational Medicine, 10(209), 1\u00E2\u0080\u00937. Weinhold, N. et al. (2014) Genome-wide analysis of noncoding regulatory mutations in cancer, Nature Genetics, 46(11), 1160\u00E2\u0080\u00931165. Whyte, W. A. et al. (2013) Master transcription factors and mediator establish super-enhancers at key cell identity genes, Cell, 153(2), 307\u00E2\u0080\u0093319. Wiegand, K. C. et al. (2017) Mutations in Endometriosis- Associated Ovarian Carcinomas, New England Journal of Medicine, 363(16), 1532\u00E2\u0080\u00931543. Wilson, B. G. et al. (2010) Epigenetic antagonism between polycomb and SWI/SNF complexes during oncogenic transformation, Cancer Cell, 18(4), 316\u00E2\u0080\u0093328. Wilson, N. R. et al. (2016) SPECC1L deficiency results in increased adherens junction stability and reduced cranial neural crest cell delamination, Scientific Reports, 6(17735), 1\u00E2\u0080\u009315. Wu, J. I., Lessard, J. and Crabtree, G. R. (2009) Understanding the Words of Chromatin Regulation, Cell, 136(2), 200\u00E2\u0080\u0093206. Wu, T. D. and Watanabe, C. K. (2005) GMAP: A genomic mapping and alignment program for mRNA and EST sequences, Bioinformatics, 21(9), 1859\u00E2\u0080\u00931875. Yamaguchi, T. P. et al. (1999) T (Brachyury) is a direct target of Wnt3a during paraxial mesoderm specification, Genes and Development, 13(24), 3185\u00E2\u0080\u00933190. Yamakawa, K. et al. (1998) DSCAM: a novel member of the immunoglobulin superfamily maps in a Down syndrome region and is involved in the development of the nervous system, Human Molecular Genetics, 7(2), 227\u00E2\u0080\u0093237. Yan, X. et al. (2018) Prognostic factors for checkpoint inhibitor based immunotherapy: An \t 191 update with new evidences, Frontiers in Pharmacology, 9, 1\u00E2\u0080\u009317. Ying, J. et al. (2009) DLEC1 is a functional 3p22.3 tumour suppressor silenced by promoter CpG methylation in colon and gastric cancers, British Journal of Cancer, 100, 663\u00E2\u0080\u0093669. Ying, Q. et al. (2003) BMP induction of Id proteins suppresses differentiation and sustains embryonic stem cell self-renewal in collaboration with STAT3, Cell, 115, 281\u00E2\u0080\u0093292. Yu, J. et al. (2010) Epigenetic inactivation of T-box transcription factor 5, a novel tumor suppressor gene, is associated with colon cancer, Oncogene, 29, 6464\u00E2\u0080\u00936474. Zhang, L. et al. (2019) Single-cell transcriptomics in medulloblastoma reveals tumor-initiating progenitors and oncogenic cascades during tumorigenesis and replapse, Cancer Cell, 36, 1\u00E2\u0080\u009317. Zhang, Y. et al. (2008) Model-based analysis of ChIP-Seq (MACS), Genome Biology, 9(9), R137. Zhao, M., Sun, J. and Zhao, Z. (2013) TSGene: A web resource for tumor suppressor genes, Nucleic Acids Research, 41(D1), 970\u00E2\u0080\u0093976. Zhao, X. et al. (2018) Noninflammatory Changes of Microglia Are Sufficient to Cause Epilepsy, Cell Reports, 22(8), 2080\u00E2\u0080\u00932093. Zhou, W., Laird, P. W. and Shen, H. (2017) Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes, Nucleic acids research, 45(4), e22. "@en . "Thesis/Dissertation"@en . "2021-05"@en . "10.14288/1.0395355"@en . "eng"@en . "Bioinformatics"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivatives 4.0 International"@* . "http://creativecommons.org/licenses/by-nc-nd/4.0/"@* . "Graduate"@en . "Molecular characterization of rhabdoid tumours from multiple anatomical sites"@en . "Text"@en . "Dataset"@en . "http://hdl.handle.net/2429/76852"@en .