@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix skos: . vivo:departmentOrSchool "Science, Faculty of"@en ; edm:dataProvider "DSpace"@en ; ns0:degreeCampus "UBCV"@en ; dcterms:creator "Lee, Stephen D."@en ; dcterms:issued "2020-06-25T18:07:42Z"@en, "2020"@en ; vivo:relatedDegree "Master of Science - MSc"@en ; ns0:degreeGrantor "University of British Columbia"@en ; dcterms:description """The Capicua transcriptional repressor (CIC) is a transcription factor whose target genes are relieved from its repressive activity upon activation of receptor tyrosine kinase signalling. Loss of CIC function is implicated in oligodendroglioma (ODG) etiology, since ODGs are defined by loss of heterozygosity of CIC (through chromosome 1p/19q loss) and exhibit deleterious somatic mutation in the remaining allele in 50-80% of cases. However, CIC’s role in this context remains obscure, primarily from our currently limited knowledge regarding its biological functions. Moreover, CIC mutations are invariably found in ODGs with a neomorphic IDH1 or IDH2 mutation, yet the functional relationship between these two genetic events are also unclear. Global epigenetic alterations are established to result from the downstream effects of mutant IDH1/2 and CIC was recently identified to physically interact with various chromatin modulators, highlighting the relevance of epigenetic regulation in CIC function as well. Under the hypothesis that CIC and mutant IDH1/2 cooperatively dysregulate gene expression to contribute to ODG, we performed transcriptomic and epigenomic profiling of CIC-wildtype (WT) and CIC-knockout (KO) cell lines, with and without mutant IDH1 expression. Comprehensive analyses across these molecular landscapes revealed a recurrence of neurodevelopmental gene dysregulation in association with CIC loss. CIC ChIP-seq was also performed to expand upon the currently limited ensemble of known CIC target genes. Among the newly identified direct CIC target genes were EPHA2 and ID1, whose functions are linked to neurodevelopment and the tumourigenicity of in vivo glioma tumour models. NFIA, a known mediator of gliogenesis, was discovered to be uniquely overexpressed in cells with both mutant IDH1 and lack of functional CIC. These results illuminate neurodevelopment and specific genes within this context as candidate targets through which CIC alterations may contribute to the onset or early progression of IDH-mutant gliomas."""@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/74788?expand=metadata"@en ; skos:note "Characterization of the Effects of CIC Loss and Neomorphic IDH1 Mutation on theTranscriptome and EpigenomebyStephen D. LeeB.Sc., University of Toronto, 2016A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Genome Science and Technology)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)June 2020© Stephen D. Lee, 2020iiThe following individuals certify that they have read, and recommend to the Faculty ofGraduate and Postdoctoral Studies for acceptance, the thesis entitled:Characterization of the Effects of CIC Loss and Neomorphic IDH1 Mutation on theTranscriptome and Epigenomesubmittedby Stephen Leein partial fulfillment of the requirementsforthe degreeof Master of Sciencein Genome Science and TechnologyExamining Committee:Marco Marra, Genome Science and TechnologySupervisorPhilip Hieter, Genome Science and TechnologySupervisory Committee MemberMartin Hirst, Genome Science and TechnologySupervisory Committee MemberSamuel Aparicio, Genome Science and TechnologySupervisory Committee MemberiiiAbstractThe Capicua transcriptional repressor (CIC) is a transcription factor whose target genes arerelieved from its repressive activity upon activation of receptor tyrosine kinase signalling. Loss ofCIC function is implicated in oligodendroglioma (ODG) etiology, since ODGs are defined by lossof heterozygosity of CIC (through chromosome 1p/19q loss) and exhibit deleterious somaticmutation in the remaining allele in 50-80% of cases. However, CIC’s role in this context remainsobscure, primarily from our currently limited knowledge regarding its biological functions.Moreover, CIC mutations are invariably found in ODGs with a neomorphic IDH1 or IDH2mutation, yet the functional relationship between these two genetic events are also unclear.Global epigenetic alterations are established to result from the downstream effects of mutantIDH1/2 and CIC was recently identified to physically interact with various chromatin modulators,highlighting the relevance of epigenetic regulation in CIC function as well. Under the hypothesisthat CIC and mutant IDH1/2 cooperatively dysregulate gene expression to contribute to ODG,we performed transcriptomic and epigenomic profiling of CIC-wildtype (WT) and CIC-knockout(KO) cell lines, with and without mutant IDH1 expression. Comprehensive analyses acrossthese molecular landscapes revealed a recurrence of neurodevelopmental gene dysregulationin association with CIC loss. CIC ChIP-seq was also performed to expand upon the currentlylimited ensemble of known CIC target genes. Among the newly identified direct CIC targetgenes were EPHA2 and ID1, whose functions are linked to neurodevelopment and thetumourigenicity of in vivo glioma tumour models. NFIA, a known mediator of gliogenesis, wasdiscovered to be uniquely overexpressed in cells with both mutant IDH1 and lack of functionalCIC. These results illuminate neurodevelopment and specific genes within this context ascandidate targets through which CIC alterations may contribute to the onset or early progressionof IDH-mutant gliomas.ivLay SummaryBiological processes rely on molecular mechanisms that control which genes are turned ‘on’ or‘off’. Loss of function in CIC, a gene that can control gene expression, is implicated in a type ofbrain cancer called oligodendroglioma (ODG) but the details underlying this role are unclear.Thus, I investigated the effects of CIC loss by comparing cells with and without functional CIC.My analyses identified altered expression of genes linked to neurodevelopment, highlighting thepotential relevance of dysregulated neurodevelopment in CIC’s role in ODG. I also discoveredthat cells that both lacked CIC and possessed an IDH1 mutation, a defining feature of ODG,uniquely overexpressed NFIA, an important regulator of glial cell development. Overall, my workhighlights neurodevelopmental gene and pathway control as a potential means by which CICmutations can drive the initiation of ODG.vPrefaceInvestigation into the epigenomic and transcriptome alterations associated with CIC-KO in anIDH-WT and -R132H background was conceived by Dr. Marco Marra. Dr. Suganthi Chittaranjan,Susanna Chan and Jungeun Song generated the isogenic CIC-KO and -WT cell lines.Veronique LeBlanc performed the CIC ChIP sample preparation. RNA-seq, ChIP-seq andWhole Genome Bisulfite Sequencing library construction, sequencing and alignment ofsequence data were performed by staff of the British Columbia Genome Sciences Centre.Stephen Lee conceived strategies regarding the characterization of epigenome andtranscriptome data sets and conducted all bioinformatics analyses included in this thesis project.Elizabeth Chun provided technical advice regarding several of the bioinformatics analyses.viTable of ContentsAbstract ..................................................................................................................................................... iiiLay Summary ........................................................................................................................................... ivPreface ........................................................................................................................................................vTable of Contents ................................................................................................................................... viList of Figures ....................................................................................................................................... viiiList of Abbreviations ............................................................................................................................. ixList of Genes ............................................................................................................................................xiAcknowledgments ................................................................................................................................xiiiDedication ...............................................................................................................................................xivChapter 1: Background ..........................................................................................................................11.1 Introduction ..................................................................................................................................... 11.2 Core Concepts in Cancer Biology ...............................................................................................11.3 Transcription Factors and Epigenetic Regulators .....................................................................31.4 The Capicua Transcriptional Repressor .....................................................................................61.4.1 CIC's Role in Signal Transduction, Development and Homeostasis .............................61.4.2 The CIC Protein and Its Interactors .....................................................................................71.4.3 CIC Alterations in Cancer ..................................................................................................... 91.5 Thesis Overview .......................................................................................................................... 11Chapter 2: Materials and Methods ....................................................................................................14Chapter 3: Results .................................................................................................................................253.1 Characterization of the Transcriptomic Consequences of CIC-KO in Cell LinesExpressing IDH1-WT and IDH1-R132H .........................................................................................253.1.1 Differential Expression Analysis Identifies Known Targets of CIC and YieldsDivergent Results According to IDH1 Status ...........................................................................253.1.2 CIC and IDH1-Associated DE Genes Are Enriched for Pathways Related to NeuralDevelopment and the Extracellular Matrix ...............................................................................273.1.3 CIC ChIP-seq Identifies Known and Potentially Novel Direct CIC Target Genes ...283.2 Characterization of the Epigenomic Consequences of CIC-KO in Cell Lines ExpressingIDH1-WT and IDH1-R132H ..............................................................................................................313.2.1 Differential Enrichment Analysis Identifies CIC and IDH1-Associated Changes inthe Chromatin Landscape .......................................................................................................... 313.2.2 CIC Binding Is Not Associated With Changes in the Histone ModificationLandscape ................................................................................................................................... 34vii3.2.3 CIC-KO Is Associated With Dysregulation of Enhancers at NeurodevelopmentalGenes ............................................................................................................................................ 343.2.4 Analysis of Differentially Methylated Regions Identifies CIC and IDH1-AssociatedChanges in the DNA Methylation Landscape ......................................................................... 37Chapter 4: Discussion ..........................................................................................................................714.1 Transcriptomic Consequences of CIC-KO and Candidate Target Genes Yield InsightsInto Its Role in Neurodevelopment and Cell Cycle Regulation ...................................................714.2 CIC-KO Cell Lines Exhibit Dysregulation of Neurodevelopmental Gene Enhancers .......734.3 Possible Synergy Between CIC-KO and IDH1-R132H in Gliomagenesis ..........................754.4 Conclusion ....................................................................................................................................77Bibliography ............................................................................................................................................79Appendices ............................................................................................................................................. 86Appendix A: RNA-seq Summary Statistics .................................................................................... 86Appendix B: Top 50 Most Significant Protein-Coding Genes in All DE Analyses ....................87Appendix C: Top 25 Most Significant Gene Ontology Terms in All DE Gene Sets .................92Appendix D: CIC ChIP-seq Summary Statistics ............................................................................94Appendix E: 150 High-Confidence CIC Peaks ..............................................................................95Appendix F: Histone Modification ChIP-seq Summary Statistics ...............................................99Appendix G: WGBS Summary Statistics ......................................................................................102viiiList of FiguresFigure 1.1 CIC Interacts With Chromatin Modifier Proteins ....................................................... 12Figure 1.2 Distribution of CIC Somatic Alterations in Glioma .................................................... 13Figure 2.1 Isogenic CIC-Wildtype and CIC-Knockout Immortalized Human Astrocyte CellLine Models .....................................................................................................................24Figure 3.1 Known CIC Target Genes Are Upregulated in CIC-KO Cell Lines ........................ 40Figure 3.2 Summary of RNA-seq DE Analysis .............................................................................41Figure 3.3 Top Enriched Pathways for Each Set of DE Genes .................................................43Figure 3.4 Quality Assessment and Comparison of CIC ChIP-seq Replicates .......................45Figure 3.5 Quality Assessment of an Independent CIC ChIP-seq Dataset .............................46Figure 3.6 Summary of High-Confidence CIC Peaks ..................................................................47Figure 3.7 CIC ChIP-seq Peaks at Known and Novel Candidate Target Genes ....................49Figure 3.8 Candidate CIC Target Genes Are Upregulated in CIC-KO Cell Lines .................. 50Figure 3.9 Histone Modification Peak Signal Cluster Cell Lines According to CIC and IDH1Status ...............................................................................................................................52Figure 3.10 Comparison of Peaks Across All Cell Lines for Each Histone Modification ......... 53Figure 3.11 Summary of DER Peaks ...............................................................................................54Figure 3.12 CIC-Associated DER Peaks Are Largely Conditional on IDH1 Status ..................56Figure 3.13 DER Peaks Are Predominantly Distal From Transcriptional Start Sites ...............57Figure 3.14 CIC Loss Is Associated With Enhancer Dysregulation ............................................58Figure 3.15 CIC-Associated DER Enhancers Are Positively Correlated With Target GeneExpression and Are Enriched for Motifs Related to Direct CIC Target Genes .... 60Figure 3.16 Enhancers at PDGFRA and NFIA Are Dysregulated in Association with CIC andIDH1 Status .....................................................................................................................62Figure 3.17 IDH1-R132H Cell Lines Exhibit a DNA Hypermethylator Phenotype ....................64Figure 3.18 Summary of Differentially Methylated Regions ......................................................... 66Figure 3.19 CIC Binding Is Not Associated With Differential Methylation ..................................68Figure 3.20 CIC-Associated Differential Methylation Is Not Associated With Differential GeneExpression ...................................................................................................................... 69Figure 4.1 Conceptual Model for the Synergistic Relationship Between CIC Loss and MutantIDH1 in Promoting Gliiomagenesis .............................................................................78ixList of Abbreviations2HG 2-HydroxyglutarateαKG Alpha-ketoglutarateAcetyl-CoA Acetyl coenzyme AAP-MS Affinity purification followed by mass spectrometryChIP Chromatin immunoprecipitationChIP-seq Chromatin immunoprecipitation sequencingCIC-L CIC-long formCIC-S CIC-short formCIMP CpG island methylator phenotypeCSC Cancer stem cellCNS Central nervous systemDE Differentially expressedDER Differentially enrichedDMR Differentially methylated regionDNA Deoxyribonucleic acidDNMT DNA methyltransferaseEGFR Epidermal growth factor receptorERK Extracellular signal-regulated kinaseGBM GlioblastomaG-CIMP Glioma-CpG island methylator phenotypeH2A Histone 2AH2B Histone 2BH3 Histone 3H3K27ac Histone 3 lysine 27 acetylationH3K27me3 Histone 3 lysine 27 tri-methylationH3K36me3 Histone 3 lysine 36 tri-methylationH3K4me1 Histone 3 lysine 4 mono-methylationH3K4me3 Histone 3 lysine 4 tri-methylationH3K9me3 Histone 3 lysine 9 tri-methylationH4 Histone 4HMG High mobility groupIP-MS Immunoprecipitation followed by mass spectrometryIP-WB Immunoprecipitation followed by western blotKO KnockoutMAPK Mitogen activated protein kinasemRNA Messenger RNANF-κB Nuclear factor kappa-light-chain-enhancer of activated B cellsNGS Next-generation sequencingNSC Neural stem cellODG OligodendrogliomaOPC Oligodendrocyte precursor cellRNA Ribonucleic acidxRPKM Reads per kilobase per million mapped readsRTK Receptor tyrosine kinaseSWI/SNF Switch/sucrose non-fermentableT-ALL T-cell acute lymphoblastic leukemiaTET Ten-eleven translocationTF Transcription factorTSS Transcription start siteWGBS Whole genome bisulfite sequencingWT WildtypexiList of GenesACLY ATP citrate lyaseARID1A AT-rich interaction domain 1AATXN1 Ataxin 1ATXN1L Ataxin 1-likeCBP CREB-binding proteinCCND1 Cyclin D1CCND2 Cyclin D2CCNE1 Cyclin E1CDC14A Cell division cycle 14ACDC14B Cell division cycle 14BCIC Capicua transcriptional repressorCRABP1 Cellular retinoic acid binding protein 1CTCF CCCTC-binding factorDCAF7 DDB1 and CUL4 associated factor 7DUSP4 Dual specificity phosphatase 4DUX4 Double homeobox 4DYRK1A Dual specificity tyrosine phosphorylation regulated kinase 1AEPHA2 EPH receptor A2ETV1 ETS variant transcription factor 1ETV4 ETS variant transcription factor 4ETV5 ETS variant transcription factor 5EZH2 Enhancer of zeste homolog 2FOS Fos proto-oncogeneFOSL1 FOS like 1GPR3 G protein-coupled receptor 3HMGA1 High mobility group AT-hook 1ID1 Inhibitor of DNA binding 1IDH1 Isocitrate dehydrogenase 1IDH2 Isocitrate dehydrogenase 2KMT5B Lysine methyltransferase 5BLIN28A Lin-28 homolog ALIN28B Lin-28 homolog BMAFF MAF BZIP transcription factor FMAFG MAF BZIP transcription factor GMECP2 Methyl-CpG binding protein 2MMP24 Matrix metallopeptidase 24MYC MYC proto-oncogeneNFIA Nuclear factor 1AP300 E1A-binding protein P300PDGFRA Platelet derived growth factor receptor alphaPJA1 Praja ring finger ubiquitin ligase 1PLK3 Polo like kinase 3xiiRUNX1 RUNX family transcription factor 1SHC3 SHC adaptor protein 3SHC4 SHC adaptor protein 4SIN3A SIN3 transcription regulator family member ASMARCA2 SWI/SNF related, matrix regulated, actin dependent regulator ofchromatin, subfamily A, member 2SPRY4 Sprouty RTK signaling antagonist 4TP53 Tumour protein 53xiiiAcknowledgmentsI would like to thank Dr. Marco Marra for your mentorship and support towards elevating myscientific acuity. Thank you Elizabeth Chun for your friendship, your unwavering encouragementand your motivating presence. Thank you Dr. Suganthi Chittaranjan and Dr. Alessia Gagliardifor your guidance and thoughtful feedback. I am grateful to Susanna Chan and Jungeun (Jay)Song for your camaraderie and always providing a friendly hand in the lab. Thank you to JamesTopham, who took his time to impart his bioinformatics expertise to me. I would also like tothank all other members of the Marra lab (Dr. Emilia Lim, Dr. Rodrigo Goya, Dr. Dan Jin,Veronique LeBlanc, Lisa Wei, Vanessa Porter and Emma Titmuss) for your collaborative spiritand helpful critique. I wish to thank Lulu Crisostomo as well, for your continual administrativehelp. Thank you to my committee members Dr. Sam Aparicio, Dr. Martin Hirst and Dr. PhilHieter for supporting me throughout the course of my program.xivDedicationTo Anita Chou.1Chapter 1: Background1.1 IntroductionAlterations of transcription factors and epigenetic regulators are commonly observed acrosscancer types, highlighting the importance of transcriptional modulators in tumourigenesis andneoplastic development. The Capicua transcriptional repressor (CIC) gene has been reported toharbour recurrent somatic mutations in oligodendrogliomas (ODGs) and in several othermalignancies1-6, yet its role in tumour progression remains poorly understood. Moreover, CICalterations exclusively occur with a neomorphic IDH1/2 mutation in the context of ODG, yet thefunctional interaction between these two somatic events is unknown. Understanding themechanistic basis underlying CIC alterations in cancer is complicated by the currently limitedcharacterization of its function in the context of normal mammalian biology. To expand upon ourcurrent state of knowledge regarding CIC’s normal molecular functions, I studied the regulatorynetwork of CIC through the analysis of transcriptomic and epigenomic landscapes in normalcells with and without functional CIC. In addition, I sought to address the knowledge gap withregards to the CIC and IDH1 interaction by including CIC-knockout and wildtype cell lines in amutant IDH1 background. The remainder of Chapter 1 reviews the literature that has directlymotivated my thesis work.1.2 Core Concepts in Cancer BiologyCancer, despite its singular designation, comprises a heterogeneous spectrum of hundreds oftumour types. However, certain biological commonalities across cancer types have been madeincreasingly apparent over the past few decades and have been consolidated into 8 definingcapabilities called the cancer hallmarks, which are i) sustained proliferative signalling, ii)resistance to growth suppressors, iii) evasion of cell death, iv) unlimited replicative capacity, v)promotion of angiogenesis, vi) enhanced invasive and metastatic potential, vii) metabolicadaptation and viii) circumvention of immune destruction7,8. Somatic genetic alterations candysregulate the function of genes to promote hallmark capabilities and thereby endow aselective growth advantage. These selectively advantageous mutations, termed drivermutations9, facilitate the expansion of cells into clones and present a source of genetic diversityupon which Darwinian natural selection can act10. Accordingly, tumour evolution is currentlyunderstood to involve the reiterative proliferation of differentially fit subclones as a result of thestepwise acquisition of multiple driver alterations11.2In addition to the 8 hallmarks, driver mutations can also encourage tumour progression bycontributing to genomic instability and tumour-promoting inflammation7,8. These states havebeen called the two enabling characteristics of cancer since genomic instability increases thelikelihood of a hallmark promoting mutation to occur while pro-oncogenic inflammation cansupply cells with tumour promoting molecules such as growth factors and mutagenic reactiveoxygen species7,8.Yet another broadly applicable feature of cancer, despite not being included in Hanahan andWeinberg’s conceptualization of the cancer hallmarks7,8, is the loss of differentiation or a stem-like phenotype. Cancer stem cells (CSCs) are named for their ability to self-renew andpropagate all the different cell types in a given tumour and have been identified in multiplemalignancies, including leukemia12, ovarian cancer13, and brain cancer14. CSCs are subjects ofgreat interest, not only due to their involvement in carcinogenesis, but also their apparent rolesin progression, metastasis and therapy resistance15. CSCs can arise from both their normalcounterparts or from more differentiated cell types that have acquired stem-cell properties16.The property of stemness is tied to cellular programs relevant to the cancer hallmarks such asproliferation and unlimited replicative potential. And just as mutations can promote the cancerhallmarks, they can also serve in the attainment of stem cell characteristics.The hallmark capabilities and other cancer promoting traits arise through the subversion orexploitation of existing cellular machinery. This is evident from the fact that a large proportion ofcancer genes (genes harbouring driver mutations in cancer) comprise those that play importantroles in normal cellular processes and whose dysregulation result in the malfunction of thoseprocesses in a manner that may be beneficial for tumourigenic initiation or progression. Thus, itis not surprising that many cancer genes consist of transcriptional regulators, since they functionas central nodes within the molecular circuitry underlying virtually all cellular processes. Indeed,domains implicated in transcriptional regulation were found to be the second most commonamong all proteins encoded by cancer genes17. Due to their importance in cancer biology, thenext section will broadly review the literature revolving genes involved in transcriptionalregulation.1.3 Transcription Factors and Epigenetic RegulatorsGenes that regulate transcription broadly encompass two categories: transcription factors (TFs)and epigenetic regulators. Together, TFs and epigenetic regulators are able to orchestrate gene3expression patterns and thereby coordinate development and establish cellular identity18,19.Furthermore, these genes also elicit appropriate transcriptional responses to environmental orintracellular stimuli. As acknowledged in the previous section, oncogenic characteristics canmanifest as a result of the dysregulation of transcriptional regulators. This section will expoundon this statement starting with TFs.TFs are a family of regulatory DNA-binding proteins whose genes are among the mostfrequently mutated and the most extensively characterized cancer genes, such as TP53 (P53)20and MYC21. TFs interact with their target DNA loci via their affinity to specific nucleotidesequences and possess the capacity to influence the expression of multiple genes. P53, forexample, recognizes RRRCWWGYYY (R = A/G, W = A/T, and Y = C/T) and other ‘noncanonical’ sites to mediate the transactivation of genes involved in cell cycle progression,apoptosis and DNA repair, among many other processes22,23. One of P53’s key roles in cancerrevolves around its position as an integral DNA damage monitor, whereby the detection of DNAdamage triggers P53 mediated delay in cell cycle progression as a means to facilitate repair orapoptosis in the case where the damage is too severe24. Thus, it is well accepted that loss offunction alterations in TP53, which abrogate P53’s ability to activate its target genes, serve toremove this safety mechanism and consequently engender the cancer enabling characteristic ofgenomic instability. As mentioned above, MYC is another TF-encoding gene that is frequentlyaltered in cancer, particularly by amplifications25. Among the many processes MYC regulates ina non-tumour context, cellular proliferation is one that can be hyperactivated upon MYCoverexpression21. While TFs undoubtedly represent an important facet of transcriptionalregulation and cancer biology, they comprise just one layer of many; below I will addressanother major class of transcriptional regulators - the epigenetic regulators, and their relevanceto cancer.Epigenetic regulators control gene expression by modulating the local chromatin state. Thefundamental unit of chromatin is the nucleosome, which consists of 147bp of DNA wrappedaround a histone octamer (2 subunits each of H2A, H2B, H3 and H4). Modification of histoneproteins occurs through the activities of histone modifying enzymes, which regulate chromatinstate through the addition or removal of chemical moieties at specific tail residues. Tri-methylation at lysine 27 of histone 3 (denoted as H3K27me3), for instance, is deposited by theepigenetic regulator EZH2 and is associated with densely packed chromatin (heterochromatin)and transcriptional repression26. Conversely, acetylation of the same residue (H3K27ac), is4established by CBP/P300 and is primarily associated with enhancer activity27. Changes in thelocal chromatin state can also involve the repositioning of nucleosomes, which is primarilycarried out by the SWI/SNF chromatin remodelling complex. With the advent of next-generationsequencing (NGS) came the discovery that mutations in many epigenetic regulator genes occuracross many cancer types, bringing chromatin biology to the forefront of cancer research. Forexample, somatic mutations in the genes encoding EZH2, CBP, P300, and other histonemodifying genes were found at high frequencies in B-cell malignancies28. Furthermore, theSWI/SNF complex is the most recurrently altered protein complex across all cancer types, whenthe mutational frequencies of its subunits are considered collectively29. Although significantprogress has been made in the identification of histone modifier gene alterations across cancers,the functional and mechanistic basis by which many of these lesions contribute to cancersremains obscure.Apart from histone modifications, another well-established layer of epigenetic regulation is DNAmethylation. DNA methylation in different genomic contexts has been associated with diversefunctional roles. Specifically, methylation of CpG sites (cytosine followed by guanine) withinregions termed CpG islands (loci with high CpG density) is well understood to mediatetranscriptional repression30. Conversely, CpG methylation within the gene body is positivelycorrelated with gene expression30. Like histone marks, DNA methylation can be added andremoved by distinct enzymes; DNA methyltransferases (DNMTs) are responsible forestablishing DNA methylation while the ten-eleven translocation (TET) family of proteins areresponsible for DNA de-methylation30. Like histone modifiers, these writers and erasers of DNAmethylation are also recurrently dysregulated across several cancer types31. One well-knowncancer-associated anomaly is the CpG island methylator phenotype (CIMP), a broadly definedterm characterized by a high degree of DNA methylation32. While the etiological relevance ofthis phenotype in many cancer types is poorly understood, it has been linked to the silencing oftumour suppressor genes in various contexts32.Importantly, the control of gene expression involves the synergistic interplay of TFs, histonemodifications, DNA methylation and many more regulatory modules. DNA methylation, forinstance, can directly impede DNA binding of a TF such as CTCF33, or generate novel bindingsites for regulators that specifically recognize and bind methylated nucleotides such asMECP234. The degree of chromatin compaction can also enable or prohibit a TF to access aparticular region and reciprocally, TFs can deliver epigenetic modifier proteins and thereby elicit5a change in the local chromatin landscape18. The dynamics between TFs, epigenetic regulatorsand other genes involved in transcriptional control therefore comprise a complex landscape thatwill require models that capture these combinatorial relationships in order for their functions tobe fully elucidated.Investigating the crosstalk between diverse transcriptional regulators is now feasible with theemergence of genome-wide assays such as whole transcriptome sequencing (RNA-seq) andchromatin immunoprecipitation followed by sequencing (ChIP-seq). ChIP-seq involves theisolation of DNA-protein complexes through the use of antibodies specific for the protein ofinterest and the subsequent mapping of their genomic locations using sequencing35. In additionto profiling the genome-wide distribution of TFs, ChIP-seq can be applied for histonemodifications35. Whole genome bisulfite sequencing (WGBS), methylated DNAimmunoprecipitation sequencing and other methodologies have also emerged as high-throughput platforms for profiling genome-wide DNA methylation36. Integration of multiple -omicplatforms are now empowering studies to comprehensively explore the regulatory landscape inthe context of cancer.In summary, the importance of TFs and epigenetic modifiers in cancer is evident from the highfrequency in which the genes encoding them are found to be somatically mutated, yet ourunderstanding of the biological consequences of these altered genes are largely incomplete.The Capicua transcriptional repressor (CIC) is one such TF gene that is frequently inactivatedthrough somatic mutation in oligodendrogliomas (ODGs) and other cancer types1-6, whoseputative tumour suppressor function is yet to be elucidated. CIC’s cancer role remains obscure,presumably because of our currently limited understanding regarding its regulatory role inhuman cells. A potentially significant hint regarding the role of CIC deficiency in the context ofODG is its invariable co-occurrence with a neomorphic IDH1 or IDH2 mutation, which are well-known to dysregulate multiple epigenetic processes (discussed in a later section). For thesereasons, the next section will comprehensively address the literature relevant to CIC and itsconnection to IDH1/2.61.4 The Capicua Transcriptional Repressor1.4.1 CIC’s Role in Signal Transduction, Development and HomeostasisCic was first discovered in Drosophila melanogaster as a transcriptional repressor involved inthe differentiation of the antero-posterior poles, hence its name Capicua, which means head-and-tail in Catalan37. Cic has since been shown to regulate other developmental patterningprocesses such as the establishment of the dorsal-ventral axis during oogenesis38-40 and thespecification of vein cells in the developing wing41. In addition to tissue patterning, Cic wasdemonstrated to regulate cell proliferation in the developing eye42,43 and in intestinal stem cells44.Despite our understanding of Cic’s role in Drosophila, our knowledge of the extent of CIC’sfunctional significance in mammals has been relatively limited. Inactivation of mammalian CIChas been associated with defects in lung alveolarization45, bile acid homeostasis46, and T-celldevelopment47,48, demonstrating its relevance across a wide range of physiological processes.Moreover, normal CIC activity seems to be important to the central nervous system (CNS), asits dysfunction has been linked to a spectrum of neurobehavioural syndromes49,neurodegeneration50, and altered lineage specification of neural stem cells (NSCs)51,52. Finally,CIC dysfunction has been implicated in the etiology of several cancer types (discussed in a latersection), and its roles in tumour progression are just beginning to be understood.In both Drosophila and mammals, CIC functions downstream of receptor tyrosine kinase (RTK)signalling through a mechanism called default repression: in the absence of RTK signals, CICmaintains its transcriptional repressor activity whereas the induction of RTK results in theinactivation of CIC and subsequent de-repression of its target genes37,40-44,53,54. Inactivation ofCIC occurs via ERK (also known as MAPK), which phosphorylates CIC and other substrates inthe nucleus downstream of RTK stimulation53,55. In Drosophila, ERK-mediated phosphorylationof Cic appears to result in its accelerated export from the nucleus into the cytoplasm followed byits eventual degradation56. In mammals, in addition to being phosphorylated by ERK, CIC is alsophosphorylated by the ERK-activated p90 ribosomal S6 kinase at serine 173 (CIC-S173)53.Phosphorylation at CIC-S173 appears to promote its interaction with 14-3-3 proteins, therebyreducing its affinity to DNA53. This post-translational modification also appears to facilitate itsrecognition by the E3 ligase PJA1 and its subsequent degradation within the nucleus57.In addition to RTK signalling, several recent studies have highlighted the capacity for Cic toparticipate in other signal transduction pathways. For instance, Cic mRNA abundance was7demonstrated to be repressed by the microRNA Bantam, whose expression is upregulated byboth Hippo and EGFR signalling58. Moreover, Minibrain (Mnb) and Wings apart (Wap) werereported to inhibit Cic’s repressor function in parallel to the RTK pathway, and this RTK-independent mode of Cic regulation was important in the growth of the eye and brain and in thenormal patterning of the wing59. Interestingly, Mnb and Wap have mammalian orthologs(DYRK1A and DCAF7, respectively) which were identified as CIC interacting proteins in arecent affinity purification followed by mass spectrometry (AP-MS) experiment54, suggesting aconservation of this mechanism in mammals. And finally, Cic was recently described to repressToll/IL-1 signalling genes during dorsoventral patterning, once again independently of RTKactivity60. Notably, this repression was demonstrated to involve the binding of Cic to enhancersvia sub-optimal AT sites (explained in more detail below) and was dependent on the nuclearlocalization of Dorsal/NFκB upon Toll/IL-1 activation60. Though whether CIC can integratesignals beyond RTK in mammals remains to be demonstrated, these results illustrate that CIC’sregulatory circuitry may be much more multifaceted than currently appreciated.1.4.2 The CIC Protein and Its InteractorsBoth Drosophila and mammals express at least two CIC protein isoforms, the CIC-long form(CIC-L) and the CIC-short form (CIC-S). Both isoforms possess two highly conserved proteindomains, the high mobility group (HMG) and C1 domain. The former is well understood toconfer sequence specific DNA-binding capabilities and the latter has been recently reported toalso be important for this activity61. In both flies and mammals, CIC binds DNA throughrecognition of its octameric consensus binding site: 5’-T(G/C)AATG(G/A)(G/A)-3’ 54,62-64.However, as mentioned above, Cic can also bind with lower affinity to AT sites which containone mismatch relative to the consensus binding site, although this sub-optimal binding appearsto require the presence of Dorsal/NFκB in the nucleus60.CIC-L differs structurally from CIC-S due to a several hundred amino acid extension in its N-terminus. CIC-L is localized predominantly in the nucleus while CIC-S can be found in both thenucleus and in the cytoplasm where it appears to be localized in close proximity to themitochondria65. Profiling of the protein interactors of cytoplasmic CIC-S by AP-MS identifiedseveral mitochondrial related proteins including ATP citrate lyase (ACLY)65, an enzyme thatcatalyzes the conversion of citrate into acetyl coenzyme A (acetyl-CoA). Expression of mutantforms of CIC-S (CIC-R1515H or CIC-R201W) resulted in reduced levels of ACLY65, highlighting8a possible metabolic role for CIC-S in the cytoplasm. Beyond this observation, however, thefunctional disparities between CIC-L and CIC-S remain obscure.In Drosophila, Cic mediated silencing of its targets involves the recruitment of Groucho (Gro),which functions as a transcriptional co-repressor, at least in some contexts60,62,66. Since the N2motif responsible for the interaction between Cic and Gro is only present in dipteran insects, thispartnership is unlikely to be present in mammals66. Rather, mammalian CIC appears to functionwith ATAXIN1 (ATXN1) or its paralogue, ATAXIN1-like (ATXN1L) as its co-repressor45,50,66. Theinteraction between ATXN1 and CIC is important in spinocerebellar ataxia type 1 pathology,since ablating this interaction results in a reduction in disease toxicity67. Moreover, CIC andATXN1L appear to have a reciprocal relationship, such that knockout of either CIC or ATXN1Lleads to altered localization and/or protein instability of their respective partner45,68.To gain a more extensive view on CIC’s protein interaction network, our group conducted CICimmunoprecipitation followed by mass spectrometry (IP-MS), the results of which are availablein a pre-print on bioRχiv69. Among the novel interactors of CIC were proteins involved inchromatin organization including several components of the SWI/SNF complex, which Ivalidated by performing reciprocal IPs followed by Western Blots (IP-WB)69. IP-MS was alsoapplied in the context of exogenously expressed Flag-tagged CIC-L by S. Weissmann andcolleagues (2018), who also identified various epigenetic modulators that overlapped with ourresults54. Furthermore, Weissmann and others reported an association between CIC recruitmentand increased histone acetylation, consistent with the observation that histone deacetylasesincluding SIN3A were identified as CIC protein interactors54. Specific CIC protein interactorsassociated with the Gene Ontology (GO) terms “Chromatin Organization” and “ChromatinBinding” identified in both studies, along with IP-WB validations of CIC pulldown in ARID1A,SMARCA2 IPs are displayed in Figure 1.1. Altogether, these results implicate an epigeneticcomponent in CIC’s molecular function.1.4.3 CIC Alterations in CancerCIC was first implicated in cancer from the observation of recurrent CIC-DUX4 fusions in asubset of undifferentiated round cell sarcomas with histological features similar to Ewingsarcoma7071. Despite the histological likeness to Ewing sarcomas, CIC-rearranged tumoursexhibit significantly lower survival rates, greater aggressiveness and distinct immunological andtranscriptional profiles, supporting the stance for a stand-alone pathological entity for these9malignancies72,73. The CIC-DUX4 chimeric protein structurally retains most of the CIC proteinexcluding the very C-terminal end which is replaced by the DUX4 trans-activation domain,resulting in a functional switch from a transcriptional repressor to a transcriptional activator70,74.The oncogenic role of CIC-DUX4 chimeras has been ascribed, at least in part, to theupregulation of genes involved in metastasis, including ETV4, and cell cycle regulation,including CCND2 and CCNE174,75.CIC mutations have been identified at high frequency in oligodendrogliomas (ODGs)1-4, asubgroup of adult diffuse gliomas whose diagnosis traditionally relied on histology. Due in partto the superior accuracy of molecular markers in predicting clinical outcome, the classification ofadult diffuse gliomas was recently updated to include the status of the IDH1/IDH2 genes and thechromosome arms 1p and 19q76. According to this current schema, ODGs are diagnosedaccording to the presence of a neomorphic IDH1 or IDH2 mutation in addition to deletion of onecopy each of chromosomes 1p and 19q. Investigations aiming to identify mutations in theretained copy of chromosomes 1p and 19q revealed alterations in CIC (encoded on 19q13.2) in~50-80% of tumours within this molecularly defined subgroup1,2. Around half of CIC alterationsin ODGs consist of deleterious nonsense and frameshift mutations throughout the entire genewhile the remainder consist of recurrent missense mutations within the regions encoding theHMG and C1 domains61 (Figure 1.2). CIC thus exhibits an enigmatic mutational profilereminiscent of both an oncogene and a tumour suppressor. Functional studies involving theexpression of CIC constructs harbouring the most common missense mutations demonstrated areduction in CIC occupancy at target promoters and their derepression, indicating that thesemissense mutations too, result in a loss of function, although they appear to retain some weakrepressive activity61,63. While the possibility remains that missense forms of CIC may retainsome molecular function (e.g. metabolic dysregulation in the cytosol, as mentioned previously),CIC’s role in cancer seems to fit more appropriately with that of a tumour suppressor.Reinforcing this view is the association of CIC deficiency with the promotion of cancer hallmarksin multiple tumour contexts, which is discussed below.Recently, our group observed increased frequencies of chromosome segregation defects inCIC-knockout (KO) cell lines69. Correspondingly, CIC-KO cell lines exhibited enhancedaneuploidy and copy number alterations, establishing the possibility for CIC loss to confergenome instability. As mentioned previously, RTK signalling inactivates CIC, and this regulatoryprocess involves its degradation by the nuclear E3 ligase PJA1, downstream of ERK activation.10Consistent with hyperactive RTK/ERK signalling being a canonical feature of glioblastoma(GBM), CIC protein levels were shown to be invariably low or absent in GBM tumour samplescompared to normal brain tissue57. The same study demonstrated that stabilization of CICprotein expression through inhibition of PJA1 or deletion of its ERK binding interfaceantagonized GBM growth in xenograft mice models. In prostate cancer cell lines, it was reportedthat knockdown of CIC was linked to increased proliferation and metastatic potential and thatthese associations were at least partially attributed to derepression of ETV5 or CRABP177. CICdeficiency was also associated with enhanced invasion and metastasis in an orthotopic lungcancer model in mice5. This phenotype was demonstrated to be conferred by the derepressionof ETV4 and subsequent upregulation of its target gene, MMP245. While germline inactivation ofCIC in mice caused perinatal lethality due to inadequate lung maturation, ubiquitous CICinactivation in adult mice was found to induce T-cell acute lymphoblastic leukemia (T-ALL)78.Derepression of ETV4 was again identified to be responsible for this phenomenon, at least to acertain degree. Another study independently reported the presence of T-ALL in adult mice inwhich CIC was deleted specifically in the hematopoietic cell compartment48. Loss of CIC in thiscontext was associated with an expansion of early T-cell precursors, suggesting a role for CICin regulating early T cell development. Interestingly, CIC has also been linked to altereddevelopment of neural stem cells, whereby forebrain specific deletion of CIC led to an increasedpopulation of NSCs and a biased propensity for these progenitors to commit to a glial celllineage rather than a neuronal cell lineage6. As evident from these findings, CIC is a tumoursuppressor whose role appears to involve the suppression of proliferation, metastatic capacityand possibly genome instability. Notably, no mice model in which CIC was deleted across alltissues or specifically in the brain developed ODG, suggesting that loss of CIC alone isinsufficient for gliomagenesis. As mentioned above, ODGs are defined by co-deletion ofchromosomes 1p/19q and neomorphic IDH1 or IDH2 mutations, which may constitute thenecessary molecular background for CIC loss to drive ODG.In ODG, CIC mutations occur exclusively in tumours that possess a heterozygous pointmutation in key residues of either IDH1 or IDH2 (R132 and R172, respectively)1-4. Thesemutations are described to be neomorphic since they confer a novel function for the protein tocatalyze alpha-ketoglutarate (αKG) into 2-hyrdoxyglutarate (2-HG) whereas wildtype IDH1/2converts isocitrate into αKG as part of the tricarboxylic acid cycle79,80. The production of 2-HG,which is referred to as an oncometabolite81, drives widespread epigenetic changes through itsantagonistic activity towards various epigenetic modulators82. One of the cardinal consequences11of neomorphic IDH1/2 mutations is the glioma CpG-island methylator phenotype (G-CIMP)where a large number of CpG loci are observed to be hypermethylated relative to wildtypeIDH1/2 gliomas83,84. Our group previously noted a cooperation between IDH1-R132H (the mostcommon neomorphic IDH1 mutant in ODG) expression and mutant CIC-S in elevating 2HGlevels and in promoting clonogenicity65. We postulated that due to the association betweenmutant CIC-S expression and lower levels of active ACLY, which consumes citrate to produceacetyl-CoA, the elevated 2HG could have resulted from the greater availability of citrate. Whilemissense forms of CIC and mutant IDH may interact through downstream metabolic alterations,it is still unknown as to how loss of CIC expression (as a result of the truncating mutations foundin ODG) could cooperate with neomorphic IDH mutations to promote ODG. Bearing in mind thatmutant IDH causes widespread epigenetic aberrations in tandem with the recent findings thatCIC interacts with chromatin modifiers, it is reasonable to posit that the functional intersectionbetween CIC and IDH may involve collaborative transcriptional and/or epigenetic dysregulation.1.5 Thesis OverviewAs reviewed above, CIC is a tumour suppressor whose inactivation appears to promoteproliferation and metastatic propensity, thereby contributing to tumourigenesis and tumourprogression in multiple tissue contexts. Many of these attributions primarily revolved around thederepression of PEA3 transcription factors such as ETV4 and ETV5. Beyond these genes,however, the extent of the molecular consequences resulting from CIC loss are relativelyunexplored. It is also unclear as to how the effects mediated by CIC inactivation converge withthe effects of neomorphic IDH1/IDH2 mutations to drive ODG, which neither alteration alone issufficient to initiate. Since the cardinal molecular consequence associated with neomorphicIDH1/2 mutations is widespread epigenetic dysregulation, the emergence of epigeneticregulation being linked to CIC presents an incentive to investigate the effects of CIC loss on theepigenome in addition to the transcriptome. Thus, we obtained genome-wide profiles of RNAexpression, DNA methylation and several histone modifications in CIC-WT and CIC-KO celllines based on the hypothesis that an interrogation of these landscapes would yield novelmechanistic insights as to how CIC loss may contribute to ODG. Concomitantly, we alsocharacterized these landscapes in an isogenic CIC-WT and KO cell line model expressingIDH1-R132H to explore the cooperative interplay between CIC loss and mutant IDH.12Figure 1.1 : CIC Interacts With Chromatin Modifier Proteins(A) Selected CIC interacting proteins associated with the GO terms “Chromatin Organization”and “Chromatin Binding” identified by IP-MS across two independent studies. CIC-IP wasperformed on endogenous CIC in NHA cells in the first study69, and Flag-IP was performed onexogenous Flag-HA-CIC-L in HEK293 cells in the second54. Proteins are coloured and groupedaccording to functionality.(B)Western Blots validating CIC pulldown in ARID1A and SMARCA2 reciprocal IPs performedon whole cell lysates on NHA cells.13Figure 1.2 : Distribution of CIC Somatic Alterations in GliomaLollipop plot displaying somatic mutations in gliomas mapped to the CIC protein isoforms.Height and size of the lollipops are proportional to the frequency of mutation and colourindicates the type of mutation. Figure was adapted from the ProteinPaint web application tool85.14Chapter 2: Materials and MethodsCell Culture and ConditionsThe IDH1-WT and IDH1-R132H expressing immortalized human astrocyte cell lines wereobtained from Applied Biological Materials Inc (T3022; Richmond, BC, Canada). All cell lineswere cultured in Dulbecco’s Modified Eagle’s Medium supplemented with 10% (v/v) heat-inactivated fetal bovine serum (Invitrogen) and incubated in a humidified, 37°C, 5% CO2incubator.Generation of Isogenic CICWildtype and Knockout Immortalized Astrocyte CellLine Models Using CRISPR-Cas9The original normal human astrocyte cell line was immortalized by the introduction of lentiviralconstructs containing E6, E7 and hTERT86. Lentiviral constructs containing wildtype IDH1(IDH1-WT) or IDH1-R132H were later transduced to establish cell lines expressing therespective forms of IDH187. The E6/E7/hTERT + IDH1-WT cell line is referred to in this thesis asthe NHA cell line. With the E6/E7/hTERT + IDH1-R132H cell line, we observed gradual loss ofIDH1-R132H protein expression over serial passages. Hence, single clone screens involvingiterative Western Blot checks over multiple passages were conducted to obtain a monoclonalcell line that stably expresses IDH1-R132H, which I refer to as the F8 cell line throughout thethesis. CRISPR-Cas9 sgRNA sequences were designed to target exon 2 of the CIC gene63(chr19:42791005-42791024, Figure 2.1A) and used to generate several CIC-KO cell lines fromNHA and F8, including NHA-A2, NHA-H9, F8-A2 and F8-E10 (Figure 2.1B). Absence of CICand stable IDH1-R132H protein expression was confirmed using Western Blot on whole celllysates harvested from a split of each sample. Other splits were used for ChIP-seq (to profilehistone marks) and Whole Genome Bisulfite Sequencing (WGBS) (Figure 2.1C).Whole Transcriptome Library Construction and SequencingTo remove cytoplasmic and mitochondrial ribosomal RNA (rRNA) species from total RNA, theNEBNext rRNA Depletion Kit for Human/Mouse/Rat was used (NEB, E6310X). Enzymaticreactions were set-up in a 96-well plate (Thermo Fisher Scientific) on a Microlab NIMBUS liquidhandler (Hamilton Robotics, USA). 100ng of DNase I treated total RNA in 6 µL was hybridizedto rRNA probes in a 7.5 µL reaction. Heat-sealed plates were incubated at 95oC for 2 minutesfollowed by incremental reduction in temperature by 0.1oC per second to 22oC (730 cycles). The15rRNA in DNA hybrids were digested using RNase H in a 10 µL reaction incubated in athermocycler at 37oC for 30 minutes. To remove excess rRNA probes (DNA) and residualgenomic DNA contamination, DNase I was added in a total reaction volume of 25 µL andincubated at 37oC for 30 minutes. RNA was purified using RNA MagClean DX beads (AlineBiosciences, USA) with 15 minutes of binding time, 7 minutes clearing on a magnet followed bytwo 70% ethanol washes, 5 minutes to air dry the RNA pellet and elution in 36 µL DEPC water.The plate containing RNA was stored at -80oC prior to cDNA synthesis.First-strand cDNA was synthesized from the purified RNA (minus rRNA) using the Maxima HMinus First Strand cDNA Synthesis kit (Thermo-Fisher, USA) and random hexamer primers at aconcentration of 8 ng/µL along with a final concentration of 0.04 µg/µL Actinomycin D, followedby PCR Clean DX bead purification on a Microlab NIMBUS robot (Hamilton Robotics, USA).The second strand cDNA was synthesized following the NEBNext Ultra Directional SecondStrand cDNA Synthesis protocol (NEB) that incorporates dUTP in the dNTP mix, allowing thesecond strand to be digested using USERTM enzyme (NEB) in the post-adapter ligationreaction and thus achieving strand specificity.cDNA was fragmented using Covaris LE220 sonication for 100seconds (2x50seconds) at a“Duty cycle” of 30%, 450 Peak Incident Power (W) and 200 Cycles per Burst in a microTUBEStrip (P/N: 520053) to achieve 200-250 bp average fragment lengths. The paired-endsequencing library was prepared following the BC Cancer Genome Sciences Centre strand-specific, plate-based library construction protocol on a Microlab NIMBUS robot (HamiltonRobotics, USA). Briefly, the sheared cDNA was subject to end-repair and phosphorylation in asingle reaction using an enzyme premix (NEB) containing T4 DNA polymerase, Klenow DNAPolymerase and T4 polynucleotide kinase, and incubated at 20oC for 30 minutes. RepairedcDNA was purified in 96-well format using PCR Clean DX beads (Aline Biosciences, USA), and3’ A-tailed (adenylation) using Klenow fragment (3’ to 5’ exo minus) and incubation at 37oC for30 minutes prior to enzyme heat inactivation. Illumina PE adapters were ligated at 20oC for 15minutes. The adapter-ligated products were purified using PCR Clean DX beads, then digestedwith USERTM enzyme (1 U/µL, NEB) at 37oC for 15 minutes followed immediately by 13 cyclesof indexed PCR using Phusion DNA Polymerase (Thermo Fisher Scientific Inc. USA) andIllumina’s PE primer set. PCR parameters were: 98oC for 1 minute followed by 13 cycles of 15seconds at 98oC, 30 seconds at 65oC and 30 seconds at 65oC, and then 72oC for 5 minutes.The PCR products were twice purified and size-selected using a 1:1 PCR Clean DX beads-to-16sample ratio, and the eluted DNA quality was assessed using an Agilent DNA1000 Assay(Agilent, USA) and quantified using a Quant-iT dsDNA High Sensitivity Assay Kit on a Qubitfluorometer (Invitrogen). Libraries (3 replicates per cell line) were then pooled and size-corrected using a final molar concentration calculation for Illumina HiSeq2500 sequencing withpaired-end 75 base reads. Sequencing summary statistics such as total number of reads,number of duplicate reads, number of mapped reads and percentage of total reads that weremapped for each library are shown in Appendix A.Sequence AlignmentFor all sequence libraries, with the exception of WGBS libraries, raw reads were aligned to thehuman reference genome GRCh37-lite(http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome) usingBWA88 v0.5.7. Resulting BAM files were sorted and indexed using SAMtools89 v0.1.18.Differential Expression AnalysisRaw read counts were mapped onto ensembl 75 gene annotations using Jaguar90. DESeq2v.1.8.291 was used to conduct four independent differential expression analyses between eachCIC-KO line and their CIC-WT counterpart. A differential expression analysis between F8CIC-WT,IDH1-R132H and NHACIC-WT, IDH1-WT was also performed to identify genes whose expression levelswere impacted by the presence of IDH1-R132H. Differentially expressed (DE) genes wereconsidered statistically significant if they met an adjusted (Benjamini-Hochberg) p-value (q-value)of 0.05. For CIC-associated DE genes, those that were not significant in either of the replicateCIC-KO cell lines or had inconsistent direction (e.g. upregulated in one CIC-KO cell line anddownregulated in the other) were also filtered out.Functional Enrichment AnalysisMetascape is a web portal that integrates over 40 knowledge bases to enable pathwayenrichment and interactome analyses92. CIC-associated DE genes within each IDH1 contextwere submitted separately for pathway enrichment analysis. For IDH1-associated DE genes (n= 5750), those that had a fold change of at least 2 were submitted (n = 2722), since Metascapedoes not perform pathway enrichment with gene lists that exceed 3000 genes.17CIC Chromatin ImmunoprecipitationFor each NHA replicate, two ~70-80% confluent 15 cm plates were treated with 1%formaldehyde (Sigma) in PBS for 12 minutes with gentle rocking, followed by treatment with0.125 M glycine (Sigma) for five minutes. Crosslinked cells were combined and pelleted usingcentrifugation at 1,200 rpm for 5 minutes at room temperature, resuspended in 450 μL ChIPlysis buffer (50mM Tris-HCl pH 8.0, 1% SDS, 10mM EDTA, 1X cOmplete EDTA-free ProteaseInhibitor cocktail [PIC, Roche]), and lysed on ice for 30 minutes. Cells were homogenized by 6passages through a 20-gauge needle, and nuclei pellets were obtained by centrifugation at5,000 rpm for 10 minutes at 4oC. The pellet was resuspended in 900 μL shearing buffer (10mMTris-HCl pH 8.0, 0.1% SDS, 1mM EDTA, 1X EDTA-free PIC) and transferred to a 1 mLmilliTUBE with AFA fiber (D-Mark Biosciences) using a 30-gauge needle. Chromatin wassonicated using a Covaris S2 sonicator using the following settings: 10% duty cycle, 6 intensity,500 burst, 16 cycles (20s on, 40s off) at 4-6oC. Insoluble debris and unfragmented chromatinwere removed by centrifugation at 14,000 rpm for 12 minutes at 4oC. An aliquot of chromatinwas de-crosslinked overnight at 68oC (0.2 M NaCl, 0.05 mg/mL RNase), treated with proteinaseK for 30 min at 42oC, and purified using the MinElute PCR Purification Kit (28006, Qiagen).Concentration of the purified chromatin was determined using the Qubit dsDNA high sensitivityassay (Life Technologies) and the presence of DNA fragments in the 150-300 bp size rangewas confirmed by separation on a 2% agarose gel. Protein A Dynabeads (Life Technologies)blocked with bovine serum albumin and salmon sperm DNA for 3 h were used to pre-clearchromatin for 2 h. For immunoprecipitation, three volumes of IP buffer (10 mM Tris-HCl pH 8.0,1% Triton X100, 0.1% deoxycholate, 0.1% SDS, 90 mM NaCl, 2mM EDTA, 1X EDTA-free PIC)were added to 22.5 μg of chromatin, which was then incubated with 5 μg of anti-CIC antibody(Sigma, HPA044341) for 1h at 4oC. Blocked Protein A Dynabeads (25 μL) were then added tothe chromatin and antibody and incubated overnight at 4oC. The samples were then washedtwice in low salt buffer (20mM Tris-HCl pH 8.0, 0.1% SDS, 1% Triton X100, 2 mM EDTA, 150mM NaCl) and twice in high salt buffer (same as low salt buffer except 300 mM NaCl). DNA waseluted in 100 mM sodium bicarbonate with 1% SDS and 0.05 mg/mL RNase at 68oC for 6 hoursor overnight, followed by treatment with proteinase K for 30 min at 42oC. For each IP, 20 μL ofchromatin was subjected to the same treatment to serve as input controls. Eluted DNA waspurified using the MinElute PCR Purification Kit.18CIC ChIP Library Construction and SequencingLibraries were prepared following a modified paired-end library protocol (Illumina Inc., USA).Briefly, the DNA was subjected to end-repair and phosphorylation using T4 DNA polymerasewith Klenow DNA Polymerase and T4 polynucleotide kinase, respectively, in a single reaction.3’ A overhangs were generated using Klenow fragment (3’ to 5’ exo minus), and ligated toIllumina PE adapters containing 5’ T overhangs. The adapter-ligated products were purifiedusing PCR Clean DX beads (ALINE Biosciences), then PCR-amplified with Phusion DNAPolymerase in 13 cycles using Illumina’s PE primer set (Illumina). PCR product was purifiedusing PCR Clean DX beads (ALINE Biosciences), and the DNA quality was assessed andquantified using the Caliper LabChip GX DNA High Sensitivity assay (PerkinElmer) and theQuant-iT dsDNA high sensitivity assay (ThermoFisher Scientific). Libraries were normalized andpooled and the final concentration of the pooled library was determined using a Qubit dsDNAHS Assay Kit and a Qubit fluorometer (ThermoFisher Scientific). Clusters were generated onthe Illumina cluster station and sequence data were generated using an Illumina HiSeq2500platform following the manufacturer’s instructions. Summary statistics for ChIP-seq libraries areshown in Appendix D.CIC ChIP-seq AnalysisBigWig files containing normalized reads (read per kilobase per million mapped reads [RPKM])in 10bp bins were generated using Deeptools93 v.3.0.1. RPKM values within genome-wide200bp bins (obtained using Bedtools94) were obtained using Deeptools and used to calculatepairwise Spearman correlation values across all libraries. ChIP and input libraries werevisualized by uploading BigWig files into IGV95. Regions in which CIC ChIP signal was enrichedrelative to the matched input (i.e. peaks) were identified using MACS296 v.2.1.1 with a q-valuethreshold of 0.05. Peaks that overlapped with regions prone to sequencing errors (i.e.blacklisted regions)97 were identified using Bedtools and filtered out.Published ChIP-seq data derived from CIC-WT and CIC-KO MEKi treated (24h) G144 celllines54 were obtained from the ArrayExpress database under the accession number E-MTAB-6682. The obtained data were processed in a manner consistent with my CIC ChIP-seq data, asdescribed above. The ChIP-seq library obtained from the matching CIC-KO cell line was usedas the background control for peak calling. Peaks that overlapped between our CIC ChIP-seqand the published set were identified using Bedtools. The top 150 significant peaks in our19dataset were deemed high-confidence CIC peaks based on the inflection point at which thepresence of reproducibly identified peaks increased as a function of peak rank relative to allpeaks (see Results).Information associated with high-confidence CIC peaks, such as the genomic feature with whichthey overlapped, the nearest gene, and the peak distance from transcriptional start sites (TSS)was obtained using ChIPseeker98. De novo motif enrichment analysis was conducted on high-confidence CIC peaks centred on their summits (defined as the coordinate at which foldenrichment of ChIP read coverage relative to its matched control was greatest) using Homer99v.4.9.1, with the size parameter set to 200bp as recommended for identifying primary and co-enriched motifs for TFs (http://homer.ucsd.edu/homer/ngs/peakMotifs.html).Histone Modification ChIP Library Construction and SequencingSamples for Native chromatin immunoprecipitation (N-ChIP) were prepared from ~100,000 cellsper immunoprecipitation, two replicates per each cell line. Briefly, cells were lysed using lysisbuffer (0.1% Triton X-100, 0.1% Deoxycholate) and protease inhibitor for 20 minutes on ice.The extracted chromatin was then digested using 90U of MNase enzyme (New England Biolabs)for 6 minutes at 25oC. Reactions were quenched using 250 µM of EDTA. A mix of 1% Triton X-100 and 1% Deoxycholate was then added to the digested samples and the 96-well plate ofsamples was chilled on ice for 20 minutes. The digested chromatin was then pooled and 12 µLof chromatin was reserved for use as input control. The rest of the digested chromatin was pre-cleared using IP buffer (20 mM Tris-HCl [pH7.5], 2 mM EDTA, 150 mM NaCl, 0.1% Triton X-100,0.1% Deoxycholate) plus protease inhibitor with 20 µL of pre-washed Protein A/G Dynabeads(Invitrogen) at 4oC for 1.5 hours. Supernatants were removed from the beads and transferred toa 96-well plate containing the antibody-bead complex. The plate was sealed and incubated at4oC on a rotating platform overnight. The reaction plate containing the immunoprecipitationsamples was placed on a magnetic plate and samples were washed twice with low salt buffer(20 mM Tris-HCl [pH 8.0], 0.1% SDS, 1% Triton X-100, 2 mM EDTA, 150 mM NaCl) and twicewith high salt buffer (20 mM Tris-HCl [pH 8.0], 0.1% SDS, 1% Triton X-100, 2 mM EDTA, 500mM NaCl). DNA-antibody complexes were eluted in 30 µL Elution Buffer (100 mM NaHCO3,1% SDS), incubated at 65°C for 1.5 hours with mixing at 1,350 rpm on a thermomixer. Proteinwas digested by adding 1.75µL of Qiagen Protease to the eluted DNA samples at 50°C for 30minutes with mixing at 600 rpm on a thermomixer. ChIP DNA was then purified using Sera-Magbeads (Fisher Scientific) with 30% PEG before library construction. Library construction and20sequencing were done identically to the CIC ChIP-seq samples with the only difference beingthat the number of PCR cycles used to amplify the adapter-ligated products was between 8 and10. Sequencing summary statistics for histone modification ChIP-seq libraries can be found inAppendix F.Histone Modification ChIP-seq AnalysisDuplicated reads were removed using Picard v.1.114 prior to peak calling. FindER100 version1.0.1e was used with default parameters to identify peaks. Peaks that overlapped blacklistedregions97 were first removed. Peaks consistently observed in both replicates were identified andmerged using Bedtools to generate the final peak list for subsequent analyses. RPKM valueswithin peaks (obtained using Bedtools94) were obtained using Deeptools93 and used to calculatepairwise Spearman correlation values across all libraries.For differential enrichment analysis, the union of all peaks across all cell lines for each markwas first obtained using Bedtools. Raw read counts within these regions were obtained usingDeeptools93. DESeq291 was used to identify differentially enriched (DER) peaks between CIC-KO and CIC-WT cells (CIC-associated) or between IDH1-R132H and IDH1-WT cells (IDH1-associated), in a manner analogous to the identification of differentially expressed genes. DERpeaks were required to meet a q-value threshold of 0.05 and a fold change of at least 2 to beconsidered significant, and additionally required directional concordance between both CIC-KOreplicate cell lines for CIC-associated DER peaks. DER peaks were annotated with theirassociated genomic feature and nearest gene using ChIPseeker98.As previously described101, putative enhancer regions were first identified by filtering H3K4me1peaks that were at least 450bp long and merging those within 600bp of each other. H3K4me1regions within 2kb of a known TSS were excluded on the basis that these likely corresponded topromoter regions, rather than enhancers. Enhancers overlapping with H3K9me3 peaks wereconsidered to be heterochromatic regions and disregarded in this analysis. Enhancers wereassessed for an overlap with DER H3K4me1, H3K27ac and H3K27me3 peaks using Bedtoolsand were considered candidate dysregulated enhancers or differentially enriched enhancers(DER enhancers) if an overlap was present. The nearest gene associated with DER enhancerswere considered to be putative targets of such enhancers. De novo motif analysis wasperformed on downregulated and upregulated DER enhancers with the size parameter set to21500bp as recommended for histone marked regions(http://homer.ucsd.edu/homer/ngs/peakMotifs.html).Whole Genome Bisulfite Sequencing and Data ProcessingTo track the efficiency of bisulfite conversion, 10ng lambda DNA (Promega) was spiked into 1µg genomic DNA, quantified using Qubit fluorometry, and arrayed in a 96-well microtitre plate.DNA was sheared to a target size of 300 bp using Covaris sonication and the fragments weresubject to end-repair and phosphorylation in a single reaction, using an enzyme premix (NewEngland Biolabs) containing T4 DNA polymerase, Klenow DNA Polymerase and T4polynucleotide kinase, and incubated at 20oC for 30 minutes. Repaired DNA was purified in 96-well format using PCR Clean DX beads (Aline Biosciences, USA), and 3’ A-tailed (adenylation)using Klenow fragment (3’ to 5’ exo minus) at 37oC for 30 minutes prior to enzyme heatinactivation. Cytosine methylated paired-end adapters (5’-AmCAmCTmCTTTmCmCmCTAmCAmCGAmCGmCTmCTTmCmCGATmCT-3’ and 3’-GAGmCmCGTAAGGAmCGAmCTTGGmCGAGAAGGmCTAG-5’) were ligated to the DNA at20oC for 15 minutes and adapter flanked DNA fragments bead purified. Bisulfite conversion ofthe methylated adapter-ligated DNA fragments was achieved using the EZ Methylation-Gold kit(Zymo Research), following the manufacturer’s protocol. 5 cycles of PCR using HiFi polymerase(Kapa Biosystems) was used to enrich the bisulfite converted DNA. Post-PCR purification andsize-selection of bisulfite converted DNA was performed using 1:1 PCR Clean DX beads. Todetermine final library concentrations, fragment sizes were assessed using the DNA1000 assay(Agilent) and DNA was quantified using Qubit fluorometry. Clusters were generated on theIllumina cluster station and sequence data were collected on the Illumina HiSeq X platformfollowing the manufacturer’s instructions. Sequencing reads were aligned to human referencegenome GRCh37-lite using NovoAlign (http://www.novocraft.com/products/novoalign/).Fractional methylation values were obtained for each aligned CpG with a minimum coverage of5 using NovoMethyl (Bilenky et al., unpublished).Differential Methylation AnalysisDifferentially methylated regions (DMRs) were identified using Defiant102with default parameters(minimum CpG coverage of 5, p-value cutoff of 0.05, minimum methylation difference = 10%and minimum number of CpGs in a DMR = 5). DMRs between replicate CIC-KO cell lines wereassessed for both overlap and concordant directionality and were considered CIC-associated if22they met these criteria. Bedtools94 was used to overlap DMRs with genomic features andChIPSeeker98 was used to identify their nearest genes. Calculation of average fractionalmethylation of genomic regions (e.g. CIC peaks) was also performed using Bedtools.23-460kDa-238kDa-130kDa-55kDa24Figure 2.1 : Isogenic CIC-Wildtype and CIC-Knockout Immortalized Human Astrocyte CellLine Models(A) The target site of the CRISPR-Cas9 sgRNA used to generate CIC-KO cell lines located atthe first commonly shared exon between the two CIC isoforms.(B) The CIC-WT and KO cell lines used in this thesis work. The colours of the depicted cellsillustrate the clonal composition of the cell lines: NHACIC-WT, IDH1-WT is a polyclonal cell line fromwhich CIC-KO cell lines were independently derived from distinct single clones, whereas F8CIC-WT, IDH1-R132H is a monoclonal cell line and thus its CIC-KO derivative cell lines share the sameclonal origin.(C)Western blots for CIC, IDH1-R132H and Vinculin confirming the absence of CIC protein inthe CIC-KO cell lines and the presence of IDH1-R132H protein in F8 and its progeny. Westernblots were performed on whole cell lysates harvested on the same cell line thaw as the onessubmitted for transcriptomic and epigenomic profiling.25Chapter 3: ResultsThis Chapter describes and details my analyses on transcriptomic and epigenomic landscapesin CIC-WT and CIC-KO cells, with and without the IDH1-R132H mutation. For each profiled -omic landscape, I perform comparative analyses between CIC-KO cells and their CIC-WTcounterparts in each IDH1 context to identify alterations associated with CIC loss.Characterizations of CIC-associated alterations are conducted to derive insight into thebiological functions of CIC. CIC-associated alterations between IDH1-WT and IDH1-R132Hcells are also compared to identify those that consistently appear regardless of IDH1 status andthose that exhibit specificity to one IDH1 context, under the rationale that this analysis wouldreveal potential regulatory interactions between CIC and IDH1. I also perform comparativeanalyses between IDH1-WT and IDH1-R132H parental lines to identify alterations associatedwith the IDH1 mutation. IDH1-associated alterations are compared to those associated withCIC-KO to examine the convergence of CIC and IDH1-R132H’s impacts on the transcriptomeand epigenome. Chapter 3.1 presents the results of my analyses of transcriptomes, whichinvolved the characterization of CIC-associated and IDH1-associated differentially expressed(DE) genes. I also integrate CIC ChIP-seq data to delineate between direct and indirect CIC-associated DE genes. Chapter 3.2 pertains to my analyses of epigenomes, which involved thecharacterization of CIC-associated and IDH1-associated differentially enriched (DER) histonemodification peaks and differentially methylated regions (DMRs). In addition, results fromanalyzing the epigenomes are combined with the results from the transcriptomic analysesdescribed in Chapter 3.1, to investigate associations between epigenetic and transcriptionalalterations.3.1 Characterization of Transcriptomic Consequences of CIC-KO in Cell LinesExpressing IDH1-WT and IDH1-R132H3.1.1 Differential Expression Analysis Identifies Known Targets of CIC and YieldsDivergent Results According to IDH1 StatusTo investigate the effects of CIC loss and neomorphic IDH1 mutation on transcriptomes, weperformed RNA-seq on 3 replicates each of NHACIC-WT, IDH1-WT, NHA-A2CIC-KO, IDH1-WT, NHA-H9CIC-KO, IDH1-WT, F8CIC-WT, IDH1-R132H, F8-A2CIC-KO, IDH1-R132H, and F8-E10CIC-KO, IDH1-R132H. Differentialexpression (DE) analyses were conducted to compare gene expression levels between eachCIC-KO cell line and its CIC-WT parental counterpart (see Methods). Genes were considered26significantly DE if their differential expression met an adjusted p-value (i.e. q-value) cutoff of0.05. Considering CIC’s known role as a transcriptional repressor, the presence of knowntargets of CIC such as ETV4 and ETV5 in the list of significantly up-regulated genes in all CIC-KO cell lines validated the analysis. Other genes that were previously confirmed65 to be boundby CIC at their promoters using ChIP-qPCR (ETV1, DUSP4, GPR3, SPRY4, SHC3, SHC4)were also generally up-regulated (Figure 3.1). Interestingly, with the exception of DUSP4, thesegenes appeared to have lower expression in the F8CIC-WT, IDH1-R132H parental line than in theNHACIC-WT, IDH1-WT parental line (Figure 3.1).Overall, 1,621 (32.3%) and 2,470 (67.7%) protein coding genes in NHA-A2CIC-KO, IDH1-WT wereup- and down-regulated relative to NHACIC-WT, IDH1-WT (q<0.05), respectively (Figure 3.2A).Similarly, 1692 (40.3%) up-regulated and 2219 (59.7%) down-regulated genes were identified inNHA-H9CIC-KO, IDH1-WT. In CIC-KO cell lines expressing IDH1-R132H, fewer protein coding geneswere significantly DE relative to their CIC-WT counterpart and proportionally had more up-regulated genes: (1437 [55.3%] up and 1160 [44.7%] down in F8-A2CIC-KO, IDH1-R132H, 783 [55.1%]up and 601 [44.9%] down in F8-E10CIC-KO, IDH1-R132H, Figure 3.2A). The fold changes in geneexpression between CIC-KO and WT lines were also more restricted in the IDH1-R132Hbackground than in the IDH1-WT background (Figure 3.2A). To identify genes that arereproducibly DE in association with CIC status, only the genes that were significantly andconcordantly DE (i.e. with consistent direction) in both CIC-KO cell lines in each IDH1 contextwere kept for subsequent analyses (henceforth referred to as CIC-associated DE genes). Thisfiltering step resulted in 1529 CIC-associated DE genes in IDH1-WT cells and 923 CIC-associated DE genes in IDH1-R132H cells (Figure 3.2B).Genes DE as a consequence of IDH1-R132H (IDH1-associated DE genes) were also identifiedby repeating the analysis comparing expression between the F8CIC-WT, IDH1-R132H and NHACIC-WT,IDH1-WT parental cell lines. IDH1-associated DE genes comprised a greater number of genes(3473 [57.5%] up and 2571 [42.5%] down) and exhibited greater dispersion in expression levelscompared to CIC-associated DE genes (Figure 3.2C). More than half of CIC-associated DEgenes in either IDH1 background were also found to be IDH1-associated (Figure 3.2D),indicating a considerable overlap between the transcriptional consequences of CIC loss andmutant IDH1. Interestingly, the intersection between CIC-associated DE genes in IDH1-WT cellsand CIC-associated DE genes in IDH1-R132H cells comprised only 275 genes, highlighting thatgreater than ~60% of the transcriptomic effects attributed to CIC loss were subject to IDH127context dependency in our cell line model. Among those consistently DE upon CIC-KO,regardless of IDH1 status, were ETV1, ETV4, ETV5 and GPR3, which is consistent with theirappearance as CIC-targeted genes across all cell/tissue types in which the effects of CIC losswere studied5,65,77,78.3.1.2 CIC and IDH1-Associated DE Genes Are Enriched for Pathways Related to NeuralDevelopment and the Extracellular MatrixTo glean insight into the CIC-associated transcriptional alterations at the level of biologicalpathways, I performed functional enrichment analyses of all CIC-associated DE genes (q<0.05)within each IDH1 context using Metascape92. Since Metascape does not intake gene listsexceeding 3000 genes for functional enrichment analysis, I applied an absolute fold changecutoff of 2 for IDH1-associated DE genes. This reduced the list of IDH1-associated DE genesfrom 5,750 to 2,033 genes. The top 10 most significantly enriched gene ontology terms for eachDE gene set is displayed in Figure 3.3 and an additional 15 terms are listed in Appendix C.Consistent with previous links between CIC and CNS development6,49,51,52, many pathwaysrelated to neuron differentiation, synapse formation and projection were among the mostsignificantly enriched terms within CIC-associated DE genes in both IDH1 backgrounds (Figure3.3). Possibly related to CIC’s role in promoting invasive phenotypes5,77, terms consistent withthe extracellular matrix (ECM) were also detected (Appendix C). These same terms wereamong the top enriched pathways for IDH1-associated DE genes, indicating an overlapbetween the consequences of CIC-KO and IDH1-R132H at the level of biological processes inaddition to the overlap at the level of DE genes, as noted in the previous section. Interestingly,vasculature development was specifically present among the top pathways for IDH1-associatedDE genes, which may be of relevance regarding the role of IDH1-R132H in hematopoieticmalignancies79,80.The lists of CIC-associated DE genes likely comprise both direct and indirect consequences,since the genes directly dysregulated by CIC-KO can promote additional transcriptionaldysregulation further downstream. To help delineate direct CIC targets from indirect ones, Isought to map the genome-wide occupancy of CIC using ChIP-seq, as described in the nextsection.283.1.3 CIC ChIP-seq Identifies Known and Potentially Novel Direct CIC Target GenesChIP-seq is widely used to generate genome-wide maps of protein-DNA interactions and toidentify candidate direct target genes of transcription factors35. Although we generated tworeplicate CIC ChIP-seq libraries from NHACIC-WT, IDH1-WT cell lines, we encountered multipleissues concerning their quality. For one, peak calling (using MACS2 at q <0.05 with input control)between the two replicates was discrepant, with 623,418 peaks called in replicate (rep) 1 andover 10 fold fewer (59,514) called in rep 2, of which only ~23% overlapped with a rep 1 peak(>= 1 bp). Moreover, Spearman correlation values calculated on genome-wide RPKM coveragewithin 200bp bins revealed that only the rep 2 ChIP library was dissimilar from the input libraries(Figure 3.4A).Closer investigation of the control input libraries revealed pronounced ‘peaky’ read distributionsreminiscent of ChIP libraries (Figure 3.4B). A common technical source for artifactualenrichment is PCR amplification bias103, which can result in high duplicate rates and bequantified using the PCR bottleneck coefficient (PBC). However, the low duplicate rates andhigh PBCs of the input libraries indicated that PCR amplification bias was unlikely to be thecause of their irregular read distributions (Appendix D). Rather, considering the input libraries’low predominant fragment lengths (Appendix D), I posit that their ‘peakiness’ may have beenattributed to sonication bias104, which can arise from excessive mechanical shearing. The ChIPlibraries displayed considerably higher duplicate rates of ~20-30% and lower PBC values, whichmay have been due to the low amounts of ChIP’ed DNA that went into sequencing (2.5ng forrep1, 4ng for rep2). This raised the possibility that both PCR amplification and sonication biaswere present in the ChIP samples. Rep 1 in particular appeared to contain a large degree ofspurious peaks which tended to mirror those in the input libraries but with greater signal (Figure3.4B), explaining the ~10 fold increase in the number of peaks that were called in rep1. Despitethe noisy input samples, inspection of peaks at validated CIC target sites, such as thepromoters of GPR3, ETV4 and ETV5, illustrated a clear enrichment of reads, especially in rep2and not in the inputs (Figure 3.4C), indicating that the experiment was at least partiallysuccessful in capturing true CIC binding sites. Taken together, these observations indicated ahigh prevalence of noise in our samples and identified rep 2 as the more reliable source of CICChIP-seq data.I thus decided to focus my analysis on rep2. To help rationalize an appropriate significancethreshold, I utilized an independent CIC ChIP-seq dataset generated by Weissmann and29colleagues54. The CIC ChIP-seq experiment conducted by Weissmann and others used IgG andCIC-KO cells as controls instead of an input control. I chose to use their CIC-KO ChIP-seqlibrary as the background control for my analysis, on the basis that this library accounts forgenomic areas of non-specific binding of their CIC-antibody, unlike an IgG control. I obtainedthe BAM files of their CIC ChIP-seq libraries produced from CIC-WT and CIC-KO G144 gliomacell lines and analyzed them in a manner consistent with my CIC ChIP-seq data (see Methods).The Spearman correlation value calculated on genome-wide RPKM coverage within 200bp binsbetween the CIC-WT ChIP-seq library and the CIC-KO ChIP-seq library was 0.2, indicating agreater degree of dissimilarity than those observed between our CIC ChIP and matched inputlibraries (0.68 for rep1 and 0.48 for rep2). Visual examination of these libraries using IGVrevealed relatively uniform read distributions, in contrast to those observed in our CIC ChIP-seqand matched input libraries (Figure 3.5A). Furthermore, a much lower number of peaks wereidentified by MACS2 at a q-value of 0.05 using the published CIC ChIP-seq data (n=1,463),again in contrast to the number of peaks identified in our CIC ChIP-seq dataset (623,418 peaksin rep1 and 59,514 peaks in rep2). Consistent with our CIC ChIP-seq data, the promoters ofGPR3, ETV4 and ETV5 exhibited a prominent enrichment of reads in the published CIC ChIP-seq library, and not in the matching CIC-KO control (Figure 3.5B). I interpreted these results toindicate that the published CIC ChIP-seq data contained a lesser degree of noise than in ourCIC ChIP-seq libraries and was of sufficient quality to use as an independent dataset ofcandidate CIC binding sites.Using Bedtools, I identified the peaks called in the published CIC ChIP-seq data that were incommon (i.e. reproducibly identified) with the peaks called in rep2 of our CIC ChIP-seq library(n=99 peaks). I then ranked rep2 peaks from least to most significant and identified theinflection point at which the number of reproducibly identified peaks spiked (Figure 3.6A). Thispoint was approximately at the 150th ranked peak, the ranking above which included almost allof the empirically validated65,69 (i.e. confirmed by ChIP-qPCR) CIC binding sites at thepromoters of ETV4, ETV5, GPR3, DUSP4, SPRY4 and PLK3. Thus, I decided to focus myanalyses on the top 150 most significant peaks (henceforth referred to as high-confidence CICpeaks).The 150 high-confidence CIC peaks were annotated using ChIPseeker98 (see Methods). Ofthese, 47 (~31%) were within 2 kb of a known TSS (i.e. promoters), 51 (34%) were associatedwith other genic features such as exons, introns and untranslated regions, and 53 (~35%) were30found in intergenic regions (Figure 3.6B). To investigate whether these peaks contained theknown CIC consensus binding site (T[G/C]AATG[G/A][G/A])54,62-64 and to identify other TFbinding sites in their close proximity, I conducted a de novo motif enrichment analysis usingHomer v.4.9.1 (see Methods). The most significant motif identified using Homer contained theCIC consensus binding site and matched most closely (p-value of 1e-49) with a transcriptionfactor belonging to the same HMG-box family as CIC, SOX17 (Figure 3.6C). Motifs matchingthose of YY1 and MED1 were the second and third most significantly enriched (p-values of 1e-17 and 1e-16, respectively) (Figure 3.6C). YY1 was previously identified as a direct interactor ofCIC (S. Weissman, et al54, Supplementary table 3) and MED1 is a subunit of the RNApolymerase II mediator complex which contains another CIC interactor, POLR2A54,69. It is thuspossible that YY1 and MED1 may form a complex with CIC and contribute to CIC’s binding ontochromatin through their own motif recognition.In addition to the previously established65,69 CIC target sites at the promoters of ETV4, ETV5,GPR3, DUSP4, SPRY4, and PLK3, high-confidence peaks were found at the promoters of cellcycle regulators (CCND1, CDC14A, CDC14B), AP-1 transcription factor subunits (FOS, FOSL1),small MAF transcription factors (MAFF, MAFG), genes involved in neurodevelopment (EPHA2,ID1), lysine methyltransferase 5B (KMT5B), and runt-related transcription factor 1 (RUNX1)(Figure 3.7). With the exception of the peak associated with KMT5B, all of these peaks werereproducibly identified using the published CIC ChIP-seq dataset, supporting the notion thatthey are true CIC binding sites. I next looked into whether these genes and other candidatedirect targets exhibited differential expression in our CIC-KO lines.Of the 12 genes noted above, CDC14A, CDC14B, EPHA2, MAFF, MAFG and RUNX1 weresignificantly (q<0.05) upregulated in at least one CIC-KO cell line compared to its CIC-WTcounterpart (Figure 3.8). CCND1, FOSL1, ID1 and PLK3 also generally exhibited increasedtranscript levels in CIC-KO cells but not at statistical significance (Figure 3.8). Theseobservations are in support of the notion that these genes may be direct targets of CIC-mediated transcriptional repression. For the rest of the CIC peak-associated genes, I definedone to be a candidate direct target of CIC if their TSS was within 100kb of a high-confidenceCIC peak, based on the finding that most empirically validated enhancer-promoter interactionsfall within this range105. Of the 110 unique candidate CIC target genes, 7 were significantlydownregulated and 11 were significantly upregulated (18 in total) in CIC-KO cells expressingIDH1-WT compared to NHACIC-WT, IDH1-WT. Thirteen total candidate CIC target genes were found31to be significantly DE, all upregulated, in CIC-KO cells expressing IDH1-R132H compared toF8CIC-WT, IDH1-R132H. These results indicate that for many of these candidate targets, the absenceof CIC alone may not be sufficient for their differential expression.As discussed in Chapter 1.4, recent evidence has highlighted a link between CIC and epigeneticregulators, implicating a functional relationship between CIC and various proteins involved in themodulation of chromatin. I thus sought to investigate the epigenomic differences between CIC-WT and KO cell lines and whether they could be related to the CIC-associated transcriptionalalterations. Also discussed in Chapter 1.4 was the co-occurrence of deleterious CIC mutationsand neomorphic IDH1/2 alterations, which are known to dysregulate the epigenome. Since thisco-occurrence implies a functional relationship between the two mutated genes, I alsoendeavoured to explore the epigenomic impact of CIC loss in the presence of IDH1-R132H.3.2 Characterization of the Epigenomic Consequences of CIC-KO in Cell LinesExpressing IDH1-WT and IDH1-R132H3.2.1 Differential Enrichment Analysis Identifies CIC and IDH1-Associated Changes in theChromatin LandscapeI chose to use FindER100 (Bilenky et al., unpublished) for peak calling (see Methods) due to itsability to accommodate histone modifications with different profile types (localized, broad or amixture of both). As a quality control experiment, I sought to assess the similarity of global peaksignal across samples. To do this, I calculated pairwise Spearman correlations across all ChIPlibraries using RPKM coverage values within the called peaks, performed unsupervisedhierarchical clustering and inspected the correlation matrix for each histone mark. As expected,all replicate pairs were grouped together (Figure 3.9). Samples were consistently separated intotwo distinct clusters according to IDH1 status for all marks, indicating that the IDH1-R132mutation had a pronounced effect on the histone modification landscape, consistent with its rolein promoting epigenome-wide changes. Samples were also grouped according to CIC status,indicating that CIC-KO also had an impact on the chromatin landscape, although the correlationdifferences appeared to be subtle compared to those between IDH1-WT and IDH1-R132Hsamples.To obtain a broad overview of the histone modification landscape and to assess for globaldifferences between CIC-KO and CIC-WT cells, and IDH1-R132H and IDH1-WT cells, I32compared the number of peaks for each histone mark across all cell lines. Consistent withprevious research reporting an association between mutant IDH and increased histonemethylation84, the F8 (IDH1-R132H) cell lines generally displayed greater numbers of peaks forthe methylated histone modifications (Figure 3.10). In particular, the differences in peaknumbers between IDH1-R132H and IDH1-WT cell lines were statistically significant forH3K4me1, H3K4me3, H3K27me3 and H3K36me3 (Mann-Whitney U test, p < 0.05). Thenumber of peaks can be dependent on the distribution of peak sizes. For example, several smallpeaks in close proximity in one library could be identified as a larger, single peak in another,resulting in fewer peaks being called in the latter sample despite its peaks encompassing morebase pairs. I therefore also compared the number of base pairs covered by peaks acrosslibraries (Figure 3.10). Consistent with the number of peaks, base pairs covered by H3K4me1,H3K4me3 and H3K27me3 peaks exhibited significant increases in IDH1-R132H cell lines(Mann-Whitney U test, p < 0.05). Notably, H3K36me3 peaks covered fewer base pairs in IDH1-R132H cell lines, in contrast to the observation that they yielded greater numbers of peakscompared to those that expressed IDH1-WT. This indicates that the differences in peaknumbers for this mark was attributed by shifts in the distribution of peak widths rather than achange in global enrichment.While CIC-associated differences in global enrichment of histone marks were not apparent, thisfinding does not completely discount the presence of CIC-associated alterations on thechromatin landscape. For example, CIC’s influence on chromatin may be restricted to arelatively small portion of the genome (around its binding sites, for example). Also, thecomparison of global enrichment does not account for differences in read coverage within peaks,which may also indicate a change in chromatin state. Hence, I employed a differentialenrichment analysis using raw read counts within the union of all peaks for each mark.Analogous to the differential expression analysis using RNA-seq data, the differentialenrichment analysis used DESeq2 to statistically assess the differences of read counts within aset of genomic regions (in this case, peaks) across conditions (see Methods). For this analysis,peaks were considered differentially enriched (DER) between two conditions (CIC-KO vs CIC-WT or IDH1-R132H vs IDH1-WT) if they met a q-value cutoff of 0.05 and an absolute folddifference of at least 2. DER peaks were additionally required to have directional concordancein both CIC-KO replicate cell lines to be considered CIC-associated, as in the RNA-seq DEgene analysis.33A summary of each DER peak analysis is displayed in Figure 3.11, showing the number of DERpeaks, the number of base pairs within DER peaks and the proportion of base pairs within DERpeaks relative to the total number of base pairs within all peaks for that mark. In an IDH1-WTbackground, there were generally more peaks that exhibited a loss of enrichment in CIC-KOcells than there were peaks that gained enrichment (Figure 3.11). Similar to the results Iobtained analysing the RNA-seq DE genes, there were fewer CIC-associated DER peaks in anIDH1-R132H background than in an IDH1-WT background. Interestingly, in terms of genomicbreadth, there was a similar degree of CIC-associated loss of H3K27me3 in both IDH1backgrounds, indicating that this association may be independent of IDH1 status. While the lossof H3K27me3 appeared to be distinctly more pronounced in absolute measures compared tothe other marks, the loss may be exaggerated by the fact that H3K27me3 covers more of thegenome than the other modifications. In relative terms, the CIC-associated loss of H3K27me3affected a proportion of the genome that was comparable in base-pairs to the differentialenrichment of H3K4me1, H3k4me3 and H3K27ac. The comparison of IDH1-R132H to IDH1-WTcells revealed a much greater degree of difference, as expected, with over 10% of each set ofhistone modification peaks (except H3K36me3) displaying differential enrichment (Figure 3.11).Since H3K9me3 and H3K36me3 displayed a limited degree of differential enrichment in theseanalyses, I decided to focus my analysis on the other four histone modifications.Examination of the intersection between all three DER peak analyses for these four marksrevealed a result similar to the RNA-seq analysis, in which a considerable number of CIC-associated DER peaks were also IDH1-associated, while the overlap between CIC-associatedDER peaks across the two IDH1 backgrounds was comparatively small (Figure 3.12). This mayindicate that the majority of changes in the chromatin landscape attributed to CIC loss aredependent on the status of IDH1. I next annotated the DER peaks using ChIPseeker toinvestigate their relationship to genomic features. For all four histone modifications, DER peakswere predominantly located >10kb away from a TSS and at introns and intergenic regions(Figure 3.13), indicating that CIC loss and IDH1-R132H may primarily affect distal regulatoryelements. Motivated by the emergence of the link between CIC and chromatin modifiers54,69, Inext explored whether the CIC binding sites identified in Chapter 3.1.3 were associated with thechanges in the histone modification landscape identified in this section.343.2.2 CIC binding Is Not Associated With Changes in the Histone Modification LandscapeCIC’s role as a transcriptional repressor appears to involve its interaction with various proteincomplexes with known roles in epigenetic modification such as the SIN3 histone deacetylase(HDAC) complex and SWI/SNF nucleosome remodelling complex54,69. Indeed, a study byWeissman and colleagues showed that CIC binding was associated with decreased histoneacetylation and that treatment with a HDAC inhibitor partially abrogated CIC’s ability to repressits target genes54. Thus, I hypothesized that CIC binding sites would be associated with anincrease in H3K27ac in CIC-KO cells. Since CIC was found to interact with chromatin modifiersother than those involved in histone deacetylation, I also hypothesized that CIC binding siteswould also exhibit a gain in the active H3K4me3 mark and/or a loss in the H3K27me3repressive mark.To explore the association between CIC binding and differences in chromatin state, I firstinspected whether the 150 high-confidence CIC peaks (see Chapter 3.1.3) overlapped with anyDER histone modification peaks and found that none of them did. Alternatively, CIC mayinfluence histone modification levels around the TSS of their associated targets rather than atthe binding sites themselves. I thus examined whether any CIC-associated DER peaks werepresent around the TSSs of candidate CIC target genes. Of the 110 candidate CIC target genes,only PLEKHA5 and ETV4 had a DER peak within 2kb of their TSSs, the former gene displayinga CIC-associated gain in H3K4me3 only in an IDH1-WT background and the latter geneexhibiting a CIC-associated decrease in H3K27me3 in both IDH1 contexts. Both genes werealso significantly upregulated in CIC-KO cells, suggesting that their increased expression mayhave been due to these chromatin state changes. Beyond these two genes, however, neitherCIC binding sites nor the TSSs of their putative targets appeared to showcase a difference inchromatin state, at least from my investigation of the six histone modifications that were profiled.3.2.3 CIC-KO Is Associated With Dysregulation of Enhancers at NeurodevelopmentalGenesThe lack of an apparent association between CIC binding and histone modification levelsindicates that the CIC-associated DER peaks may largely consist of indirect consequences ofCIC loss. Rather than CIC itself, the genes targeted by CIC may be responsible for mediatingthese histone mark changes. This is plausible since the regulatory functions of several CICtarget genes, such as the AP-1 TF complex components FOS and FOSL1, and ETV5 havebeen shown to involve chromatin modulation, particularly at enhancers106-108. Since the majority35of DER peaks occupied regions distal to a TSS (Figure 3.13), I hypothesized that these regions,or at least a considerable subset of them, included enhancers. To address this hypothesis, I firstused H3K4me1 data to identify putative enhancer regions in our cell line model as previouslydescribed101 (see Methods). These putative enhancer regions were subsequently assessed foran overlap with DER peaks for the enhancer marks (H3K4me1, H3K27ac and H3K27me3).Consistent with my postulation that a substantial proportion of DER peaks occurred atenhancers, ~23-36% of CIC-associated H3K4me1 DER peaks and ~58-62% of CIC-associatedH3K27ac DER peaks were found at enhancer regions (Figure 3.14A). Comparatively, only ~5-6% of DER H3K27me3 peaks overlapped with an enhancer region, indicating that the majorityof CIC-associated H3K27me3 peaks that were found in intergenic regions were not atenhancers. Henceforth, I will refer to enhancers that displayed a loss of H3K4me1 and/orH3K27ac or gain of these marks as down-regulated and up-regulated enhancers, respectively,and collectively as DER enhancers. As a technical control, I visualized the H3K4me1 andH3K27ac coverage profiles at these down and up-regulated enhancers and confirmed thedifferences in mean coverage between CIC-WT cells and their CIC-KO counterparts, primarilyaround the center of the enhancer regions (Figure 3.14B).I then investigated whether these enhancers were accompanied by changes in gene expression,focusing on those that exhibited differential enrichment of H3K4me1 or H3K27ac. Using theDER peak annotations obtained previously, I inspected the relationship between H3K4me1 andH3K27ac signal at DER enhancers and the fold change in expression of the nearest gene.Consistent with H3K4me1 and H3K27ac being histone modifications associated with enhanceractivity, there was a clear positive correlation between both H3K4me1 and H3K27ac enhancersignal and proximate gene expression (Figure 3.15A). DER enhancers whose associated geneshad the largest changes in gene expression also tended to exhibit concordant differentialenrichment of both H3K4me1 and H3K27ac. In the IDH1-WT model, the putative gene targets ofsuch enhancers included Cadherin 8 (CDH8), Transmembrane Protein 108 (TMEM108),Platelet Derived Growth Factor Receptor Alpha (PDGFRA) and Nuclear Factor 1A (NFIA). In theIDH1-R132H model, prominent DER enhancer associated genes included Neuregulin1 (NRG1),EPH Receptor A4 (EPHA4), ETV1, and NFIA. Curiously, NFIA was down-regulated upon CICloss in IDH1-WT cells yet up-regulated upon CIC loss in IDH1-R132H cells (Figure 3.15A),suggesting a possible interaction between IDH1 and CIC in regulating this gene. No other geneexhibited such a behaviour. Notably, NFIA and many of the above-mentioned genes are36involved in neurodevelopment and cell adhesion, which are in agreement with the geneontology terms that were found to be significantly enriched for CIC-associated DE genes (Figure3.3). NFIA is particularly interesting, due to its demonstrated role in the control of gliogenesis.Kang and others109, for example, showed that NFIA expression promoted the onset ofgliogenesis in the embryonic chick spinal cord. Moreover, in a mouse model study conducted byGlasgow and colleagues110, NFIA overexpression was shown to result in the formation ofgliomas.As discussed previously, CIC binding sites did not appear to be associated with DER peaks,illustrating that DER enhancers are likely downstream products of CIC’s direct effects. SinceCIC target genes comprised those with demonstrated roles in enhancer regulation, I speculatedthat their motifs would be enriched within these DER enhancers. I therefore conducted a motifenrichment analysis as was done for the high-confidence CIC peaks, separately for down-regulated enhancers and up-regulated enhancers in each model. As expected, motifs related toseveral CIC target genes were enriched within all sets of CIC-associated DER enhancers. Forexample, motifs related to the AP-1 complex which encompasses CIC candidate targets FOSand FOSL1 were the most significantly enriched across all DER enhancers (Figure 3.15B).Motifs that matched best with ETS-family TFs (ERG and ETS:E-box), to which ETV1,4 and 5belong, also emerged, illustrating that the de-repression of ETV genes may have contributed tothese altered enhancers. Along the same lines, motifs matching those of additional candidateCIC targets, RUNX1 and MAF genes (MAFB and BACH1 - a common binding partner for MAFTFs111) were among the top 5 most significantly enriched. As mentioned above, CIC loss wasassociated with NFIA down-regulation in IDH1-WT cells but was associated with up-regulationin IDH1-R132H cells. This observation, coupled with the finding that the NFIA motif wasenriched specifically in CIC-associated down-regulated enhancers in IDH1-WT cells, iscompatible with the notion that the loss of NFIA expression may be linked to the inactivation ofadditional enhancers in this model.As noted above, an enhancer associated with the well-known GBM associated genePDGFRA112 displayed a loss of both H3K4me1 and H3K27ac in CIC-KO lines expressing IDH1-WT with a corresponding loss of PDGFRA gene expression (Figure 3.15A). The same enhanceralso exhibited H3K4me1 and H3K27ac loss with concurrent loss of PDGFRA expression inF8CIC-WT,IDH1-R132H cells relative to NHACIC-WT,IDH1-WT cells (Figure 3.16A,B), indicating that both theloss of CIC and the presence of the IDH1-R132H mutation independently resulted in an37inactivation of this enhancer. An intragenic enhancer within NFIA exhibited a reduction ofH3K27ac in CIC-KO cells compared to their CIC-WT counterparts and in the F8CIC-WT,IDH1-R132Hparental line compared to the NHACIC-WT,IDH1-WTparental line, also illustrating the independentimpacts of CIC-KO and IDH1-R132H in downregulating the same enhancer. Intriguingly,however, CIC-KO cells also containing the IDH1-R132H mutation (i.e. double mutant cells)displayed a gain of H3K27ac, which I interpret as a potential reactivation of this enhancer(Figure 3.16A). This unique pattern of enhancer dysregulation was also accompanied byconcurrent changes in NFIA gene expression (Figure 3.16B). In summary, these exampleshighlight the convergence of CIC-KO and IDH1-R132H on impacting specific enhancers atgenes associated with CNS tumours and, regarding NFIA, illustrate an interesting case of anenhancer whose activity is differentially regulated by CIC loss based on the status of IDH1.3.2.4 Analysis of Differentially Methylated Regions Identifies CIC and IDH1-AssociatedChanges in the DNA Methylation LandscapeSince the stereotypical molecular consequence of the neomorphic IDH1-R132H mutation iswidespread DNA hypermethylation, I first sought to verify this phenotype in the WGBS data.Similar to previous studies84, I performed a clustering analysis on all cell lines based on the top10,000 most variably methylated CpG sites. Expectedly, this resulted in a clear bifurcationaccording to IDH1 status, with the majority of CpGs exhibiting hypermethylation in IDH1-R132Hsamples (Figure 3.17A). Interestingly, cell lines were also clustered by CIC status, indicatingthat the fractional methylation values of these 10,000 CpGs also distinguish CIC-KO samplesfrom their CIC-WT counterparts, although these differences appear to be subtle. I nextcompared mean CpG methylation across all cell lines, genome-wide and within CpG island andCpG shores. I found that mean CpG methylation was significantly higher (p<0.0005) in all IDH1-R132H cell lines compared to all IDH1-WT lines, consistent with the expected associationbetween global DNA hypermethylation and expression of IDH1-R132H (Figure 3.17B).To identify regions of DNA methylation that were affected by CIC loss or IDH1-R132H, Iconducted a differentially methylated region (DMR) analysis using Defiant102 (see Methods).This tool was chosen due to its superior precision and recall compared to other available tools102.Once again, similar to the DE and DER analyses described above, DMRs were considered tobe CIC-associated based on a q-value threshold of 0.05 in both replicate CIC-KO cell lines andconcordant directionality. Of the 43,302 DMRs (CIC-KO vs. CIC-WT) identified in NHA-A2CIC-KO,IDH1-WT and 42,259 DMRs identified in NHA-H9CIC-KO,IDH1-WT, only 5,683 (~13%) were in38common and had consistent directionality. (see Methods). A much lower number of DMRs wereidentified between CIC-KO and CIC-WT cells expressing IDH1-R132H: 2,605 in F8-A2CIC-KO,IDH1-R132H, 1,835 in F8-E10CIC-KO,IDH1-R132H and only 115 in common and concordant between both.Unsurprisingly, many more IDH1-associated DMRs were identified, totaling 83,572. I nextsummarized each set of DMRs regarding their distributions in directionality (hypomethylated orhypermethylated), genomic span, absolute difference in fractional methylation and genomicfeatures.Strikingly, the impact of CIC loss on the DNA methylome in both IDH1-WT and IDH1-R132Hbackgrounds appeared to almost exclusively involve increased DNA methylation, as wasexpected and observed with IDH1-R132H associated DMRs (Figure 3.18). While the absolutefractional methylation differences were universally distributed around a median of 20-30%, CIC-associated DMRs in IDH1-R132H cells were notably smaller in size and were comprised of ahigher proportion of DMRs at CpG islands/shores than CIC-associated DMRs in IDH1-WT cellsand IDH1-associated DMRs (Figure 3.18). Considering CIC’s established role as atranscriptional repressor, the prominence of hypermethylated DMRs relative to hypomethylatedDMRs in CIC-KO cells was unexpected. However, as with the DER enhancers, these resultsmay largely comprise indirect consequences of CIC-KO. As illustrated previously, CIC bindingsites did not appear to be associated with differential enrichment of histone marks. Since themechanism behind CIC target de-repression upon loss of CIC was unexplained by differentialenrichment of histone modifications, I asked whether hypomethylation of CIC target promoterscould constitute such a mechanism. I thus inspected whether any CIC-associatedhypomethylated DMRs overlapped with the high-confidence CIC peaks but found that none did.I also compared the average methylation levels of high-confidence CIC peak regions andaround the TSSs of CIC targets between CIC-WT and CIC-KO cells and observed nodifferences (Figure 3.19). These results illustrate a lack of an association between direct CICbinding and DNA methylation, reinforcing the notion that CIC-associated DMRs aroseindependently of CIC binding.Considering the established link between promoter CpG island DNA methylation anddownstream gene expression, I next sought to examine the relationship between promoter CpGisland DMRs and the expression levels of their target genes to characterize the downstreamconsequences of CIC-associated differential methylation. To do this, I visualized thedistributions of expression levels of said genes, separating targets of hypomethylated DMRs39and targets of hypermethylated DMRs, to assess for differences between CIC-WT and CIC-KOcell lines. Following conventional knowledge, I expected to observe lower expression levels inCIC-KO lines for hypermethylated DMRs and higher levels for hypomethylated DMRs relative toCIC-WT cells. However, I observed no such differences (Figure 3.20A). I also plotted thefractional methylation difference versus the fold change in expression (CIC-KO vs CIC-WT) toinspect their relationship at the level of individual genes. Consistent with the lack of differencesin Figure 3.20A, there was no overall correlation between methylation and expression (Figure3.20B). While some promoter DMR associated genes were significantly differentially expressed,the directionality in their altered expression did not consistently align with expectations. Forexample, PDGFRA had a hypomethylated DMR at its promoter yet it exhibited a loss ofexpression in CIC-KO, IDH1-WT cells (Figure 3.20B). A possible explanation for thiscontradiction is that the inactivation of the intragenic enhancer within PDGFRA (see section3.2.3) could have precluded any potential influence that the hypomethylated DMR mayotherwise have had on PDGFRA expression. Overall, differential methylation as a consequenceof CIC loss appeared to have a minimal impact on gene expression.40Figure 3.1 : Known CIC Target Genes Are Upregulated in CIC-KO Cell LinesExpression levels (RPKM) of known CIC target genes in each cell line. Colours denote CICstatus of each cell line (legend on the bottom left). Statistical significance of differentialexpression is indicated by the number of stars (legend on the bottom right) which correspondsto the q-value obtained from DESeq2. No star indicates no statistical significance (q >=0.05).4142Figure 3.2 : Summary of RNA-seq DE Analysis(A) Scatterplots displaying average gene expression (Log10 RPKM) across all replicates foreach CIC-KO cell line (y-axis) against their CIC-WT counterpart (x-axis). Each dot represents aprotein coding gene and coloured green if it was upregulated, red if downregulated and grey if itdid not meet the significance threshold (q<0.05).(B) Venn diagrams depicting the overlap of concordantly DE genes (q<0.05 and consistentdirectionality) between the two independent CIC-KO cell lines within each IDH1 context.(C) Same plot as in (A) but comparing gene expression between the NHACIC-WT, IDH1-WT parentalcell line and the F8CIC-WT, IDH1-R132H parental cell line.(D) Venn diagram displaying the intersections of all DE analyses. Only the genes concordantlyDE between replicate CIC-KO cell lines (the intersections in [B]) were considered to be CIC-associated DE genes. For IDH1-associated DE genes, all of those that were significant (q<0.05)between NHACIC-WT, IDH1-WT and F8CIC-WT, IDH1-R132H were considered.4344Figure 3.3 : Top Enriched Pathways for Each Set of DE GenesTop 10 gene ontology terms enriched within each DE gene set identified using Metascape92.The numbers beside each bar denotes the number of DE genes over the total number of geneswithin the corresponding gene ontology term.45Figure 3.4 : Quality Assessment and Comparison of CIC ChIP-seq Replicates(A) Heatmap of pairwise Spearman correlation values across CIC ChIP-seq and input librariescalculated on RPKM values within 200bp bins, genome-wide.(B) IGV signal tracks displaying RPKM coverage across CIC ChIP-seq (blue) and input (grey)libraries at a region spanning 100kb in chromosome 9. Numbers to the left of the first trackindicate the scale of RPKM values (all tracks within each panel were set to the same scale).(C) IGV signal tracks displaying RPKM across all CIC ChIP-seq (blue) and input (grey) librariesat known CIC target genes GPR3, ETV4 and ETV5. As in (B), numbers on the left refer to theRPKM scale of all tracks for each gene. RefSeq gene models are shown on the bottom withstrand orientation denoted by the direction of the arrowheads.46Figure 3.5 : Quality Assessment of an Independent CIC ChIP-seq Dataset(A) IGV signal tracks displaying RPKM coverage across published54 ChIP-seq libraries derivedfrom CIC-WT G144 cells (blue) and CIC-KO G144 cells (grey) at a region spanning 100kb inchromosome 9, as in Figure 3.4A. Numbers to the left of the first track indicate the scale ofRPKM values of both tracks.(B) IGV signal tracks displaying RPKM across published54 ChIP-seq libraries at known CICtarget genes GPR3, ETV4 and ETV5. As in (A), numbers on the left refer to the RPKM scale ofall tracks for each gene. Gene models are presented below each panel, as in Figure 3.4C.4748Figure 3.6 : Summary of High-Confidence CIC Peaks(A) CIC peaks ranked by significance (MACS2 q-value) vs. the number of peaks in commonwith a published CIC ChIP-seq dataset (see Methods). The left plot shows all peaks while theright plot shows a subset of the top 1000 most significant peaks. The genes, for which a CICpeak was detected in the vicinity of their TSS and were validated by ChIP-qPCR and are shownin red. The point of inflection at which the number of reproducibly identified peaks increasedrelative to the entire curve was identified to be approximately at the 150th ranked peak (red line).(B) Distribution of high-confidence CIC peaks in relation to their association with genomicfeatures.(C) Top 3 significantly enriched de novo motifs within high-confidence CIC peaks along withtheir p-values and the TF with the closest matching motif identified by Homer. The known CICconsensus binding sequence is highlighted in pink.49Figure 3.7 : CIC ChIP-seq Peaks at Known and Novel Candidate Target GenesCIC ChIP-seq (replicate 2) RPKM coverage in 10bp bins subtracted by RPKM coverage in itsmatched input library at high-confidence peaks in proximity to gene promoters visualized on IGV.The portion of the track highlighted in blue indicates the region called as a CIC peak in replicate2. As in Figure 3.4A,B, numbers on the left refer to the RPKM scale of the coverage track foreach gene. RefSeq gene models are shown on the bottom with strand orientation denoted bythe direction of the arrowheads.5051Figure 3.8 : Candidate CIC Target Genes Are Upregulated in CIC-KO Cell LinesExpression levels (RPKM) of genes for which a reproducibly identified CIC peak was presentnear their TSS across all cell lines. Colours represent CIC status of each cell line, as in Figure3.1. Statistical significance of differential expression is indicated by the number of stars (legendon the bottom right) which corresponds to the q-value obtained from DESeq2. No star indicatesno statistical significance (q >=0.05).52Figure 3.9 Histone Modification Peak Signal Cluster Cell Lines According to CIC and IDH1statusUnsupervised hierarchical clustering (on Euclidean distance) using pairwise Spearmancorrelations of peak signal across all samples. Spearman correlations were calculated usingRPKM within the union of all peaks in all samples for each mark (see Methods).53Figure 3.10 Comparison of Peaks Across All Cell Lines for Each Histone ModificationThe mean number of peaks identified using FindER (q < 0.05) and base pairs within peaks (inmegabases) across the two replicates for each histone modification library is shown. Coloursrepresent CIC status of each cell line, as in Figure 3.1. Exact numbers for each replicate can befound in Appendix F.5455Figure 3.11 : Summary of DER PeaksSummary of significant DER peaks (q<0.05, fold change > 2) across all histone modificationsand for all three comparisons. Metrics shown are as follows: number of DER peaks (left),number of base pairs covered by DER peaks (centre) and base pairs within DER peaks as apercentage of total base pairs within peaks for each mark (right). Bars are coloured according tothe direction of differential enrichment (loss = green, gain = red).56CIC-associated DER Peaks (IDH1-WT)CIC-associated DER Peaks (IDH1-R132H)IDH1-associated DER PeaksFigure 3.12 : CIC-Associated DER Peaks Are Largely Conditional on IDH1 statusVenn diagrams displaying the intersections of all DER peak analyses for H3K4me1, H3K4me3,H3K27ac and H3K27me3. CIC-associated DER peaks comprised those that were concordantlyDER between replicate CIC-KO cell lines.57Figure 3.13 : DER Peaks Are Predominantly Distal From Transcriptional Start SitesDistribution of DER peaks for H3K4me1, H3K4me3, H3K27ac and H3K27me3 with respect totheir distance to the nearest TSS (left) and their associated genomic feature (right).5859Figure 3.14 : CIC Loss Is Associated With Enhancer Dysregulation(A) Proportions of CIC-associated H3K4me1, H3K27ac and H3K27me3 DER peaks (of all DERpeaks for each mark) found at enhancer regions. Colours represent the proportion of DERpeaks that exhibited loss in CIC-KO cells (red) or gain in CIC-KO cells (green).(B) Signal profiles for H3K4me1 and H3K27ac at CIC-associated DER enhancers. Profilesrepresent RPKM in 100bp bins, averaged across all up or down-regulated DER enhancers foreach cell line.6061Figure 3.15 : CIC-Associated DER Enhancers Are Positively Correlated With Target GeneExpression and Are Enriched for Motifs Related to Direct CIC Target Genes(A) Change in ChIP signal (log2 fold change RPKM, CIC-KO vs CIC-WT) for H3K4me1 andH3K27ac peaks at CIC-associated DER enhancers (y-axis) vs. change in gene expression (log2fold change RPKM, CIC-KO vs CIC-WT) of associated genes (x-axis) in IDH1-WT and IDH1-R132H contexts. Colour intensity of points corresponds to absolute fold change in geneexpression (darker for greater fold changes). The top 10 genes that displayed the greatestchanges in gene expression are labelled in each plot.(B) Top 5 significantly enriched de novo motifs within CIC-associated downregulated andupregulated enhancers in each IDH1 context. Motifs are presented with their associated p-values and the TF with the closest matching motif identified using Homer.6263Figure 3.16 Enhancers at PDGFRA and NFIA Are Dysregulated in Association With CICand IDH1 Status(A) IGV tracks displaying H3K4me1 and H3K27ac signal (RPKM) across all cell lines at thePDGFRA (top) and NFIA (bottom) gene loci. RPKM signal range for each mark at each locuswas set at the same scale across all cell lines (numbers on the top right corner of each set oftracks indicate the RPKM value corresponding to the height of each track within that set).Regions highlighted in pink or under a red arrow represent the enhancer regions identified topossess a DER H3K4me1 and/or H3K27ac peak.(B) PDGFRA and NFIA gene expression (Log10 RPKM) across all cell lines. ***, q<0.0005.6465Figure 3.17 : IDH1-R132H Cell Lines Exhibit a DNA Hypermethylator Phenotype(A) Heatmap displaying fractional methylation of the top 10,000 most variably methylated CpGsites across all samples. Samples and CpGs were clustered using unsupervised hierarchicalclustering (Euclidean).(B) Average fractional methylation for CpGs with greater than 5 raw read coverage across allCpGs, CpG islands and CpG shores. Colours represent CIC status of each cell line, as in Figure3.1. *** (Mann-Whitney U Test, p < 0.0005).6667Figure 3.18 : Summary of Differentially Methylated RegionsSummaries of hypomethylated and hypermethylated CIC- and IDH1-associated DMRsdisplaying their proportions, size, absolute fractional methylation differences, and genomicfeature distributions within each comparison. Size and methylation change distributions aredisplayed as both boxplots and violin plots with median values denoted as a red crossbar.DMRs were assigned to a genomic feature based on overlap calculated using Bedtools.68Figure 3.19 : CIC Binding Is Not Associated With Differential MethylationViolin plots of fractional CpG methylation of CIC peak regions (left) and at +/- 2kb around theTSSs of putative CIC targets (right) across all cell lines. Very little difference in the distributionscomparing CIC-KO cells to their CIC-WT counterparts is evident. Red crossbars represent themeans.6970Figure 3.20 : CIC-Associated Differential Methylation Is Not Associated With DifferentialGene Expression(A) Expression levels (RPKM) of genes associated with CIC-associated hypo- andhypermethylated promoter CpG islands across CIC-WT and CIC-KO cell lines. Since no CIC-associated hypomethylated DMRs were found at promoters in the IDH1-R132H cell lines, onlythe expression of genes associated with hypermethylated DMRs is shown.(B) Fractional methylation differences of promoter CpG island DMRs (x-axis) and changes inexpression (log2 fold change RPKM, CIC-KO vs CIC-WT) of downstream genes (y-axis). Non-significant DE genes are coloured grey while significant DE genes are coloured purple andlabeled.71Chapter 4: DiscussionRecurrent somatic mutations in CIC and IDH1 are linked to the development of ODG yet theirroles in this context are not understood. In this thesis, I formulated two overarching hypotheses:1) a comprehensive investigation of CIC binding sites and the transcriptomic and epigenomicconsequences of CIC-KO will implicate mechanisms by which its loss of function promotes ODGand 2) contrasting these CIC-associated effects between IDH1-WT and IDH1-R132Hexpressing cell lines will reveal insight into the synergistic manner in which CIC deficiency andmutant IDH can collaboratively drive low grade gliomas. To address these hypotheses, Icharacterized and compared the effects of CIC-KO on global gene expression, histonemodifications and DNA methylation in IDH1-WT and IDH1-R132H backgrounds, factoring inanalyses of reproducibly identified high-confidence CIC binding sites. In this Chapter, Isummarize my research findings and their relevance and discuss potential directions for futurestudies.4.1 Transcriptomic Consequences of CIC-KO and Candidate Target Genes YieldInsights Into Its Role in Neurodevelopment and Cell Cycle RegulationIn Chapter 3.1, I provided analyses of the transcriptional dysregulation that resulted from loss offunctional CIC in IDH1-WT and IDH1-R132H backgrounds. An examination of the pathwaysrepresented in the sets of CIC-associated DE genes highlighted neurodevelopmental and ECMrelated processes as the most enriched (Figure 3.3). The presence of ECM pathways mayspeak to the association between CIC loss and enhanced invasiveness that previous studieshave described5,77. In one of these studies, the pro-invasive phenotype of their CIC-deficientorthotopic lung cancer mouse model was attributed to the derepression of ETV4 and thesubsequent upregulation of a matrix metalloproteinase gene, MMP245. While MMP24specifically did not display significant upregulation in our CIC-KO cell lines, other MMP genes,namely MMP2 and MMP11, were significantly overexpressed. Additionally, several other ECMcomponents and regulators of cell migration such as VCAN, LAMA3, LAMA4 and ADAMTS3were also significantly upregulated, expanding upon the regulatory network in which CIC mayparticipate to regulate invasive capacity. Further, recent studies have indicated a regulatory rolefor CIC within the neurodevelopmental hierarchy whereby its elimination resulted in neuralmaturation defects52 and the expansion of NSC and oligodendrocyte precursor cell (OPC)populations6. One study showed that conditional knockout of CIC in mice obviated the need forEGF for NSC proliferation51. In addition, the authors observed a defect in oligodendrocyte72differentiation in association with CIC loss51. Another study highlighted ETV5 to be a relevanteffector, downstream of CIC loss, in the neurodevelopmental context, in which ETV5knockdown partially diminished the effects of CIC absence in promoting NSC and OPCexpansion6. While CIC’s role in neurodevelopment remains largely uncharacterized, the resultsof these few studies support the notion that the presence of CIC may be important in themaintenance of NSC quiescence and that loss of CIC may somehow promote increasedproliferation and partial commitment towards the oligodendrocyte lineage.There is an important connection between the processes of cellular migration andneurodevelopment. Throughout the development and maintenance of the embryonic and adultbrain, nascent neuronal or glial cells are required to travel from their birthplace to theirdestination in a tightly controlled and coordinated manner113. It is thus natural to questionwhether the prevalence of both neurodevelopmental and cellular migration pathways in myresults arose due to the dysregulation of a gene expression program that encompasses bothontological domains rather than a series of independent transcriptional circuits.Several points of insight into CIC’s biological role are revealed by the genes identified to becandidate target genes in Chapter 3.1.3, which included novel or previously unexplored targets.In particular, high-confidence CIC peaks were identified at the promoter regions of RUNX1, ID1,EPHA2, CDC14A and CDC14B. RUNX1 encodes a TF involved in the differentiation ofhematopoietic stem cells and participates in leukemogenesis114. The finding that CIC maydirectly regulate RUNX1 may provide mechanistic insight into the reports of CIC loss resulting inaltered T-cell development and the onset of T-ALL in mice47,48,78. RUNX1 is also commonlyoverexpressed in the Mesenchymal subtype of GBM and has been associated with enhancedproliferation and invasion115. Furthermore, expression of RUNX1 was correlated with thesurvival and proliferation of adult neural precursor cells116, indicating a pro-neurogenic role inaddition to its well-established function in hematopoietic lineage specification. ID1 is aninactivator of basic helix-loop-helix transcription factors whose gene knockout wasdemonstrated to result in a reduction of tumour progression of GBM cells117. It was alsodemonstrated to play a part in the control of NSC quiescence during regenerativeneurogenesis118, highlighting the possibility that it may also contribute to glioma via this function.EPHA2 encodes a receptor tyrosine kinase and is the most recurrently altered of its 14 membergene family across human malignancies119, including GBM120. In a study conducted by Miao, etal121, EPHA2 overexpression was observed to promote the invasive infiltration of glioma stem73cells (GSCs) in vivo and promote neurosphere formation. Moreover, they showed that EPHA2knockdown reduced self-renewal and intracranial tumorigenicity121, reinforcing the notion thatEPHA2 may drive GBM through the promotion of both stemness and invasiveness. Notably,while ID1 was upregulated in all CIC-KO lines, RUNX1 and EPHA2 were specificallyupregulated in CIC deficient cells expressing IDH1-R132H (Figure 3.6), perhaps hinting at someinterplay between CIC loss and mutant IDH1 to promote the activation RUNX1 and EPHA2.Thus, my identification of RUNX1, ID1 and EPHA2 as candidate CIC targets, and theirincreased expression in CIC-KO cell lines, reveals potential mechanistic insights underpinningthe link between CIC and the modulation of invasiveness, NSC cell fate, and ODG.CDC14A and CDC14B are of interest due to their involvement in cell cycle regulation, whichmay be of relevance regarding our laboratory’s recent finding of mitotic defects in CIC deficientcells69. CDC14A is particularly conspicuous, since its ectopic overexpression was demonstratedto result in multipolar centrosomes and chromosome segregation defects122, which matches thephenotypes we observed in our CIC-KO cell lines69. CDC14A de-repression, and subsequentcentrosome malfunction, may therefore constitute a possible mechanism underlying thesemitotic defects in cells lacking CIC. An experiment involving the silencing of CDC14A withsiRNA or other means of downregulation and subsequent assessment of mitotic defects in CIC-KO cells could be conducted to examine whether these phenotypes are modulated by CDC14Aexpression. If, indeed, the mitotic defects in CIC-KO cells are underpinned by CDC14A de-repression, the experiment outlined above may produce a phenotypic rescue whereby CDC14Aknockdown would lead to a reduction in or disappearance of mitotic defects.4.2 CIC-KO Cell Lines Exhibit Dysregulation of Neurodevelopmental GeneEnhancersIn Chapter 3.2, I investigated and compared epigenetic landscapes at the level of histonemodifications and DNA methylation, comparing CIC-KO cell lines to their CIC-WT counterpartsin the presence and absence of IDH1-R132H. Motivating this work was the identification ofseveral chromatin modifying genes, such as histone deacetylase complex components, as CICprotein interactors. I expected to at least observe an increase in histone acetylation at CICbinding sites in cell lines lacking CIC based on a previous study demonstrating a loss of histoneacetylation upon CIC binding54. I also hypothesized that CIC binding sites would be associatedwith a loss of one or more repressive marks or gain of one or more activating marks given its74established transcriptional repressor function. Contrary to my hypothesis, I did not observesignificant differences in histone modifications around CIC binding sites. While others haveattributed histone deacetylation to CIC’s role in transcriptional repression, it is important to notethat they did not investigate the changes in histone acetylation upon loss of CIC binding54. Aspresented in Chapter 1.4.1, CIC’s transcriptional repressor function is downregulated by RTKsignalling. The cell line models used by Weissmann, et al54 did not exhibit CIC’s transcriptionalrepressor activity under normal culturing conditions (presumably due to RTK mediated CICinactivation from the addition of growth factors). Thus, they treated their cell line models with aninhibitor of MEK, a downstream effector of RTK, which was sufficient to result in CIC bindingand repression of its targets. Therefore, the changes in histone acetylation they reported were inthe context of CIC binding, and not in the context of CIC loss. Our cell lines, on the other hand,exhibited CIC activity under normal culturing conditions, presumably due to an insufficientamount of growth factors in our culturing media (DMEM + 10% FBS) to induce RTK mediateddownregulation of CIC. Thus, since my analysis pertained to the consequences of CIC lossrather than the consequences of CIC binding, my investigation of the chromatin landscape isdistinct from that of the previous study. The lack of an apparent change in histone acetylationaround high-confidence CIC peaks in our CIC-KO lines is therefore not a direct contradiction tothe findings of the previous study and can be explained by a divergence in the mechanismsinvolved in CIC-mediated transcriptional repression versus de-repression. In other words, CICmay recruit histone deacetylases and thereby mediate transcriptional silencing, but themaintenance, and therefore, relief of this repression may involve other mechanisms. Loss ofH3K27ac, for instance, can allow for H3K27me3 to be deposited in its absence in order fortranscriptional repression to be maintained. It is then possible for transcriptional de-repression,in this scenario, to be enabled through the loss of H3K27me3 rather than the re-establishmentof H3K27ac, as it might be the case for ETV4 whose associated CIC peak was accompanied bya loss of H3K27me3. It is also possible for the de-repression of CIC targets to involve multiplemechanisms that could differ on a gene-by-gene basis, which may further explain the lack of arecurrent CIC binding associated change in the histone modification landscape in my results.Rather than a direct impact, the consequences of CIC loss on histone modifications appeared tolargely impact enhancers whose dysregulation may have transpired as a result of the up- ordownregulation of direct CIC target genes. Supporting this argument was the significantenrichment of motifs directly related to known and candidate CIC targets such as ETV1/4/5,RUNX1, MAFF/G and FOS/FOSL1 at downregulated and upregulated enhancers (Figure 3.14B).75Several genes whose enhancers were dysregulated have been linked to the tumourigenicity ofGBM models and/or neural progenitor cell fate decisions, including PDGFRA112 and NFIA109,110.PDGFRA has been primarily framed as an oncogene in high grade malignancies includingGBM112, and hence the inactivation of its enhancer upon loss of CIC, a supposed tumoursuppressor, was unexpected. Ablation of PDGFRA has been demonstrated to lead toprecocious differentiation of OPCs in the developing spinal cord123, highlighting its importance inthe regulation of this developmental pathway. The relationship between CIC and PDGFRAdysregulation may therefore be of relevance in the context of CIC’s emerging function in neuralcell fate specification and warrants further investigation. The CIC-associated dysregulation of agenic enhancer in NFIA and its expression showcased a striking phenomenon in which CIC lossappeared to have had opposite effects between cells expressing IDH1-WT and cells expressingIDH1-R132H. The finding that NFIA appeared to be uniquely activated in CIC-KO + IDH1-R132H cell lines relative to their CIC-WT counterpart, and its demonstrated role ingliomagenesis, lends merit to the view that NFIA dysregulation may underlie the synergisticrelationship between CIC loss and mutant IDH in the initiation of ODG. Moreover, in a study thatperformed single-cell transcriptome profiling on primary ODGs4, NFIA appeared to beoverexpressed in CIC-mutant cells relative to CIC-wildtype cells , further supporting this notion.POU3F2, another important neurodevelopmental TF and a known direct regulator of NFIA110,was significantly overexpressed specifically in the CIC-KIDH1O + -R132H cell lines (mean foldchange ~3, adjusted p-value < 0.0001), providing a plausible mechanism explaining NFIAdysregulation. What may be upregulating POU3F2 specifically in these cell lines, however,remains unknown and warrants further investigation.4.3 Possible Synergy Between CIC-KO and IDH1-R132H in GliomagenesisConsistent throughout all of the molecular landscapes I explored was the greater degree of CIC-associated differences in the IDH1-WT cell lines compared to the IDH1-R132H cell lines. Asdescribed in the Methods section and in Figure 2.1, the set of IDH1-WT cell lines consist of apolyclonal CIC-WT parent with two CIC-KO progeny lines that were derived from twoindependent clones. Conversely, the set of mutant IDH1 lines had an identical clonal originsince the parental CIC-WT line was expanded from a single clone in the process of obtaining amodel that stably expressed the IDH1 mutation. This distinction introduces potentialconfounding effects of clonal heterogeneity specifically in the IDH1-WT cell lines. Therefore, it ispossible that the differences between the CIC-KO and CIC-WT samples in this model may beattributed to the variability in clonal composition, rather than the loss of CIC function. My76methodologies involved filtering based on the concordance in both CIC-KO cell lines, therebyincreasing the confidence in a change being attributed to CIC activity rather than clonalheterogeneity. Still, it is possible that some changes that satisfied this condition were influencedby the intrinsic variability in clonal composition, contributing to the larger number of CIC-associated differences in IDH1-WT cells than in IDH1-132H cells. There may also be abiological reason, where the consequences of the IDH1-R132H mutation could be masking theeffects of CIC loss. Mutant IDH1 mediated CpG island hypermethylation, for instance, couldrepress a gene that is normally dysregulated by CIC loss and thereby preclude the impact ofCIC loss on that gene. PDGFRA, which exhibited decreased expression in association witheither CIC loss or IDH1-R132H but no difference in expression between double mutants and theF8CIC-WT, IDH1-R132H parental line (Figure 3.15B), may be an example of such a possibility.Neomorphic IDH1/2 mutations and 1p/19q co-deletions are the defining features of ODG whileCIC alterations are found in ~50-80% of primary tumours1,2. These mutational frequencies implyan order of events in which CIC mutations occur after the IDH1/2 mutation and 1p/19q co-deletion. The report that distinct CIC mutations were found in different spatially sampled regionsof primary ODGs4 also supports this notion. The prevalence of neurodevelopmental pathwaysand genes in my results aligns with recent evidence of CIC being an important mediator ofneural/glial cell fate specification, in which its loss appears to encourage NSC expansion andbias NSCs towards the oligodendroglial lineage6,51. The neomorphic IDH1 mutation andconsequent DNA hypermethylation has also been demonstrated to affect the neuraldevelopmental hierarchy, specifically in blocking differentiation. This is supported by a study inwhich DNMT inhibitor treatment of patient derived IDH1-R132H glioma cells resulted in theinduction of differentiation124. Moreover, NSCs transduced with mutant IDH1-R132H wereshown to exhibit insensitivity to stimulated differentiation compared to empty vector controls andIDH1-WT transduced NSCs125. Integrating these results, I conceptualized a model thatdescribes the possible synergistic nature of CIC loss and mutant IDH, using the Waddingtonlandscape126 as a visual means of outlining the possible developmental trajectories of a NSC(Figure 4.1). In this model, and as a consequence of CIC loss, the tri-potent precursor ispresumed to progress towards the ODC lineage. With the contextual background of theneomorphic IDH1-mutation and its de-differentiating influence, it is plausible for both lesions tocollaborate in the expansion and maintenance of OPC like cells. This amplification of self-renewing cells could then present a suitable environment for additional mutations to arise andconfer cancer hallmark characteristics (Figure 4.1).77Ahmad and colleagues showed that forebrain-specific deletion of CIC biased neural stem cellstowards the oligodendrocyte lineage6. Conversely, Hwang and colleagues showed that CNS-wide knockout of CIC resulted in a neural maturation defect without evidence of an expansion ofcells resembling OPCs52. While both studies are consistent with the notion that CIC operates asa regulator of NSC cell fate, their disparities suggest that the phenotypic outcome in the contextof lineage commitment depends on the cellular context in which CIC function is abrogated. Onthis point, since our cell line models consisted of immortalized and terminally differentiatedastrocytes, the molecular alterations in CIC-KO cells reported in this thesis work may notaccurately reflect what happens when CIC is lost in a NSC or other precursor cell type.Crediting this view is the fact that many important neural/glial precursor marker genes such asSOX2, SOX9, OLIG2, and ASCL1 were undetected, or present at extremely low read counts inthe RNA-seq data obtained from our cell lines. Therefore, future studies should investigate theeffects of CIC deficiency in the context of undifferentiated NSCs or other immature cell types toobtain accurate insights into CIC’s role in lineage commitment. Having outlined several caveatsof the model used in this study, the findings gained from this thesis work were nonethelessvaluable, yielding several novel results and hypotheses that may be relevant in the context ofCIC’s roles in cell cycle regulation, CNS development and ODG.4.4 ConclusionThe research featured in this thesis presents insights that expand our current understanding ofCIC’s molecular function in normal human and cancer biology. Through investigating theregions associated with high-confidence CIC binding sites, I identified several candidate targetgenes whose roles encompass cell cycle regulation and neurodevelopment. My analyses oftranscriptomic and epigenomic landscapes also independently converged onto thedysregulation of neurodevelopmental pathways and genes. I discovered that NFIA, an integralregulator of gliogenesis, was uniquely overexpressed in CIC deficient cells that possessed theneomorphic IDH1-R132H mutation, thus presenting a possible mechanism by which both CICloss and mutant IDH could contribute to gliomagenesis. Overall, my findings provide a rationalefor future research examining the functional relationship between CIC loss and mutant IDH1 inthe context of early neural/glial cell fate.78IDH1-WT IDH1-R132HFigure 4.1 Conceptual Model for the Synergistic Relationship Between CIC Loss andMutant IDH1 in Promoting GliomagenesisA visualization of my conceptual model for how CIC loss and neomorphic IDH1 mutation maycontribute to gliomagenesis. The model employs the Waddington landscape126 to represent thepossible developmental trajectories of an undifferentiated NSC and the impact of CIC deficiencyand mutant IDH1 in this context. In a wildtype IDH background, CIC loss directs NSCs to initiatedevelopment towards the ODC lineage. Conversely, an epigenetic landscape distorted by theeffects of mutant IDH could prevent this developmental cue from continuing beyond a certainpoint (denoted by the red barrier). Expansion of this precursor cell pool could then provideopportunities for additional alterations to occur and accrue if they provide some selectiveadvantage. AC - astrocyte, ODC - oligodendrocyte.79Bibliography1. Bettegowda, Chetan, et al. \"Mutations in CIC and FUBP1 contribute to human oligodendroglioma.\"Science 333.6048 (2011): 1453-1455.2. Yip, Stephen, et al. \"Concurrent CIC mutations, IDH mutations, and 1p/19q loss distinguisholigodendrogliomas from other cancers.\" The Journal of Pathology 226.1 (2012): 7-16.3. Cancer Genome Atlas Research Network. \"Comprehensive, integrative genomic analysis of diffuselower-grade gliomas.\" New England Journal of Medicine 372.26 (2015): 2481-2498.4. Suzuki, Hiromichi, et al. “Mutational landscape and clonal architecture in grade II and III gliomas.\"Nature Genetics 47.5 (2015): 458.5. Okimoto, Ross A., et al. \"Inactivation of Capicua drives cancer metastasis.\" Nature Genetics 49.1(2017): 87.6. Ahmad, Shiekh Tanveer, et al. \"Capicua regulates neural stem cell proliferation and lineagespecification through control of Ets factors.\" Nature Communications 10.1 (2019): 2000.7. Hanahan, Douglas, and Robert A. Weinberg. \"The hallmarks of cancer.\" Cell 100.1 (2000): 57-70.8. Hanahan, Douglas, and Robert A. Weinberg. \"Hallmarks of cancer: the next generation.\" Cell 144.5(2011): 646-674.9. Pon, Julia R., and Marco A. Marra. \"Driver and passenger mutations in cancer.\" Annual Review ofPathology: Mechanisms of Disease 10 (2015): 25-50.10. Nowell, Peter C. \"The clonal evolution of tumor cell populations.\" Science 194.4260 (1976): 23-28.11. Greaves, Mel, and Carlo C. Maley. \"Clonal evolution in cancer.\" Nature 481.7381 (2012): 306.12. Ciurea, Marius Eugen, et al. \"Cancer stem cells: biological functions and therapeutically targeting.\"International Journal of Molecular Sciences 15.5 (2014): 8169-8185.13. Zhang, Shu, et al. \"Identification and characterization of ovarian cancer-initiating cells fromprimary human tumors.\" Cancer Research 68.11 (2008): 4311-4320.14. Singh, Sheila K., et al. \"Identification of human brain tumour initiating cells.\" Nature 432.7015(2004): 396.15. Ciurea, Marius Eugen, et al. \"Cancer stem cells: biological functions and therapeutically targeting.\"International Journal of Molecular Sciences 15.5 (2014): 8169-8185.16. Friedmann‐Morvinski, Dinorah, and Inder M. Verma. \"Dedifferentiation and reprogramming:origins of cancer stem cells.\" EMBO Reports 15.3 (2014): 244-253.17. Futreal, P. Andrew, et al. \"A census of human cancer genes.\" Nature Reviews Cancer 4.3 (2004):177-183.18. Iwafuchi-Doi, Makiko, and Kenneth S. Zaret. \"Cell fate control by pioneer transcription factors.\"Development 143.11 (2016): 1833-1837.19. Kiefer, Julie C. \"Epigenetics in development.\" Developmental Dynamics 236.4 (2007): 1144-1156.20. Olivier, Magali, Monica Hollstein, and Pierre Hainaut. \"TP53 mutations in human cancers: origins,consequences, and clinical use.\" Cold Spring Harbor Perspectives in Biology 2.1 (2010): a001008.21. Dang, Chi V. \"MYC on the path to cancer.\" Cell 149.1 (2012): 22-35.22. Beckerman, Rachel, and Carol Prives. \"Transcriptional regulation by p53.\" Cold Spring HarborPerspectives in Biology 2.8 (2010): a000935.23. Sullivan, Kelly D., et al. \"Mechanisms of transcriptional regulation by p53.\" Cell Death andDifferentiation 25.1 (2018): 133.24. Lane, David P. \"Cancer. p53, guardian of the genome.\" Nature 358 (1992): 15-16.8025. Beroukhim, Rameen, et al. \"The landscape of somatic copy-number alteration across humancancers.\" Nature 463.7283 (2010): 899.26. Schuettengruber, Bernd, et al. \"Genome regulation by polycomb and trithorax proteins.\" Cell 128.4(2007): 735-745.27. Raisner, Ryan, et al. \"Enhancer activity requires CBP/P300 bromodomain-dependent histoneH3K27 acetylation.\" Cell Reports 24.7 (2018): 1722-1729.28. Morin, Ryan D., et al. \"Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma.\"Nature 476.7360 (2011): 298.29. Kadoch, Cigall, et al. \"Proteomic and bioinformatic analysis of mammalian SWI/SNF complexesidentifies extensive roles in human malignancy.\" Nature Genetics 45.6 (2013): 592.30. Jones, Peter A. \"Functions of DNA methylation: islands, start sites, gene bodies and beyond.\"Nature Reviews Genetics 13.7 (2012): 484.31. Zhang, Wu, and Jie Xu. \"DNA methyltransferases and their roles in tumorigenesis.\" BiomarkerResearch 5.1 (2017): 1.32. Mojarad, Ehsan Nazemalhosseini, et al. \"The CpG island methylator phenotype (CIMP) incolorectal cancer.\" Gastroenterology and Hepatology from Bed to Bench 6.3 (2013): 120.33. Flavahan, William A., et al. \"Insulator dysfunction and oncogene activation in IDH mutantgliomas.\" Nature 529.7584 (2016): 110.34. Chen, Lin, et al. \"MeCP2 binds to non-CG methylated DNA as neurons mature, influencingtranscription and the timing of onset for Rett syndrome.\" Proceedings of the National Academy ofSciences 112.17 (2015): 5509-5514.35. Mundade, Rasika, et al. \"Role of ChIP-seq in the discovery of transcription factor binding sites,differential gene regulation mechanism, epigenetic marks and beyond.\" Cell Cycle 13.18 (2014):2847-2852.36. Yong, Wai-Shin, Fei-Man Hsu, and Pao-Yang Chen. \"Profiling genome-wide DNA methylation.\"Epigenetics & Chromatin 9.1 (2016): 26.37. Jiménez, Gerardo, et al. \"Relief of gene repression by torso RTK signaling: role of capicua inDrosophila terminal and dorsoventral patterning.\" Genes & Development 14.2 (2000): 224-231.38. Goff, Deborah J., Laura A. Nilson, and Donald Morisato. \"Establishment of dorsal-ventral polarityof the Drosophila egg requires capicua action in ovarian follicle cells.\" Development 128.22 (2001):4553-4562.39. Atkey, Matthew R., et al. \"Capicua regulates follicle cell fate in the Drosophila ovary throughrepression of mirror.\" Development 133.11 (2006): 2115-2123.40. Andreu, María José, et al. \"Mirror represses pipe expression in follicle cells to initiate dorsoventralaxis formation in Drosophila.\" Development 139.6 (2012): 1110-1114.41. Roch, Fernando, Gerardo Jiménez, and Jordi Casanova. \"EGFR signalling inhibits Capicua-dependent repression during specification of Drosophila wing veins.\" Development 129.4 (2002):993-1002.42. Tseng, Ai-Sun Kelly, et al. \"Capicua regulates cell proliferation downstream of the receptortyrosine kinase/ras signaling pathway.\" Current Biology 17.8 (2007): 728-733.43. Krivy, Kate, Mary-Rose Bradley-Gill, and Nam-Sung Moon. \"Capicua regulates proliferation andsurvival of RB-deficient cells in Drosophila.\" Biology Open 2.2 (2013): 183-190.44. Jin, Yinhua, et al. \"EGFR/Ras signaling controls Drosophila intestinal stem cell proliferation viaCapicua-regulated genes.\" PLoS Genetics 11.12 (2015): e1005634.8145. Lee, Yoontae, et al. \"ATXN1 protein family and CIC regulate extracellular matrix remodeling andlung alveolarization.\" Developmental Cell 21.4 (2011): 746-757.46. Kim, Eunjeong, et al. \"Deficiency of Capicua disrupts bile acid homeostasis.\" Scientific Reports 5(2015): 8272.47. Park, Sungjun, et al. \"Capicua deficiency induces autoimmunity and promotes follicular helper Tcell differentiation via derepression of ETV5.\" Nature Communications 8 (2017): 16037.48. Tan, Qiumin, et al. \"Loss of Capicua alters early T cell development and predisposes mice to T celllymphoblastic leukemia/lymphoma.\" Proceedings of the National Academy of Sciences 115.7(2018): E1511-E1519.49. Lu, Hsiang-Chih, et al. \"Disruption of the ATXN1–CIC complex causes a spectrum ofneurobehavioral phenotypes in mice and humans.\" Nature Genetics 49.4 (2017): 527.50. Lam, Yung C., et al. “ATAXIN-1 interacts with the repressor Capicua in its native complex tocause SCA1 neuropathology.\" Cell 127.7 (2006): 1335-1347.51. Yang, Rui, et al. \"Cic loss promotes gliomagenesis via aberrant neural stem cell proliferation anddifferentiation.\" Cancer Research 77.22 (2017): 6097-6108.52. Hwang, Inah, et al. \"CIC is a Critical Regulator of Neuronal Differentiation.\" JCI Insight 5.9(2020):e13582.53. Dissanayake, Kumara, et al. “ERK/p90RSK/14-3-3 signalling has an impact on expression of PEA3Ets transcription factors via the transcriptional repressor capicua.\" Biochemical Journal 433.3(2011): 515-525.54. Weissmann, Simon, et al. \"The tumor suppressor CIC directly regulates MAPK pathway genes viahistone deacetylation.\" Cancer Research 78.15 (2018): 4114-4125.55. Astigarraga, Sergio, et al. \"A MAPK docking site is critical for downregulation of Capicua byTorso and EGFR RTK signaling.\" The EMBO Journal 26.3 (2007): 668-677.56. Grimm, Oliver, et al. “Torso RTK controls Capicua degradation by changing its subcellularlocalization.” Development 139.21 (2012): 3962-3968.57. Bunda, Severa, et al. \"CIC protein instability contributes to tumorigenesis in glioblastoma.\" NatureCommunications 10.1 (2019): 661.58. Herranz, Héctor, et al. “Mutual repression by Bantam miRNA and Capicua links the EGFR/MAPKand Hippo pathways in growth control.” Current Biology 22.8 (2012): 651-657.59. Yang, Liu, et al. “Minibrain and Wings apart control organ growth and tissue patterning throughdown-regulation of Capicua.” Proceedings of the National Academy of Sciences 113.38 (2016):10583-10588.60. Papagianni, Aikaterini, et al. “Capicua controls Toll/IL-1 signaling targets independently of RTKregulation.” Proceedings of the National Academy of Sciences 115.8 (2018): 1807-1812.61. Fores, Marta, et al. “A new mode of DNA binding distinguishes Capicua from other HMG-boxfactors and explains its mutation patterns in cancer.\" PLoS Genetics 13.3 (2017): e1006622.62. Ajuria, Leiore, et al. “Capicua DNA-binding sites are general response elements for RTK signalingin Drosophila.” Development 138.5 (2011): 915-924.63. LeBlanc, Veronique G., et al. \"Comparative transcriptome analysis of isogenic cell line models andprimary cancers links capicua (CIC) loss to activation of the MAPK signalling cascade.\" TheJournal of Pathology 242.2 (2017): 206-220.64. Kim, Yoosik, et al. “Context-dependent transcriptional interpretation of mitogen activated proteinkinase signaling in the Drosophila embryo.” Chaos 23.2 (2013): 025105.8265. Chittaranjan, Suganthi, et al. \"Mutations in CIC and IDH1 cooperatively regulate 2-hydroxyglutarate levels and cell clonogenicity.\" Oncotarget 5.17 (2014): 7960.66. Forés, Marta, et al. “Origins of context-dependent gene repression by Capicua.” PLoS Genetics11.1 (2015): e1004902.67. Rousseaux, Maxime WC, et al. “ATXN1-CIC complex is the primary driver of cerebellarpathology in spinocerebellar ataxia type 1 through a gain-of-function mechanism.” Neuron 97.6(2018): 1235-1243.e5.68. Wong, Derek, et al. \"Transcriptomic analysis of CIC and ATXN1L reveal a functional relationshipexploited by cancer.\" Oncogene 38.2 (2019): 273.69. Chittaranjan, Suganthi, et al. \"Loss of CIC promotes mitotic dysregulation and chromosomesegregation defects.\" bioRxiv (2019): 533323.70. Kawamura-Saito, Miho, et al. \"Fusion between CIC and DUX4 up-regulates PEA3 family genes inEwing-like sarcomas with t (4; 19)(q35; q13) translocation.\" Human Molecular Genetics 15.13(2006): 2125-2137.71. Italiano, Antoine, et al. \"High prevalence of CIC fusion with double‐homeobox (DUX4)transcription factors in EWSR1‐negative undifferentiated small blue round cell sarcomas.\" Genes,Chromosomes and Cancer 51.3 (2012): 207-218.72. Specht, Katja, et al. \"Distinct transcriptional signature and immunoprofile of CIC‐DUX4 fusion–positive round cell tumors compared to EWSR1‐rearranged ewing sarcomas: Further evidencetoward distinct pathologic entities.\" Genes, Chromosomes and Cancer 53.7 (2014): 622-633.73. Antonescu, Cristina R., et al. “Sarcomas with CIC-rearrangements are a distinct pathologic entitywith aggressive outcome: a clinicopathologic and molecular study of 115 cases.” The AmericanJournal of Surgical Pathology 53.7 (2014): 622-633.74. Yoshimoto, Toyoki, et al. \"CIC-DUX4 induces small round cell sarcomas distinct from Ewingsarcoma.\" Cancer Research 77.11 (2017): 2927-2937.75. Okimoto, Ross A., et al. “The CIC-DUX4 fusion oncoprotein drives metastasis and tumor growthvia distinct downstream regulatory programs and therapeutic targets in sarcoma.\" bioRxiv (2018):476283.76. Louis, D. N., et al. “The 2016 World Health Organization classification of tumors of the centralnervous system: a summary.” Acta neuropathologica 131.6 (2016): 803-820.77. Choi, Nahyun, et al. \"miR-93/miR-106b/miR-375-CIC-CRABP1: a novel regulatory axis inprostate cancer progression.\" Oncotarget 6.27 (2015): 23533.78. Simón-Carrasco, Lucía, et al. \"Inactivation of Capicua in adult mice causes T-cell lymphoblasticlymphoma.\" Genes & Development 31.14 (2017): 1456-1468.79. Dang, Lenny, et al. \"Cancer-associated IDH1 mutations produce 2-hydroxyglutarate.\" Nature462.7274 (2009): 739.80. Ward, Patrick S., et al. \"The common feature of leukemia-associated IDH1 and IDH2 mutations is aneomorphic enzyme activity converting α-ketoglutarate to 2-hydroxyglutarate.\" Cancer Cell 17.3(2010): 225-234.81. Sciacovelli, Marco, and Christian Frezza. \"Oncometabolites: Unconventional triggers of oncogenicsignalling cascades.\" Free Radical Biology and Medicine 100 (2016): 175-181.82. Xu, Wei, et al. \"Oncometabolite 2-hydroxyglutarate is a competitive inhibitor of α-ketoglutarate-dependent dioxygenases.\" Cancer Cell 19.1 (2011): 17-30.8383. Noushmehr, Houtan, et al. \"Identification of a CpG island methylator phenotype that defines adistinct subgroup of glioma.\" Cancer Cell 17.5 (2010): 510-522.84. Turcan, Sevin, et al. “IDH1 mutation is sufficient to establish the glioma hypermethylatorphenotype.\" Nature 483.7390 (2012): 479.85. Zhou, Xin, et al. “Exploring genomic alteration in pediatric cancer using ProteinPaint.” NatureGenetics 48.1 (2016): 4-6.86. Sonoda, Yukihiko, et al. “Formation of intracranial tumors by genetically modified humanastrocytes defines four pathways critical in the development of human anaplastic astrocytoma.”Cancer Research 61.13 (2001): 4956-4960.87. Ohba, Shigeo, et al. “Mutant IDH1-driven cellular transformation increases RAD51-mediatedhomologous recombination and temozolomide resistance.\" Cancer Research 74.17 (2014): 4836-4844.88. Li, Heng, and Richard Durbin. \"Fast and accurate short read alignment with Burrows–Wheelertransform.\" Bioinformatics 25.14 (2009): 1754-1760.89. Li, Heng, et al. \"The sequence alignment/map format and SAMtools.\" Bioinformatics 25.16 (2009):2078-2079.90. Butterfield, Yaron S., et al. \"JAGuaR: junction alignments to genome for RNA-seq reads.\" PloSOne 9.7 (2014).91. Love, Michael I., et al. “Moderated estimation of fold change and dispersion for RNA-seq data withDESeq2.” Genome Biology 15.12 (2014):550.92. Zhou, Y., et al. “Metascape provides a biologist-oriented resource for the analysis of systems-leveldatasets.” Nature Communications 10.1 (2013): 1523.93. Ramírez, Fidel, et al. \"deepTools: a flexible platform for exploring deep-sequencing data.\" NucleicAcids Research 42.W1 (2014): W187-W191.94. Quinlan, Aaron R., and Ira M. Hall. \"BEDTools: a flexible suite of utilities for comparing genomicfeatures.\" Bioinformatics 26.6 (2010): 841-842.95. Robinson, James T., et al. \"Integrative genomics viewer.\" Nature Biotechnology 29.1 (2011): 24.96. Feng, Jianxing, et al. \"Identifying ChIP-seq enrichment using MACS.\" Nature Protocols 7.9 (2012):1728.97. Amemiya, Haley M., Anshul Kundaje, and Alan P. Boyle. \"The ENCODE Blacklist: Identificationof Problematic Regions of the Genome.\" Scientific Reports 9.1 (2019): 9354.98. Yu, Guangchuang, Li-Gen Wang, and Qing-Yu He. \"ChIPseeker: an R/Bioconductor package forChIP peak annotation, comparison and visualization.\" Bioinformatics 31.14 (2015): 2382-2383.99. Heinz, Sven, et al. \"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities.\"Molecular Cell 38.4 (2010):576-589.100. Misha Bilenky. FindER: A sensitive analytical tool to study epigenetic modifications and protein-DNA interactions from ChIP-seq data. Canadian Epigenetics, Environment and Health ResearchConsortium Network Available at: http://www.epigenomes.ca/tools-andsoftware/finder/index.html.(Accessed: August 2019)101. Pellacani, Davide, et al. \"Analysis of normal human mammary epigenomes reveals cell-specificactive enhancer states and associated transcription factor networks.\" Cell Reports 17.8 (2016):2060-2074.84102. Condon, David E., et al. \"Defiant:(DMRs: easy, fast, identification and ANnoTation) identifiesdifferentially Methylated regions from iron-deficient rat hippocampus.\" BMC Bioinformatics 19.1(2018): 31.103. Aird, Daniel, et al. \"Analyzing and minimizing PCR amplification bias in Illumina sequencinglibraries.\" Genome Biology 12.2 (2011): R18.104. Landt, Stephen G., et al. \"ChIP-seq guidelines and practices of the ENCODE and modENCODEconsortia.\" Genome Research 22.9 (2012): 1813-1831.105. He, Bing, et al. \"Global view of enhancer–promoter interactome in human cells.\" Proceedings ofthe National Academy of Sciences 111.21 (2014): E2191-E2199.106. Vierbuchen, Thomas, et al. \"AP-1 transcription factors and the BAF complex mediate signal-dependent enhancer selection.\"Molecular Cell 68.6 (2017): 1067-1082.107. Pham, Duy, et al. \"The transcription factor Etv5 controls TH17 cell development and allergicairway inflammation.\" Journal of Allergy and Clinical Immunology 134.1 (2014): 204-214.108. Kalkan, Tüzer, et al. \"Complementary Activity of ETV5, RBPJ, and TCF3 Drives FormativeTransition from Naive Pluripotency.\" Cell Stem Cell 24.5 (2019): 785-801.109. Kang, Peng, et al. \"Sox9 and NFIA coordinate a transcriptional regulatory cascade during theinitiation of gliogenesis.\" Neuron 74.1 (2012): 79-94.110. Glasgow, Stacey M., et al. \"Glia-specific enhancers and chromatin structure regulate NFIAexpression and glioma tumorigenesis.\" Nature Neuroscience 20.11 (2017): 1520.111. Katsuoka, Fumiki, and Masayuki Yamamoto. \"Small Maf proteins (MafF, MafG, MafK): history,structure and function.\" Gene 586.2 (2016): 197-205.112. Ozawa, Tatsuya, et al. \"PDGFRA gene rearrangements are frequent genetic events in PDGFRA-amplified glioblastomas.\" Genes & Development 24.19 (2010): 2205-2218.113. Florio, Marta, and Wieland B. Huttner. \"Neural progenitors, neurogenesis and the evolution of theneocortex.\" Development 141.11 (2014): 2182-2194.114. Sood, Raman, Yasuhiko Kamikubo, and Paul Liu. \"Role of RUNX1 in hematologicalmalignancies.\" Blood, The Journal of the American Society of Hematology 129.15 (2017): 2070-2082.115. Zhao, Kai, et al. \"RUNX1 contributes to the mesenchymal subtype of glioblastoma in a TGFβpathway-dependent manner.\" Cell Death & Disease 10.12 (2019): 1-15.116. Fukui, Hirokazu, et al. \"Transcription factor Runx1 is pro-neurogenic in adult hippocampalprecursor cells.\" PloS One 13.1 (2018).117. Sachdeva, Rohit, et al. \"ID1 Is Critical for Tumorigenesis and Regulates Chemoresistance inGlioblastoma.\" Cancer Research 79.16 (2019): 4057-4071.118. Viales, Rebecca Rodriguez, et al. \"The helix‐loop‐helix protein id1 controls stem cell proliferationduring regenerative neurogenesis in the adult zebrafish telencephalon.\" Stem Cells 33.3 (2015):892-903.119. Miao, Hui, and Bingcheng Wang. \"EphA receptor signaling—complexity and emerging themes.\"Seminars in Cell & Developmental Biology 23.1 (2012): 16-25.120. Wykosky, Jill, et al. \"EphA2 as a novel molecular marker and target in glioblastoma multiforme.\"Molecular Cancer Research 3.10 (2005): 541-551.121. Miao, Hui, et al. \"EphA2 promotes infiltrative invasion of glioma stem cells in vivo through cross-talk with Akt and regulates stem cell properties.\" Oncogene 34.5 (2015): 558.85122. Mailand, Niels, et al. \"Deregulated human Cdc14A phosphatase disrupts centrosome separation andchromosome segregation.\" Nature Cell Biology 4.4 (2002): 318.123. Zhu, Qiang, et al. \"Genetic evidence that Nkx2. 2 and Pdgfra are major determinants of the timingof oligodendrocyte differentiation in the developing CNS.\" Development 141.3 (2014): 548-555.124. Turcan, Sevin, et al. \"Efficient induction of differentiation and growth inhibition in IDH1 mutantglioma cells by the DNMT Inhibitor Decitabine.\" Oncotarget 4.10 (2013): 1729.125. Rosiak, Kamila, et al. \"IDH1R132H in neural stem cells: differentiation impaired by increasedapoptosis.\" PloS One 11.5 (2016).126. Waddington, CH. The Strategy of the Genes. A Discussion of Some Aspects of Theoretical Biology.Alen & Unwin, 1957.86AppendicesAppendix A: RNA-seq Summary StatisticsReads were aligned to the human reference genome GRCH37-lite using BWA88. Sequencingstatistics were obtained using Samtools89.Sample Total Reads DuplicateReadsMapped reads % MappedReadsF8_rep1 189,074,704 29,107,790 186,628,612 98.71F8_rep2 213,576,364 32,263,314 210,734,184 98.67F8_rep3 202,716,158 25,116,903 200,176,182 98.75F8A2_rep1 192,846,030 30,614,741 190,198,244 98.63F8A2_rep2 205,317,890 31,505,770 202,855,219 98.8F8A2_rep3 199,557,902 25,253,529 196,078,176 98.26F8E10_rep1 192,881,560 29,592,442 190,259,983 98.64F8E10_rep2 218,435,418 34,228,966 215,091,661 98.47F8E10_rep3 209,931,290 26,350,938 204,185,369 97.26NHA_rep1 196,548,574 23,717,029 194,426,174 98.92NHA_rep2 224,926,016 31,520,776 222,351,474 98.86NHA_rep3 221,355,914 33,761,109 218,889,522 98.89NHAA2_rep1 210,044,704 26,501,182 207,180,740 98.64NHAA2_rep2 215,858,232 25,660,243 213,166,136 98.75NHAA2_rep3 197,866,280 24,080,785 195,459,633 98.78NHAH9_rep1 225,539,490 36,287,077 222,971,482 98.86NHAH9_rep2 215,263,620 32,816,555 212,806,796 98.86NHAH9_rep3 225,423,678 31,309,100 222,682,024 98.7887Appendix B: Top 50 Most Significant Protein-Coding Genes in All DE Analyses.Top 50 most significant DE genes with their q-values and fold changes obtained by DESeq291 inall differential expression analyses. A q-value of 0 was returned for values that exceeded thelowest representable float value in R.CIC-Associated DE Genes (IDH1-WT)NHA-A2 vs NHA NHA-H9 vs NHAChromosomeGeneName q-valueFoldChange q-valueFoldChange Direction1 NFIA 0 -306.6 1.3E-50 -15.4 down11 GDPD5 1.1E-128 -53.8 3.3E-46 -8.0 down6 GABBR1 3.2E-117 -213.3 1.5E-113 -173.3 down11 DGAT2 8.1E-113 -29.5 2.0E-10 -2.5 downX CNKSR2 3.6E-103 -122.9 1.2E-102 -92.0 down11 TENM4 1.6E-101 -44.6 9.5E-19 -8.0 down8 TSNARE1 7.4E-101 -48.4 2.7E-02 -1.8 down10 MGMT 2.5E-96 -202.4 2.6E-109 -174.3 down5 FAM169A 1.5E-92 -12.2 1.3E-15 -2.8 down19 CACNG8 2.2E-90 -42.1 1.7E-155 -186.4 down10 KIAA1217 5.8E-88 -24.3 3.1E-64 -14.5 down6 GFOD1 9.5E-88 -38.0 6.1E-28 -6.3 down6 DSP 6.5E-87 -293.3 1.1E-244 -459.1 down10 CPN1 4.3E-82 -59.2 2.3E-80 -99.6 down2 TNS1 2.9E-79 -182.5 1.0E-39 -30.0 down13 SLAIN1 3.9E-78 8.8 8.3E-03 1.7 up21 BACE2 1.2E-74 -132.9 4.3E-87 -119.1 down8 ZNF704 8.4E-74 17.9 2.7E-08 3.1 up1 C1orf106 5.7E-73 -20.5 1.6E-57 -16.5 down12 CACNA1C 4.4E-71 -16.0 1.6E-76 -31.6 down21 ADARB1 4.6E-70 -21.4 1.9E-61 -16.8 down12 CD9 9.9E-70 -36.2 1.9E-12 -2.8 down1 CREG1 8.0E-68 -72.9 7.1E-38 -8.1 down19 INSR 4.0E-67 -63.8 5.1E-72 -54.7 down20 C20orf194 1.2E-65 -24.2 2.7E-02 -1.5 down3 SCN5A 6.2E-65 -56.5 2.7E-73 -68.4 down17 CCL2 3.8E-63 -175.8 8.1E-39 -15.8 down8811 SYT9 1.1E-61 -112.1 2.0E-23 -8.8 down7 FOXP2 2.1E-61 8.7 6.1E-19 5.3 up11 OSBPL5 2.8E-61 -19.1 4.8E-30 -13.2 down11 JAM3 3.2E-60 -6.9 2.0E-33 -4.8 down10 PTPRE 3.6E-60 -54.2 7.2E-06 -2.8 down19 KCTD15 2.6E-59 -126.5 4.1E-31 -10.5 down1 SERINC2 1.4E-58 -41.0 2.6E-28 -5.7 down17 MYH10 9.2E-58 -81.7 1.1E-62 -41.5 down1 KIF21B 3.9E-57 -110.3 6.6E-53 -32.1 down10 ARMC4 5.4E-57 4.8 5.2E-10 2.7 up19 ZFR2 8.8E-57 -38.3 3.9E-64 -46.1 down6 SOX4 1.8E-56 -77.2 6.5E-31 -15.3 down6 BAI3 9.7E-56 9.9 3.0E-05 2.8 upX ZDHHC15 2.8E-55 -19.3 9.6E-64 -31.1 down12 IRAK3 4.3E-52 -18.3 4.6E-54 -19.9 down12 HSPB8 8.5E-51 -76.4 6.6E-21 -5.9 down7 PEG10 8.5E-51 -35.5 5.5E-66 -27.8 down8 FBXO16 1.4E-50 -25.9 2.6E-29 -8.0 down1 SPATA6 8.9E-49 -35.5 3.5E-51 -35.4 downX MID2 3.1E-48 -59.4 9.0E-12 -3.7 down4 HTRA3 1.1E-47 -49.5 1.1E-46 -46.4 down10 UNC5B 7.1E-47 -94.7 2.2E-53 -95.3 down10 PALD1 7.4E-47 -31.0 3.7E-37 -24.2 downCIC-Associated DE Genes (IDH1-R132H)F8-A2 vs F8 F8-H9 vs F8ChromosomeGeneName q-valueFoldChange q-valueFoldChange Direction17 NXN 0 -16.3 4.8E-28 -2.3 downX NHS 3.1E-150 -5.8 5.9E-76 -5.1 down1 SERINC2 4.3E-135 -5.7 7.8E-19 -2.0 down19 MYADM 2.0E-130 -41.9 7.3E-03 -1.5 down1 IGFN1 2.0E-119 23.0 1.6E-29 4.5 up10 GFRA1 1.2E-117 -4.5 1.8E-08 -1.6 down1 DNAJC6 6.3E-99 -2.8 8.2E-25 -1.8 downX PAK3 5.7E-92 4.3 1.6E-08 1.6 up3 TMEM108 7.3E-92 -3.9 8.0E-11 -1.5 down1 NFIA 5.5E-88 3.2 1.4E-137 4.3 up8911 CREB3L1 1.4E-87 3.9 1.0E-109 4.5 up7 TMEM178B 1.7E-87 -3.8 1.0E-04 -1.5 down9 TLR4 1.7E-77 -7.4 3.8E-06 -2.1 down3 ETV5 2.6E-77 3.4 1.8E-69 3.1 up5 CAMK2A 1.6E-68 4.8 2.2E-29 3.2 up1 LPAR3 3.6E-67 4.1 3.4E-37 2.7 up5 IL31RA 1.4E-66 3.6 7.1E-58 3.3 up18 ANKRD30B 1.4E-65 -4.4 1.1E-13 -2.2 downX DOCK11 2.0E-63 3.5 6.6E-07 1.7 up16 CDH8 1.3E-61 -13.1 2.9E-15 -3.1 down5 ADCY2 3.9E-59 3.8 6.4E-78 4.3 up14 TRIM9 5.1E-59 10.4 4.0E-20 3.7 up1 SLC44A5 1.4E-55 9.6 9.7E-13 2.9 up12 PPM1H 6.0E-55 -2.2 7.2E-28 -1.8 down1 PIK3C2B 2.4E-54 4.0 2.3E-24 3.0 up9 FIBCD1 1.9E-53 3.2 4.4E-25 2.3 up13 COL4A1 6.1E-51 -2.8 3.4E-05 -1.4 down9 GLIPR2 3.7E-50 -2.6 1.8E-22 -2.0 down4 HHIP 6.8E-50 -6.0 1.2E-02 -1.6 down2 SH3RF3 9.7E-48 -3.9 1.2E-06 -1.7 down19 PSG1 1.1E-47 -3.6 9.9E-21 -2.1 down20 JAG1 6.2E-47 -2.7 3.8E-18 -1.8 down9 MVB12B 6.0E-43 -3.6 1.2E-32 -3.2 down2 LOXL3 7.7E-43 -3.1 1.7E-45 -3.1 down14 PYGL 8.2E-43 2.5 6.1E-21 1.8 up1 ACOT11 8.8E-43 6.9 7.5E-04 1.8 up7 ETV1 2.3E-42 3.6 3.0E-37 3.2 up5 ARSI 2.3E-42 2.4 8.5E-23 1.8 up1 RGS7 3.1E-42 -2.0 5.7E-13 -1.9 down22 KIAA1644 5.4E-42 2.2 2.0E-14 1.7 up5 NPR3 1.6E-40 -2.6 2.2E-35 -2.4 downX SRPX2 2.0E-40 2.6 4.7E-27 2.1 up4 KCTD8 3.9E-40 -2.6 5.3E-03 -1.4 down4 TMEM156 2.9E-39 -2.2 2.3E-03 -1.3 down6 ALDH5A1 1.8E-38 -3.4 2.2E-55 -4.0 downX CLCN4 1.6E-37 3.2 2.2E-45 3.4 up19 EMC10 1.8E-36 -10.2 1.3E-36 -6.4 down3 DGKG 2.8E-36 4.3 1.9E-02 1.5 up5 F2RL1 3.0E-35 2.5 6.1E-29 2.5 up90IDH1-Associated DE GenesChromosome Gene Name q-value Fold Change DirectionX FGF13 0 322.6 up11 SHANK2 1.6E-294 174.5 up8 SLC7A2 3.4E-286 -55.5 down1 RYR2 2.3E-280 125.0 up21 COL6A2 4.5E-261 -307.3 down6 DSP 1.7E-242 -784.6 down4 PLAC8 6.7E-211 -42.9 down7 COL1A2 2.3E-210 -1159.8 downX PAK3 1.0E-208 -22.6 down9 LPPR1 1.6E-184 258.8 up5 CTNND2 6.0E-162 88.4 up18 FHOD3 5.1E-147 -35.6 down19 CACNG8 5.0E-146 -545.9 down6 SAMD5 2.6E-145 41.1 up10 SLC16A9 2.7E-143 -50.4 down17 ANKFN1 3.4E-132 24.0 up1 CHRM3 4.6E-130 29.5 up3 WNT5A 6.1E-129 -40.0 down19 GRIK5 1.5E-128 -40.8 downX BGN 3.8E-124 -87.8 down17 MYH10 2.7E-121 -79.1 down1 NFIA 4.6E-119 -7.1 down4 CCSER1 9.4E-113 26.5 up10 MKX 1.5E-112 20.9 up16 MMP2 3.1E-112 -10.9 down1 C1orf106 2.4E-110 -37.6 down11 NOX4 7.2E-109 -20.4 downX NHS 8.6E-107 -6.3 down21 DSCR4 2.6E-106 261.8 up5 CDX1 3.1E-106 114.1 up3 RPL39L 1.2E-105 59.3 up22 MN1 8.8E-104 -26.2 down2 ACTG2 5.8E-101 -81.4 down1 HIVEP3 4.6E-100 -37.0 down2 IGFBP2 1.5E-98 120.7 up913 ABI3BP 1.5E-96 8.4 up4 DKK2 3.7E-96 -16.0 down20 PMEPA1 4.7E-96 -254.5 downX CNKSR2 1.3E-95 -184.9 down11 CREB3L1 1.7E-92 -10.1 down3 CNTN6 6.5E-89 -17.4 down2 TNS1 6.4E-88 -201.7 down10 CPN1 7.7E-84 -118.7 down5 PPP2R2B 2.2E-83 25.4 up7 CREB5 3.2E-83 -24.7 down5 ADCY2 3.7E-82 56.8 up2 ADAM23 1.5E-81 14.2 up5 ADAM19 2.4E-80 -9.2 down2 CYP27C1 2.2E-79 -143.0 down13 SLAIN1 1.9E-76 -30.6 down92Appendix C: Top 25 Most Significant Gene Ontology Terms in All DE Gene SetsTop 25 most significant GO terms identified by Metascape92 on the results of all DE analyses.CIC-KO vs WT (IDH1-WT)GO Term Pathway Adj. p-valueGO:0050808 synapse organization 2.3E-10GO:0045664 regulation of neuron differentiation 9.6E-10GO:0032990 cell part morphogenesis 3.2E-09GO:0048667 cell morphogenesis involved in neuron differentiation 3.2E-09GO:0048858 cell projection morphogenesis 6.1E-09GO:0048812 neuron projection morphogenesis 6.7E-09GO:0120039 plasma membrane bounded cell projection morphogenesis 1.9E-08GO:0120035 regulation of plasma membrane bounded cell projection organization 5.2E-08GO:0000904 cell morphogenesis involved in differentiation 5.8E-08GO:0031344 regulation of cell projection organization 9.0E-08GO:0010975 regulation of neuron projection development 1.3E-07GO:0061564 axon development 4.7E-07GO:0007409 axonogenesis 2.0E-06GO:0060322 head development 7.8E-06GO:0007268 chemical synaptic transmission 1.3E-05GO:0050807 regulation of synapse organization 2.8E-05GO:0051962 positive regulation of nervous system development 2.9E-05GO:0097485 neuron projection guidance 2.9E-05GO:0007416 synapse assembly 3.9E-05GO:0007420 brain development 4.0E-05GO:0050803 regulation of synapse structure or activity 5.1E-05GO:0007411 axon guidance 5.1E-05GO:0042330 taxis 5.2E-05GO:0050769 positive regulation of neurogenesis 8.4E-05GO:0006935 chemotaxis 8.4E-05CIC-KO vs WT (IDH1-R132H)GO:0050808 synapse organization 1.3E-16GO:0050803 regulation of synapse structure or activity 7.1E-14GO:0050807 regulation of synapse organization 7.1E-14GO:0030198 extracellular matrix organization 1.0E-12GO:0000904 cell morphogenesis involved in differentiation 1.1E-12GO:0043062 extracellular structure organization 3.6E-12GO:0045664 regulation of neuron differentiation 8.4E-11GO:0120039 plasma membrane bounded cell projection morphogenesis 1.2E-1093GO:0048812 neuron projection morphogenesis 1.3E-10GO:0048858 cell projection morphogenesis 1.4E-10GO:0032990 cell part morphogenesis 1.7E-10GO:0031344 regulation of cell projection organization 6.8E-10GO:0120035 regulation of plasma membrane bounded cell projection organization 9.4E-10GO:0051962 positive regulation of nervous system development 9.4E-10GO:0010975 regulation of neuron projection development 1.0E-08GO:0007416 synapse assembly 1.8E-08GO:0022604 regulation of cell morphogenesis 2.8E-08GO:0051963 regulation of synapse assembly 3.4E-08GO:0010720 positive regulation of cell development 2.5E-07GO:0050769 positive regulation of neurogenesis 2.7E-07GO:0010769 regulation of cell morphogenesis involved in differentiation 4.0E-07GO:0045666 positive regulation of neuron differentiation 9.8E-07GO:0044089 positive regulation of cellular component biogenesis 4.8E-05GO:0051965 positive regulation of synapse assembly 1.8E-04IDH1-R132H vs IDH1-WTGO:0050808 synapse organization 1.3E-16GO:0050803 regulation of synapse structure or activity 7.1E-14GO:0050807 regulation of synapse organization 7.1E-14GO:0030198 extracellular matrix organization 1.0E-12GO:0000904 cell morphogenesis involved in differentiation 1.1E-12GO:0043062 extracellular structure organization 3.6E-12GO:0045664 regulation of neuron differentiation 8.4E-11GO:0120039 plasma membrane bounded cell projection morphogenesis 1.2E-10GO:0048812 neuron projection morphogenesis 1.3E-10GO:0048858 cell projection morphogenesis 1.4E-10GO:0032990 cell part morphogenesis 1.7E-10GO:0031344 regulation of cell projection organization 6.8E-10GO:0120035 regulation of plasma membrane bounded cell projection organization 9.4E-10GO:0051962 positive regulation of nervous system development 9.4E-10GO:0010975 regulation of neuron projection development 1.0E-08GO:0007416 synapse assembly 1.8E-08GO:0022604 regulation of cell morphogenesis 2.8E-08GO:0051963 regulation of synapse assembly 3.4E-08GO:0010720 positive regulation of cell development 2.5E-07GO:0050769 positive regulation of neurogenesis 2.7E-07GO:0010769 regulation of cell morphogenesis involved in differentiation 4.0E-07GO:0045666 positive regulation of neuron differentiation 9.8E-07GO:0044089 positive regulation of cellular component biogenesis 4.8E-05GO:0051965 positive regulation of synapse assembly 1.8E-0494Appendix D: CIC ChIP-seq Summary StatisticsSequencing statistics on CIC ChIP-seq and their matched input libraries obtained usingSamtools89. Peaks were called using MACS296 at a q-value cutoff of 0.05.SampleTotalReadsMappedreads%MappedReads%DuplicateReadsPCRBottleneckingCoefficientPredominantFragmentLength (bp) # PeaksCIC_ChIP_rep1 130,204,904 125,674,081 96.52 21.36 0.74 285 623,418input_rep1 78,589,976 70,695,857 89.96 2.09 0.89 130 NACIC_ChIP-rep2 93,073,398 80,471,044 86.46 24.79 0.70 235 59,514input_rep2 176,946,404 160,368,245 90.63 4.34 0.86 150 NA95Appendix E: 150 High-Confidence CIC PeaksThe top 150 most significant CIC peaks (in replicate 2) identified by MACS296. Peaks wereannotated using ChIPseeker98.ChromosomeStartCoordinateEndCoordinate-Log10q-valueFoldEnrichmentDistanceto TSSGenomicFeatureClosestGene1 27718653 27719403 2404.67 20.78 0 Promoter GPR33 185825182 185827244 1984.55 11.51 0 Promoter ETV511 69257994 69258635 1122.07 15.12 196372 Intergenic MYEOV8 29208023 29208912 1084.40 14.34 0 Promoter DUSP413 87602854 87603166 865.04 17.08 667829 Intergenic MIR45007 157253956 157255357 726.74 16.43 50041 Intergenic DNAJB617 41623429 41624300 693.96 18.34 0 Promoter ETV45 141703905 141705956 516.12 6.42 0 Promoter SPRY45 180875889 180876396 451.23 9.70 81601 Intergenic OR4F1620 30191762 30192411 443.97 10.11 -675 Promoter ID11 45265662 45266320 406.23 6.31 0 Promoter PLK318 75795746 75796189 387.87 29.02 833738 Intergenic GALR12 1217305 1218534 356.80 85.48 -159461 Intron TPO19 13905575 13906255 331.58 9.85 -19 Promoter ZSWIM417 36956021 36956673 271.45 7.00 0 Promoter PIP4K2B1 53791832 53792421 225.64 6.33 650 Promoter LRP815 75871470 75871861 222.40 7.45 0 Promoter PTPN917 79885506 79885864 213.22 5.24 0 Promoter MAFG15 86439488 86440729 184.96 16.10 -101299 Intergenic KLHL2511 69453128 69454345 156.86 5.25 -1528 Promoter CCND13 54433591 54434282 155.29 4.84 202193 Intron CACNA2D310 368743 369264 148.93 5.99 46643 Intron DIP2C11 65667584 65668145 141.08 4.61 0 Promoter FOSL114 77735557 77736139 133.99 17.52 1516 Promoter NGB22 34792391 34792728 132.47 26.07 -473807 Intergenic LARGE111 67981061 67981542 127.15 4.39 0 Promoter KMT5B20 34562936 34563423 120.45 7.40 6407 Intron CNBD28 86567412 86569379 115.37 2.84 0 Promoter REXO1L2P8 145180520 145180976 102.54 5.20 -11696 Intergenic HGH114 75745201 75746019 100.83 4.34 0 Promoter FOSX 53935317 53935893 100.33 15.97 105103 Intergenic PHF89 99381934 99382294 98.66 5.80 0 Promoter CDC14B8 86803867 86804939 98.65 2.82 -14561 Intron REXO1L1P22 38597867 38598237 98.56 4.52 0 Promoter MAFF19 48453271 48453719 96.98 3.00 0 Promoter SNAR-C38 86788551 86789860 94.25 3.07 0 Promoter REXO1L1P968 86758010 86759300 93.96 2.93 -249 Promoter REXO1L2P10 99035512 99036033 92.91 5.33 -5065 Intron ARHGAP1912 122289628 122289926 92.44 21.28 6843 Intron HPD11 57435176 57435606 91.71 3.17 0 Promoter ZDHHC52 65657655 65658855 87.86 3.32 801 Promoter SPRED26 468994 469609 85.87 37.66 71938 Intergenic IRF411 11268014 11268598 83.01 19.66 106306 Intergenic CSNK2A311 69065787 69066570 82.99 7.96 4165 Intron MYEOV14 57031242 57031530 82.37 16.11 -14808 Intergenic TMEM2602 239197511 239198122 80.24 4.30 -304 Promoter PER28 86815303 86817381 76.74 2.64 22790 Exon REXO1L1P8 86746940 86748277 73.40 2.76 9484 Intron REXO1L2P2 91776065 91776613 71.15 3.33 71362 Intergenic LOC6543428 58118369 58119243 70.92 6.05 -12405 Intergenic LINC0160617 77437797 77438510 67.88 17.66 -17595 Intron RBFOX319 50636873 50637377 67.02 3.76 0 Promoter SNAR-B21 16482364 16482743 64.61 4.44 0 Promoter EPHA210 62703285 62704667 63.58 2.88 0 Promoter RHOBTB11 16862880 16863466 61.81 5.20 -11943 3' UTR MIR36751 145043635 145044220 61.75 3.84 -3643 Intron LOC6535138 58124251 58125072 60.11 4.46 -6576 Intergenic LINC0160613 66981970 66982584 58.95 6.78 -416717 Exon PCDH9-AS21 100817566 100818537 58.54 3.80 0 Promoter CDC14A4 132663690 132664972 56.14 2.71 -1405498 Intergenic PCDH101 205270355 205270834 55.94 5.87 7026 Intergenic NUAK219 48458658 48459151 55.59 6.73 0 Promoter SNAR-C222 45723898 45724255 55.33 3.15 -1270 Promoter FAM118A8 86776624 86777710 54.33 2.63 11596 Intron REXO1L1P16 83986500 83987717 50.95 3.81 0 Promoter OSGIN110 112257454 112257732 47.43 3.45 0 Promoter DUSP512 122185207 122186415 46.23 3.21 34549 Exon TMEM120B4 190767226 190768204 46.17 3.24 -93770 Intergenic FRG1X 147045049 147045497 46.02 8.70 -17352 Intergenic FMR1NB19 46151551 46151873 44.98 6.68 -2776 Intergenic EML22 202897713 202898477 44.57 4.97 -833 Promoter FZD719 17797414 17797903 44.09 6.32 1147 Promoter UNC13A18 15165058 15165501 43.95 4.41 160417 Intergenic LOC6446692 3184183 3185113 42.09 2.67 138332 Intergenic TSSC12 65662929 65663919 41.66 3.36 -3273 Exon SPRED23 77817934 77818824 41.61 6.17 180032 Intergenic ROBO210 64367948 64368507 41.03 5.01 -34995 Intron ZNF3653 127141372 127142135 40.71 5.70 157145 Intergenic TPRA121 40114966 40115282 40.59 16.79 8899 Intron LINC001141 32083639 32084112 40.52 3.87 338 Promoter HCRTR110 64564371 64564652 39.94 3.05 0 Promoter ADO9716 32489651 32490258 39.76 5.31 -188349 Intergenic LOC3907057 65037507 65037998 39.50 6.86 -74779 Intergenic INTS4P210 106544513 106544993 39.49 5.12 143654 Intron SORCS310 63425242 63425807 39.14 5.58 2523 Intron C10orf1079 115823524 115824240 38.67 3.03 -4528 Intergenic ZFP375 69194728 69198929 38.14 6.30 -122143 5' UTR SERF1A6 34211974 34212925 38.08 4.46 3460 3' UTR HMGA17 158995277 158996348 38.07 5.46 -57628 Intergenic VIPR218 22613463 22613819 38.01 19.63 318395 Intergenic ZNF5211 82882024 82882325 37.92 7.66 446026 Intergenic ADGRL28 123269395 123269977 37.91 6.58 -523924 Intergenic ZHX23 170105432 170105812 37.85 10.52 28021 Intron SKIL11 113569900 113570554 37.37 9.69 6514 5' UTR TMPRSS521 36260820 36261192 37.21 4.13 0 Promoter RUNX117 67210244 67210832 37.17 7.09 160 Promoter ABCA105 39070242 39070687 37.06 5.04 3814 Intron RICTOR1 16945223 16945736 37.00 2.96 -5123 Exon NBPF11 150753790 150754562 36.99 3.05 -15357 Intergenic CTSS2 1801429 1802144 36.84 3.35 44514 Intron MYT1L17 36348250 36349125 36.80 7.57 0 Promoter TBC1D35 1760935 1761941 36.46 6.38 38015 Intergenic MRPL3610 122409806 122410403 36.21 5.98 146715 Intergenic PLPP417 29421698 29422017 36.21 3.83 0 Promoter NF112 19493073 19493556 35.97 4.59 17621 Intron PLEKHA51 120787861 120788493 35.50 8.20 -50512 Intergenic FAM72B19 16145607 16147222 35.45 5.10 677 Promoter LINC009057 23211262 23211813 35.45 6.95 -9633 Intron NUPL217 30412121 30412698 35.44 6.80 44766 Intergenic SH3GL1P116 81607606 81608225 35.39 7.44 -35687 Intron CMIP3 129096916 129097624 35.39 7.61 18567 Intergenic SNORA7B20 3240594 3241126 34.76 7.56 -20707 Exon SLC4A1120 24913847 24914455 34.71 8.72 -15411 Intergenic CST715 102299848 102302220 34.58 2.88 -35203 Exon TARSL210 95123520 95124048 34.48 8.08 17105 Exon MYOF22 47254851 47256796 34.32 9.84 85027 Intron TBC1D22A3 98290107 98290676 34.32 8.61 21779 Intergenic CPOX5 27912990 27913565 34.30 7.48 440591 Intergenic LINC010212 78928613 78929064 34.27 9.68 -323748 Intergenic REG3G5 250412 251771 34.27 9.49 -19965 Exon PDCD68 118106842 118107554 34.20 5.33 -39783 Intron SLC30A88 56601874 56602527 34.12 5.89 73060 Intergenic TMEM685 68926942 68931869 34.02 5.75 0 Promoter NA20 60459353 60460013 33.81 4.87 68705 Intron MIR12578 58119906 58120631 33.72 3.70 -11017 Intergenic LINC01606981 234953439 234954070 33.64 15.08 93650 Intergenic LINC011322 91763820 91764368 33.36 4.34 83607 Intergenic LOC65434216 86156112 86156781 33.19 7.13 -163256 Intergenic LOC14651310 444792 445393 33.13 5.19 14657 Exon DIP2C3 187516577 187517161 32.97 7.07 -53064 Intergenic BCL610 126248966 126249581 32.87 6.29 -44189 Intron LHPP10 72083479 72084624 32.75 3.79 -40029 Exon NPFFR11 145026661 145027216 32.59 6.45 12776 Intron LOC6535134 6998130 6998591 32.52 8.50 -2221 Exon TBC1D144 65765685 65766192 32.45 4.07 104026 Intergenic LOC4011341 120838940 120839373 32.40 5.22 0 Promoter FAM72B2 9933866 9934482 32.39 9.61 -49089 Intergenic TAF1B20 56797357 56797924 32.29 5.53 23384 Intergenic PPP4R1L1 186248895 186249478 32.22 5.46 -15927 Intron PRG416 85159595 85160706 32.20 3.32 -10050 Intergenic LOC40054812 662413 663440 32.08 5.26 714 Promoter B4GALNT312 3775321 3775741 31.93 11.51 7017 Intron CRACR2A20 60226213 60226812 31.91 4.90 51396 Intron CDH41 152186058 152186846 31.89 2.92 9826 Exon HRNR20 37140364 37140775 31.69 3.28 2644 Intron RALGAPB1 206138400 206138879 31.61 4.48 -32 Promoter FAM72A17 57118089 57118574 31.60 4.79 65692 Intron TRIM373 106539547 106540222 31.60 6.44 419263 Intergenic LINC008825 69207158 69207626 31.54 2.67 -113446 Intron SERF1A1 161860225 161860944 31.53 6.63 94078 Intron OLFML2B99Appendix F: Histone Modification ChIP-seq Summary StatisticsSequencing and peak summary statistics on each histone modification ChIP-seq library.Number and proportion of mapped reads were obtained using Samtools89 and peaks werecalled using FindER100 with a q-value cutoff of 0.05.Library# MappedReads% MappedReads # PeaksTotal bp inPeaks% MappedReads inPeaksF8_rep1_h3k27ac 95,760,982 98.8 51,909 63,571,191 44.6F8_rep1_h3k27me3 179,688,646 98.9 303,326 414,109,284 46F8_rep1_h3k36me3 193,986,989 98.9 333,162 386,728,634 49.4F8_rep1_h3k4me1 193,412,496 99.3 128,818 156,658,770 43.9F8_rep1_h3k4me3 92,358,554 99.2 59,047 77,936,781 51.8F8_rep1_h3k9me3 154,219,713 94.9 114,071 98,946,862 13.4F8_rep1_input 201,177,018 97.4 NA NA NAF8_rep2_h3k27ac 90,747,284 98.9 46,074 59,404,992 53.2F8_rep2_h3k27me3 182,910,273 98.8 390,018 386,161,015 44.2F8_rep2_h3k36me3 172,736,891 98.9 384,672 384,939,786 51.8F8_rep2_h3k4me1 198,566,537 99.3 158,716 168,558,198 48.4F8_rep2_h3k4me3 84,622,260 99.4 58,468 76,708,452 61.8F8_rep2_h3k9me3 148,486,039 94.8 118,625 101,594,940 14.9F8_rep2_input 196,872,944 97.4 NA NA NAF8A2_rep1_h3k27ac 90,507,638 99 49,482 62,824,848 51.1F8A2_rep1_h3k27me3 178,117,409 98.7 392,246 369,754,707 44.1F8A2_rep1_h3k36me3 338,758,604 99 316,303 387,782,693 50.7F8A2_rep1_h3k4me1 192,980,185 99.3 136,119 160,372,810 47.3F8A2_rep1_h3k4me3 83,810,994 99.4 51,343 70,468,371 63.2F8A2_rep1_h3k9me3 157,461,502 95.1 125,176 103,949,976 14.5F8A2_rep1_input 213,504,198 97.5 NA NA NAF8A2_rep2_h3k27ac 124,393,186 98.8 47,965 61,941,835 44.1F8A2_rep2_h3k27me3 177,502,588 98.8 318,837 421,260,480 47.4F8A2_rep2_h3k36me3 258,160,962 98.9 349,042 385,264,605 49.6F8A2_rep2_h3k4me1 185,331,530 99.3 127,767 151,857,654 45100F8A2_rep2_h3k4me3 84,981,067 99.4 60,629 77,433,168 61.6F8A2_rep2_h3k9me3 149,435,319 94.5 107,267 97,571,820 14.5F8A2_rep2_input 157,127,332 97.4 NA NA NAF8E10_rep1_h3k27ac 96,881,804 98.9 56,307 68,643,132 55.3F8E10_rep1_h3k27me3 211,177,945 98.8 271,252 435,721,110 47.6F8E10_rep1_h3k36me3 189,630,960 98.9 355,963 360,312,766 48.2F8E10_rep1_h3k4me1 202,090,506 99.4 128,835 153,862,656 45.5F8E10_rep1_h3k4me3 87,317,541 99.2 60,160 73,747,260 53.5F8E10_rep1_h3k9me3 168,478,782 95 121,583 102,401,884 14.6F8E10_rep1_input 224,601,318 97.5 NA NA NAF8E10_rep2_h3k27ac 98,451,656 98.9 48,098 62,725,180 53.7F8E10_rep2_h3k27me3 218,799,583 98.6 302,644 333,608,352 38.7F8E10_rep2_h3k36me3 245,290,755 98.8 314,365 350,409,330 43.8F8E10_rep2_h3k4me1 198,345,809 99.3 172,855 182,684,880 49.5F8E10_rep2_h3k4me3 96,696,624 99.2 60,183 79,811,634 61.3F8E10_rep2_h3k9me3 175,924,375 95.5 126,952 105,674,247 14.7F8E10_rep2_input 181,507,024 97.3 NA NA NANHA_rep1_h3k27ac 116,261,440 98.2 46,289 60,150,394 35NHA_rep1_h3k27me3 176,654,876 98.9 261,103 401,215,482 46.5NHA_rep1_h3k36me3 202,715,477 99 297,261 448,464,138 62.2NHA_rep1_h3k4me1 165,992,038 99.5 125,179 155,519,534 46.8NHA_rep1_h3k4me3 79,754,427 99.4 44,969 60,770,082 57.9NHA_rep1_h3k9me3 154,158,413 93.4 134,212 106,239,708 13.3NHA_rep1_input 208,415,418 97.7 NA NA NANHA_rep2_h3k27ac 127,242,193 96.3 42,120 62,281,086 41.6NHA_rep2_h3k27me3 154,092,500 98.8 252,979 349,801,970 38.6NHA_rep2_h3k36me3 190,487,012 99 240,091 450,400,948 59.2NHA_rep2_h3k4me1 158,378,279 99.3 106,860 147,330,586 41.1NHA_rep2_h3k4me3 56,071,104 99.1 43,162 61,629,405 50.6NHA_rep2_h3k9me3 114,260,941 93.4 106,484 63,053,166 5.2NHA_rep2_input 191,288,550 91.8 NA NA NANHAA2_rep1_h3k27ac 84,886,572 97.5 39,876 55,896,339 44.9NHAA2_rep1_h3k27me3 127,792,075 98.8 246,472 399,897,484 46.4101NHAA2_rep1_h3k36me3 182,251,716 98.8 267,845 409,279,818 53.6NHAA2_rep1_h3k4me1 155,694,331 99.4 106,852 145,255,200 46.2NHAA2_rep1_h3k4me3 71,547,298 99.2 48,703 61,666,130 58.2NHAA2_rep1_h3k9me3 101,230,677 94 92,101 73,624,332 7.5NHAA2_rep1_input 201,382,995 97.8 NA NA NANHAA2_rep2_h3k27ac 116,892,215 98.9 47,976 67,273,575 44.5NHAA2_rep2_h3k27me3 146,525,931 98.8 299,632 374,943,102 44.5NHAA2_rep2_h3k36me3 190,734,614 99 360,017 408,172,056 54.7NHAA2_rep2_h3k4me1 197,950,485 99.4 127,867 160,324,596 50.7NHAA2_rep2_h3k4me3 78,980,859 99.3 45,768 59,691,116 64.3NHAA2_rep2_h3k9me3 164,034,572 94.3 104,131 93,838,220 13.5NHAA2_rep2_input 202,364,160 97.7 NA NA NANHAH9_rep1_h3k27ac 110,001,381 99 47,535 63,408,910 43.8NHAH9_rep1_h3k27me3 171,993,360 98.8 231,310 392,330,553 49.4NHAH9_rep1_h3k36me3 195,062,543 99.1 273,494 432,415,995 60.6NHAH9_rep1_h3k4me1 188,801,459 99.5 122,145 158,105,582 49.6NHAH9_rep1_h3k4me3 81,489,805 99.5 41,130 59,599,130 63.4NHAH9_rep1_h3k9me3 163,867,566 92.3 106,653 99,967,998 16NHAH9_rep1_input 199,579,342 97.9 NA NA NANHAH9_rep2_h3k27ac 93,099,987 99.1 55,765 71,681,442 40.8NHAH9_rep2_h3k27me3 172,119,287 98.9 236,308 399,423,550 49.9NHAH9_rep2_h3k36me3 175,311,655 99 373,424 428,673,534 57.4NHAH9_rep2_h3k4me1 191,699,428 99.5 134,092 169,475,939 51.2NHAH9_rep2_h3k4me3 86,662,789 99.4 44,668 62,583,138 60.4NHAH9_rep2_h3k9me3 171,198,529 91.9 95,911 96,741,522 16.3NHAH9_rep2_input 149,295,677 97.7 NA NA NA102Appendix G: WGBS Summary StatisticsSequencing summary statistics on each WGBS library, obtained using Samtools89.Library # Mapped Reads % Mapped Reads # CpGs with >5CoverageF8_rep1 744,731,704 80.9 27,039,289F8_rep2 813,208,945 82.4 27,101,426F8A2_rep1 781,073,529 81.9 27,318,639F8A2_rep2 822,008,290 79.9 27,314,548F8E10_rep1 803,788,079 83 27,367,223F8E10_rep2 805,552,585 80.1 27,372,183NHA_rep1 747,204,272 76.8 27,227,589NHA_rep2 837,313,478 84.7 27,353,412NHAA2_rep1 808,887,087 82.8 27,319,479NHAA2_rep2 809,738,836 82.9 27,249,838NHAH9_rep1 839,614,940 84.4 27,173,849NHAH9_rep2 810,311,737 83.2 27,369,575"@en ; edm:hasType "Thesis/Dissertation"@en ; vivo:dateIssued "2020-11"@en ; edm:isShownAt "10.14288/1.0391990"@en ; dcterms:language "eng"@en ; ns0:degreeDiscipline "Genome Science and Technology"@en ; edm:provider "Vancouver : University of British Columbia Library"@en ; dcterms:publisher "University of British Columbia"@en ; dcterms:rights "Attribution-NonCommercial-NoDerivatives 4.0 International"@* ; ns0:rightsURI "http://creativecommons.org/licenses/by-nc-nd/4.0/"@* ; ns0:scholarLevel "Graduate"@en ; dcterms:title "Characterization of the effects of CIC loss and neomorphic IDH1 mutation on the transcriptome and epigenome"@en ; dcterms:type "Text"@en ; ns0:identifierURI "http://hdl.handle.net/2429/74788"@en .