"Medicine, Faculty of"@en . "Medical Genetics, Department of"@en . "DSpace"@en . "UBCV"@en . "Liu, Sheng"@en . "2016-01-06T02:18:03"@en . "2015"@en . "Doctor of Philosophy - PhD"@en . "University of British Columbia"@en . "Transcription of endogenous retroviruses (ERVs) is inhibited by de novo DNA methylation during gametogenesis, a process initiated after birth in oocytes and at approximately embryonic day 15.5 (E15.5) in prospermatogonia. However earlier in germline development, the genome, including most retrotransposons, is progressively demethylated. As DNA methylation reaches a low point in E13.5 primordial germ cells (PGCs) of both sexes, raising the question whether repressive histone methylations play a role in silencing of retrotransposons at this stage of development. To answer this question, I first focused on developing low input assays for profiling histone modifications, DNA methylation and transcription from rare cell populations. In close collaboration with Dr. Julie Brind\u00E2\u0080\u0099Amour, I was able to develop the \u00E2\u0080\u009CSmallCell\u00E2\u0080\u009D protocol package, which enables chromatin immunoprecipitation, bisulfite conversion of DNA, RNA isolation-reverse transcription using ~1000 cells per assay, but also construction of sequencing library from pictograms of DNA. This allows profiling of epigenetic information at both locus-specific and genome-wide scales. I then developed the \u00E2\u0080\u009CInterSeq\u00E2\u0080\u009D software (R package) to intersect and explore different types of epigenomic data. This package allows converting sequencing data into genomic interval measures in spreadsheet (SeqData), interfacing this spreadsheet into flowcytometry data (SeqFrame), and an intuitive graphical interface to gate and explore the inter-relationship between different types of epigenomic sequencing data similar to flowcytometry (SeqViz). With these tools we first determined whether retrotransposons are marked by H3K9me3 and H3K27me3. Although these repressive histone modifications are found predominantly on distinct genomic regions in E13.5 PGCs, they concurrently mark partially methylated long terminal repeats and LINE1 elements. Germline-specific conditional knockout of the H3K9 methyltransferase SETDB1 yields a decrease of both marks and DNA methylation at H3K9me3-enriched retrotransposon families. Strikingly, Setdb1 knockout E13.5 PGCs show concomitant derepression of many marked ERVs, including IAP, ETn, and ERVK10C elements, and ERV-proximal genes, a subset in a sex-dependent manner. Furthermore, Setdb1 deficiency is associated with a reduced number of male E13.5 PGCs and postnatal hypogonadism in both sexes. Taken together, these observations reveal that SETDB1 is an essential guardian against proviral expression prior to the onset of de novo DNA methylation in the germline."@en . "https://circle.library.ubc.ca/rest/handle/2429/56191?expand=metadata"@en . "Transcriptional Repression of Retrotransposons in Mouse Germline by Sheng Liu A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Medical Genetics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) December 2015 \u00C2\u00A9 Sheng Liu, 2015 ii Abstract Transcription of endogenous retroviruses (ERVs) is inhibited by de novo DNA methylation during gametogenesis, a process initiated after birth in oocytes and at approximately embryonic day 15.5 (E15.5) in prospermatogonia. However earlier in germline development, the genome, including most retrotransposons, is progressively demethylated. As DNA methylation reaches a low point in E13.5 primordial germ cells (PGCs) of both sexes, raising the question whether repressive histone methylations play a role in silencing of retrotransposons at this stage of development. To answer this question, I first focused on developing low input assays for profiling histone modifications, DNA methylation and transcription from rare cell populations. In close collaboration with Dr. Julie Brind\u00E2\u0080\u0099Amour, I was able to develop the \u00E2\u0080\u009CSmallCell\u00E2\u0080\u009D protocol package, which enables chromatin immunoprecipitation, bisulfite conversion of DNA, RNA isolation-reverse transcription using ~1000 cells per assay, but also construction of sequencing library from pictograms of DNA. This allows profiling of epigenetic information at both locus-specific and genome-wide scales. I then developed the \u00E2\u0080\u009CInterSeq\u00E2\u0080\u009D software (R package) to intersect and explore different types of epigenomic data. This package allows converting sequencing data into genomic interval measures in spreadsheet (SeqData), interfacing this spreadsheet into flowcytometry data (SeqFrame), and an intuitive graphical interface to gate and explore the inter-relationship between different types of epigenomic sequencing data similar to flowcytometry (SeqViz). With these tools we first determined whether retrotransposons are marked by H3K9me3 and H3K27me3. Although these repressive histone modifications are found predominantly on iii distinct genomic regions in E13.5 PGCs, they concurrently mark partially methylated long terminal repeats and LINE1 elements. Germline-specific conditional knockout of the H3K9 methyltransferase SETDB1 yields a decrease of both marks and DNA methylation at H3K9me3-enriched retrotransposon families. Strikingly, Setdb1 knockout E13.5 PGCs show concomitant derepression of many marked ERVs, including IAP, ETn, and ERVK10C elements, and ERV-proximal genes, a subset in a sex-dependent manner. Furthermore, Setdb1 deficiency is associated with a reduced number of male E13.5 PGCs and postnatal hypogonadism in both sexes. Taken together, these observations reveal that SETDB1 is an essential guardian against proviral expression prior to the onset of de novo DNA methylation in the germline. iv Preface Collaborators: Dr. Kenjiro Shirane, Dr. Hiroyuki Sasaki Sasaki Lab Division of Epigenomics and Development, Department of Molecular Genetics, Medical Institute of Bioregulation, Kyushu University Dr. Yoichi Shinkai, Shinkai Lab\u0001 Cellular Memory Laboratory, RIKEN Aaron Bogutz, Dr. Louis Lefebvre Lefebvre Lab Department of Medical Genetics, Life Sciences Institute, University of British Columbia v A version of chapter 2 has been published. Brind'Amour, J., Liu, S., Hudson, M., Chen, C., Karimi, M.M., and Lorincz, M.C. (2015). An ultra-low-input native ChIP-seq protocol for genome-wide profiling of rare cell populations. Nat Commun 6, 6033. I initiated the low input native ChIP assay development for my PGC project, then with the help of Dr. Brind'Amour, specifically the library construction protocol that she optimized, combined with the antibody-beads binding step that I introduced, we developed the 1000 cell \u00E2\u0080\u009Cultra-low-input\u00E2\u0080\u009D native ChIPseq protocol, and applied it to study PGCs. Julie then added data in ES cells for different cell quantities (1k, 10k, 100k). A version of chapter 3 will be submitted shortly. Liu, S., and Lorincz, M.C. (2015). InterSeq, a bioinformatic tool for intersecting multiple types of epigenomic datasets. I wrote the bioinformatic software package and this software method article with the guidance from Dr. Lorincz. A version of chapter 4 has been published. Liu, S.*, Brind'Amour, J.*, Karimi, M.M., Shirane, K., Bogutz, A., Lefebvre, L., Sasaki, H., Shinkai, Y., and Lorincz, M.C. (2014). Setdb1 is required for germline development and silencing of H3K9me3-marked endogenous retroviruses in primordial germ cells. Genes Dev 28, 2041\u00E2\u0080\u00932055. I conducted experiments from isolation of PGCs to submission of constructed ChIPseq and RNAseq libraries from these cells; I also analyzed the data regarding endogenous retroviruses and participated in writing the manuscript, specifically the section on the role of Setdb1 in regulation of endogenous retroviruses. Timed embryos were provided by Dr. Brind\u00E2\u0080\u0099Amour, who also carried out the imaging experiment on vi postnatal gonads and analyzed data regarding genes and wrote the section of the manuscript focused on the role of Setdb1 in gene regulation. *- Indicates co-first author. Animal experimentation followed the guidelines from the Canadian Council on Animal Care (CCAC) under University of British Columbia animal care license numbers A13-0115 and A12-0208. The software code and protocols have been shared on my Github repository (https://sheng-liu.github.io). Individual package can also be accessed directly: R package InterSeq (https://github.com/sheng-liu/InterSeq.git) and protocol package SmallCell (https://github.com/sheng-liu/SmallCell.git). vii Table of Contents Abstract .......................................................................................................................................... ii\t \u00C2\u00A0Preface ........................................................................................................................................... iv\t \u00C2\u00A0Table of Contents ........................................................................................................................ vii\t \u00C2\u00A0List of Tables ................................................................................................................................ xi\t \u00C2\u00A0List of Figures .............................................................................................................................. xii\t \u00C2\u00A0List of Abbreviations ................................................................................................................. xiv\t \u00C2\u00A0Acknowledgements .................................................................................................................. xviii\t \u00C2\u00A0Dedication .................................................................................................................................... xx\t \u00C2\u00A0SECTION I\t \u00C2\u00A0 INTRODUCTION ............................................................................................... 1\t \u00C2\u00A0Chapter 1: Retrotransposons, germ cell development and associated epigenetic events ................. 2\t \u00C2\u00A01.1\t \u00C2\u00A0 Retrotransposons and the host genome ....................................................................................... 2\t \u00C2\u00A01.2\t \u00C2\u00A0 Epigenetic mechanisms involved in ERV silencing ................................................................... 6\t \u00C2\u00A01.2.1\t \u00C2\u00A0 DNA methylation and ERV silencing .................................................................................................7\t \u00C2\u00A01.2.2\t \u00C2\u00A0 H3K9 methylation and ERV silencing ................................................................................................8\t \u00C2\u00A01.3\t \u00C2\u00A0 Germ cell development and associated epigenetic events ........................................................ 10\t \u00C2\u00A01.3.1\t \u00C2\u00A0 Primordial germ cells (PGCs) before sex differentiation ..................................................................10\t \u00C2\u00A01.3.2\t \u00C2\u00A0 Male germ cell development .............................................................................................................13\t \u00C2\u00A01.3.3\t \u00C2\u00A0 Female germ cell development ..........................................................................................................16\t \u00C2\u00A01.3.4\t \u00C2\u00A0 The piRNA-DNMT3L pathway and ERV silencing in the germ line ...............................................18\t \u00C2\u00A01.4\t \u00C2\u00A0 Aims of the thesis ...................................................................................................................... 19\t \u00C2\u00A0 viii SECTION II\t \u00C2\u00A0 METHODS DEVELOPMENT ........................................................................ 21\t \u00C2\u00A0Chapter 2: Survey epigenetic information of \u00E2\u0080\u009Crare\u00E2\u0080\u009D cell populations in vivo at locus-specific and genome-wide scales ............................................................................................................................... 22\t \u00C2\u00A02.1\t \u00C2\u00A0 Design and implementation ...................................................................................................... 26\t \u00C2\u00A02.1.1\t \u00C2\u00A0 Cell collection ....................................................................................................................................27\t \u00C2\u00A02.1.2\t \u00C2\u00A0 Locus-specific assays of low input samples for exploration or validation ........................................27\t \u00C2\u00A02.1.3\t \u00C2\u00A0 Genome-wide transcriptome and epigenome profiling of \u00E2\u0080\u009Crare\u00E2\u0080\u009D cell populations ............................36\t \u00C2\u00A02.2\t \u00C2\u00A0 Results ....................................................................................................................................... 39\t \u00C2\u00A02.2.1\t \u00C2\u00A0 Complexity of ULI NChIP-seq libraries from 103-105 mESCs .........................................................39\t \u00C2\u00A02.2.2\t \u00C2\u00A0 Correlation between ULI and standard-input NChIP-seq libraries ...................................................42\t \u00C2\u00A02.2.3\t \u00C2\u00A0 Sex-specific H3K27me3 profiles in PGCs isolated from single embryos .........................................45\t \u00C2\u00A02.3\t \u00C2\u00A0 Conclusion ................................................................................................................................ 50\t \u00C2\u00A02.4\t \u00C2\u00A0 Availability and future directions ............................................................................................. 50\t \u00C2\u00A0Chapter 3: Visualize and intersect multidimensional epigenomic datasets .................................... 53\t \u00C2\u00A03.1\t \u00C2\u00A0 Design and implementation ...................................................................................................... 54\t \u00C2\u00A03.1.1\t \u00C2\u00A0 Calculations from a bam file with SeqData .......................................................................................55\t \u00C2\u00A03.1.2\t \u00C2\u00A0 Interface tables to flowCore with SeqFrame .....................................................................................55\t \u00C2\u00A03.1.3\t \u00C2\u00A0 Graphical interface for visualization and gating with SeqViz ...........................................................56\t \u00C2\u00A03.2\t \u00C2\u00A0 Results ....................................................................................................................................... 57\t \u00C2\u00A03.2.1\t \u00C2\u00A0 Compute H3K9me3, H3K27me3 and DNA methylation genome-wide using 1kb bins ...................58\t \u00C2\u00A03.2.2\t \u00C2\u00A0 Gate on distribution of one variable and see changes of distribution in other variables ...................60\t \u00C2\u00A03.2.3\t \u00C2\u00A0 Gate on correlation of two variables and see changes of distribution in other variables ..................60\t \u00C2\u00A03.3\t \u00C2\u00A0 Conclusion ................................................................................................................................ 62\t \u00C2\u00A03.4\t \u00C2\u00A0 Availability and future directions ............................................................................................. 62\t \u00C2\u00A0 ix SECTION III\t \u00C2\u00A0 RESULTS ......................................................................................................... 64\t \u00C2\u00A0Chapter 4: Setdb1 is required for germline development and silencing of H3K9me3-marked endogenous retroviruses in primordial germ cells ............................................................................ 65\t \u00C2\u00A04.1\t \u00C2\u00A0 Materials and methods .............................................................................................................. 65\t \u00C2\u00A04.1.1\t \u00C2\u00A0 Breeding and mating ..........................................................................................................................65\t \u00C2\u00A04.1.2\t \u00C2\u00A0 Isolation of PGCs ...............................................................................................................................65\t \u00C2\u00A04.1.3\t \u00C2\u00A0 Genotyping ........................................................................................................................................66\t \u00C2\u00A04.1.4\t \u00C2\u00A0 Chromatin immunoprecipitation ........................................................................................................66\t \u00C2\u00A04.1.5\t \u00C2\u00A0 RNA extraction and double stranded cDNA preparation ..................................................................67\t \u00C2\u00A04.1.6\t \u00C2\u00A0 qPCR ..................................................................................................................................................67\t \u00C2\u00A04.1.7\t \u00C2\u00A0 Library construction and sequencing .................................................................................................68\t \u00C2\u00A04.1.8\t \u00C2\u00A0 ChIP and RNA sequencing analysis ..................................................................................................68\t \u00C2\u00A04.1.9\t \u00C2\u00A0 PBAT Sequencing .............................................................................................................................69\t \u00C2\u00A04.1.10\t \u00C2\u00A0 Imaging ............................................................................................................................................69\t \u00C2\u00A04.2\t \u00C2\u00A0 Results ....................................................................................................................................... 70\t \u00C2\u00A04.2.1\t \u00C2\u00A0 E13.5 gonadal somatic cells and PGCs express distinct retroelements .............................................70\t \u00C2\u00A04.2.2\t \u00C2\u00A0 H3K9me3 and H3K27me3 mark distinct and overlapping genomic regions in E13.5 PGCs ...........73\t \u00C2\u00A04.2.3\t \u00C2\u00A0 A subset of ERVs are marked by H3K9me3, H3K27me3 and DNA methylation in E13.5 PGCs ...76\t \u00C2\u00A04.2.4\t \u00C2\u00A0 H3K9me3 and H3K27me3 are reduced at ERVs in SETDB1 deficient E13.5 PGCs .......................80\t \u00C2\u00A04.2.5\t \u00C2\u00A0 H3K9me3 depleted IAP LTRs show reduced DNA methylation in Setdb1 KO PGCs .....................83\t \u00C2\u00A04.2.6\t \u00C2\u00A0 A subset of ERVs are reactivated upon Setdb1 depletion in E13.5 PGCs ........................................86\t \u00C2\u00A04.2.7\t \u00C2\u00A0 Transcription of a subset of genes de-repressed in Setdb1 KO PGCs initiates in LTRs ...................89\t \u00C2\u00A04.2.8\t \u00C2\u00A0 Setdb1 deletion early in germline development leads to gametogenesis defects in postnatal and adult mice .......................................................................................................................................................92 x SECTION IV\t \u00C2\u00A0 DISCUSSION ................................................................................................... 94\t \u00C2\u00A0Chapter 5: Setdb1, ERVs silencing and germ cell development ....................................................... 95\t \u00C2\u00A05.1\t \u00C2\u00A0 Co-occurrence of H3K9me3 and H3K27me3 at retrotransposons ........................................... 95\t \u00C2\u00A05.2\t \u00C2\u00A0 Influence of Setdb1 deletion on DNA methylation ................................................................... 97\t \u00C2\u00A05.3\t \u00C2\u00A0 H3K9me3 and DNA demethylation at ERVs ......................................................................... 102\t \u00C2\u00A05.4\t \u00C2\u00A0 LTR-initiated chimaeric transcripts ........................................................................................ 103\t \u00C2\u00A05.5\t \u00C2\u00A0 Gender difference in ERV reactivation ................................................................................... 104\t \u00C2\u00A05.6\t \u00C2\u00A0 Setdb1 deletion and germ cell viability ................................................................................... 104\t \u00C2\u00A05.7\t \u00C2\u00A0 Setdb1 and the DNMT3L-piRNA pathway ............................................................................ 105\t \u00C2\u00A0References .................................................................................................................................. 109\t \u00C2\u00A0Appendices ................................................................................................................................. 122\t \u00C2\u00A0Appendix A Supplementary figures of Chapter 2 ..........................................................................123\t \u00C2\u00A0Appendix B Supplementary methods of Chapter 2 .......................................................................130\t \u00C2\u00A0Appendix C Supplementary figures of Chapter 4 ..........................................................................136\t \u00C2\u00A0Appendix D Supplementary Tables for Chapter 4 .........................................................................154\t \u00C2\u00A0 xi List of Tables Table 1-1 Detailed stages and nomenclature of male germ cell development. ............................ 13!Table 1-2 Female germ cell development. .................................................................................... 17! xii List of Figures Figure 1-1 Composition of the mouse genome. .............................................................................. 2!Figure 1-2 Diagram of autonomous retrotransposons. ................................................................... 3!Figure 1-3 Phylogenetic classification of endogenous retroviruses in the mouse genome based on ERV reverse transcriptase domains. ............................................................................................... 4!Figure 1-4 Influence of retrotransposons on the host genome and transcriptome. ......................... 6!Figure 1-5 Dynamics of covalent histone modifications in germ cell development. ................... 12!Figure 1-6 Key developmental events in male germ cell development. ....................................... 15!Figure 1-7 Female germline development nomenclature and key events. .................................... 15!Figure 2-1 Diagram of chromatin immunoprecipitation (ChIP). .................................................. 23!Figure 2-2 Bisulfite conversion of DNA. ..................................................................................... 25!Figure 2-3 Diagram of ChIP protocol for rare cell populations. ................................................... 29!Figure 2-4 Diagram of Bisulfite Sanger Sequencing protocol for rare cell population. ............... 33!Figure 2-5 Diagram of RNA isolation and reverse transcription protocols used for rare cell populations. ................................................................................................................................... 34!Figure 2-6 Library construction from low input assays for the Illunima sequencing platform. ... 37!Figure 2-7 A native ChIP-sequencing protocol to generate genome-wide chromatin maps from low cell numbers. .......................................................................................................................... 40!Figure 2-8 Correlation between standard and ultra low input native ChIP-seq libraries built from 103 to 105 ESCs. ............................................................................................................................ 44!Figure 2-9 High-resolution gender-specific H3K27me3 profiles generated from E13.5 PGCs isolated from single embryos. ....................................................................................................... 47! xiii Figure 2-10 Gender-specific H3K27me3 profiles from E13.5 PGCs isolated from single embryos. ........................................................................................................................................ 49!Figure 3-1 GTK+ based Graphical interface for counting and linking to the UCSC genome browser. ......................................................................................................................................... 60!Figure 3-2 Gating and plot generation using SeqViz plot pane. ................................................... 61!Figure 4-1 Enrichment of H3K9me3, H3K27me3 and gene expression in E13.5 PGCs. ............ 72!Figure 4-2 H3K9me3, H3K27me3 and DNA methylation at ERVs in E13.5 PGCs. ................... 75!Figure 4-3 Influence of Setdb1 KO on PGC number and chromatin marks. ................................ 79!Figure 4-4 DNA methylation is reduced in Setdb1 deficient PGCs in regions highly enriched for H3K9me3. ..................................................................................................................................... 82!Figure 4-5 Reactivation of ERVs upon Setdb1 depletion in mouse E13.5 PGCs. ........................ 85!Figure 4-6 Genes up-regulated in Setdb1 KO E13.5 PGCs. ......................................................... 88!Figure 4-7 Gonadal defects in male and female germline Setdb1 deficient mice. ....................... 91!Figure 5-1 CpG DNA methylation in Setdb1 HET and KO E13.5 PGCs. ................................... 98!Figure 5-2 DNA methylation is reduced at H3K9me3 marked ERVs in both male and female Setdb1 KO PGCs. ......................................................................................................................... 99!Figure 5-3 Distribution of methylation levels of sequenced reads aligning to IAPLTR1 in control Setdb1 HET and KO PGCs. ........................................................................................................ 101!Figure 5-4 Distribution of methylation levels of sequenced reads aligning to IAPLTR1a in control Setdb1 HET and KO PGCs. ............................................................................................ 102!Figure 5-5 H3K9me3 at ERVs persists in the male germline until at least P10. ........................ 107! xiv List of Abbreviations 5hmC 5-hydroxyl-methyl cytosine 5mC 5-methyl-cytosine Bam A BAM file (.bam) is the binary version of a SAM file. A SAM file (.sam) is a tab-delimited text file that contains sequence alignment data. ChIP Chromatin immunoprecipitation CKO Conditional knockout\u00E2\u0080\u00A8 CpG Cytosine-guanine dinucleotide\u00E2\u0080\u00A8 DAPI 4',6-diamidino-2-phenylindole, a fluorescent stain that binds strongly to A-T rich regions in DNA DMRs Differentially Methylated Regions DNMT DNA methyltransferase DNMT1 DNA methyltransferase 1 DNMT3A DNA methyltransferase 3A \u00E2\u0080\u00A8 DNMT3B DNA methyltransferase 3B \u00E2\u0080\u00A8 DNMT3L DNA methyltransferase 3-like\u00E2\u0080\u00A8 ENCODE Encyclopedia of DNA Elements ENSEMBLE A joint project between EMBL-EBI and the Wellcome Trust Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes ERV Endogenous retroviruses. The remnant of exogenous retroviruses infected germline millions of years ago. ERV1 Endogenous virus type 1, phylogenetically close to gamma- and epsilon-retroviruses. Also termed Class I ERV. ERVK Endogenous virus type K, phylogenetically close to lentivirus, alpha-, beta- and delta-retroviruses, also termed Class II ERV. xv ERVL Endogenous virus type L, phylogenetically close to spumaviruses, also termed Class II ERV. ES Embryonic stem cells ETN Early transposon FACS Fluorescence-activated cell sorting\u00E2\u0080\u00A8 FBXL10 Synonym of Kdm2b (lysine (K)-specific demethylase 2B) FSH Follicle-stimulating hormone G9A H3K9-specfc lysine methyltransferase, synonym of EHMT2 (Euchromatic Histone-Lysine N-Methyltransferase 2) or KMT1C (Histone lysine methyltransferase 1C) GASZ Synonyms of ASZ1 (ankyrin repeat, SAM and basic leucine zipper domain containing 1) GFP Green fluorescent protein GLP G9A- like protein, synonym of EHMT1 (euchromatic histone-lysine N-methyltransferase 1) or KMT1D (Histone lysine methyltransferase 1D) GTK Graphical tool kit GVBD Germinal vesicle breakdown H3K27me3 Histone 3 lysine 27 tri-methylation H3K9me3 Histone 3 lysine 9 tri-methylation \u00E2\u0080\u00A8 HET Heterozygote IAP Intracisternal A-type particle\u00E2\u0080\u00A8 IAPEz IAP (Intracisternal A-type particle) subfamily Ez INT Internal region of retrovirus IP Immunoprecipitation KO Knockout\u00E2\u0080\u00A8 xvi L1Md_GF LINE1 (long interspersed elements 1) subfamily Md_GF LINE Long interspersed elements LSH Lymphoid-specific helicase LTR Long terminal repeat MACS A ChIP-seq signal enrichment analysis software, \u00E2\u0080\u009CModel-based Analysis for ChIP-Seq\u00E2\u0080\u009D , which uses a Poisson distribution model to identify peaks from ChIP-seq data. Their improved second version is named MACS2 MAEL Maelstrom homolog MALR Mammalian apparent LTR retrotransposons MERVK10C Mouse endogenous virus type K subfamily 10C MERVL Mouse endogenous virus type L MILI Synonym of PIWIL2 (piwi-like RNA-mediated gene silencing 2) MIWI Mouse homolog of PIWI (P-element induced wimpy testis) MIWI2 Synonym of PIWIL4 (piwi-like RNA-mediated gene silencing 4) MLV Moloney murine leukemia virus MNase Micrococcal Nuclease, is a relatively non-specific endo-exonuclease, digests double-stranded, single-stranded, circular and linear nucleic acids. In the application of ChIP-seq, the enzyme makes double-stranded cuts between nucleosomes. MOV10L1 Mov10 RISC complex RNA helicase like 1 MVH Synonym of Ddx4 (DEAD (Asp-Glu-Ala-Asp) box polypeptide 4) NCHIP Native ChIP (chromatin immunoprecipitation) NGS Next generation sequencing PBAT Post library bisulfite adaptor tagging xvii PBS Phosphate-buffered saline PCR Polymerase chain reaction PE Pair end (sequencing) PF Primary follicle PGC Primordial germ cell PIWI P-element induced wimpy testis RPKM Reads Per Kilobase per Million mapped reads SA Splice acceptor SD Splice donor SETDB1 SET Domain Bifurcated 1, a histone-lysine N-methyltransferase, synonym of ESET (ERG-associated protein with a SET domain) or KMT1E (histone lysine methyltransferase 1E) SF Secondary follicle SINE Short interspersed elements SSEA1 Stage specific embryonic antigen 1 SUV39h1/h2 Suppressor of variegation 3-9 homolog 1/homolog2 TDRD1 Tudor domain containing 1 TF Tertiary follicle TSS Transcription start site UCSC An on-line genome browser hosted by the University of California, Santa Cruz ULI Ultra low input UTR Untranslated region WGBS Whole genome bisulfite sequencing xviii Acknowledgements First of all, I would like to thank my wife Qin Lei for sticking together when we were both going through this rough road of pursuing our PhD degrees at opposite ends of the globe. The tremendous effort and sacrifice we\u00E2\u0080\u0099ve made to maintain the family as a whole while raising our little angel Jonathan, is itself a monumental task. I thank her for standing by my side in completing this journey. I would like to thank my supervisor Dr. Matthew C. Lorincz for his professional and continuous support throughout my study. The free, open and respectful atmosphere he had created in the lab was an essential basis for the creative discussions and research that is happening everyday. His mind probing questions stimulated endless creative thoughts and passion for me towards my project; his style of communication as a friend made his advice and comments a joy to take, which proved to be invaluable in finding solutions for numerous problems I had encountered in my work. I would also like to thank Dr. Jacob W. Hodgson for his professional support later in my study and many insightful bench side conversations about science as a community, the evolutionary aspects of science discovery, and the joy of conducting innovative research. These have helped in shaping science as a career and passion for me, and helped me greatly during the downtimes. I would also like to thank my colleagues, in particular Dr. Julie Brind\u00E2\u0080\u0099Amour, who joined my project and played essential role in helping me move it forward. We\u00E2\u0080\u0099ve had wonderful times together in developing techniques to address the question I set out to achieve. I\u00E2\u0080\u0099d also like to xix thank our technician Preeti Goyal and Aaron Bogutz from Dr. Louis Lefebvre\u00E2\u0080\u0099s lab who have helped in setting up the initial technical platform. My thanks also go to Peter Thompson, with whom I have shared the ups and downs and many constructive discussions along the process of obtaining our PhD together. My special thanks to my committee members, Dr. Louis Lefebvre, Dr. Dixie Mager, and Dr. Wyeth Wasserman for their professional and constructive suggestions on the direction of the research throughout my study. Last but not least, I would also like to thank Cheryl Bishop, our program secretary for her compassionate help when I first started and the kind advices I received throughout my study. xx Dedication to my parents 1 SECTION I INTRODUCTION 2 Chapter 1:!Retrotransposons, germ cell development and associated epigenetic events 1.1! Retrotransposons and the host genome Retrotransposons compose ~38% of the mouse genome and ~42% of the human genome (Mouse Genome Sequencing Consortium et al. 2002) (Figure 1-1). All retrotransposons replicate via a \u00E2\u0080\u009Ccopy and paste\u00E2\u0080\u009D mechanism that involves transcription to generate an RNA intermediate, followed by reverse transcription and subsequent integration into the host genome at a new location. Retrotransposons can be further classified into non-LTR (LINE, SINE) and LTR retrotransposons based on whether they have a Long Terminal Repeat (LTR) in their sequence (Figure 1-2). Figure 1-1 Composition of the mouse genome. Data from the mouse genome consortium (Mouse Genome Sequencing Consortium et al. 2002) were analyzed for repetitive sequence content. Pie chart shows the composition of the major classes of repetitive sequences. Note that LTR elements make up almost 10% of the mouse genome. LINE elements are autonomous non-LTR retrotransposons that compose ~20% of the mouse and human genomes. They are generally ~6kb in length and contain a 5\u00E2\u0080\u0099 untranslated region (UTR), Transposons( (38.17%1 3 two open reading frames (ORFs), and a 3\u00E2\u0080\u0099 UTR (Figure 1-2). LINE1 elements account for 0.1% of spontaneous germline mutations in humans and 2.5% in mice (Kazazian and Moran 1998). SINE elements comprise 8% and 10% of the mouse and human genomes, respectively, and these non-autonomous elements require LINE elements to retrotranspose. As a consequence of the reverse transcription mechanism of LINE1 elements, most of the integrated copies are truncated at the 5\u00E2\u0080\u0099 end and thus do not include the 5\u00E2\u0080\u0099UTR of full length elements, which includes the promoter region of these Non-LTR retrotransposons (Beck et al. 2011) . Figure 1-2 Diagram of autonomous retrotransposons. Figure adapted from (Jurka et al. 2007) . Full-length LTR and LINE1 elements and the proteins they encode are shown. LTR retrotransposons, which will be described in greater detail below, are also referred to as Endogenous Retroviruses (ERVs) due to their similarity with their counterpart exogenous retroviruses (Jurka et al. 2007). Based on their reverse transcriptase sequences, ERVs can be further divided into three classes. As illustrated in Figure 1-3, a subset of ERVs still maintain the ability to retrotranspose. It is estimated that about 10% of spontaneous germline mutations in the mouse are the result of retrotransposition of ERVs, revealing the high level of active elements in this species (Kazazian and Moran 1998). In contrast, only ~0.1% of human spontaneous mutations are due to retrotransposons, and none are due to ERV insertions, consistent with the observation that very few, if any, human ERVs are actively retrotransposing in humans (Grow et al. 2015). 4 Figure 1-3 Phylogenetic classification of endogenous retroviruses in the mouse genome based on ERV reverse transcriptase domains. The phylogenetic tree was adapted from Stocking et al. and McCarthy et al. (Stocking and Kozak 2008; McCarthy and McDonald 2004). Only active members within the three classes of ERVs in the mouse genome, which are included in our own bioinformatics analyses described below, are shown. Among different subclasses of ERVs, the most extensively studied noninfectious subfamilies are IAP and ETn subfamily. About 700 full-length and 300 partially deleted IAP elements are present in the mouse genome (Kuff and Lueders 1988). The most abundant form is l\u00CE\u00941 subclass which has a 1.9-kb deletion in gag-pol, and is responsible for majority of IAP insertional mutations (Maksakova et al. 2006) . The other major family of active mouse ERV are ETns. Etn RNA expression levels are significantly elevated during early development stage at E3.5 (Br\u00C3\u00BBlet et al. 1985) to E13.5 (Loebel et al. 2004) at certain tissues. There are two types of Etn elements (I and II) differing in their 3\u00E2\u0080\u0099 portion of the LTR and the 5\u00E2\u0080\u0099 internal segment. There are ~200 ETnI and ~40 ETnII elements (Baust et al. 2003). ETnII are more transcriptionally active than ETnI elements (Baust et al. 2002). A related subfamily MusD (Mager and Freeman 2000) which ERV$families$figureviruses are generally thought to have evolved fromGypsy-like LTR retrotransposons, which adopted aviral lifestyle through acquisition of an envelopeprotein (Env) (Fig. 1). Most ERVs show clear homol-ogy to one another and to modern exogenous retro-viruses (XRV) (albeit to a lesser extent), especiallyacross the RT gene, which is relatively refractory tononsynonymous substitution [11, 12]. In addition,shared characteristics such as translational strategy,number of zinc finger proteins in the NC of gag, thepresence and location of dUTPase (preventing inco-poration of uracil), presence of a GPY/F motif in theC-terminal end of IN, and accessory genes can be us dto classify ERVs [11]. There has been a growingtendency to group ERVs into classes according totheir similarity to XRVs, which have been classifiedinto seven genera (alpha-, beta-, gamma-, delta, andepsilonretrovirus, lentivirus, and spumavirus), thelatter belonging to a distinct subfamily [13]. Usingthis system of classification, ERVs clustering withgamma- and epsilonretrovirus are termed Class I,those that cluster with lentivirus, alpha-, beta-, anddeltaretroviruses are termed Class II, and those thatcluster with spumaviruses are termed Class III [11,14\u00E2\u0080\u009316]. Notably, intermediates between these differ-ent families have been identified, indicating an evolu-tionary continuum.Distribution and classes of ERVs in the mousegenomeUnveiling of the mouse genome sequence in 2002allowed the first comprehensive effort to cataloguethe diversity of ERVs in the mouse genome [1].Subsequently, several data-mining programs havebeen used to both identify novel ERV families, aswell as validate earlier genetic analysis [1, 17\u00E2\u0080\u009320]. Asin the human genome, the three different classes ofERVs can be readily distinguished, and togethermakeup close to 10% of their host!s genome (Fig. 2).However, a ma kedly dissimilar evolutionary historyin human and mouse has been noted, both in thedistribution and number of ERV families within thedifferent classes, but also in the fact that ERVs arenearly extinct in human, whereas in mouse there aremany active members [1, 21]. Although this reducedactivity of ERVs in humans reflects in part anunexplained drop in the overall rate of transpositionin the human but not mouse genome over the past 40million years [1, 22], many other factors are clearlyinvolved. Before addressing some of these mecha-nisms by which ERV activity is maintained or extin-guished, an overview of the three major ERV classesfound in the mouse genome is warranted.Class III ERVs, which show closest (although distant)homology to the spuma-like genus of retrovirus, makeup 5.4% of the ouse genome [1] (Fig. 2). These areprobably the most ancient ERVs, accounting for 80%of recognized LTR element copies predating theFigure 2. Phylogenetic analysisof ERV RT domains [19] dem-onstrates the three classes ofmouse retrotransposons. RT se-quences of ERVs from host spe-cies other than mouse are includ-ed for comparison and are inblack letters. The close relation-ship of Class I ERVs to XRVsand ERVs from other species isclearly shown. Four distinctclades or superfamilies are de-fined for the Class II ERVs, oneof which (MMTV-like) is poorlycharacterized. Non-autonomouselements, such as the abundantVL30 s (Class I), ETns (Class II),and MaLRs (Class III) are listedwith their presumed parentalERVs, as they do not containRT domains. The phlyogenetictree was obtained from http://genomebiology.com/2004/5/3/R14 and modified to include ourown analysis of Class I ERVs.Cell.Mol. Life Sci. Vol. 65, 2008 Review Article 3385(ERV1)(ERVK)(ERVL) 5 shares nearly identical LTRs with ETns, are found to be providing the proteins necessary for ETn retrotransposition (Ribet 2004). Retrotransposons impact the genome and transcriptome in multiple ways. In addition to obligatory insertional mutagenesis, retrotransposons can also promote rearrangements through homologous recombination between closely related ERV copies (Figure 1-4). At the RNA level, retrotransposon sequences can promote transcriptional initiation (Di Cristofano et al. 1995), intergenic splicing (Feuchter-Murthy et al. 1993) and exonization (Jenkins et al. 1981), transcription enhancement (Samuelson et al. 1990), and polyadenylation (Mager 1989; Cavanagh et al. 2006). A well-studied example of an LTR-derived promoter driving an alternative genic transcript is the Avy allele. Transcription originating in a cryptic promoter in the 3\u00E2\u0080\u0099LTR of an IAP element 100kb upstream of the Avy allele results in constitutive expression of the agouti gene in various tissues; DNA methylation of this IAP elements is reported to influence the level of expression of agouti, and may be inherited in a transgenerational manner when inherited maternally (Morgan et al. 1999). 6 Figure 1-4 Influence of retrotransposons on the host genome and transcriptome. Retrotransposons can influence the host genome by: A) Insertional mutagenesis. B) Rearrangement of the genome through homologous recombination. C) Initiation of transcription/cryptic promoter function. D) Enhancing genic transcription in cis (enhancer function). E) Acting as a splice donor (SD) or splice acceptor (SA) to initiate alternative splicing or exonization. F) Acting as a polyadenylation signal. Two shades of blue indicate two short inverted repeat within the LTR and the arrow indicates transcriptional start site. 1.2! Epigenetic mechanisms involved in ERV silencing Several epigenetic marks are involved in the silencing of ERVs in mammals, including DNA methylation and histone H3 lysine 9 tri-methylation (H3K9me3) as well as H3K9me2, which act directly on chromatin at the transcriptional level (Rowe and Trono 2011) . C) Initiate transcriptionE) Alternative splicing/ ExonizationD) Enhance transcriptionB) RearrangementA) InsertionSASD SDSA SASDF) PolyadenylationSD(SA) SA 7 1.2.1! DNA methylation and ERV silencing DNA methylation describes the covalent addition of a methyl group to the 5th position of cytosine, predominantly in the context of CpG dinucleotides. The presence of a methyl group on cytosine in genic promoter regions is proposed to interfere with binding of transcriptional machinery or ubiquitous transcriptional factors, resulting in transcriptional inhibition. DNA methylation can also promote silencing via the recruitment of specific transcriptional repressor complexes, including methyl-binding proteins (Nan et al. 1993), UHRF1 (Hashimoto et al. 2008) and zinc-finger proteins (Prokhortchouk 2001; Filion et al. 2005), which can recognize methylated DNA (Moore et al. 2012). The three DNMTs in mammals, DNMT1, DNMT3A and DNMT3B, share a conserved catalytic domain structure (Cheng and Blumenthal 2008). DNA methylation is inherited following DNA replication, as a result of the activity of the maintenance methyltransferase DNMT1 (Leonhardt et al. 1992), which is recruited to the replication fork by the hemi-methylated DNA binding protein UHRF1/NP95 (Hashimoto et al. 2008; Sharif et al. 2007). In addition, the de novo methyltransferases DNMT3A and DNMT3B act to methylate unmethylated sequences during specific stages in development (Goll and Bestor 2005). DNMT3L does not have enzymatic activity, but is reported to stimulate the activity of DNMT3A/3B, specifically in the germline and in the early embryo (Chedin et al. 2002; Suetake et al. 2004; Webster et al. 2005). DNA methylation of Intracisternal A-particle (IAP), an active class II ERV (Stocking and Kozak 2008), decreases only slightly in Dnmt3a/3b knockout (KO) 9.5dpc (days post coitum) embryos. The decrease is more profound in Dnmt3a and Dnmt3b double KO embryos, but still mild relative to the hypomethylation of IAP elements observed in Dnmt1 KO embryos (Okano et al. 8 1999). The expression of IAP elements increases ~50-100 fold in somatic cells of 9.5dpc embryos in Dnmt1 KO embryos (Walsh et al. 1998), which show embryonic lethality at 9.5dpc. This lethality may be a result of p53-dependent apoptosis induced by the loss of methylation in somatic cells (Jackson-Grusby et al. 2001). Intriguingly, although DNA methylation of IAP elements continue decreases in primordial germ cells (PGCs) until E13.5 (Seisenberger et al. 2012), induction of IAP expression was not observed in Dnmt1 knockout PGCs (Walsh et al. 1998). The mechanism for silencing of ERVs in the germ cells of Dnmt1 KO mice is still an open question. Demethylation and dramatic reactivation of IAP elements is also observed in the postnatal testes of Dnmt3l KO male mice, which are infertile (Bourc'his and Bestor 2004). Methylation of IAP elements in Dnmt3l null oocytes is also lower (Lane et al. 2003; Lucifero et al. 2007) than wild type (WT) (Lane et al. 2003), but female Dnmt3l KO mice show normal oogenesis, likely due to posttranscriptional targeting by siRNAs and/or piRNAs (Watanabe et al. 2008). Interestingly, the heterozygous progeny of female Dnmt3l KO mice die at 9.5dpc due to abnormal maternal imprinting inherited from the KO oocytes (Kaneda et al. 2004). Similar to Dnmt3l KO females, methylation of IAP elements in oocytes of Dnmt3a germ line conditional KO is reduced by 50% relative to WT (Kaneda et al. 2004). Little is known about the expression of these elements in the oocyte when they are hypomethylated. One of my research goals was to determine the mechanism for silencing of ERVs during the stage in germ cell development when ERVs are hypomethylated, as discussed in detail in the research section. 1.2.2! H3K9 methylation and ERV silencing The core histone proteins (H2A, H2B, H3, H4), in association with ~147bp DNA form the basic unit of chromatin, the nucleosome core particle (Tollefsbol 2010). Each histone in the 9 nucleosome is composed of a globular domain and an unstructured tail domain. Covalent modifications of core histone N-terminal tails can recruit various co-factors, which in return alter the chromatin structure state to an opened euchromatin state or a closed heterochromatin state, and affect subsequent transcription (Jenuwein 2001). H3K9me2 is deposited by lysine methyltransferases (KMTase) G9A and GLP. This mark has been reported to protect the maternal genome against DNA demethylation in early embryos (Nakamura et al. 2012), and is required for de novo DNA methylation of MLVs (Leung et al. 2011). Intriguingly, the class III MALR MERVL was shown to be silenced by H3K9me2 deposited by G9A and/or GLP in ES cells (Maksakova et al. 2013; Macfarlan et al. 2012). H3K9me3 is associated with heterochromatin formation and transcriptional silencing(Mikkelsen et al. 2007). This mark is deposited by the lysine methyltransferases (KMTases) Suv39h1/2 and SETDB1 in mammals (Peters et al. 2001; Martin and Zhang 2005). Suv39 KMTases specifically target pericentric regions (Peters, Kubicek et al. 2003). KO of Suv39h1/2 leads to impaired viability, chromosomal instability and increased tumor risk and infertility in both male and female mice, possibly due to nonhomologous chromosome associations. Although Suv39h1/h2 double null ES cells show a 4-fold increased IAP expression, the double KO embryo can still develop to adulthood (Peters, O'carroll et al. 2001). On the other hand, SETDB1 was originally reported to target various developmental genes in ES cells (Bilodeau et al. 2009; Yuan et al. 2009) and Setdb1 KO mice die at ~3.5-5.5dpc (Dodge et al. 2004). Previous work in our lab revealed that conditional KO of Setdb1 in ES cells, induces a ~25-fold increase in IAP expression and causes cell death at ~8 days post-induction of the deletion (Matsui et al. 2010). In contrast, there is only a modest increase in ERV expression in either Suv39h1/h2 double KO 10 or Dnmt1, Dnmt3a/3b triple KO cells, indicating that SETDB1, but not Suv39h1/h2, is required for silencing of ERVs in ES cells, independent of DNA methylation (Hutnick et al. 2010). H3K9me3 therefore may be an important repressive mark for silencing of ERVs at developmental stages when the genome is hypomethylated, such as in E13.5 primordial germ cells (PGCs). However, nothing is known about the role of this mark in silencing of ERVs in the germ line. 1.3! Germ cell development and associated epigenetic events Endogenous retroviruses become \u00E2\u0080\u009Cendogenized\u00E2\u0080\u009D when they infect germ cells. Such endogenization has occurred over the course of millions of years of mammalian evolution, with some ERVs becoming endogenized only recently in the rodent genomes. There are 631,000 copies of LTR retrotransposons in the reference mouse genome (Mouse Genome Sequencing Consortium et al. 2002). Some families, such as IAP and Etn elements, recently colonized the mouse genome and a subset of these maintains the ability to retrotranspose. Here I will introduce concepts on germ cell development and associated epigenetic events to provide a context for understanding the relationship between ERV regulation and germ cell development. 1.3.1! Primordial germ cells (PGCs) before sex differentiation Germ cells are specified in the proximal epiblast at ~E6.5. About 200 cells emerge as a cluster of alkaline phosphatase positive cells detectable at 7.5dpc, then migrate to and settle in the gonadal ridge at ~11.5dpc (Ginsburg et al. 1990). These PGCs undergo rapid propagation, with an average population doubling time of ~16h (Ginsburg et al. 1990). Subsequently, with their numbers reaching about 10,000 at 13.5dpc (Haston et al. 2009), they stop dividing. From E6.5 until E12.5, PGCs migrate through the hind gut to gonadal ridges and undergo mitotic 11 replication. Coincidentally, PGCs undergo global DNA demethylation, with methylation levels reaching a low point at ~ E13.5. A unique epigenetic reprogramming process occurs at ~10.5-12.5dpc (De Felici 2011): the genome undergoes rapid DNA demethylation, imprinting marks are erased, the X-chromosome is reactivated and most of the genes in the genome show demethylation at their promoter region. In addition, repetitive elements, including most ERVs, are hypomethylated at this stage. However, some ERVs, in particular IAP elements, are somewhat resistant to this demethylation compared with other regions in the genome, but nevertheless show less DNA methylation in PGCs than somatic tissues (Lane et al. 2003; Seisenberger et al. 2012). Histone modifications at this reprogramming stage have not been studied in detail due to the limitation of cell numbers. Early studies using immunostaining (Daujat et al. 2009; Hajkova et al. 2008; Seki et al. 2005) (Figure 1-5) showed that H3K9me2 is lost in developing PGCs, coincident with down regulation of G9A at this stage. A global increase of H3K27me3 is also observed, which was later confirmed by ChIP-seq of pooled E13.5 PGCs (Ng et al. 2013). Notably, H3K9me3 apparently remains unchanged throughout this developmental window, while H3K64me3, which was shown to depend on H3K9me3, is apparently absent at this stage, as determined by immunostaining (Daujat et al. 2009). 12 Figure 1-5 Dynamics of covalent histone modifications in germ cell development. Data was obtained from (Daujat et al. 2009; Seki et al. 2007; Hajkova et al. 2008). Numbers (1~3) on the diagram indicates the corresponding histone modifications examined by the reference. The light color (orange and yellow) represents open chromatin marks; the dark color (blue and black) represents repressive chromatin marks. Note most of these observations were made using immunostaining-based methods. Studies of early germline development indicate that male and female PGCs undergo similar epigenetic reprogramming processes prior to sex differentiation at ~E13.5 (Saitou and Yamaji 2012). However, while differences in gene expression in male and female PGCs are clearly detected at subsequent developmental stages, given the small number of high-resolution studies of the distribution of such marks in PGCs, little is known about gender specific differences in histone modifications at these critical developmental stages. E6.5H3K9me2H3K27me3H3K4me3E7.5 E8.5 E9.5 E10.5 E11.5 E12.5 E13.5 E14.5 E15.5 E16.5 E17.5 birthH3K9AcH3K9me3H3K64me31H3S10pH3S28p2presentweakabsentunknown3SegrFgation Reprogramming Mitotic arrest1 Daujat, S et al. 20092 Hajkova, P et al. 20083 Seki, Y et al. 2007Sex differentiationMeiosis to prophase I%/\".FUIZMBUJPO 13 1.3.2 \u00C2\u00A0 Male germ cell development Upon sex differentiation, male germ cells (termed \u00E2\u0080\u009Cgonocytes\u00E2\u0080\u009D) initiate mitotic arrest at E13.5 and G0 mitotic arrest is maintained through the perinatal stage (~ P2). At ~P10 male gonadal stem cells begin to proliferate and at the same time start to differentiate and initiate meiosis. Until ~P20, round spermatids are formed, then undergo a series of morphological changes. After meiosis II, male germ cells develop into mature spermatozoa, and become mature sperm at ~P30. During this final step (called \u00E2\u0080\u009Cspermiogenesis\u00E2\u0080\u009D), histone proteins are replaced by protamines (Gaucher et al. 2009; Kimmins and Sassone-Corsi 2005). Critical events in spermatogenesis are summarized in Table 1-1 and Figure 1-6 below (Saitou and Yamaji 2012). DEVELOPMENTAL STAGE DURATION (DAYS) NOMENCLATURE CELL CYCLE & PHYSIOLOGICAL EVENTS EPIGENETIC EVENTS E6.5~E12.5 7 PGCs Mitotic proliferation (migration) DNA demethylation E13.5~E15.5 10 Gonocytes Mitotic arrest (settled at gonadal ridge) Re-methylation E16.5~E18.5 BIRTH~P2 P3~P9 7 Spermatogonia Mitotic proliferation \u00E2\u0080\u00A6 P10~P19 10 Spermatocytes Meiosis in germ cells (Mitotic in GSCs) \u00E2\u0080\u00A6 P20 10 Spermatid Round spermatid Spermiogenesis P27 Elongated spermatid P30 Sperm Table 1-1 Detailed stages and nomenclature of male germ cell development. After completion of genome-wide demethylation (~E13.5), the genome is remethylated starting in male gonocytes at ~E15.5, with the upregulation of DNMT3L and DNMT3A, a process that continues until birth (Saitou et al. 2012). During this process, paternal imprinting is established, genes are re-methylated and ERVs become hypermethylated. Importantly, a germ cell-specific 14 piRNA-DNMT3L pathway, discussed in detail in the next section, plays a critical role in the silencing of ERVs during this timeframe. Little is known about the fate of histone marks at these later stages. Figure 1-6 shows the key epigenetic regulators including components of the piRNA pathway (MILI/MIWI2) in the context of male germ cell development. 15 Figure 1-6 Key developmental events in male germ cell development. Figure 1-7 Female germline development nomenclature and key events. Developmental Stages Embryonic/Prenatal (days post-coitum) BirthPostnatal/Neonatal (days post-partum)Development Timeline 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Nomenclature SpermSpermatogoniaRound SpermatidCell cycle MeiosisDNA methylation DNA demethylationDNA remethylation Leptotene/ZygotenePachyteneTE expressionpiRNA expressionMILIMIWI/MIWI2Histones ProtamineMILIMIWI2 MIWIHistonesTE derepression TE derepressionMitotic proliferationpre-pachytene piRNAs pachytene piRNAsEarlySpermatocyteLate SpermatocyteElongating SpermatidPrimordial Germ cells GonocyteMitosis Mitotic arrestDevelopmental Stages Embryonic/Prenatal (days post-coitum) BirthPostnatal/Neonatal (days post-partum)Development Timeline 7.5 8.5 9.5 10.5 11.5 12.5 14.5 15.5 16.5 17.5 18.5 19.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25NomenclatureCyst BreakdownEvents nomenclature Germ cell deathCell cycle Mitosis MeiosisDNA methylationDNA demethylation DNA remethylationDNMT1DNMT3aDNMT3bDNMT3lDNMT3bDNMT3LMeiosis IIprimary follicles\u00E2\u0086\u0092secondary follicles\u00E2\u0086\u0092antral folliclesnon-growing oocytes\u00E2\u0086\u0092growing oocytes\u00E2\u0086\u0092fully grown oocytesMIMIIGVBD (germinal vesicle breakdown)13.5Germline cyst20Secondary oocytesArrest at diplotene(Prophase of Meiosis I)Primordial Germ cellsPrimary oocytesDNMT1DNMT3a 16 1.3.3! Female germ cell development Female PGCs have a developmental trajectory similar to male PGCs until after sex determination at ~E13.5. Unlike their male counterparts, which are arrested at G0 at E13.5, female PGCs initiate meiosis and arrest at the diplotene stage of meiotic prophase I of meiosis at E17.5 (Pepling 2006). From birth (~E19.5/P0, postnatal day 0) to P2, ~70% of female germ cells undergo apoptosis, termed \u00E2\u0080\u009Cgermline cysts break down\u00E2\u0080\u009D to form primary follicles (Pepling and Spradling 2001; Pepling 2006). During this time, oocytes remain in meiotic arrest and the genome is hypomethylated. From P3 until P20, primary follicles remain arrested at meiosis I and only grow in size. During this time, the genome undergoes re-methylation, coincident with the upregulation of expression of both the de novo and maintenance DNMTs (La Salle et al. 2004). Maternal imprinting is established by the fully-grown oocyte stage, and ERVs also become hypermethylated during oocyte growth (Seisenberger et al. 2012). After puberty (~P25), following the stimulation of FSH (Follicle-Stimulating Hormone), a subset of oocytes proceed through meiosis and become arrested at meiosis II, coincident with the completion of de novo DNA methylation. The line of events in female gametogenesis is briefly outlined in Table 1-2 and Figure 1-7. 17 DEVELOPMENTAL STAGE NOMENCLATURE DAYS CELL CYCLE EPIGENETIC EVENTS E6.5~E12.5 Primordial germ cells (PGCs) 7 Mitotic proliferation DNA demethylation E13.5~E17.5 Oogonia; germ cell cyst 5 Meiotic progress & arrested at prophase of meiosis I DNA hypo-methylation E17.5~E19.5/ BIRTH (P0) Oogonia; germ cell cyst 2 Arrested at prophase of meiosis I P1~P2 (E20.5~E22.5) Germ cell cysts breakdown (70% germ cells undergone apoptosis) 2 P3~P20 Primary oocytes; primary follicles\u00C3\u00A0\u00EF\u0083\u00A0secondary follicles\u00C3\u00A0\u00EF\u0083\u00A0antral follicles; non-growing oocytes\u00C3\u00A0\u00EF\u0083\u00A0growing oocytes\u00C3\u00A0\u00EF\u0083\u00A0fully grown oocytes; MI 18 DNA re-methylation P20~P25 Secondary oocytes; MII (after first polar body exclusion); GVBD (germinal vesicle breakdown, marks the progression of meiosis I from prophase.) 5 Meiosis progress & arrested at meiosis II Maintenance of DNA methylation Table 1-2 Female germ cell development. Note that for most of female germ cell development, oocytes are arrested in meiosis I (following the first 7 days pre-meiotic proliferation) and while a subset of oocytes will enter meiosis II, most will die or never progress and will remain in prophase of meiosis I throughout adulthood. Many important epigenetic changes take place in arrested meiosis I oocytes, such as aforementioned DNA re-methylation and germ cell apoptosis (germinal vesicle breakdown) (Pepling and Spradling 2001). 18 1.3.4! The piRNA-DNMT3L pathway and ERV silencing in the germ line A number of studies have implicated the importance of small RNAs, in particular piRNAs, in silencing of ERVs in the germ line. piRNAs are 24~30-nt-long RNAs that associate with the PIWI subfamily of Argonaute proteins. piRNA biogenesis involves primary and secondary processing mechanisms (Thomson and Lin 2009). Primary RNA precursors transcribed from transposons, intergenic repetitive elements and piRNA clusters are processed by unknown factors and become mature piRNAs. These piRNAs are then processed in a germ cell-specific perinuclear structure called the chromatoid body (Kotaja and Sassone-Corsi 2007). According to their structure and components, chromatoid bodies can be further divided to pi-bodies (which contain MILI, TDRD1, GASZ, MOV10L1 and MVH) and pi-pbodies (which contain MIWI2, TDRD9, MAEL, and MVH) (van der Heijden et al. 2010). When loaded with mature piRNAs, the Piwi proteins MIWI and MILI can degrade complementary ERV transcripts and the exchange of piRNAs between MILI and MIWI2 amplifies and extends the piRNAs cycle (Thomson and Lin 2009). Intriguingly, MIWI2 loaded with piRNAs is proposed to translocate to the nucleus and induce de novo DNA methylation of the complementary DNA sequence, in association with DNMT3L (Aravin et al. 2008). The components in the pi-body or pi-pbody are essential for piRNA processing and germ cell development. KO of any of these factors causes a dramatic reactivation of ERVs postnatally and infertility in male germ cells (Ma et al. 2009; Reuter et al. 2009; Shoji et al. 2009; Pastor et al. 2014). 19 In contrast, the piRNA pathway is apparently not essential in female germ cells and MIWI2 is not expressed in oocytes. KO of MILI results in a 4-fold increase in expression of IAP elements, but interestingly does not affect fertility (Watanabe et al. 2008). Furthermore, KO of the single piRNA biogenesis gene has no obvious phenotype in female germ cells (Deng and Lin 2002; Kuramochi-Miyagawa 2004). Sexual dimorphism in the phenotypes of piRNA biogenesis mutants may be explained by the role of siRNAs in post-translational control of ERVs, a pathway which seems to be unique to the female germline (Watanabe et al. 2008), as well as chromatin remodelers such as LSH, which has been shown to be required for genome stability during female meiosis (La Fuente et al. 2006). 1.4! Aims of the thesis Retrotransposons proliferate within species by retrotransposition in the host\u00E2\u0080\u0099s germline, which imposes a major threat to genomic integrity and can lead to catastrophic developmental defects. Transcription of these parasitic elements is reported to be inhibited in the germline by DNA methylation, which is initiated at ~E15.5 in male and after birth in female. However, the genome is progressively demethylated earlier in germline development of both genders, reaching a low point in primordial germ cells (PGCs) at ~E13.5. This raises the question whether repressive histone modifications play a role in silencing of retrotransposons at this stage. In particular, I hypothesized that the histone lysine methyltransferase SETDB1/ESET, which was shown previously by the Lorincz lab (Matsui et al. 2010) to be required for silencing of retrotransposons in cultured embryonic stem (ES) cells, is required for silencing of these parasitic elements in PGCs. 20 The main tasks that stood at the beginning of this work were thus: 1) To determine whether repressive histone modifications, in particular H3K9me3 and H3K27me3, play a role in silencing of retrotransposons at E13.5 PGCs. 2) To determine the impact of losing Setdb1 and associated H3K9me3 on genomic DNA methylation and germline development. The main work of this thesis was therefore: 1) To developed a low cell number native ChIP-seq protocol to profile chromatin state at E13.5 PGCs. 2) To optimize DNA methylation and transcription profiling method for low input PGCs. 3) To develop computational pipelines to intersect epigenomic dataset (i.e. histone modification, DNA methylation and expression) generated from the E13.5 PGCs. 4) Using the information generated with above tools to delineate the inter-relationship between H3K9me3, H3K27me3 and DNA methylation, examining its effect on retrotransposon silencing, as well as its impact on germline development. 21 SECTION II METHODS DEVELOPMENT 22 Chapter 2:!Survey epigenetic information of \u00E2\u0080\u009Crare\u00E2\u0080\u009D cell populations in vivo at locus-specific and genome-wide scales# Epigenetics concerns information that is inherited through replication but independent of DNA sequences. This includes modifications deposited on histones (methyl, acetyl, and phosphate etc. on different residues of histone tails or core regions) and covalent DNA modifications (methyl, hydroxymethyl, formyl and carboxyl groups on cytosine bases), as well as proteins that bind to chromatin (chromatin readers, writers, and transcription factors, etc.). Intersecting data on the genomic distribution of these modifications with transcriptional profiling of genes, transposons or any genomic features of interest provides a powerful means of dissecting the interplay between chromatin marks and transcriptional regulation. A number of techniques have been developed for characterizing the localization of these epigenetic marks. Among others, the most commonly used are chromatin immunoprecipitation (ChIP, Figure 2-1) which allows for the analysis of enrichment of DNA or chromatin binding proteins or specific histone modifications, and bisulfite conversion of DNA ( Figure 2-2) which allows for the analysis of methylation status of DNA. For ChIP (Figure 2-1), chromatin is either cross-linked with interacting proteins (cross-linked ChIP), or processed in its native state (native ChIP), and then fragmented to single nucleosomes (through either sonication or enzymatic digestion, e.g. MNase). Subsequently, antibodies that specifically recognize either a chromatin binding protein or a histone modification are used to immunoprecipitate or \u00E2\u0080\u009Cpull- # The result section of this chapter is published in Nature Comm. 2015 vol. 6 p. 6033. This work was done in close collaboration with Julie Brind\u00E2\u0080\u0099Amour in our lab, as well as collaborator Dr. Kenjiro Sharine from Sasaki\u00E2\u0080\u0099s lab. 23 down\u00E2\u0080\u009D bound nucleosomes. Following purification of the associated DNA, qPCR or next-generation sequencing of the DNA can be carried out, revealing the target genomic regions of a chromatin binding protein (in the case of cross-linked ChIP) or genomic regions in which a specific histone modification is enriched (in the case of native ChIP). Figure 2-1 Diagram of chromatin immunoprecipitation (ChIP). Chromatin is either cross-linked with associated proteins (cross-linked ChIP, left), or maintained in its native state (native ChIP, right), and then fragmented to single nucleosomes. Antibodies that recognize a specific chromatin binding protein or histone modification are used to \u00E2\u0080\u009Cpull-down\u00E2\u0080\u009D bound nucleosomes. DNA is extracted from these nucleosomes and then analyzed via qPCR at the locus level or next generation sequencing at a genome-wide scale. Lines, DNA; spheres, nucleosomes; U shapes, chromatin binding proteins/transcription factors; Y, antibodies. YYFragmentationNative$ChIPCross?link$ChIPDNA$purificationqPCR ChIPed DNA$with$locus$specific$primersSequencing$ChIPed DNA$with$next$generation$sequencerLocus?level$enrichment Genome$wide$enrichment$profileImmunoprecipitation 24 For bisulfite conversion (Figure 2-2), purified DNA is first treated with sodium bisulfite, which deaminates cytosine to uracil while keeping 5-methyl-cytosine intact, and subsequently amplified with Taq polymerase. During this process, cytosines are converted to thymines while 5-methyl-cytosine (5mC) remains as cytosine. This difference can be exploited as a readout of DNA methylation state, either through Sanger sequencing of PCR products to generate locus-specific CpG methylation data, or through next generation sequencing for genome-wide CpG methylation profiling. Notably, as 5-hydroxyl-methyl cytosine (5hmC) also remains \u00E2\u0080\u009Cunchanged\u00E2\u0080\u009D (as cytosine); this method cannot be used to discriminate between 5mC and 5hmC. PCR$amplify$specific$regions$of$bisulfite$converted$genome$with$user$designed$primersLigate$bisulfite$converted$genome$fragments$into$nextgen sequencing$adaptersLigate$PCR$products$into$Sanger$sequencing$plasmidSanger$sequencingPCR$amplify$all$adaptor$ligated$sequences$with$sequencing$adapter$primers$Next$generation$parallel$sequencingLocus?level$CpGmethylation$status Genome?wide?level$CpG methylation$profile>>aggCGgaagCaCagaggCGCtga>>m|m|Bisulfite$conversion>>aggCGgaagUaUagaggCGUtga>>PCR$amplification>>aggCGgaagTaTagaggCGTtga>> 25 Figure 2-2 Bisulfite conversion of DNA. Sodium bisulfite deaminates cytosine to uracil while having no effect on 5-methyl-cytosine (or 5-hydroxyl-methyl cytosine). After PCR amplification with Taq polymerase, the original cytosines are replaced with thymine in the end-product while 5-methyl-cytosines (or 5-hydroxyl-methyl cytosine) remain \u00E2\u0080\u009Cunchanged\u00E2\u0080\u009D as cytosine. This sequence difference can be used as a read-out of DNA methylation status, either through Sanger sequencing for CpG methylation at individual loci, or through next generation parallel sequencing for genome-wide CpG methylation profiling. Locus-specific and genome-wide assays each have their advantages. Locus-specific assays such as ChIP-qPCR, Bisulfite-Sanger-Sequencing has the advantage of quick turn-around time and significant low cost compare to genome wide approaches. It is well suited for initial exploratory testing. On the other hand, genome wide assays such as ChIP-seq (parallel sequencing of chromatin immunoprecipitated DNA), meDIP-seq (using antibody against methyl or hydroxymethyl to pull-down methylated DNA, then followed by parallel sequencing of the pull-down materials) and whole-genome bisulfite sequencing (WGBS, parallel sequencing of bisulfite treated genomic DNA), yield genome-wide information when global surveys of epigenetic modifications are desired. Conventional methods for studying histone modifications (O'Geen et al. 2011; Massie and Mills 2011), DNA methylation (Li and Tollefsbol 2011) and transcription (Farrell 2009), either at the locus level or genome-wide, generally require high amounts of input material (>105 cells), which is not suitable for the study of rare primary cell populations in vivo. Indeed, analysis of epigenetic information from small number of cells has proven technically challenging: A standard ChIP assay generally requires ~106 cells (Mikkelsen et al. 2007; Barski et al. 2007). Scaled-down protocols have been developed for 103~105 cells (Adli et al. 2010), however most 26 of these methods include crosslinking and pre-amplification of ChIP material before library construction. To address this need, we developed an ultra low input (ULI) ChIP protocol, which requires as few as 1000 cells per assay at either the locus-specific level or genome-wide; we also optimized conventional DNA methylation- and transcription-profiling methods for low input materials at the same time, making up our \u00E2\u0080\u009CSmallCell\u00E2\u0080\u009D protocol package for surveying of epigenetic information from rare cell populations. Such ChIP results can be analyzed in parallel with whole genome bisulfite sequencing (WGBS) DNA methylation data and/or expression (RNA-seq), yielding important insights into the interplay between these features in these rare cell populations. 2.1! Design and implementation Our goal was to optimize techniques for analysis of histone modifications, DNA methylation and expression from relatively small numbers of cells. We chose 1000 cells as the goal, as this is the order of magnitude of primordial germ cells that can be obtained from gonads at the developmental stage in which we were most interested (namely E13.5). We also setout to develop a method that would allow for the isolation and storage of cells of interest for further processing at a later time point. The workflow contains three main steps: 1) Prepare cells in single suspension, sort cells of interest via FACS and store isolated cells in 1000 cell \u00E2\u0080\u009Cbatches\u00E2\u0080\u009D for further processing; 2) Optimize the molecular biology steps for analysis of histone modifications and expression from 1000 cells for locus-specific exploratory analysis; and 3) Optimize library construction techniques for these isolated samples of chromatin or RNA to enable genome-wide profiling using NGS (next generation sequencing) on the Illumina platform. 27 2.1.1! Cell collection Strategies for isolating cell types of interest vary significantly. Generally, specific cell types are sorted using FACS via immunostaining of cell surface markers or using GFP or an alternative intrinsically fluorescent cell marker, if available. For our studies, cells were sorted into different media depending on the intended downstream application: for native-ChIP and bisulfite conversion, cells were sorted into nuclei extraction buffer (Robert E Farrell 2010); while for analysis of RNA, cells were sorted directly into Trizol (Chomczynski and Sacchi 1987; 2006). If cells are to be used for cross-linked ChIP, they should be sorted into PBS then cross-linked. Sorted cells are then snap-frozen with liquid nitrogen and stored at -80 \u00C2\u00B0C for future use. For this thesis project, I was interested in studying the role of the histone lysine methyltransferase SETDB1 on transcriptional silencing of retrotransposons in mouse primordial germ cells (PGCs) at embryonic day (E) 13.5. Previous reports revealed that an average of ~10,000 PGCs could be collected at this developmental stage when using Oct4-GFP as a marker for FACS (Hajkova et al. 2002). However, I was able to collect only ~3000~6000 PGCs per embryo at this stage via FACS using an antibody which specifically recognizes the primordial germ cell specific marker SSEA1 (Matsui and Tokitake 2009). Since my goal was to assay more than one histone mark in each experiment (i.e. from an individual embryo), I decided to sort 3000 cells per tube in aliquots, enough for three assays per sample. 2.1.2! Locus-specific assays of low input samples for exploration or validation For exploratory testing of epigenetic marks and expression in rare cell populations, preliminary analysis of specific loci is prudent, due to the short \u00E2\u0080\u009Cturn-around\u00E2\u0080\u009D time for results and validation 28 and the relatively low cost of PCR-based assays compared to next-generation sequencing. ChIP, and preparation of cDNA take only ~2-4 days to complete and cost a fraction of an Illumina sequencing lane. PCR-based assays are also valuable for validation of results obtained by genome-wide methods. 2.1.2.1! Low input chromatin immunoprecipitation (ChIP) The specificity of the signal of a ChIP experiment is determined by the specificity of the antibody, while the affinity of the antibody determines the amplitude of the signal. An antibody with decent specificity and affinity to the antigen of interest is key for a successful ChIP. For any given antibody that is used in a ChIP assay, the specificity or affinity is determined. Procedures one takes to increase the signal frequently also increases the background (non-specific signal), while reducing the background often reduces the signal. To scale down the amount of input material required for ChIP, we tried to optimize each step to find a balance between signal and background at a ratio we deemed acceptable. We referenced a variety of ChIP related protocols and in particular, a low input cross-link ChIP protocol (Dahl and Collas 2008), a commercial kit for relatively low input (Diagenode LowCell#ChIP Kit kch-maglow-016), a conventional native ChIP protocol used in our lab (See Availability and Future direction section), and a strand-seq protocol (Falconer et al. 2012), to achieve a balanced signal over background ratio as the number of cells/amount of input material available for our study is fairly low. Below, I list some of critical observations we made when developing the experimental conditions (in step 1~3) and changes that we introduced that were not rigorously tested but based primarily on what we consider good practice (procedures we think in theory will increase the signal and reduce 29 background) (step 3~6), serving as a reference for researchers who may be interested in further development of this protocol. Figure 2-3 Diagram of ChIP protocol for rare cell populations. Major steps in the ChIP protocol are shown. Time required for each step is included in parenthesis. 1. Conditions that we have tested and found critical: 1)! Binding antibodies to magnetic beads Figure 2-3 depicts a diagram of our ChIP protocol for rare cell populations, which is similar to the conventional ChIP protocol except that in the latter, antibodies and magnetic beads are not bound beforehand, but rather incubated together with chromatin at the \u00E2\u0080\u009Cmagnetic immunoprecipitation\u00E2\u0080\u009D step. The strategy of pre-binding antibody-beads, rather than co-Binding antibodies to magnetic beads Chromatin shearing and nuclear membrane solubilizationChromatin pre-clearingMagnetic Immunoprecipitation WashesDNA isolationqPCR validationCell collection, lysis and storage (2h)(30m + Rotation at 4\u00C2\u00B0C for 2h)(40m)(30m+1h)(30m+Rotation at 4\u00C2\u00B0C Overnight)(1h+Incubate at 65\u00C2\u00B0C for 1.5h)(30m+Incubate at-20\u00C2\u00B0C 30min)(30m+2h30min qPCR run time)Sheng Liu V04152013 30 incubation of antibody, beads and chromatin proved to be critical for the success of our low input ChIP protocol. As shown in the Sequencing Data List (see Availability and Future direction section), two ChIP experiments in parallel in ChIPseq#1, \u00E2\u0080\u009CAb.Bds.Incubate\u00E2\u0080\u009D failed to produce quality reads while \u00E2\u0080\u009Cab.Bds.preBind\u00E2\u0080\u009D yielded the first 1000 cell low input ChIPseq data in PGCs. We believe that pre-binding the antibody with beads process helps to create more effective antibody-bead conjugates than the conventional co-incubation method. Furthermore, this additional step also removes excess antibody, therefore reducing background, which may contribute to the success of the assay. 2)! Chromatin shearing and nuclear membrane solubilization Another change we made to the conventional protocol was adding a membrane solubilization step to release all nucleosomes from the nuclei after chromatin shearing with MNase; instead of depending upon free nucleosomes shuttling out of the nuclear pore, as in the conventional method. This step also proved to be critical, as when we tested without membrane solubilization, the low input ChIP failed, suggesting that this step helps to increase the yield of nucleosomes relative to the conventional ChIP method. 3)! Chromatin pre-clearing There are two common ways of pre-clearing to minimize non-specific binding of proteins, the first is by incubating magnetic beads with chromatin without antibody, the second is by incubating magnetic beads and chromatin with IgG antibody. The latter approach failed in our low input ChIP assay, suggesting that the IgG-beads conjugate \u00E2\u0080\u009Cpulled-down\u00E2\u0080\u009D too much 31 chromatin non-specifically during this step due to the limitation of overall amount of the chromatin. 2. Conditions we accept as good practice (procedures we think in theory will increase the signal and reduce background) without rigorous testing: 4)! Magnetic Immunoprecipitation The \u00E2\u0080\u009Cimmunoprecipitation\u00E2\u0080\u009D step is described in Figure 2-1. We designed the steps described above to yield 100\u00C2\u00B5L of ChIP media for chromatin isolated from 1000 cells. We keep the media volume at 100\u00C2\u00B5L and use 100\u00C2\u00B5L volume siliconized strips. This protocol keeps nucleosome and antibody-bead conjugates in suspension throughout the process; we also discarded the tube without washing off nucleosomes which bind to the tube wall, as recommended (Dahl and Collas 2008), to reduce the background (non-specific signal). 5)! Washes We added an additional low salt wash compared to the conventional ChIP method to further reduce the background. 6)! DNA isolation After ChIP with a chromatin specific antibody, only a fraction of the genome is pulled down, with the specific level dependent in part on the abundance of the epitope of interest. Presuming that each cell has ~6pg DNA, 1000 cells would have 6ng of input material. Depending on the enrichment level of the histone mark in the genome and the specificity and/ affinity of the 32 antibody, various fractions of this 6ng can be pulled down. With the H3K9me3 specific antibody we used most frequently (Active Motif 39161), we generally IPed (immunoprecipitated) ~10% of the total input material (0.6ng). Among the methods we tested for DNA extraction following the IP, phenol-chloroform was the most cost-effective and yielded the highest level of material for further analysis. 7)! qPCR validation Due to the low amounts of DNA recovered (for example, ~0.6ng after ChIP with H3K9me3 Active Motif 39161 antibody), qPCR is challenging for single copy sequences. However, this method works well for repetitive sequences, which are present in hundreds to thousands of copies in the genome. In our hands, the yield of purified DNA from ChIP with the H3K9me3-specific antibody is sufficient for 48 PCR reactions, which is enough for technical replicates of 16 amplicons conducted in triplicate. If surveying more genomic regions is required, building libraries for genome-wide analysis is more practical, although more costly and potentially time consuming (precise turn-around time depends on the sequencing resources). 2.1.2.2! Low input bisulfite conversion and Sanger Sequencing Low input bisulfite conversion has been the \u00E2\u0080\u009Cgolden standard\u00E2\u0080\u009D to assess DNA methylation levels for a number of years. Figure 2-4 shows a general diagram of the steps involved. We used a commercial kit (EZ DNA Methylation-Direct Kit D5021 Zymo Research) for the purpose of rare cell population analysis and optimized conversion and amplification steps for low input samples. 33 As there are no substantial changes to the conventional protocol, I will not discuss the steps in detail. Figure 2-4 Diagram of Bisulfite Sanger Sequencing protocol for rare cell population. Major steps are shown in bold, time required for each step is indicated in parenthesis. Amplify regions of Bisulfite converted DNA by nested PCR (Day2, 5h)Purification of Bisulfite converted DNA by gel electrophoresis (Day3, 4h)Cloning of Bisulfite converted DNA with sequencing plasmid (Day4)Ligation, 2hTransformation, 2hPlate colony formation, overnightCulture picked colony, overnightPlasmid preparation and Digestion test (Day5)Mini Prep, 2hDigestion, 30minGel electrophoresis, 30minSanger Sequencing Data analysisBisulfite conversion of DNA (Day1, 6h30m) 34 2.1.2.3! Low input RNA isolation and reverse transcription A number of commercial RNA isolation kit (Ambion RNAqueous Micro Kit, Qiagen RNeasy Micro RNA Kit, and Arcturus PicoPure RNA Kit) are available for isolating RNA from low input of materials. Here we provide a protocol that used only TRIzol Reagent (15596-026 from Life Technologies) to extract RNA from 1000 cell aliquots. The basic steps of RNA isolation are illustrated in Figure 2-5. Note that depletion of rRNA using Ribominus (RiboMinus Eukaryote Kit, A10837-08, from Life Technologies) is only required when one wants to make sequencing ready cDNA, this maybe desirable for material that is very precious. Figure 2-5 Diagram of RNA isolation and reverse transcription protocols used for rare cell populations. Steps are highlighted in bold; Time required for each step is indicated in parenthesis. Steps with asterisk are optional for making sequencing-ready RT materials. Depletion of rRNA by Ribominus*(2~3h)Reverse Transcription and First Strand Synthesis(2h)Second Strand synthesis(2h)qPCR Validation(2h)RNA isolation and DNase I treatment(2h) 35 2.1.2.4 \u00C2\u00A0 Quality control for low input assays \u00E2\u0080\u009CSpike-in\u00E2\u0080\u009D RNAs are synthesized RNAs with known copy number (e.g. Affymetrix Poly-A RNA Control, 900433). They can be added to the sample before RNA isolation and subsequent steps to evaluate the efficiency of the protocol. For ChIP, it may be of use to add reconstructed nucleosomes, which include specific histone modifications of interest and have known DNA sequence. At the current stage, the cost of reconstituting such nucleosomes are too high for its use as a control, however this may change in the near future. An alternative would be using a peptide, linking it to specific DNA sequences and using this peptide-DNA as control for the immunoprecipitation process. We have not gone through the process of testing such controls. Instead we used chromatin from other cells to serve as a control for quality of the chromatin, this however cannot tell the efficiency of the ChIP. For bisulfite-Sanger-Sequencing, the commercial kit (EZ DNA Methylation-Direct Kit D5021from Zymo Research) already includes the methylated DNA for use as a control. We recommend using controls as we described above in all assays particularly during the development stage, as it helps to identify the step in which the problem arise, if the protocol does not work and indicates how well the protocol performs if it does work. 36 2.1.3! Genome-wide transcriptome and epigenome profiling of \u00E2\u0080\u009Crare\u00E2\u0080\u009D cell populations 2.1.3.1! Library construction of material from low-input assays Unlike most low-input library construction protocols that include an amplification step of the input material, the libraries we constructed were built directly from the material yielded from the low input assays described above. This approach yields a significantly higher fraction of unique reads compared with protocols that include an amplification step. Libraries for ChIP and RNA analysis were constructed using a modified paired-end (PE) protocol (Illunima) based on a previously published strand-seq protocol (Falconer et al. 2012). Fragmented DNA from either the ChIP assay or sonicated cDNA from RNA isolation and reverse transcription, are first end-repaired and A-tailed. Illumina PE adaptors (A1 and A2 as shown in Figure 2-6) are then ligated to these fragments. Ligated DNA fragments are amplified using Illumina paired end primers (P1 and P2 as shown in Figure 2-6). We synthesized P2 with index sequences to distinguish different libraries when pooling multiple libraries (multiplexing) on the same flow cell for Illumina sequencing. 37 Figure 2-6 Library construction from low input assays for the Illunima sequencing platform. Fragmented DNA from either ChIP assay or sonicated cDNAs from RNA isolation and reverse transcription, are first end-repaired and A-tailed, and then ligated to Illumina PE (pair end) adaptors Indexed Illumina primers are used during the PCR amplification step to enable multiplexing and paired-end sequencing. A1, Illumina pair end adaptor1; A2, Illumina paired end adaptor2; P1, Illumina paired end primer 1; P2, indexed Illumina paired end primer2. 5\u00E2\u0080\u0099P, 5\u00E2\u0080\u0099 Phosphate. Due to the introduction of nicks in DNA that occurs during bisulfite conversion, library construction has to be done differently when constructing libraries for WGBS. There are different strategies and several published protocols for dealing with this problem (Gu et al. 2011; Miura et al. 2012; Wang et al. 2013; Ekram and Kim 2014). We chose the post-bisulfite adaptor tagging (PBAT) (Miura et al. 2012) WGBS method for this study, which attaches library adaptors after bisulfite conversion. Dr. Kenjiro Shirane in Dr. Hiroyuki Sasaki\u00E2\u0080\u0099s lab at Kyushu University carried out the PBAT analysis on DNA isolated from PGCs, which I isolated from mouse embryos and sent to him. 38 2.1.3.2! Quality control for library construction Library construction is time consuming and involves numerous DNA extraction and enzymatic steps, which makes this process prone to handling errors. Therefore, proper controls are essential for trouble-shooting. We routinely use MNase digested input material as a control for the library construction process. Spike-in controls added before the library construction process can also be used to evaluate the efficiency of the protocol, revealing what percentage of the original DNA is represented in the library and in turn the \u00E2\u0080\u009Cdiversity\u00E2\u0080\u009D of the sampled material. The controls used in low-input assays are also useful for genome-wide analyses, as they can be used as internal normalization standards with great accuracy. 39 2.2! Results 2.2.1! Complexity of ULI NChIP-seq libraries from 103-105 mESCs To improve the yield of chromatin isolated from small samples, we optimized a dilution-based NChIP-seq (Native ChIP-seq) procedure that can easily be adjusted to cell sample size. A comparison of our method to standard NChIP-seq and low-input XChIP-seq (Cross-linked ChIP-seq) protocols highlighting steps improved to prevent sample loss is presented in Figure 2-7a. ULI (Ultra Low Input) NChIP-seq allows for sorting of cells directly into a detergent-based nuclear isolation buffer, enabling extended sample storage or pooling of samples. Importantly, unlike most low-input XChIP-seq methods, no pre-amplification of ChIP material is required prior to library construction, minimizing the generation of PCR artifacts. 40 Figure 2-7 A native ChIP-sequencing protocol to generate genome-wide chromatin maps from low cell numbers. (a) Overview of our improved ultra-low-input NChIP-seq protocol and comparison to previously published NChIP-seq and low input ChIP-seq protocols. Gray: steps associated with sample loss, orange: steps optimized to minimize sample loss. Extrapolation (Daley and Smith 2013) of H3K9me3 (b), H3K27me3 (c) and H3K4me3 (d) library complexity based on low input and standard ChIP-seq libraries sequenced at various depths. Using this protocol, we prepared H3K9me3 NChIP-seq libraries from 103-105 mouse ESCs (Supplementary figure 2-1a). To serve as a reference, we also generated an H3K9me3 library Chromatin immunoprecipitation followed by next-genera-tion sequencing (ChIP-seq) is a widely used approach tostudy genome-wide DNA\u00E2\u0080\u0093protein interactions. Whilesuch experiments have yielded significant insights, standardChIP-seq protocols requireB107 cells1\u00E2\u0080\u00933, precluding their use onrare cell populations. In recent years, scaled-down ChIP-Chip4and ChIP-seq procedures5\u00E2\u0080\u009310 were developed for inputs rangingfrom 103 to 106 cells. However, most include crosslinking(XChIP) and pre-amplification of ChIP material before libraryconstruction5\u00E2\u0080\u00937, which can reduce library complexity andgenerate PCR artefacts11. Despite advances, few groups havegenerated high-quality data from rare in vivo cell populationsusing these methods. Three groups recently published data setsfrom purified primordial germ cells (PGCs) pooled from thegonadal ridges of mouse embryos10,12,13. The large amount ofinput material used in these analyses, however, is prohibitive forstudies involving single embryos or very rare cell types.The reduced number of steps and improved resolution relativeto XChIP makes micrococcal nuclease (MNase)-based \u00E2\u0080\u0098native\u00E2\u0080\u0099ChIP (NChIP) an attractive alternative to study histonemodifications in rare cells. A low-input NChIP-seq method togenerate high-quality and resolution-sequencing libraries wasrecently described8, but libraries built from o105 cells using thismethod had low levels of complexity and high levels of duplicates.We therefore sought to develop an improved NChIP procedurethat would generate high-complexity libraries from significantlysmaller amounts of input material.Here, we present a flexible and robust ultra-low-input (ULI)NChIP-seq method optimized for chromatin isolated from as fewas 103 cells. H3K9me3 and H3K27me3 NChIP-seq librariesgenerated from 103 to 105 mouse embryonic stem cells (ESCs)yield results comparable to those previously generated from 106ESCs. We further validated our approach by generating sex-specific H3K27me3 NChIP-seq data sets from 103 PGCs isolatedfrom the gonadal ridges of single male and female E13.5 embryos.The maps generated have higher complexity and resolution thanpreviously published data sets12,13. Moreover, by intersecting ourNChIP-seq data sets with RNA-seq libraries generated from 103male and female E13.5 PGCs, we identified a subset of genesinvolved in meiosis and transforming growth factor-b receptorsignalling that show sex-specific differences in expression andH3K27me3 enrichment in their promoter regions.ResultsComplexity of ULI-NChIP-seq libraries from 103 to 105 cells.To improve the yield of chromatin isolated from small samples,we optimized a dilution-based NChIP-seq procedure that caneasily be adjusted to cell sample size. A comparison of ourmethod with standard NChIP-seq and low-input XChIP-seqprotocols highlighting steps improved to prevent sample loss ispresented in Fig. 1a. ULI-NChIP-seq allows for sorting of cellsdirectly into a detergent-based nuclear isolation buffer, therebyenabling extended sample storage or pooling of samples.FACS or aliquotPellet cellsFlash-freezeHypotonic lysisbuffer Fragmentation(MNase)LysisDilution andpre-clearChIPElutionLibraryconstruction(8\u00E2\u0080\u009310 cycles)\u00E2\u0080\u0098Gold standard\u00E2\u0080\u0099NChIP-seq106\u00E2\u0080\u0093107 cells FACS or aliquot(PBS or sheathbuffer up to 50%volume)Flash-freezeDetergent lysisbufferFragmentation(MNase) LysisDilution andpre-clearChIPElutionLibraryconstruction(8\u00E2\u0080\u009310 cycles)Low-inputNChIP-seq103\u00E2\u0080\u0093105 cellsFACS or aliquotPellet cellsFlash-freezeDetergent lysisbuffer Fragmentation(sonication)Dilution andpre-clearChIPElutionLibraryconstruction(15\u00E2\u0080\u009318 cycles)Formaldehydecrosslink WashesReversecrosslinkWGA(5\u00E2\u0080\u009315 cycles)Low-inputChIP-seq104\u00E2\u0080\u0093105 cells1101001 10 100Distinct read pairs (millions)Total read pairs (millions)106105103(Input cells)1101001,0001 10 100 1,000 10,000Distinct read pairs (millions)1101001,0001 10 100 1,000 10,000Distinct read pairs (millions)105104103(Input cells)ENCODE (SE reads)105104(Input cells)5*103H3K9me3H3K27me3H3K4me3Figure 1 | A NChIP-seq protocol to generate genome-wide chromatin maps from low cell numbers. (a) Overview of our improved ULI-NChIP-seqprotocol and comparison with previously published NChIP-seq and low-input ChIP-seq protocols. Grey: steps associated with sample loss; orange: stepsoptimized to minimize sample loss. Extrapolation15 of H3K9me3 (b), H3K27me3 (c) and H3K4me3 (d) library complexity based on low-input and standardChIP-seq libraries sequenced with various depth. SE reads, single-end reads.ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms70332 NATURE COMMUNICATIONS | 6:6033 | DOI: 10.1038/ncomms7033 | www.nature.com/naturecommunications& 2015 Macmillan Publishers Limited. All rights reserved. 41 from 106 ESCs using a previously described native ChIP-seq (\u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D) protocol (Maunakea et al. 2010). All libraries were indexed, pooled, and paired-end sequenced (100bp reads). Depending on the input and the number of libraries pooled on a single lane, we obtained from 45-145 million reads. We evaluated library complexity by comparing the total number of distinct reads to the number of duplicate and unaligned reads in each library (Supplementary figure 2-1b). Unmapped reads represented from 7-15% of all reads, independent of sequencing depth or input size, suggesting that the low number of PCR cycles (8-10) employed for library amplification introduced relatively few PCR artifacts. The H3K9me3 library prepared from 106 cells was sequenced the deepest (~147 million reads) and also had the highest proportion of duplicates (28%). Independent of sequencing depth (45-100 million reads) or the number of input cells, ultra-low-input libraries prepared from 103-105 cells had a total of 21-25% uniquely and multi-aligned duplicate reads, suggesting that these libraries were sufficiently complex for deeper sequencing (Supplementary figure 2-1a). As we are comparing libraries with different sequencing depths, we used the PreSeq package (Daley and Smith 2013) to extrapolate and compare the potential complexity of our libraries (Figure 2-7b). Although our H3K9me3 libraries built from 103-105 cells display a lower potential complexity than our \u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D library (Figure 2-7b, top panel), all could potentially be sequenced several times deeper than the ~20 distinct million reads recommended to generate high quality profiles for such broad chromatin marks. In addition, we prepared H3K27me3 NChIP-seq libraries from 103-105 ESCs using similar conditions, and obtained 29-42 million distinct reads per library, with ~10% unmapped reads and only 3-8% total duplicate reads in each case (Supplementary figure 2-1c). As for H3K9me3 42 libraries, using PreSeq(Daley and Smith 2013) to extrapolate the potential complexity of these libraries indicates that even with the lowest input, all of the H3K27me3 libraries could be several times the required depth to obtain high quality profiles (Figure 2-7c) To determine whether this method can be used to create profiles for active histone marks, we next generated ULI NChIP-seq data for H3K4me3. As H3K4me3 is less abundant than H3K9me3 and H3K27me3, H3K4me3 libraries were amplified for 2-4 additional PCR cycles in order to obtain sufficient material for sequencing (Supplementary figure 2-1a). Deep sequencing (37.7 million reads) of an H3K4me3 library built from 105 cells showed under 10% of unaligned reads and 36% total duplicate reads (Supplementary figure 2-1d). Shallow sequencing of H3K4me3 libraries prepared from 5*103 and 104 cells (9.5 and 7.8 million reads, respectively) showed an increased proportion of unaligned reads (55-70%), indicative of lower complexity libraries. As this was a shallow round of sequencing, the proportion of duplicate reads remains very low (<5%). Extrapolation of potential library complexity indicates that, despite the increased proportion of unaligned reads, deeper sequencing of these libraries could generate enough reads to saturate H3K4me3 peaks (Figure 2-7d). 2.2.2! Correlation between ULI and standard-input NChIP-seq libraries Visual inspection of NChIP-seq profiles from randomly chosen regions shows similar enrichment in libraries built from 103-106 cells (Figure 2-8a-c). We compared H3K9me3 enrichment in genome-wide 2 kb bins and calculating Pearson correlation coefficients to assess the similarity between ultra-low-input and standard NChIP-seq libraries (Figure 2-8d and Supplementary figure 2-2a). H3K9me3 libraries built from 103-105 cells had correlations Lorincz 43 lab ranging from 0.83 to 0.9 when compared to \u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D H3K9me3 NChIP-seq. As expected, low input libraries had modestly higher background levels, as illustrated by an increase in variance (Supplementary figure 2-2c). We next defined regions enriched for H3K9me3 using MACS (Zhang et al. 2008). Of all H3K9me3 peaks identified in our \u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D library, 76-85% were also detected in libraries generated from 103-105 cells (Supplementary figure 2-3a and c). Consistent with previous reports showing that specific endogenous retroviruses (ERVs) are marked and silenced by H3K9me31, (Karimi et al. 2011; Matsui et al. 2010), our \u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D and ultra-low-input libraries show H3K9me3 enrichment at the same subset of ERV1 and ERVK subfamilies (Supplementary figure 2-4a), in the unique 1kb 5\u00E2\u0080\u0099 flank of ERVKs (Supplementary figure 2-4b), and at individual IAP ERVK elements (Supplementary figure 2-4c). between technical (PCR) and biological duplicate reads. WhileH3K27me3 enrichment patterns around and upstream of theHoxC cluster are broadly similar to those described by Ng et al.13and Lesch et al.12 (Fig. 3b), our method yields higher resolutionmaps, likely owing to a combination of high number of distinctreads, longer reads and lower number of PCR amplification cyclesused during library construction. In addition, fragmentation ofchromatin using MNase generates smaller and more uniformlysized fragments than does sonication of crosslinked chromatin,while the use of paired-end sequencing allows for thedetermination of true fragment size. Relative H3K27me3enrichment around all annotated TSSs (\u00C2\u00B12 kb) was similar topreviously published data12,13 (Fig. 3c), with Pearson correlationsbetween 0.68 and 0.85. Of note, the more deeply sequenced of thetwo female libraries from Lesch et al.12 showed greater correlationto the female H3K27me3 data set generated using ULI-NChIP-seq19 (0.69) than to its replicate library (0.51) (Fig. 3c).Intriguingly, while male and female E13.5 PGCs have distinctdifferentiation programs and transcription patterns12,20,21, ourresults indicate that their H3K27me3 distribution profiles arebroadly similar (Supplementary Fig. 6 and ref. 19). Using ourULI-NChIP-seq data sets, we therefore sought to identify sex-specific H3K27me3-marked promoters associated with genesilencing in E13.5 PGCs. In both males and females,H3K27me3 around TSSs was associated with low levels oftranscription (Fig. 4a,b and ref. 19). Most genic promotersharbouring H3K27me3 in male PGCs are also marked in femalesand vice versa, with approximately two-thirds of those alsomarked in ESCs (Supplementary Fig. 7). Interestingly, a relativelylarge number of promoters (B1,500) are enriched for H3K27me3exclusively in female PGCs, while a smaller proportion (B270)are enriched exclusively in male PGCs. While most of the genesmarked in a sex-specific manner are silenced in both male andfemale PGCs, we identified a subset of sex-specific H3K27me3-marked genes that show an inverse relationship with expressionin PGCs (Fig. 4c\u00E2\u0080\u0093e and Supplementary Tables 1 and 2). Inaccordance with female E13.5 PGCs preparing to initiate meiosisI and male PGCs undergoing mitotic arrest22,23, several meioticgenes, including Lfhg and Stra8 (ref. 24), show a higher level ofexpression in female PGCs and, conversely, a higher level ofH3K27me3 in male PGCs (Supplementary Fig. 8 andSupplementary Tables 1 and 2). On the other hand, only asmall number of male-specific genes, including transforminggrowth factor-b receptor binding factors Lefty1 and Lefty2,are marked by H3K27me3 in female PGCs exclusively(Supplementary Fig. 8 and Supplementary Tables 1 and 2),consistent with the recent observation that Nodal signalling isactivated specifically in males25. Taken together, these resultsreveal that at this stage in PGC development, the polycombpathway may be engaged more frequently in the male germ lineto regulate germ cell-specific genes.DiscussionWe present a rapid, ULI-NChIP-seq procedure, which can becarried out with as few as 103 cells, without sacrificing complexity0.47 0.49 0.43 0.44 0.46 0.46 0.47 0.47 0.78 0.77 1.000.29 0.32 0.26 0.27 0.29 0.30 0.35 0.35 0.90 1.000.30 0.34 0.29 0.30 0.31 0.31 0.37 0.37 1.000.43 0.34 0.28 0.29 0.31 0.31 0.85 1.000.43 0.35 0.28 0.29 0.31 0.31 1.000.90 0.87 0.86 0.87 0.93 1.000.90 0.88 0.87 0.89 1.000.84 0.88 0.88 1.000.83 0.87 1.000.86 1.001.000.76 0.82 0.90 0.96 1.000.78 0.83 0.90 1.000.71 0.80 1.000.97 1.001.00(Input cells)*106105104103**103**103105104103***103***103106 *105104103103 **103 **105104103103 ***103 ***H3K9me3H3H3K27me301(Input cells)(Input cells)5\u00C3\u0097103104105(Input cells) 1051045\u00C3\u0097103ENCODEULI ChIP01103103**103**104105106*H3K9me3H3K27me3H3K4me3(Input cells)chr17:22,839,001-24,587,728ENCODE5\u00C3\u0097103104105chr17:34,022,966-34,101,258GM16386 Zfp40 Vmn2r117 Cldn6 Dcpp2 Amdh2chr2:118,155,950\u00E2\u0080\u0093119,526,810103104105106\u00E2\u0080\u0093107\u00E2\u0080\u00A0Eif2ak4 Phgr1 Ccdc32 Rhov Ino80 Oip5Bmf Pak6Tapbp Rps18 Vsp52Rgl2 Wdr46DaxxFigure 2 | Correlation between standard and ULI-NChIP-seq libraries built from 103 to 105 ESCs. Genome browser screenshots of H3K9me3 (a),H3K27me3 (b) or H3K4me3 (c) profiles at the indicated genomic locations. (d) Genome-wide Pearson correlation (50,000 random 2 kb bins) betweenH3K9me3, H3 and H3K27me3 enrichment (RPKM) in data sets generated from 103 to 106 cells as input material. *Library prepared using the \u00E2\u0080\u0098gold-standard\u00E2\u0080\u0099NChIP-seq protocol. ** and ***Technical replicates. wObtained from ref. 35. (e) Pearson correlation of H3K4me3 enrichment (RPKM) around genic promoters(RefSeq TSS \u00C2\u00B1500bp) between ULI-NChIP-seq libraries generated from 5! 103 to 5! 105 cells and ENCODE libraries18 in E14 mouse ESCs.ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms70334 NATURE COMMUNICATIONS | 6:6033 | DOI: 10.1038/ncomms7033 | www.nature.com/naturecommunications& 2015 Macmillan Publishers Limited. All rights reserved. 44 Figure 2-8 Correlation between standard and ultra low input native ChIP-seq libraries built from 103 to 105 ESCs. IGV genome browser (Thorvaldsd\u00C3\u00B3ttir et al. 2013) screenshots of H3K9me3 (a), H3K27me3 (b) or H3K4me3 (c) profiles at the indicated genomic locations. (d) Genome-wide Pearson correlation (50,000 random 2kb bins) between H3K9me3, H3 and H3K27me3 enrichment (RPKM) in datasets generated from 103 to 106 cells as input material. * library prepared using the \u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D NChIP-seq protocol. ** and *** technical replicates. Obtained from reference (Rugg-Gunn et al. 2010). (e) Pearson correlation of H3K4me3 enrichment (RPKM) around genic promoters (RefSeq transcription start site +/- 500 bp) between ULI NChIP-seq libraries generated from 5*103 to 105 cells and ENCODE libraries (Consortium et al. 2012) in E14 mouse ESCs. Similarly, H3K27me3 libraries built from 104-105 cells were highly correlated, with a genome-wide correlation (2 kb bins) of 0.9. Likely due to a modest increase in background levels, the library built from 103 cells had correlations of 0.77 and 0.78 to the libraries built from 104 and 105 cells, respectively (Figure 2-8d and Supplementary figure 2-2b-c). Regardless, H3K27me3 enriched regions showed good correlation between libraries, with 80% and 70% of peaks detected in our 105 cell input library overlapping with peaks detected in our libraries built from 104 and 103 cells, respectively (Supplementary figure 2-3b and d). We next compared H3K27me3 enrichment levels around transcription start sites (TSSs), as H3K27me3 marks the promoter regions of bivalent or silenced genes1. Libraries from all input sizes showed high correlation to each other (0.86-0.96), and H3K27me3 enrichment at gene promoters was correlated with relatively low levels of gene expression, as expected (Supplementary figure 2-5a-c). As H3K4me3 is a narrow chromatin mark present at actively transcribed and bivalent genes, we compared H3K4me3 enrichment around transcription start sites (+/- 500 bp) of ULI NChIP-seq to ENCODE(Consortium et al. 2012) libraries (built from E14 ESCs). We obtained high Pearson 45 correlation coefficient between our libraries built from 5*103 to 105 cells (0.90 to 0.96), and good correlations (0.71 to 0.83) to ENCODE libraries (Figure 2-8e). The lower correlation to ENCODE libraries is presumably due in part to the large difference in sequencing depth, as well as the different ES cell lines and antibodies used. Visual inspection of ChIP-seq profiles reveals that the same promoters are generally marked, but with varying intensities (Figure 2-8c). While preliminary attempts to generate H3K4me3 profiles from 103 cells did not yield sufficient coverage (data not shown), further optimization of the ChIP conditions for this mark will likely improve the resolution of signal above background. 2.2.3! Sex-specific H3K27me3 profiles in PGCs isolated from single embryos As low input methods are particularly useful for the study of cell types present in limited numbers in vivo, we validated our method on PGCs, the precursors to mature gametes. To determine sex-specific H3K27me3 profiles correlation to sex-specific gene expression, we used ULI NChIP-seq datasets prepared from 103 PGCs purified from the gonads of single male and female E13.5 embryos(Liu et al. 2014). Comparison to previously published low-input H3K27me3 datasets generated from 5.2\u00C3\u0097104 to 1.8\u00C3\u0097105 PGCs in two independent studies(Lesch et al. 2013; Ng et al. 2013) reveals that our method yielded similar or greater sequencing depth while minimizing total duplicate generation (<15%) (Figure 2-9a). Of note, while a fraction of the reads labeled as duplicates are likely due to preferred MNase cleavage sites, the use of paired-end rather than single-end sequencing for ULI NChIP-seq allows for improved discrimination between technical (PCR) and biological duplicate reads. While H3K27me3 enrichment patterns around and upstream of the HoxC cluster are broadly similar to those described by Ng et al. (Ng et al. 2013) and Lesch et al. (Lesch et al. 2013) (Figure 2-9b), our 46 method yields higher resolution maps, likely due to a combination of high number of distinct reads, longer reads and lower number of PCR amplification cycles employed during library construction. In addition, fragmentation of chromatin using MNase generates smaller and more uniformly sized fragments than does sonication of cross-linked chromatin, while the use of paired-end sequencing allows for the determination of true fragment size. Relative H3K27me3 enrichment around all annotated transcription start sites (TSS +/- 2kb) was similar to previously published data(Lesch et al. 2013; Ng et al. 2013) (Figure 2-9c), with Pearson correlations between 0.68-0.85. Of note, the more deeply sequenced of the two female libraries from Lesch et al. showed greater correlation to the female H3K27me3 dataset generated here (0.69) than to its replicate library (0.51) (Figure 2-9c). 47 Figure 2-9 High-resolution gender-specific H3K27me3 profiles generated from E13.5 PGCs isolated from single embryos. (a) Library complexity of H3K27me3 datasets prepared from E13.5 PGCs using ULI NChIP-seq compared to previously published datasets generated using alternative low input XChIP-seq protocols. *: (Liu et al. 2014);**:(Ng et al. 2013) ; ***: (Lesch et al. 2013). The number of input cells is indicated above each bar (Rep.: Repeat). (b) Genome browser screenshots of H3K27me3 enrichment around the HoxC cluster illustrating the complexity of libraries generated using ULI NChIP-seq library and correlation to previously published datasets (Lesch et al. 2013; Ng et al. 2013) generated using an alternative low input ChIP-seq protocol. *(Liu et al. 2014); **: (Ng et al. 2013); ***: (Lesch et al. 2013). (c) Pearson correlations between H3K27me3 data generated from 103 male PGCs using our low input protocol and previously published data around gene promoters (RefSeq transcription start site +/- 2 kb). * (Liu et al. 2014); **: (Ng et al. 2013) ; ***: (Lesch et al. 2013). or resolution5\u00E2\u0080\u009310. Despite the small input size, librariesgenerated with this method show high resolution andcomplexity comparable to libraries built with 106 cells.Indexing and pooling multiple libraries per sequencing lanenot only minimizes sequencing costs but also eliminates theneed for pre-amplification of raw ChIP material, which incombination with low PCR cycles at the library constructionstep reduces the fraction of duplicates and unaligned readsgenerated. Moreover, the protocol presented here is flexible,allowing freezing, storing and pooling of samples prepared ondifferent days, a valuable feature when working with in vivosamples. ULI-NChIP-seq may also be useful for analysis of non-histone proteins, including transcription factors, that can beimmmunoprecipitated in the absence of crosslinking26.Using this ULI-NChIP-seq method, we generated H3K27me3libraries in PGCs isolated from single male and female embryos19.While these data sets are correlated with previously publisheddata generated from PGCs pooled from multiple embryos12,13,ULI-NChIP-seq data sets show improved resolution and areduced proportion of reads flagged as duplicates, highlightingthe benefit of minimizing the number of library amplificationcycles and paired-end sequencing. Intersection of our high-resolution NChIP-seq libraries with low-input RNA-seq profilesallowed us to identify a subset of differentially expressed genesthat are marked in a sex-specific manner by H3K27me3 in E13.5PGCs, including both previously identified targets of polycombgroup (PcG)-dependent silencing and novel candidates.While it is possible to pool rare samples to generate ChIP-seqlibraries, obtaining sufficient cell numbers for previouslypublished \u00E2\u0080\u0098low-input\u00E2\u0080\u0099 protocols (4104 cells) can be impractical.For example, in our recently published study19, only B3! 103and B6! 103 SSEA1\u00C3\u00BE PGCs could be purified by fluorescence-activated cell sorting (FACS) from single male and female wild-type embryos, respectively. In genetically manipulated animals,cell viability can be impacted, decreasing sample yield yet further.Furthermore, embryos with the desired genotype may representonly a small fraction of each litter, so the ULI-NChIP-seq methodpresented here minimizes the breeding colony size required forgenome-wide analyses. As multiple histone marks can be profiledsimultaneously with transcription in individual embryos, thevariability inherent in studies of cell types that are in the processof transcriptional reprogramming in association withdevelopmental stage is also minimized. ULI-NChIP-seq shouldalso be useful for studies of clinical samples, where cell numbersare frequently limiting.MethodsCell culture and isolation. TT2 mouse ESCs27 were cultured in DMEMsupplemented with 15% fetal bovine serum (HyClone), 20mM HEPES, 0.1mMnon-essential amino acids, 0.1mM 2-mercaptoethanol, 100Uml# 1 penicillin,0.67 0.69 0.52 0.63 0.67 0.60 0.51 1.000.51 0.52 0.42 0.53 0.61 0.56 1.000.68 0.64 0.58 0.69 0.76 1.000.76 0.72 0.62 0.76 1.000.77 0.73 0.73 1.000.75 0.67 1.000.94 1.001.00UnalignedMulti-aligned (dup.)Multi-alignedUniquely aligned (dup.)Uniquely aligned01020304050607080Reads (millions)Rep.1Rep.2Rep.1Rep.2Rep.1Rep.2********ULI ChIP*(read pairs) (reads) (reads)103 1035.2\u00C3\u00971041.8\u00C3\u0097105 0.95\u00E2\u0080\u00931.2\u00C3\u0097105**********121212********** 1 2 1 2 1 201ULI-NChIPNg. et al.2012Lesch et al.2013Atp5g2 Calcoco1Atf7102,510 kb102,490 kb102,470 kb 102,550 kb102,530 kb 102,570 kbchr15Amhr2 Atp5g2 HoxC cluster102,500 kb102,300 kb 102,700 kb 102,900 kb********121212**Figure 3 | High-resolution gender-specific H3K27me3 profiles generated from E13.5 PGCs isolated from single embryos. (a) Library complexity ofH3K27me3 data sets prepared from E13.5 PGCs using ULI-NChIP-seq compared with previously published data sets generated using alternative low-inputXChIP-seq protocols (*ref. 19; **ref. 13; ***ref. 12). The number of input cells is indicated above each bar. (Rep.: Repeat) (b) Genome browser screenshots ofH3K27me3 enrichment around the HoxC cluster illustrating the complexity of libraries generated using ULI-NChIP-seq library and correlation to previouslypublished data sets12,13 generated using an alternative low-input ChIP-seq protocol (*ref. 19; **ref. 13; ***ref. 12). (c) Pearson correlations betweenH3K27me3 data generated from 103 male PGCs using our low-input protocol and previously published data around gene promoters (RefSeq TSS\u00C2\u00B12 kb).(*ref. 19; **ref. 13; ***ref. 12).NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7033 ARTICLENATURE COMMUNICATIONS | 6:6033 | DOI: 10.1038/ncomms7033 | www.nature.com/naturecommunications 5& 2015 Macmillan Publishers Limited. All rights reserved. 48 Intriguingly, while male and female E13.5 PGCs have distinct differentiation programs and transcription patterns(Lesch et al. 2013; Jameson et al. 2012; Seisenberger et al. 2012), our results indicate that their H3K27me3 distribution profiles are broadly similar (Supplementary figure 2-6 and (Liu et al. 2014)). Using our ULI NChIP-seq datasets, we therefore sought to identify sex-specific H3K27me3 marked promoters associated with gene silencing in E13.5 PGCs. In both males and females, H3K27me3 around TSSs was associated with low levels of transcription (Figure 2-10a-b and (Liu et al. 2014)). Most genic promoters harboring H3K27me3 in male PGCs are also marked in females and vice versa, with approximately two-thirds of those also marked in ESCs (Supplementary figure 2-7). Interestingly, a relatively large number of promoters (~1,500) are enriched for H3K27me3 exclusively in female PGCs, while a smaller proportion (~270) are enriched exclusively in male PGCs. While most of the genes marked in a sex-specific manner are silenced in both male and female PGCs, we identified a subset of sex-specific H3K27me3-marked genes that show an inverse relationship with expression in PGCs (Figure 2-10c-e). In accordance with female E13.5 PGCs preparing to initiate meiosis I and male PGCs undergoing mitotic arrest (Koubova et al. 2006), several meiotic genes, including Lfhg and Stra8 (Baltus et al. 2006), show a higher level of expression in female PGCs and conversely, a higher level of H3K27me3 in male PGCs (Supplementary figure 2-8). On the other hand, only a small number of male-specific genes, including TGF\u00CE\u00B2 receptor binding factors Lefty1 and Lefty2, are marked by H3K27me3 in female PGCs exclusively (Supplementary figure 2-8), consistent with the recent observation that Nodal signaling is activated specifically in males(Souquet et al. 2012). Taken together, these results reveal that at this stage in PGC development, the polycomb 49 pathway may be engaged more frequently in the male germline to regulate germ cell specific genes. Figure 2-10 Gender-specific H3K27me3 profiles from E13.5 PGCs isolated from single embryos. Relationship between gene promoter enrichment of H3K27me3 (RPKM, RefSeq transcription start site +/- 1 kb) and gene expression (exonic RPKM) in male (a) and female (b) E13.5 PGCs. Enrichment of H3K27me3 in genic promoter regions (RPKM, RefSeq transcription start site +/- 1kb) (c) and expression (exonic RPKM) of annotated genes (d) in male versus female PGCs. H3K27me3-marked genes that show an inverse relationship with expression in male (gold) and female (red) PGCs are highlighted. (e) Genome browser screenshots of male versus female H3K27me3 0.05mM streptomycin, leukemia inhibitory factor and 2 M L-glutamine ongelatinized plates. Trypsinized cells were either FACS-sorted or aliquoted innuclear isolation buffer (Sigma, N3408) containing protease inhibitor cocktail(Roche), flash-frozen and stored at ! 80 !C for a few weeks to a few months.\u00E2\u0080\u0098Gold-standard\u00E2\u0080\u0099 NChIP. For \u00E2\u0080\u0098gold-standard\u00E2\u0080\u0099 NChIP14,16, 106 cells wereresuspended in douncing buffer (10mM Tris-HCl, pH 7.5, 4mM MgCl2, 1mMCaCl2 and protease inhibitor cocktail) and homogenized through a syringe.Chromatin was digested in 2U ml! 1 MNase (Worthington Biochemicals) at 37 !Cfor 5min, and the reaction was quenched by 0.5M EDTA. Chromatin wasresuspended in hypotonic buffer (0.2mM EDTA, pH 8.0, 0.1mM benzamidine,0.1mM phenylmethylsulfonyl fluoride, 1.5mM dithiothreitol and 1\" proteaseinhibitor cocktail (PIC) and incubated for 1 h on ice. Cellular debris was pelletedand the supernatant was recovered. Chromatin was pre-cleared with 20 ml of 1:1protein A:protein G Dynabeads (Life Technologies) and immunoprecipitation wascarried out with antibody\u00E2\u0080\u0093bead complexes (5 ml Active Motif no. 39161 H3K9me3antibody and 20ml 1:1 protein A:protein G Dynabeads) overnight at 4 !C. IPedcomplexes were washed twice with 400 ml of ChIP wash buffer I (20mM Tris-HCl,pH 8.0, 0.1% SDS, 1% Triton X-100, 2mM EDTA and 150mM NaCl) and twicewith 400 ml of ChIP wash buffer II (20mM Tris-HCl (pH 8.0), 0.1% SDS, 1% TritonX-100, 2mM EDTA and 500mM NaCl). Protein\u00E2\u0080\u0093DNA complexes were eluted in200ml of elution buffer (100mM NaHCO3 and 1% SDS) for 2 h at 68 !C. IPedmaterial was purified by phenol chloroform and 5 ng of raw ChIP material wasprocessed for library construction.ULI-NChIP. A detailed, step-by-step procedure is presented in SupplementaryMethods. We based our chromatin preparation on a previously published MNasechromatin fragmentation and library construction from single cells28. TT2 mouseESCs were either FACS-sorted directly in nuclear isolation buffer (Sigma;o20,000cells) or pelleted and re-suspended in nuclear isolation buffer (Sigma). Dependingon input size chromatin was fragmented for 5\u00E2\u0080\u00937.5min using MNase at 21 or 37 !C,and diluted in NChIP immunoprecipitation buffer (20mM Tris-HCl pH 8.0, 2mMEDTA, 15mM NaCl, 0.1% Triton X-100, 1\" EDTA-free protease inhibitorcocktail and 1mM phenylmethanesulfonyl fluoride (Sigma)). Chromatin was pre-cleared with 5 or 10 ml of 1:1 protein A:protein G Dynabeads (Life Technologies)and IPed with 0.25 or 1mg of H3K9me3 (Active Motif no. 39161), H3K27me3(Diagenode pAb-069\u00E2\u0080\u0093050) or pan-H3 (Sigma, I8140) antibody\u00E2\u0080\u0093bead complexesovernight at 4 !C. IPed complexes were washed twice with 400ml of ChIP washbuffer I (20mM Tris-HCl, pH 8.0, 0.1% SDS, 1% Triton X-100, 0.1% deoxycholate,2mM EDTA and 150mM NaCl) and twice with 400 ml of ChIP wash buffer II(20mM Tris-HCl (pH 8.0), 0.1% SDS, 1% Triton X-100, 0.1% deoxycholate, 2mMEDTA and 500mM NaCl). Protein\u00E2\u0080\u0093DNA complexes were eluted in 30 ml of ChIPelution buffer (100mM NaHCO3 and 1% SDS) for 2 h at 68 !C. IPed material waspurified by phenol chloroform, ethanol-precipitated and raw ChIP material was re-suspended in 10mM Tris-HCl pH 8.0. As material obtained after ChIP is minimal,DNA concentration was not measured in samples before library construction. Foroptimal results, raw ChIP material was re-purified with 1.8\" volume of AmpureXP DNA purification beads (Agencourt) before library construction.RNA extraction and double-stranded cDNA preparation. Total RNA wasextracted from a frozen 103 cells aliquots using TRIzol (Invitrogen, AM9738)according to the manufacturer\u00E2\u0080\u0099s manual. Residual genomic DNA was removed bytreatment with DNase I (Promega), and ribosomal RNA was depleted using theRiboMinusTranscriptome Isolation kit (Invitrogen) according the manufacturer\u00E2\u0080\u0099slow-input protocol. First strand cDNA synthesis was carried out using SuperscriptIII (Invitrogen 18080-093) with T4 protein 32 and a combination of random 15-mers and oligo dT (NEB), followed by second strand cDNA synthesis using theKlenow polymerase (NEB) in the presence of RNase H. Double-stranded cDNA2410004A20RikCpt1aLefty1Lefty2 Olig1Otx2Pitx2Tmem63a.Cdx2Chd3.Fscn1Gata2Gm1564Hoxc5Id1Igfbp4Itga3Lamb2Msx1as.Slc22a18Stra8Zfp50300.511.522.50 0.5 1 1.5 2 2.5 32410004A20RikCpt1aLefty1Lefty2Olig1Otx2Pitx2Tmem63a.Aif1l..Cd82Cdx2Chd3..Fscn1Gata2.Hoxc5Id1Igfbp4Itga3Msx1asMsx2.SctStra8 Vim0.010.11101000.01 0.1 1 10 100H3K27me3H3K27me3RNARNAH3K27me3H3K27me3RNARNAH3K27me3H3K27me3RNARNAH3K27me3H3K27me3RNARNA>> Lefty2 >>>> Tmem63a >>>> Id1 >>>> Stra8 >>H3K27me3 (RPKM)H3K27me3 (RPKM)0.011010.1H3K27me3 (RPKM) H3K27me3 (RPKM)0 321 0 211001,000Expression (RPKM)Expression (RPKM)Expression (RPKM)Figure 4 | Gender-specific H3K27me3 profiles from E13.5 PGCs isolated from single embryos. Relationship between gene promoter enrichment ofH3K27me3 (RPKM, RefSeq TSS \u00C2\u00B11 kb) and gene expression (exonic RPKM) in male (a) and female (b) E13.5 PGCs. Enrichment of H3K27me3 ingenic promoter regions (RPKM, RefSeq TSS \u00C2\u00B11 kb) (c) and expression (exonic RPKM) of annotated genes (d) in male versus female PGCs. H3K27me3-marked genes that show an inverse relationship with expression in male (gold) and female (red) PGCs are highlighted. (e) Genome browser screenshotsof male versus female H3K27me3 enrichment and gene expression at selected loci, revealing sex-specific H3K27me3-associated gene silencingin E13.5 PGCs.ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms70336 NATURE COMMUNICATIONS | 6:6033 | DOI: 10.1038/ncomms7033 | www.nature.com/naturecommunications& 2015 Macmillan Publishers Limited. All rights reserved. 50 enrichment and gene expression at selected loci, revealing gender specific H3K27me3-associated gene silencing in E13.5 PGCs. 2.3! Conclusion We have provided a protocol package SmallCell to provide epigenomic surveying of histone modification, DNA methylation, and transcriptional profiling through developing and optimizing existing protocols that are designed for large amount of input materials (\u00E2\u0089\u00A5105 cells). This protocol package enables researchers to survey epigenetic information of \u00E2\u0080\u009Crare\u00E2\u0080\u009D cell populations in vivo on both locus-specific and genome-wide scale with only 1000~2000 cells. We also listed a systemic approach we took to optimize/develop the ChIP protocol and addressed the importance and the usage of control in developing the protocol, which may be useful in a general sense to developing other protocols. Besides conventional epigenome profiling protocols, SmallCell protocol package serves as a tool for in vivo validation or exploration of epigenomic information in rare cell populations. With the computational tools we provided in the next chapter, researchers can have a platform for in vivo epigenome assays with very limited cells. 2.4! Availability and future directions Specific materials and methods for this chapter are included in Appendix B Supplementary methods for Chapter 2. Detailed protocol package for low input materials is also available from Github Repository (https://github.com/sheng-liu/SmallCell/blob/master/SmallCell.md), individual protocols can also be accessed from Github: -! ChIP protocol: https://github.com/sheng-liu/smallcellChIP.git, 51 -! RNA extraction and reverse transcription: https://github.com/sheng-liu/smallcellRNA.git, -! Bisulfite Sanger sequencing: https://github.com/sheng-liu/smallcellDNAme.git. Supplemental information as well as Sequencing Data List can be accessed through Github (https://github.com/sheng-liu/sheng-liu.github.io/tree/master/Supplementals/). I am also uploading links the sequencing data and demo data tracks to this Github repository. Many different protocols can be developed from this protocol package to fit various applications, here I will only list a few points on the ChIP protocol that I think can be further developed to enhance the robustness of the protocol: -! It would be useful to develop a \u00E2\u0080\u009Cuniversal\u00E2\u0080\u009D cell storage buffer. Current cell storage buffers vary depending on the application. Cells sorted into the buffer used for RNA preparation cannot be used for ChIP and vice versa. This restricts experimental flexibility, a critical limitation when the cells of interest are rare. Using nuclei extraction buffer supplemented with RNAse inhibitor maybe an alternative way to store cells to allow the flexibility of using the same material for ChIP, bisulfite conversation and transcription profiling. -! Cross-linking is intended for the purpose of mapping chromatin binding proteins, as well as staining. Proper concentration of cross linkers needs to be tested to achieve optimal cross-linking. We tested some of these conditions as well and found that freshly sorted cells need to be fixed/cross-linked before snap freezing to avoid variation in cross- 52 linking. However further development is needed to get cross-linked ChIP to work as efficiently as native ChIP. -! Of course, there are other potential advances that can be incorporated into the protocol, such as the development of a method that allows for the extraction of DNA and RNA from the same batch of cells, or modifications that allow for nano-liquid handling to automate the labor-intensive steps of the method. 53 Chapter 3:!Visualize and intersect multidimensional epigenomic datasets# Chromatin carries multiple layers of biochemical information, including DNA methylation and various covalent histone modifications, which makes epigenomic information inherently multidimensional. Genome browsers (e.g. UCSC genome browser) are useful for visualizing the relationships between these epigenetic modifications at specific genomic regions. However, only one genomic region can be visualized in any given session, making it difficult to identify complex relationships between various epigenetic modifications, such as co-enrichment and/or exclusion, at a genome-wide level. Furthermore, it is frequently useful to select multiple data points in one type of dataset (such as ChIPseq) and intersect the corresponding data points with other datasets (e.g. BisulfiteSeq or RNAseq). To address this need, I developed InterSeq, which allows users to intuitively recognize patterns among multiple epigenetic modifications at the genome-wide level; and \u00E2\u0080\u009Cgate\u00E2\u0080\u009D/intersect interactively on patterns of correlation and/or exclusion of different epigenetic modifications. As a proof-of-concept, I used InterSeq to analyze the relationship between H3K9me3, H3K27me3 (ChIPseq) and DNA methylation (PBAT) data, together with RNAseq data (a total of 13 datasets) generated in E13.5 mouse primordial germ cells. Using this approach, I found that these three different epigenetic modifications are specifically enriched at transcriptionally silenced retrotransposons (Liu et al. 2014). The results were validated by visualization on the UCSC genome browser, as well as ChIP-qPCR and Sanger-Bisulfite Sequencing at specific loci. # The illustrations, figures and segments of the discussion in this chapter can be found in a submitted manuscript: Liu, S., Lorincz, M.C. (2015) Intersect multiple types of epigenomic datasets base on pattern of distribution. 54 Thus, InterSeq can be used as a companion tool to genome browsers to efficiently discover patterns of association between epigenetic modifications at a genome-wide level and it conveniently connects users to a repertoire of flow cytometry tools in bioconductor for high-level analysis (clustering) of epigenomic sequencing data. 3.1! Design and implementation The underlying strategy behind the interface is to view each genomic feature (e.g. a gene, exon or repetitive sequence) as an entity, and different types of epigenomic and expression data values (quantified through various biochemical assays such as ChIP-seq, Bisulfite-seq and RNA-seq) as attributes of the entity; analogous to the concept of the cell or \u00E2\u0080\u009Cevent\u00E2\u0080\u009D in flow cytometry, with the intensity of fluorescence collected from different laser channels analogous to the attributes mentioned above. Indeed, users can gate and plot epigenetic modifications as if they are channels in a flow cytometry dataset. Instead of designing new algorithms for gating and plotting, I took advantage of the flow cytometry data analysis packages (See biocViews\ FlowCytometry) already available in Bioconductor (Gentleman et al. 2004), an open source platform of tools for the analysis and comprehension of high-throughput genomic data. We implemented a new data structure, SeqFrame, to interface genomic interval related tables into the flow cytometry packages in bioconductor, allowing users to use flow resources in R directly on genomic interval data tables. The tool kit is designed to meet the basic needs of biologists including the ability to compute, visualize and subset/gate the genome-wide distribution of enrichment of epigenetic modifications 55 at all genomic features of interest in an intuitive interface, and to facilitate recognition of patterns, which may lead to new biological insights. 3.1.1! Calculations from a bam file with SeqData This package provides three basic ways of counting with bam files, including read counts, read coverage, and base percentage (e.g. percent of methyl cytosine) for RNAseq, ChIPseq and Bisulfite-seq respectively. Once the data is entered, functions for finding significantly enriched regions (findPeaks), and annotating them (annotatePeaks) can be easily employed. The computations are based on an internal SeqData data structure, which only requires a bam file and optional annotation file to construct and is built solely upon Bioconductor infrastructure packages e.g. GenomicRanges (Lawrence et al. 2013) and Rsamtools (Morgan et al. 2013) . 3.1.2! Interface tables to flowCore with SeqFrame The flowCore package (Hahne et al. 2009) forms the infrastructure of flow cytometry pipelines in bioconductor. SeqFrame is designed to function as an interface between genomic intervals spreadsheet data and flowCore. It is essentially a flowFrame class defined in flowCore with an additional data frame describing the corresponding genomic annotation of channels. It currently supports converting comma separated values (.csv), tab separated values (.tab), R data type data.frame, and genomic interval file formats bed, and bigWig into SeqFrame. Genome-wide genomic feature sets can often be very large (>50,000 items), particularly for genome-wide bins, making it hard to manipulate with conventional table editors such as Excel. 56 SeqFrame provide a convenient function for merging and constructing SeqFrame files from multiple csv files, and an optional sampling of 50K data points for uncluttered visualization. 3.1.3! Graphical interface for visualization and gating with SeqViz To simplify the most frequently used basic computation and visualization functions in SeqData and SeqFrame, we implemented a graphical interface SeqViz based on GTK+ graphics using the RGTK2 package (Lawrence and Lang 2010). This is particularly useful for biologists who are not familiar with R programming. SeqViz includes a data pane for computing the three different modes of counting (read counts, read coverage, and base percentage e.g. percent of methyl cytosine) at genomic features of interest and a plot pane for visualization of distributions of all attributes in one or two dimensions (Figure 3-1 A-B). The SeqData package is the underlying workhorse for such computations. Four intuitive gating functions are available using basic interactive graphics in R, including both one-dimensional gating (range gate) and two-dimensional gating (quadrant gate, rectangle gate and polygon gate) for manual selection of a population of features with special geometric clustering. Three plotting functions are implemented including: one dimensional histogram, two dimensional scatter plot and a 2D+ plot function which allows the user to generate two-dimensional plots (i.e. visualizing the relationship between two variables) with a third 57 variable/dimension shown in the form of color-coding, as in a 1 dimensional heat map. The plotting function allows for simple plotting of selected data, but also for plotting the child nodes that are generated from gating/subset generation/operation in parallel, allowing the user to easily visualize the separation of variables/attributes as a result of gating. To facilitate further analyses of regions of interest in a genome browser, SeqViz also launches an action when double clicking on the data row, opening the corresponding genomic coordinates region in the UCSC genome browser (Figure 3-1 C). All functions in SeqViz are available in command line for more flexible adjustment of parameters. In addition there are a number of \u00E2\u0080\u009Chelper\u00E2\u0080\u009D functions that are useful when doing similar analyses such as df2gr (easily convert data.frame into genomic ranges), df2sf (easily convert data.frame into SeqFrame with annotation slot filled), bins (generate genome-wide bins of various lengths), .rpkm (rpkm normalization), etc. 3.2! Results Typically, the first questions asked with genomics data are, what is the distribution of one variable? What is the correlation of these two variables? If subset one variable, how does the distribution of the other variables change? As an example of the utility of this software, we analyzed epigenomic and expression datasets for mouse cells generated in the Lorincz laboratory (Liu et al. 2014). These data are used here to demonstrate how the above questions can be addressed with SeqViz. The csv file has been 58 embedded in the software for demonstration purposes (under Menu Help/Demo). An alternative and more flexible means of using commands in R console is detailed in the vignette. 3.2.1! Compute H3K9me3, H3K27me3 and DNA methylation genome-wide using 1kb bins SeqViz contains 5 \u00E2\u0080\u009Cready to use\u00E2\u0080\u009D annotation files. Under SeqViz\u00E2\u0080\u0099s data pane, clicking the annotation button will load the file into the annotation table. Dragging the annotation file and dropping onto the bam file (Figure 3-1 A) will automatically compute either read counts, read coverage or percent methylation of the bam file depending on the counting mode selected. 59 Scalechr1:RefSeq GenesSINELINELTRDNASimpleLow ComplexitySatelliteRNAOtherUnknown500 bases mm1010,373,500Male_PGC_HET (Lorincz)Male_PGC_SETDB1KO (Lorincz)H3K27me3.MalePGC.SETDB1CKO (ht11)H3K27me3.MalePGC.SETDB1KO (ht12)H3K9me3.MalePGC.SETDB1CKO (ht8)H3K9me3.MalePGC.SETDB1KO (ht9)RefSeq GenesRetroposed Genes V2, Including PseudogenesRepeating Elements by RepeatMaskerMale_PGC_HET (Lorinc100 _12.5 _Male_PGC_SETDB1KO57.1429 _0 _H3K27me3.MalePGC.S10 _1 _H3K27me3.MalePGC.S12 _1 _H3K9me3.MalePGC.SE22 _1 _H3K9me3.MalePGC.SE20 _1 _Double click dataupdates Information Double click informationopens UCSC browser ABCDrag&Drop annotation onto bam file to count 60 Figure 3-1 GTK+ based Graphical interface for counting and linking to the UCSC genome browser. (A) Drag and drop annotation file onto bam file to start count. Counting mode is selected from the radio buttons above. The default is read counts. (B) Double-clicking a subset data node updates the information table while double-clicking columns in the information table opens the UCSC genome browser for viewing. (C) UCSC genome browser view of the column selected in the information table. DNA methylation, H3K27me3 and H3K9me3 in male Het/CKO and male KO are shown. 3.2.2 \u00C2\u00A0 Gate on distribution of one variable and see changes of distribution in other variables We first gated on H3K9me3 methylation, selecting the region that has RPKM>4 (Figure 3-2A). Then we looked at the changes in distribution of H3K9me3 and H3K27me3 (Figure 3-2B). It clearly shows the separation of H3K9me3 enriched regions with high DNA methylation relative to the rest of genomic regions. 3.2.3 \u00C2\u00A0 Gate on correlation of two variables and see changes of distribution in other variables When we plotted H3K9me3 and H3K27me3 in male heterozygote PGCs, an interesting butterfly shape was clearly apparent. We then divided the shape into four quadrants with quadrant gate (Figure 3-2C), to see the distribution of DNA methylation in these four areas. As illustrated in Figure 3-2D, it is very clear that the H3K9me3 and H3K27me3 double positive quadrant has a higher average DNA methylation than the other quadrants. To see the effect of a third variable, the user can also use 2D+ plot, gate on two variables and color on the third one. As illustrated in Figure 3-2E, H3K9me3 and H3K27me3 double positive 61 regions have higher DNA methylation, suggesting the same conclusion as using a gating strategy above. Figure 3-2 Gating and plot generation using SeqViz plot pane. (A) One-dimensional gating on the distribution (density plot) of H3K9me3 is shown. RPKM>4 is selected as the gate. (B) Two-dimensional plotting of DNA methylation and H3K27me3 in male heterozygote PGCs. Plot shows changes of the distribution by gating on H3K9me3' (RPKM)Genome0wide'(1'kb'bins)DNA' methylation(%)H3K27me3'(RPKM)Data'(H3K9me3'>'4)Data'(H3K9me3'<'4)Data0/+ +/++/00/0H3K9me3' (RPKM)H3K27me3'(RPKM)Genome0wide'(1'kb'bins)DataData (H3K9me3- /H3K27me3-)Data (H3K9me3- /H3K27me3+)Data (H3K9me3+ /H3K27me3-)Data (H3K9me3+ /H3K27me3+)0 20 40 60 80DNA methylationDH3K27me3'(RPKM)H3K9me3' (RPKM)02460 5 100204060DNA'methylation%80E%62 H3K9me3 methylation level illustrated in panel A. A \u00E2\u0080\u009Cpopulation\u00E2\u0080\u009D of H3K9me3 enriched regions with high DNA methylation levels relative to other genomic regions is clearly apparent. (C) Two-dimensional gating on H3K9me3 and H3K27me3, with 4 quadrants shown for genomic bins with different enrichment levels. (D) One-dimensional plotting of DNA methylation in male heterozygote PGCs. Histogram shows changes in the distribution of DNA methylation levels following \u00E2\u0080\u009Cgating\u00E2\u0080\u009D on H3K9me3 and H3K27me3 levels. This plot shows that regions marked with H3K9me3 and H3K27me3 (\u00E2\u0080\u009Cdouble-positive\u00E2\u0080\u009D) regions have relatively high DNA methylation levels. (E) 2D+ plotting H3K9me3 and H3K27me3 are plotted and DNA methylation levels of each data point are color-coded using a heat map. A positive correlation of DNA methylation at H3K9me3 and H3K27me3 co-enriched regions is clearly apparent. 3.3! Conclusion We have developed a visualization tool based on the idea of viewing epigenetic modifications as attributes of genomic features and using the functionalities that already exist for analyses of flow cytometry data to explore correlations between different epigenetic modifications. Users can also use this tool to explore correlation of binding patterns of transcription factors with such epigenomic datasets, if such ChIP-seq datasets are available. SeqViz suite serves as a companion tool to a genome browser, providing simple computation, visualization and subsetting of epigenetic modifications at genomic features on a genome-wide level. The graphical interface implements the most frequently used functions, allowing biologists to interactively explore epigenomic sequencing data. 3.4! Availability and future directions The InterSeq suite home page is at GitHub (https://github.com/sheng-liu/InterSeq-suite/blob/wiki/ProjectHome.md), along with instructions for installation. The code for individual packages can be accessed form Github Repository (https://github.com/sheng- 63 liu/SeqData.git, https://github.com/sheng-liu/SeqFrame.git, https://github.com/sheng-liu/SeqViz.git). To view an example of the implementation of this package, see the vignette included in the package. A video demo is also available at: http://youtu.be/Zsv4LGTgdA0. For the broader bioinformatics community, the addition of automatic clustering (eg. flowClust) and machine learning (e.g. flowFP) can be added to assist automatic discovery of discrete regulatory regions based on specific patterns of epigenetic modifications. To expand upon the available functionality, we plan to provide more interactive features such as brushing or projections in future releases, making use of interactive graphic facilities in R such as ggobi or cranvas. Furthermore, to increase the accessibility of the interface for the informatics novice, we plan to make SeqViz a web based interactive interface by employing facilities in R such as Shinny. 64 SECTION III RESULTS 65 Chapter 4:!Setdb1 is required for germline development and silencing of H3K9me3-marked endogenous retroviruses in primordial germ cells# 4.1! Materials and methods 4.1.1! Breeding and mating Germline specific Setdb1 KO animals were generated using the Setdb1tm1.1Yshk (Tan et al. 2012) and the Tnap-Cre (129-Alpltm1(cre)Nagy/J) (Lomel\u00C3\u00AD et al. 2000) lines. Briefly, Setdb1flox/flox females were mated with a Setdb1flox/+ Tnapcre/+ male, to generate the genotypes described in Supplemental Figure 4-2A. For timed matings, the day of the vaginal plug was considered E0.5, and females were sacrificed at E13.5 for embryo collection. Animal experimentation followed the guidelines from the Canadian Council on Animal Care (CCAC) under UBC animal care license numbers A13-0115 and A12-0208. 4.1.2! Isolation of PGCs PGCs were isolated from the gonadal ridges of embryos 13.5 days post-coitum, as described previously (Nagy 2003). Isolated gonadal ridges were digested with 0.05% Trypsin at 4\u00C2\u00B0C for 30min and pipetted into single cell suspensions, and cells were stained with a PE-conjugated SSEA-1 antibody (BD #560142). SSEA-1 positive (PGCs) and negative (Soma) cells from individual embryos were sorted directly into 20\u00C2\u00B5L of nuclear isolation buffer (Sigma NUC-101) # The result of this chapter is published in: Liu, S., Brind'Amour, J., Karimi, M.M., Shirane, K., Bogutz, A., Lefebvre, L., Sasaki, H., Shinkai, Y., and Lorincz, M.C. (2014). Setdb1 is required for germline development and silencing of H3K9me3-marked endogenous retroviruses in primordial germ cells. Genes Dev 28, 2041\u00E2\u0080\u00932055. 66 (ChIP-seq) or 20\u00C2\u00B5L TRI reagent (Invitrogen AM9738) (RNA-seq) and snap frozen. Aliquots of 1,000 to 3,000 cells were stored for up to several months at -80\u00C2\u00B0C.!!!4.1.3! Genotyping Tissue from tails of the E13.5 embryos were isolated and extracted using the HotShot method, as described previously (Truett et al. 2000). The sex of each embryo was determined using primers to Zfy1 and Xist, and the genotypes were determined using multiplex primers (primer sequences are presented in Supplemental Table 5). Setdb1 KO primers were used to confirm the deletion of flox allele. 4.1.4! Chromatin immunoprecipitation Chromatin was prepared from single or pooled aliquots of 3,000 cells digested with 1U/\u00C2\u00B5L of micrococcal nuclease (NEB M0247) at 25\u00C2\u00B0C for 5min and re-suspended in ChIP Buffer (10mM Tris-HCl PH8.0, 150mM NaCl, 2mM EDTA, 0.1% Triton X-100, 0.1% deoxycholate, proteinase inhibitor cocktail and PMSF). Chromatin was pre-cleared with protein A Dynabeads\u00C2\u00AE. For ChIP-seq, 10% of the raw ChIP material was used for qPCR validation and the remaining 90% was used for library construction. A 95 cell-equivalent was removed as input. The rest of the chromatin was divided into 3 aliquots (~950 cells-equivalents each): H3K9me3 (Active Motif #39161), H3K27me3 (Diagenode pAb-069-050) and pan-H3 (Sigma I8140) antibody-beads complexes were used for immunoprecipitation (overnight at 4\u00C2\u00B0C). The chromatin-antibody-beads conjugates was then washed twice with Low Salt Wash buffer (20mM PH8.0 Tris-Cl, 0.10% SDS, 1% Triton X-100, 2mM EDTA, 150mM NaCl) and twice with High Salt Wash Buffer (20mM pH8.0 Tris-Cl, 0.10% SDS, 1% Triton X-100, 2mM EDTA,500mM NaCl), and DNA 67 was eluted at 65\u00C2\u00B0C for 2h in Hot Elution Buffer (100mM NaHCO3, 1% SDS). DNA was then extracted using phenol-chloroform (Sigma) followed by ethanol precipitation. 10% aliquots were removed for qPCR validation and the remainder for library construction. 4.1.5! RNA extraction and double stranded cDNA preparation Total RNA was extracted from frozen 1,000 cells aliquots using TRIzol (Invitrogen, AM9738) according to the manufacturer\u00E2\u0080\u0099s manual. RNAse inhibitor (40U per sample, Fermentas, RiboLock RNase Inhibitor, EO0381) was added to resulting RNA. Residual genomic DNA in RNA was removed by DNase I (Promega, RQ1 RNase-Free DNase, M6101) according to the manual, and ribosomal RNA was depleted using riboMinus (Invitrogen, RiboMinus Transcriptome Isolation Kit, K155002) according the low-input protocol in manufacturer\u00E2\u0080\u0099s manual. First strand cDNA synthesis was carried using Superscript III (Invitrogen 18080-093) with T4 protein 32 and a combination of random 15-mers and oligo dT (NEB), followed by second strand cDNA synthesis using Klenow polymerase in the presence of RNaseH. A 10% fraction of double stranded cDNA was used for qPCR validation, and the rest were fragmented for 20 minutes (High Power mode, 30s on and 30s off for 20min) using a BioRuptor (Diagenode) for library construction. 4.1.6! qPCR qPCR was carried out with Eva Green supermix (BioRad SsoFast\u00E2\u0084\u00A2 EvaGreen\u00C2\u00AE Supermix) on an Opticon II qPCR machine (BioRad). IAP, ERVK10C and MERVL primers were used for validation of ChIP and RNA expression. 68 4.1.7! Library construction and sequencing Libraries were constructed using a custom paired-end protocol (Illumina) (Falconer et al. 2012). Briefly, samples were end-repaired, A-tailed and Illumina PE adapters were ligated. Libraries were amplified using indexed PE primers for 8 to 10 PCR cycles. Amplified indexed libraries were pooled (4 to 6 libraries per pool) and size selected for paired-end sequencing. A detailed library construction procedure is presented in Supplementary Methods. Cluster generation and paired-end sequencing (100 bp reads) were performed on the Illumina cluster station and Illumina HiSeq 2000 sequencing platform. Sequence reads were mapped to mm9 (NCBI 37) using BWA (Li and Durbin 2009). Reads passing Illumina\u00E2\u0080\u0099s default chastity filter (Li et al. 2009) were used to generate library statistics. 4.1.8! ChIP and RNA sequencing analysis ChIP-seq and RNA-seq data are available at the Gene Expression Omnibus repository (http://www.ncbi.nlm.nih.gov/geo/) under the accession number GSE60377. For analysis of relative ChIP or RNA enrichment at unique loci, unique reads with a MapQ (Mapping Quality > 5 (uniquely mapped) were utilized (Li et al. 2009)). Multi-aligned reads were included for calculating the relative ChIP enrichment at agglomerated transposable elements (TEs). As they are not properly mapped in the B6 genome (Reuss et al. 1996) IAPEy, IAPLTR3, RLTR5 and MURVY elements were not included in our analyses. Normalization of relative ChIP enrichment was calculated as Reads Per Kilobase per Million mapped reads (RPKM) (Pepke et al. 2009). For pair wise sample comparisons, a Z-score was calculated: , \u0000 Z \u00E2\u0088\u0092 score = (RPKMA \u00E2\u0088\u0092 RPKMB ) / (RPKMA + RPKMB ) 69 where RPKMA and RPKMB are RPKMs in the region of interest of A and B samples, respectively. For RNA-seq analysis, RPKM values were calculated for exonic reads only. To calculate the proportion of the genome covered by H3K9me3 and H3K27me3 in E13.5 PGCs, peaks were called using MACS2 (Zhang et al. 2008; Feng et al. 2012) with a p-value of 0.05. False peaks detected in our pan-H3 ChIP-seq library were subtracted. Gene Ontology (GO) analysis was conducted on InnateDB (http://www.innatedb.ca/), with Benjamini Hochberg correction of GO enrichment p-values. 4.1.9! PBAT Sequencing 2,000-3,000 PGCs isolated from individual E13.5 HET and Setdb1 KO embryos were spiked with 0.1 ng of lambda phage DNA, lysed and subjected to bisulfite treatment as described previously (Shirane et al. 2013). Whole-genome bisulfite shotgun libraries were prepared using the post-bisulfite adaptor tagging (PBAT) method (Miura et al. 2012). Single-end sequencing generated 38.1 and 34.6 Gb of uniquely-aligned sequencing data for male HET and KO PGC libraries. Alignment of PBAT data was performed using Bismark (Krueger and Andrews 2011). The bisulfite conversion rate as determined by analysis of lambda DNA was 99.55% (Converted C\u000473,394,225, unconverted C: 332,718) for the HET and 99.53% (Converted C\u0004107,386,787, unconverted C: 504,201) for the KO. PBAT data is available at the Gene Expression Omnibus repository under the accession number GSE60377. 4.1.10! Imaging Gonads were dissected from P10, 5 week-old and adult germline Setdb1 KO animals and littermates and imaged using a LEICA MS5 stereomicroscope. Gonads isolated from 5 week-old 70 animals were fixed in 4% formaldehyde, incubated in sucrose, embedded in OCT (Tissue-Tek) and frozen at -80\u00C2\u00B0C until sectioning. 10 \u00C2\u00B5M gonad sections were stained with hematoxylin and eosin according to standard procedures. Germ cells were detected in postnatal gonads using a Mouse Vasa Homologue (MVH/DDX4) antibody (Abcam ab13840). 4.2! Results With the low input assays and bioinformatic tools we developed (discussed in detail in chapter 2 and chapter 3), and the specific materials and methods described in above section, I was able to proceed and address the challenging biological question I proposed: Does histone lysine methyltransferase Setdb1 play a role in silencing of retrotransposons in mouse PGCs when the genome is hypomethylated at E13.5? 4.2.1! E13.5 gonadal somatic cells and PGCs express distinct retroelements We first analyzed the expression of genes and repetitive elements in gonadal somatic cells (soma) versus PGCs at E13.5, the stage in germ cell development when DNA methylation levels are at their lowest (Kobayashi et al. 2013; Seisenberger et al. 2012; Hackett et al. 2013). PGCs and soma from E13.5 gonads were purified by fluorescence-activated cell sorting (FACS) using the germ cell surface marker SSEA-1 (Durcova-Hills et al. 1999) (Supplemental Figure 4-1A). RNA-seq analyses of total RNA from 103 PGCs or soma isolated from a single male embryo revealed that expression of male germ cell markers such as Dazl, is restricted to PGCs, while expression of genes expressed in somatic cells, such as Hoxc6, is restricted to somatic/SSEA-1- cells (Jameson et al. 2012), confirming the purity of the sorted populations (Figure 4-1A and 1B). Furthermore, Gene Ontology analysis revealed that genes associated with processes 71 involved in male germline development, such as spermatogenesis, stem cell maintenance and meiosis, are enriched in the SSEA-1+ population, as expected (Supplemental Figure 4-1B and Supplemental Table 4-1). Notably, analysis of the expression of ERVs revealed high levels of expression of VL30 in both PGCs and soma but cell-type dependent differences in expression of specific families (Supplemental Figure 4-1C). ETn elements, for example, are expressed at a higher level in PGCs, while the converse is true of several ERV1 and ERVK families, including RLTR4 (MLV) and IAPey3. Taken together, these data reveal clear differences in the regulation of retroelements in gonadal germ versus somatic cells at this developmental stage. 72 Figure 4-1 Enrichment of H3K9me3, H3K27me3 and gene expression in E13.5 PGCs.(A) Genome browser screenshots of Dazl and Hoxc6 loci showing RNA-seq data for PGCs and gonadal somatic cells (Soma). (B) Expression (RPKM) of germ cell-specific, somatic and housekeeping genes in PGCs and Soma. (C) Coverage of H3K9me3 and H3K27me3 genome-wide (1 kb bins, 50,000 random regions displayed), at transcription start sites (TSSs, +/- 1kb), gametic DMRs and in the 5\u00E2\u0080\u0099 1kb region flanking annotated ERVs or LINEs. (D) Percentage of the mappable genome enriched for H3K9me3 and/or H3K27me3 in male E13.5 PGCs. Enriched regions were identified using MACS2 (p < 0.05). (E) Relationship between mean gene expression level and H3K9me3 and/or H3K27me3 enrichment at TSSs. The number of genes in each category is indicated above. (F) Genome-wide relationship (1 kb 0510152025M F M F M F M F M FAll)genes No)H3K9me3/)H3K27me3H3K9me3)+)H3K27me3H3K9me3 H3K27me3Mean%expression%(RPKM)Allgenes\u00E2\u0099\u0082 \u00E2\u0096\u00A0\u00E2\u0099\u0080 \u00E2\u0096\u00A0+++99+99H3K9me3H3K27me3A 2kbPGCSoma1,80001,8000DazlHoxc615001500PGCSoma1kbB050100150200Pou5f1 KitKlf5Prdm1BrdtGapdhActn1Actn4Gata2Bmp1GnasCyp17a1Grb10RPKMPGCSomaEDCF\u00E2\u0099\u0082H3K27me3%(RPKM)Genome9wide)(1)kb)bins) TSS ERVs)(5\u00E2\u0080\u0099)flank)Gametic DMRs LINE1)(5\u00E2\u0080\u0099)flank)\u00E2\u0099\u0082 H3K9me3%(RPKM)H3K9me3H3K27me3Genome87.013.18 5.034.78\u00E2\u0099\u0082 H3K9me3%(RPKM)\u00E2\u0099\u0082HET%%%DNA%methylation0204060800 1 2 3\u00E2\u0099\u0082 H3K27me3%(RPKM)0 0.5 1 1.5 20123\u00E2\u0099\u0082 H3K9me3\u00E2\u0099\u0082Germ%cell HouseLkeeping Soma244848_Figure1_Liu 73 bins, 50,000 random regions displayed) between H3K9me3 enrichment, H3K27me3 enrichment and DNA methylation in male E13.5 PGCs. 4.2.2! H3K9me3 and H3K27me3 mark distinct and overlapping genomic regions in E13.5 PGCs To determine the role of SETDB1 in PGCs, we crossed the Tnap-Cre mouse line (Lomel\u00C3\u00AD et al. 2000), which expresses Cre in germ cells from ~E9.5 to late gestation, with a Setdb1 conditional knockout mouse line (Tan et al. 2012). E13.5 embryos were recovered, genotyped, and soma and PGCs were purified as described above (Supplemental Figure 4-2 and see Methods). Using our recently developed low input NChIP-seq protocol (Brind'Amour et al. 2015), we generated genome-wide profiles of H3K9me3 and H3K27me3 from 1,000 PGCs of both sexes isolated from heterozygous (HET; Setdb1flox/-; Tnap+/+) and knock-out (KO; Setdb1\u0001/-; TnapCre/+) littermates, focusing initially on the former. Both marks show very similar distributions in male and female HET PGCs, with genome-wide correlations (1 kb bins) of 0.89 and 0.9, respectively (Supplemental Figure 4-3A). These results are consistent with previous microscopy-based studies of both marks in E13.5 PGCs (Lomel\u00C3\u00AD et al. 2000; Hajkova et al. 2008). Intriguingly, most annotated genic TSSs are not marked, or are enriched exclusively with H3K27me3 (Figure 4-1C and Supplemental Figure 4-3B). Consistent with erasure of parental DNA methylation imprints by E13.5 (Seisenberger et al. 2012), gametic DMRs show uniformly low levels of H3K9me3, as reported previously (Henckel et al. 2011). Interestingly, several gametic DMRs are enriched for H3K27me3 at this developmental stage (Figure 4-1C and Supplemental Figure 4-3B). In contrast, the flanks of a subset of ERVs and LINE1 elements are enriched for H3K9me3 and H3K27me3, with ~3% and ~2% of the genome enriched for both marks in male and female 74 PGCs, respectively (Figure 4-1D and Supplemental Figure 4-3C). Consistent with the known functions of these covalent modifications, genes with unmarked TSSs are expressed at higher than average levels, while genes enriched for either or both silencing marks are expressed at low or undetectable levels (Figure 4-1E and Supplemental Figure 4-4A). To determine the relationship between DNA methylation and these marks, we conducted whole genome DNA methylation analysis on male HET and KO E13.5 PGCs using Post-Bisulfite Adaptor Tagging (PBAT) (Miura et al. 2012) (Supplemental Table 4-2), focusing initially on the HET dataset. Strikingly, hypermethylated regions (>20% mean DNA methylation) are generally enriched for H3K9me3 and intermediate levels of H3K27me3 (Figure 4-1F). In total, 2.9% of genomic bins are marked by H3K9me3 (RPKM >0.9) and show elevated DNA methylation. The apparent coexistence of H3K9me3 and intermediate levels of H3K27me3 at hypermethylated regions is also evident when comparing to previously published PBAT data from E13.5 PGCs (Kobayashi et al. 2013) (Supplemental Figure 4-4B and C). While these observations likely reflect the co-existence of all three marks at the same genomic loci, mutually exclusive enrichment of H3K9me3 and H3K27me3 at these regions in distinct subpopulations of PGCs cannot be ruled out. 75 Figure 4-2 H3K9me3, H3K27me3 and DNA methylation at ERVs in E13.5 PGCs. (A) Enrichment of H3K9me3, H3K27me3 and H3 at Repbase annotated LINE and ERV families present at >100 copies in the BL6 genome is shown, along with the mean percentage of DNA methylation (PBAT data), for male HET E13.5 PGCs (above X-axis), or for ESCs is shown below the axis, along with the mean percentage of DNA methylation in blastocysts (Kobayashi et al. 2012). LINE and ERV1, ERVK, ERVL, Gypsy and MaLR classes of ERVs are presented in alphabetical order along the X- axis according to their Repbase nomenclature. Unique and multi-aligned reads were included in the analysis. (B) qPCR validation of H3K9me3 enrichment at IAP and ERVK10C 0102030405060700123CR1_MamHAL1HAL1bL1_Mur2L1_Mus2L1_RodL1M2L1M3L1M3cL1M3eL1M4cL1M7L1MA4AL1MA6L1MA9L1MB3L1MB7L1MC1L1MC4L1MCaL1MDL1Md_F2L1Md_TL1MD3L1ME1L1ME3L1ME4aL1MEbL1MEeL1MEg2L1VL1 LxLx2A1Lx3ALx4A Lx6Lx9L2a L4LTR55LTR37BLTR65LTR78LTRIS2LTRIS5MER21BMER31BMER34AMER57A-intMER57E1MER67AMER67DMER90aMMVL30-intMuRRS-intRLTR14-intRLTR1CRLTR24RLTR41RLTR6RMER21ARodERV21-intLTR81CBGLII_BETnERV3-intIAP-d-intIAPEY3-intIAPLTR1aIAPLTR2bMERVK26-intMurERV4_19-intMYSERV6-intRLTR10ARLTR10CRLTR10-intRLTR11BRLTR13ARLTR13B1RLTR13B4RLTR13D1RLTR13D4RLTR15RLTR18RLTR19RLTR19CRLTR20A1RLTR20B2RLTR20DRLTR22_MusRLTR26RLTR31_MurRLTR33RLTR42-intRLTR44BRLTR44-intRLTR46RLTR9A2RLTR9CRLTR9FRMER12BRMER16RMER17A2RMER17CRMER17D2RMER19CRMER3D-intRMER6ARMER6DRNLTR23ERVL-E-intHERVL40-intLTR16A1LTR16B2LTR16E1LTR33ALTR33CLTR40cLTR50LTR75LTR80BLTR84aMER54BMER70BMER74BMER77MLT2B1MLT2B4MLT2C2MLT2FMT2BRLTR28RMER10BLTR87MamGypLTR1aMamGypLTR1dMamGypLTR3LTR85cMLT1AMLT1A1MLT1BMLT1C-intMLT1EMLT1E2MLT1F1MLT1GMLT1HMLT1H-intMLT1JMLT1J-intMLT1MMTAMTBMTCMTD-intMTE2bMTEa-intMTE-intORR1A0-intORR1A2ORR1A3-intORR1B1ORR1B2-intORR1C2ORR1D1-intORR1D-intLTRIS2MMERGLN-int-70-60-50-40-30-20-10-3-2-1CR1_MamHAL1HAL1bL1_Mur2L1_Mus2L1_RodL1M2L1M3L1M3cL1M3eL1M4cL1M7L1MA4AL1MA6L1MA9L1MB3L1MB7L1MC1L1MC4L1MCaL1MDL1Md_F2L1Md_TL1MD3L1ME1L1ME3L1ME4aL1MEbL1MEeL1MEg2L1VL1 LxLx2A1Lx3ALx4A Lx6Lx9L2a L4LTR55LTR37BLTR65LTR78LTRIS2LTRIS5MER21BMER31BMER34AMER57A-intMER57E1MER67AMER67DMER90aMMVL30-intMuRRS-intRLTR14-intRLTR1CRLTR24RLTR41RLTR6RMER21ARodERV21-intLTR81CBGLII_BETnERV3-intIAP-d-intIAPEY3-intIAPLTR1aIAPLTR2bMERVK26-intMurERV4_19-intMYSERV6-intRLTR10ARLTR10CRLTR10-intRLTR11BRLTR13ARLTR13B1RLTR13B4RLTR13D1RLTR13D4RLTR15RLTR18RLTR19RLTR19CRLTR20A1RLTR20B2RLTR20DRLTR22_MusRLTR26RLTR31_MurRLTR33RLTR42-intRLTR44BRLTR44-intRLTR46RLTR9A2RLTR9CRLTR9FRMER12BRMER16RMER17A2RMER17CRMER17D2RMER19CRMER3D-intRMER6ARMER6DRNLTR23ERVL-E-intHERVL40-intLTR16A1LTR16B2LTR16E1LTR33ALTR33CLTR40cLTR50LTR75LTR80BLTR84aMER54BMER70BMER74BMER77MLT2B1MLT2B4MLT2C2MLT2FMT2BRLTR28RMER10BLTR87MamGypLTR1aMamGypLTR1dMamGypLTR3LTR85cMLT1AMLT1A1MLT1BMLT1C-intMLT1EMLT1E2MLT1F1MLT1GMLT1HMLT1H-intMLT1JMLT1J-intMLT1MMTAMTBMTCMTD-intMTE2bMTEa-intMTE-intORR1A0-intORR1A2ORR1A3-intORR1B1ORR1B2-intORR1C2ORR1D1-intORR1D-intCpG meth. (%)ChIP-seq (RPKM)ETnGLNVL30RLTR40.010.11100.01 0.1 1 10Expression\u00E2\u0099\u0082PGC (RPKM) Expression \u00E2\u0099\u0080 PGC (RPKM) 01230 1 2 3H3K9me3\u00E2\u0099\u0082PGC (RPKM) H3K9me3 \u00E2\u0099\u0080 PGC (RPKM) ERV1ERVKMALRERVLH3K27me3A0123321604020ChIP(RPKM) DNA meth (%)ESCPGC\u00E2\u0096\u00A0 H3K9me3\u00E2\u0096\u00A0 pan-H3\u00E2\u0096\u00A0 H3K27me3\u00E2\u0096\u00A0 DNA meth.LINE E V1 E VK ERVL Gypsy aLRPGC BlastocystLTR-ERVIAPRLTR8-46ERVK10CEtnRLTR1/4/6/47L1Md_A/F/G/TB020406080100IAP \u00E2\u0099\u0082 PGC\u00E2\u0099\u0082 Soma\u00E2\u0099\u0080 PGC\u00E2\u0099\u0080 Soma020406080100 IgG H3K9me3ERVK10CI H3K9me3% inputCDIAPLTR1ETnGLNRLTR4-intRLTR4VL300.010.11100 1 2 3Expression (RPKM)H3K9me3 (RPKM)E\u00E2\u0099\u0082 PGCIAPLTR1ETnGLN-intRLTR4-intRLTR4VL300.010.11100 1 2 3Expression (RPKM)H3K9me3 (RPKM)F\u00E2\u0099\u0082 PGC244848_Figure2_Liu\u00E2\u0099\u0080 PGC 76 ERVs in matched PGCs and soma isolated from male and female gonads. Error bars show standard deviation of technical replicates. (C) Enrichment of H3K9me3 and H3K27me3 (inset) in male versus female E13.5 PGCs at all 385 Repbase annotated ERV families present at >100 copies. Unique and multi-aligned reads were included in the analysis. (D) Expression of ERVs in male versus female E13.5 PGCs. Relationship between ERV expression and H3K9me3 enrichment in (E) male and (F) female PGCs. Unique and multi-aligned reads were included in the analysis. 4.2.3! A subset of ERVs are marked by H3K9me3, H3K27me3 and DNA methylation in E13.5 PGCs Having shown that the regions flanking ERV and LINE elements are frequently enriched for H3K9me3 and H3K27me3, we next analyzed the distribution of both marks within annotated repetitive elements. In male and female HET PGCs, both marks are enriched at a number of ERV1 and ERVK families, including IAP (such as IAPez-int and its cognate LTRs IAPLTR1 & IAPLTR1a) and ERVK10C (RLTR10C) elements (Figure 4-2A and Supplemental Figure 4-4D). These marks are also enriched, albeit to a lesser extent, at specific LINE1 elements, including the potentially active L1Md families (Dudley 1987), which show a 5\u00E2\u0080\u0099 bias in enrichment (Supplemental Figure 4-4E). In contrast, H3 is evenly distributed across all repetitive elements (Figure 4-2A). High H3K9me3 enrichment at IAP and ERVK10C families in male and female PGCs was confirmed by qPCR (Figure 4-1B). Surprisingly, H3K9me3 was also detected at these ERVs in soma, indicating that deposition at repetitive elements is not restricted to germ cells at this developmental stage. In contrast to mESCs (Figure 4-2A), enrichment levels of both marks are highly correlated at all ERV families in both male and female E13.5 PGCs (Figure 4-2C), indicating that these retroelements harbor a unique combination of repressive histone marks in PGCs. H3K27me3 77 enrichment at ERVs was also recently reported in E13.5 PGCs using a distinct H3K27me3-specific antibody (Ng et al. 2013; Tan et al. 2012). Analysis of reads mapping uniquely to specific IAPez or Etn elements reveals a strong correlation between the levels of H3K9me3 and H3K27me3 in male and female PGCs, confirming that they mark the same individual retroelements (Supplemental Figure 4-5A). Of note, ERV families with the highest level of enrichment of these repressive marks also show the lowest mappability (Supplemental Figure 4-5B), indicating that they colonized the genome relatively recently. Thus, ERVs that are most likely to be competent for retrotransposition are preferentially targeted for deposition of H3K9me3 and H3K27me3 in PGCs. As discussed above, specific repeats, IAP elements in particular, retain relatively high DNA methylation levels in E13.5 PGCs (Seisenberger et al. 2012; Kobayashi et al. 2013). As for genomic regions in general, ERV families enriched for H3K9me3 show relatively high levels of DNA methylation in E13.5 PGCs (Figure 4-2A and Supplemental Figure 4-4C). Analysis of recently published PBAT data (Kobayashi et al. 2012) reveals that the same subset of H3K9me3 marked retroelements are relatively hypermethylated in blastocysts (Figure 4-2A). Elevated levels of both H3K9me3 and DNA methylation extend into the unique genomic regions flanking IAPez elements in PGCs, with both marks progressively diminishing to background levels ~3kb distal to the genomic DNA-ERV boundary (Supplemental Figure 4-5C). Similar H3K9me3 and DNA methylation profiles are found in the flanks of IAP elements in ESCs and blastocysts, respectively. In contrast, elevated DNA methylation levels were not detected in the regions flanking MERVL elements, which are unmarked in each of these cell types. 78 While most enriched ERV1 and ERVK families are expressed at relatively low levels in PGCs (Figure 4-2D), a subset, including ETn and RLTR4, are apparently expressed despite the presence of these repressive marks (Figure 4-2E-F). Analysis of reads aligning uniquely to specific annotated elements reveals that while a majority of individual ERVs in both families show a low level of expression and high H3K9me3 enrichment, a subset of the mappable elements are expressed at a high level and show relatively low levels of H3K9me3 (Supplemental Figure 4-5D). Thus, while the majority of individual elements within each ERV family are repressed, likely by H3K9me3 alone or in concert with H3K27me3 and/or DNA methylation, rare members apparently evade such targeting and are constitutively expressed at this stage. 79 Figure 4-3 Influence of Setdb1 KO on PGC number and chromatin marks. (A) Percentage of PGCs/total cells in male and female gonads at E13.5, as determined by flow cytometry. The number of embryos analyzed for each genotype is presented below. Wilcoxon rank sum test p-value for WT(flox/+) versus KO(\u00CE\u0094/-): 0.0005 and 0.210, HET(\u00CE\u0094/+) versus KO(\u00CE\u0094/-): 0.001 and 0.013, and HET(flox/-) versus KO(\u00CE\u0094/-): 0.001 and 0.740 in male and female PGCs, respectively. (B) Box and whisker plots of global (1 kb bins) H3K9me3 and H3K27me3 in Setdb1 KO vs. HET PGCs (box 25th and 75th, whisker 5th and 95th percentile) in unmarked regions or regions enriched for H3K9me3 only, H3K27me3 only or both marks. (C) qPCR validation of H3K9me3 enrichment at IAP and MERVL ERVs in 0123Het KO Het KOH3K9me3 H3K27me3No H3K9me3/H3K27me30123Het KO Het KOH3K9me3 H3K27me3H3K9me3 onlyBChIP(RPKM)EA0123Het KO Het KOH3K9me3 H3K27me3H3K9me3+H3K27me30123Het KO Het KOH3K9me3 H3K27me3H3K27me3 only% gene body methylation0.010203040506070mHet mKO0.00.10.20.30.40.50.60.7% CpG methylation0.010203040506070mHet mKO0204060% CpG Methylation1030507013 16 10 8 10 10 7 6% of SSEA-1+cells\u00E2\u0099\u0080\u00E2\u0099\u0082All CpGs CpGs in gene bodies% inputCIAPMERVL0102030H3K9me3 H3K27me30102030H3K9me3 H3K27me3\u00E2\u0096\u00A0 HET\u00E2\u0096\u00A0 KO244848_Figure3_LiuL1Md_FL1Md_GfL1Md_TEtn-intIAPLTR1IAPLTR1aRLTR6-intRLTR11BRLTR45070901101300.5 1 1.5 2 2.5\u00E2\u0099\u0082KO/HET % \u00E2\u0099\u0082 HET (RPKM)LINE1ERVKERV1DH3K9me3 80 matched Setdb1 HET and KO male PGCs. Error bars show standard deviation of 3 biological replicates. (D) Change in H3K9me3 at LINE1, ERVK and ERV1 families in E13.5 male Setdb1 KO relative to HET PGCs versus the level of enrichment of H3K9me3 in HET PGCs. No unmarked ERV showed an increase in H3K9me3 level above the background threshold (0.5 RPKM) in Setdb1 KO PGCs (not shown). Unique and multi-aligned reads were included in the analysis. (E) Global changes of DNA methylation in E13.5 PGCs isolated from Setdb1 KO versus HET littermates. Box and whisker plots show the median percentage of CpG methylation (box 25th and 75th percentile, whisker 10th and 90th percentile) at all annotated CpGs in the genome (left panel), or at all CpGs within gene bodies of ENSEMBL annotated genes (right panel). 4.2.4! H3K9me3 and H3K27me3 are reduced at ERVs in SETDB1 deficient E13.5 PGCs To determine the role of SETDB1 in H3K9 trimethylation, DNA methylation and silencing of retroelements, we analyzed the datasets generated from the Setdb1 KO (Setdb1\u0001/-;TnapCre/+) littermates of the HET (Setdb1flox/-;Tnap+/+) embryos described above. ChIP-seq, PBAT and RNA-seq experiments on HET and KO pairs were all conducted in parallel. Strikingly, a ~2-3-fold decrease in the number of germ cells relative to soma was consistently observed in E13.5 male KO gonads (Figure 4-3A and Supplemental Figure 4-6A). Nevertheless, sufficient numbers of PGCs were isolated from single embryos for genome-wide analyses. Based on relative read coverage of RNA-seq data generated from KO and HET littermates over Setdb1 exons 15 and 16, the deletion efficiency was estimated to be 60-70% (Supplemental Figure 4-6B), consistent with that originally reported for the Tnap-Cre line at E13.5 (Lomel\u00C3\u00AD et al. 2000). H3K9me3 levels were broadly reduced in Setdb1 KO PGCs at both H3K9me3 and H3K9me3/H3K27me3 co-enriched regions, to 62% and 69%, respectively, of the level measured in their HET littermates (Figure 4-3B). Interestingly, while H3K27me3 levels were only 81 modestly reduced at regions enriched solely for H3K27me3 (to ~91% of HET levels), H3K27me3 levels were reduced to 76% of normal levels at co-enriched regions (Figure 4-3B and Supplemental Figure 4-7A). Reduced levels of H3K9me3 and H3K27me3 at IAP elements was confirmed by qPCR on biological replicates (Figure 4-3C and Supplemental Figure 4-7B). Consistent with the rest of the genome, while most H3K9me3 marked ERVK and ERV1 subfamilies, including ETn, IAPLTR1 and RLTR4, show a decrease in H3K9me3 (Figure 4-3D and Supplemental Figure 4-7C) and H3K27me3 (Supplemental Figure 4-7D-E) in KO PGCs, significant levels of these marks remain. Incomplete deletion (Supplemental Figure 4-6B) may explain the residual levels of H3K9me3. Alternatively, other H3K9 methyltransferases, such as SUV39H1 and/or SUV39H2, could be responsible for the residual H3K9me3 present in Setdb1 KO germ cells (Bulut-Karslioglu et al. 2014). No change in expression of the remaining known H3K9 or H3K27 histone methyltransferases was observed in the Setdb1 KO (Supplemental Figure 4-7E), indicating that deposition of H3K27me3 at co-enriched regions, likely by EZH1 and/or EZH2, is at least partially dependent upon SETDB1 or H3K9me3. 82 Figure 4-4 DNA methylation is reduced in Setdb1 deficient PGCs in regions highly enriched for H3K9me3. (A) Relationship between % methylation in HET E13.5 PGCs and the change (\u00CE\u0094) in the percentage of methylation in HET PGCs relative to PGCs isolated from Setdb1 KO littermates. PBAT Data from 50,000 random regions (1kb bins) is shown. (B) Relationship between H3K9me3 coverage at LINE1 elements and ERVs (>100 copies) in HET E13.5 PGCs and the change (\u00CE\u0094) in the % of DNA methylation at such ERVs relative to PGCs isolated from their Setdb1 KO littermates. IAP and related ERK10C elements are labeled in red and orange, respectively. Unique and multi-aligned reads were included in the analysis. (C) The mean percentage of DNA methylation (mC/C) at all 5\u00E2\u0080\u0099LTR INTIAPIAPLTR1aIAPLTR1IAPLTR4_IMULV-intRLTR4L1Md_FL1Md_AL1Md_TL1Md_Gf-10.0-6.0-2.02.06.010.014.018.00.0 0.5 1.0 1.5 2.0 2.5D% DNA methylation (KO-HET) \u00E2\u0099\u0082 H3K9me3 HET (RPKM)A B\u00E2\u0099\u0082 HET DNA methylation %% Din DNA methylation (KO-HET) 030-30-60600 20 40 60 800RPM123\u00E2\u0099\u0082 HET H3K9me3C D\u00E2\u0099\u0082 \u00E2\u0099\u00800 500 1000 1500 2000\u00E2\u0088\u0092100\u00E2\u0088\u009250050100Percent of methylationL1MdTfI.chr10.cutoff55\u00E2\u0080\u0099UTR ORF1Mean % mC/CL1Md_T coordinate (bp)0501010500 500 1000 1500 2000HET PGC KO PGC0 200 400 600 800 1000\u00E2\u0088\u0092100\u00E2\u0088\u009250050100Percent of methylationIAP.cutoff55\u00E2\u0080\u0099LTR INTMean % mC/CHET PGC KO PGC0 200 400 600 1000800050101050Consensus IAPez coordinate (bp)70.564.9HET KO % mC: 69.989.2244848_Figure4_Liu 83 CpGs (with \u00E2\u0089\u00A55x coverage) across the consensus LTR and proximal internal (INT) regions of IAPLTR1 elements or in the 5\u00E2\u0080\u0099UTR of L1md_T elements (homologous to L1md_T chr10:11355585-11361471) is shown for HET and Setdb1 KO E13.5 PGCs. (D) Sanger bisulfite sequencing of the 5\u00E2\u0080\u0099 LTR region of IAPLTR1a elements in male and female Setdb1 HET and KO PGCs. The region amplified is showed above (red bar). 4.2.5! H3K9me3 depleted IAP LTRs show reduced DNA methylation in Setdb1 KO PGCs As genomic regions retaining relatively high levels of DNA methylation were found to be enriched for H3K9me3 (Figure 4-1F), we next determined whether Setdb1 deletion perturbs DNA methylation at these regions in E13.5 PGCs. Surprisingly, analysis of PBAT data from male Setdb1 KO PGCs revealed that at a global level, DNA methylation levels were ~2.5 fold higher in Setbd1 KO than HET PGCs, with median CpG methylation values of 25.0% versus 9.0%, respectively (Figure 4-3D, first panel, and Supplemental Figure 4-8A). Similar methylation levels were observed for all CpGs within the bodies of annotated genes (Figure 4-3D, second panel). As we did not observe an increase in expression of any of the DNMTs, or a decrease in expression of the dioxygenases TET1 or TET2 in Setdb1 KO PGCs (Supplemental Figure 4-8B), indirect effects are the most likely explanation. While we considered the possibility that development is delayed in Setdb1 KO germ cells, we did not consistently observe higher expression of genes normally expressed at a higher level at E12.5 than E13.5, or lower levels of expression of genes normally expressed at a higher level at E13.5 than E12.5 (Jameson et al. 2012) in Setdb1 KO relative to HET PGCs (Supplemental Figure 4-8C). Intriguingly, analysis of the relationship between DNA methylation and both repressive histone modifications reveals that regions marked with H3K9me3 show a lower overall increase in DNA methylation than regions marked with H3K27me3 alone or regions lacking both marks 84 (Supplemental Figure 4-7D). Indeed, in non-repetitive genomic regions, changes in DNA methylation levels are generally inversely correlated with H3K9me3 enrichment levels in HET PGCs (Figure 4-4A). Notably, hypermethylated regions with high levels of H3K9me3 and intermediate levels of H3K27me3 show decreased DNA methylation levels in Setdb1 KO PGCs (Figure 4-4A and Supplemental Figure 4-7E). Similarly, the degree of DNA methylation gain at ERVs shows an inverse correlation with H3K9me3 enrichment levels (Figure 4-4B), with the CpG-rich IAPLTR1 and IAPLTR1a of the IAPez family actually showing reduced mean DNA methylation levels of ~8% and ~3%, respectively. This decrease is likely an underestimate of the true decrease in DNA methylation, given the significant percentage of PGCs in which the deletion has not occurred. Analysis of the distribution of DNA methylation across the consensus IAPLTR1 and proximal internal regions reveals a near complete loss of methylation across a short stretch of CpGs just 3\u00E2\u0080\u0099 of the 5\u00E2\u0080\u0099 LTR (Figure 4-4C). In contrast, a broad but modest increase in methylation was observed in the 5\u00E2\u0080\u0099UTR and ORF1 regions of L1Md_T elements (Figure 4-4C), consistent with the overall methylation dynamics observed for LINE1 families. Decreased methylation in IAPLTR1a was confirmed by Sanger bisulfite sequencing in male E13.5 PGCs, but interestingly was not observed in female E13.5 PGCs in the region amplified (Figure 4-4D). Hypermethylation of IAP and other H3K9me3 marked ERVs in PGCs (Seisenberger et al. 2012) and in preimplantation embryos (Smith et al. 2012) is consistent with the hypothesis that H3K9me3 plays a role in protecting marked genomic regions against active or passive demethylation (Leung et al. 2014). 85 Figure 4-5 Reactivation of ERVs upon Setdb1 depletion in mouse E13.5 PGCs. (A) Expression of Repbase annotated ERV and LINE1 families (> 100 copies in the B6 genome) in male and female HET and KO E13.5 PGCs is shown, along with male H3K9me3 enrichment levels. LINE and ERV1, ERVK, ERVL, Gypsy and MaLR classes of ERVs are presented in alphabetical order along the X- axis according to their Repbase nomenclature. Unique and multi-aligned reads were included in the analysis. (B) qPCR validation of sex-specific differences in expression of IAP and ERVK10C ERVs in HET versus Setdb1 KO E13.5 PGCs isolated from male and female littermates. (C) GLNMLVIAPLTR1ERVK10CEtn0123402468CR1_MamHAL1HAL1bL1_Mur2L1_Mus2L1_RodL1M2L1M3L1M3cL1M3eL1M4cL1M7L1MA4AL1MA6L1MA9L1MB3L1MB7L1MC1L1MC4L1MCaL1MDL1Md_F2L1Md_TL1MD3L1ME1L1ME3L1ME4aL1MEbL1MEeL1MEg2L1VL1 LxLx2A1Lx3ALx4A Lx6Lx9L2a L4LTR55LTR37BLTR65LTR78LTRIS2LTRIS5MER21BMER31BMER34AMER57A-intMER57E1MER67AMER67DMER90aMMVL30-intMuRRS-intRLTR1RLTR1BRLTR1DRLTR30RLTR41RLTR6RMER21ARodERV21-intLTR81CBGLII_BETnERV3-intIAPEY_LTRIAPEY3-intIAPLTR1IAPLTR2aIAPLTR3-intMERVK26-intMMTV-intMYSERV16_I-intRLTR10RLTR10B2RLTR10ERLTR11A2RLTR12BRLTR13A3RLTR13B3RLTR13C2RLTR13D3RLTR13D6RLTR17RLTR18-intRLTR19BRLTR20ARLTR20B1RLTR20CRLTR22_MurRLTR25BRLTR31RLTR31BRLTR42RLTR44ARLTR44ERLTR45-intRLTR9ARLTR9B2RLTR9ERMER12RMER13BRMER17ARMER17BRMER17DRMER19BRMER20BRMER4BRMER6CRNERVK23-intERVL-B4-intHERV16-intLTR16ALTR16B1LTR16DLTR33LTR33BLTR40bLTR41BLTR53LTR80ALTR83MER54AMER70AMER74AMER76MERVL-intMLT2B3MLT2C1MLT2EMT2AMT2CRMER10ARMER15-intLTR88cMamGypLTR1cMamGypLTR2cLTR85bLTR90BMLT1A0-intMLT1A-intMLT1CMLT1D-intMLT1E1AMLT1FMLT1F-intMLT1G3MLT1H2MLT1-intMLT1J2MLT1LMLT-intMTBMTB-intMTDMTE2a-intMTEaMTEb-intORR1A0ORR1A1-intORR1A3ORR1A4-intORR1B2ORR1C1-intORR1D1ORR1D2-intORR1E-intH3K9me3 (RPKM)GLNRLTR4IAPLTR1ERVK10CRLTR10CLTRISMLVEtn-4-3-2-1-9-7-5-3-1CR1_MamHAL1HAL1bL1_Mur2L1_Mus2L1_RodL1M2L1M3L1M3cL1M3eL1M4cL1M7L1MA4AL1MA6L1MA9L1MB3L1MB7L1MC1L1MC4L1MCaL1MDL1Md_F2L1Md_TL1MD3L1ME1L1ME3L1ME4aL1MEbL1MEeL1MEg2L1VL1 LxLx2A1Lx3ALx4A Lx6Lx9L2a L4LTR55LTR37BLTR65LTR78LTRIS2LTRIS5MER21BMER31BMER34AMER57A-intMER57E1MER67AMER67DMER90aMMVL30-intMuRRS-intRLTR1RLTR1BRLTR1DRLTR30RLTR41RLTR6RMER21ARodERV21-intLTR81CBGLII_BETnERV3-intIAPEY_LTRIAPEY3-intIAPLTR1IAPLTR2aIAPLTR3-intMERVK26-intMMTV-intMYSERV16_I-\u00E2\u0080\u00A6RLTR10RLTR10B2RLTR10ERLTR11A2RLTR12BRLTR13A3RLTR13B3RLTR13C2RLTR13D3RLTR13D6RLTR17RLTR18-intRLTR19BRLTR20ARLTR20B1RLTR20CRLTR22_MurRLTR25BRLTR31RLTR31BRLTR42RLTR44ARLTR44ERLTR45-intRLTR9ARLTR9B2RLTR9ERMER12RMER13BRMER17ARMER17BRMER17DRMER19BRMER20BRMER4BRMER6CRNERVK23-intERVL-B4-intHERV16-intLTR16ALTR16B1LTR16DLTR33LTR33BLTR40bLTR41BLTR53LTR80ALTR83MER54AMER70AMER74AMER76MERVL-intMLT2B3MLT2C1MLT2EMT2AMT2CRMER10ARMER15-intLTR88cMamGypLTR1cMamGypLTR2cLTR85bLTR90BMLT1A0-intMLT1A-intMLT1CMLT1D-intMLT1E1AMLT1FMLT1F-intMLT1G3MLT1H2MLT1-intMLT1J2MLT1LMLT-intMTBMTB-intMTDMTE2a-intMTEaMTEb-intORR1A0ORR1A1-intORR1A3ORR1A4-intORR1B2ORR1C1-intORR1D1ORR1D2-intORR1E-intH3K9me3 (RPKM)BACH3K9me3 Setdb1 KO vs. HET Z-scoreExpression Setdb1 KO vs. HET PGC Z-score-0.8-0.6-0.4-0.200.20.4-2 0 2 4 6 8ERVK10C-0.8-0.6-0.4-0.200.20.4-2 0 2 4 6 8IAPezFemaleMale-0.8-0.6-0.4-0.200.20.4-2 0 2 4 6 8L1Md_Gf-0.8-0.6-0.4-0.200.20.4-2 0 2 4 6 8Etn0 5 10 15 20ESET(cko/-)ESET(-/-)IAPKO(\u00CE\u0094/-)HET(flx/-)0 5 10 15 20 25 30ERVK10CKO(\u00CE\u0094/-)HET(flx/-)Normalized to \u00CE\u00B2-actin024688642012344321Expression (RPKM)H3K9me3 (RPKM)FemalesMales\u00E2\u0096\u00A0 HET (\u00CE\u0094/+)\u00E2\u0096\u00A0\u00E2\u0096\u00A0 KO (\u00CE\u0094/-)\u00E2\u0096\u00A0 H3K9me3LINEV1 E VK ERVL Gypsy aLRLTR-ERVL1Md_A/F/G/T244848_Figure5_Liu 86 Expression versus H3K9me3 coverage Z-scores of individual \u00E2\u0080\u009Cintact\u00E2\u0080\u009D IAP, ETn and ERVK10C ERVs. Only uniquely aligned reads were considered in this analysis. 4.2.6! A subset of ERVs are reactivated upon Setdb1 depletion in E13.5 PGCs Comparison of RNA-seq data generated from PGC isolated from E13.5 HET with Setdb1 KO littermates reveals that a number of ERV1 and ERVK families are dramatically up-regulated in both male and female KO PGCs (Figure 4-5A and Supplemental Table 4-3). Strikingly, for a subset of ERVs, the level of upregulation is sex-dependent (Supplemental Figure 4-9A). IAPez elements showed increased expression in male and female PGCs of 27.5-fold and 15.5-fold, respectively. While ETn elements also showed a higher level of de-repression in males, the converse was true for ERVK10C elements. These trends were validated by qPCR (Figure 4-5B) and confirmed by RNA-seq using PGCs isolated from an independent litter (Supplemental Figure 4-9B). Notably, a subset of H3K9me3 marked ERVs, including LTRIS families, are not de-repressed in Setdb1 KO PGCs, implicating alternative repressive pathways and/or a nuclear milieu that does not support expression of these elements. To determine the relationship between expression of re-activated ERV families and H3K9me3 enrichment at the level of individual elements, we compared H3K9me3, H3K27me3 and RNA-seq read coverage at intact elements, considering only uniquely aligned reads. Numerous IAPez, ERVK10C and ETN elements show decreased H3K9me3, concomitant with increased expression (Figure 4-5C), whereas marked MLV and unmarked ERVL elements show modest or no upregulation, respectively (Supplemental Figure 4-10A). Notably, LINE1 families enriched for H3K9me3 (Figure 4-2A and Supplemental Figure 4-4C), including L1Md, exhibited very little reactivation, despite a clear decrease in H3K9me3 (Figure 4-3C and Figure 4-5C). Similar 87 trends were observed for H3K27me3 (Supplemental Figure 4-10B). Taken together, these results indicate that H3K9me3 at a subset of ERVs and LINE1 elements is dependent upon Setdb1 and that, while these retroelements are also marked by H3K27me3 in PGCs, deletion of Setdb1 is sufficient to disrupt silencing of a subset of co-marked ERVs but not of LINE1 elements, implicating the activity of alternative transcriptional or post-transcriptional silencing pathways for the latter. 88 Figure 4-6 Genes up-regulated in Setdb1 KO E13.5 PGCs. (A) Expression levels of annotated ENSEMBL genes in male HET vs. Setdb1 KO E13.5 PGCs are shown. Genes up-regulated (91) in Setdb1 KO PGCs (z-score Setdb1 KO/WT and Setdb1 KO/HET >1, RPKM Setdb1 KO/WT and Setdb1 KO/HET >1.5-fold) are highlighted in red. Genes down regulated (23) (z-score Setdb1 KO/WT and Setdb1 KO/HET <-1, RPKM Setdb1 KO/WT and Setdb1 KO/HET < 0.75-fold) are highlighted in green. Only uniquely aligned reads were considered in this analysis. (B) Venn diagram of the overlap between genes up-regulated in Setdb1 KO PGCs isolated from male and female littermates. (C) Genome browser views showing H3K9me3 and RNA-seq coverage of the Mmp10 and Ogfod2 gene 0.010.11101001000100000.01 0.1 1 10 100 1000 10000Setdb1KO expression (RPKM)Setdb1 HET expression (RPKM)A B\u00E2\u0099\u0082 E13.5 PGCs0%10%20%30%40%50%60%70%80%90%100%1 20%10%20%30%40%50%60%70%80%90%100%1 2ERVK10C \u00E2\u0096\u00A0IAP \u00E2\u0096\u00A0Etn \u00E2\u0096\u00A0\u00E2\u0096\u00A0 LTR-driven\u00E2\u0096\u00A0 Near LTR\u00E2\u0096\u00A0 H3K9me3 TSS\u00E2\u0096\u00A0 Others020406080( )LTR-drivenAll\u00E2\u0099\u0082 \u00E2\u0099\u0082\u00E2\u0099\u0080 \u00E2\u0099\u008091 157 29 17C >> Mmp10 (ETn) >>\u00E2\u0099\u0082H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (KO)H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (KO)Etn-intMmp10>> Ogfod2 (IAPEz) >>IAPEz-intOgfod2\u00E2\u0099\u0082H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (KO)H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (KO)D\u00E2\u0099\u0082 (91) \u00E2\u0099\u0080 (157)75 16 141\u00E2\u0099\u0080\u00E2\u0099\u0080225225012012012012Reads225225059059059059Reads244848_Figure6_Liu 89 loci in E13.5 PGCs. Note the coverage over the ETn ERV upstream of the Mmp10 gene and the IAP ERV upstream of the Ogfod2 gene. (D) Percentage of up-regulated genes in male and female Setdb1 KO PGCs that are expressed from an LTR element (LTR-driven), within 20 kb of an LTR (Near LTR) or marked by H3K9me3 in the TSS (+/- 1 kb) region (left panel). Percentage of LTR-driven up-regulated genes that are driven by ETn, IAP or ERVK10C LTRs (right panel). 4.2.7! Transcription of a subset of genes de-repressed in Setdb1 KO PGCs initiates in LTRs As discussed above, relatively few genic promoters are marked by H3K9me3. To determine how deletion of Setdb1 influences gene expression, we calculated RNA-seq coverage over all genic exons for WT, HET and KO littermates. As expected, most genes that significantly change in expression in both male and female Setdb1 KO PGCs were up-regulated (Figure 4-6A and Supplemental Figure 4-11A). Of the 91 and 157 genes up-regulated in male and female, respectively, only 16 are common to both using the stringent thresholds applied (Figure 4-6B and Supplemental Table 4-4). Gene Ontology analysis revealed that upregulated genes were associated with nucleic acid binding and ion binding in males, and extracellular matrix and definitive hematopoiesis in females (Supplemental Figure 4-11B). Surprisingly, less than 25% of up-regulated genes have an H3K9me3 peak overlapping with their annotated TSS. Many of the other up-regulated genes in both sexes are either in close proximity to an LTR element or initiate transcription within an upstream LTR (Figure 4-6C-D). For the latter cases, elevated gene expression is likely the result of transcription initiating in an ERV upstream (<25 kb) of the canonical TSS, and extending through the 5\u00E2\u0080\u0099 end of the gene (Figure 4-6C and Supplemental Figure 4-11C). A minority of such genes (6/29 in males and 3/17 in females) show evidence of \u00E2\u0080\u009Cchimeric\u00E2\u0080\u009D transcription, where splicing occurs between the upstream ERV and a canonical 90 genic exon. Interestingly, the LTRs driving aberrant genic transcripts also show clear sex-dependent differences. Almost 40% of the transcripts up-regulated in male PGCs originate from a nearby LTR (Figure 4-6C-D), with approximately 50% originating in ETn/MusD elements and the remaining 40% and 10% originating in IAP and ERVK10C elements, respectively. In contrast, less than 15% of the genes up-regulated in females are LTR-driven, the majority of which originate from IAP or ERVK10C LTRs. These results are consistent with the sex-dependent differences observed in expression of these ERVs (Supplemental Figure 4-9A). Thus, as we previously reported for mESCs (Karimi et al. 2011) derepression of latent ERVs concomitantly influences the expression of nearby genes in PGCs. 91 Figure 4-7 Gonadal defects in male and female germline Setdb1 deficient mice. (A) Decrease in male and female gonadal size in germline Setdb1 KO mice relative to HET and/or WT littermates in postnatal mice at P10, 5-weeks and 23/24-weeks (adult). Adult testes are also relative to ~one year-old HET (paternal, Setdb1flox/+ Tnapcre/+) and C57BL/6 testes. (B) Hematoxilin and eosin sections in 5-week postnatal ovaries illustrating gonadal defects in germline Setdb1 deficient females. PF: primary follicle, SF: secondary follicle, TF: tertiary follicle. (C) Hematoxilin and eosin sections in 5-week postnatal testis illustrating defects in gonadal morphology in germline Setdb1 deficient 2 mmKOHet WTKO1 mmHetWTKOovaries testisAP105 mmKO (24wk) KO (23wk) Het (24 wk)B6 (54 wk) Het (56 wk) Het (23 wk)CFertility testing:Litter and genotypes table??2 mmKO KOWTAdult2 mmKO WT2 mmWTKO5-week60 \u00CE\u00BCm 60 \u00CE\u00BCmB5XHet KOPFSFSFC5XKO 1 mmWT KO1 mm60 \u00CE\u00BCm60 \u00CE\u00BCm20XSdSg ScSlSdScSgSlD P10 5-weekKOWT KOHetDAPIMVH244848_Figure7_Liu10X 10X2 mm 2 mm 2 mm 2 mm 92 males. Sl: Sertoli cells, Sg: spermatogonia, Sc: spermatocytes, Sd: spermatids. (D) Immunofluorescence on sections from postnatal testes at day 10 (P10) and at 5-weeks. White: DAPI (DNA), red: MVH (germ cell marker). 4.2.8! Setdb1 deletion early in germline development leads to gametogenesis defects in postnatal and adult mice As Setdb1 deletion early in male germline development leads to a significant decrease in the number of PGCs relative to soma at E13.5, we assessed whether germ cell defects persist postnatally. Although Setdb1\u00CE\u0094/-; TnapCre/+ mice displayed normal size and morphology, P10 testes, which normally harbor pre-meiotic germ cells, are clearly reduced in size relative to their HET and WT littermates (Figure 4-7A). Curiously, P10 ovaries, which normally harbor oocytes arrested in meiosis I, also show a decrease in size, despite the fact that female Setdb1 KO PGCs are not reduced in number at E13.5. Male and female gonadal hypotrophy persisted through puberty and into adulthood (Figure 4-7A), indicating that this phenotype is due to a germ cell defect rather than a developmental delay. Sections from ovaries isolated from 5 week-old Setdb1 KO females revealed the presence of both primordial and growing follicles, although in much smaller numbers than in ovaries isolated from HET littermates (Fig 7B). Sections of testes isolated from 10 day-old (P10) and 5 week-old Setdb1 KO mice revealed the presence of seminiferous tubules that were devoid of germ cells alongside normal seminiferous tubules displaying all layers of male germ cell development (Figure 4-7C-D), consistent with incomplete Cre-mediated recombination. Although both male and female germline Setdb1 KO animals appeared to be fertile, we have not observed transmission of Setdb1\u00CE\u0094/KO germ cells in litters born from Setdb1 flox/\u00CE\u0094Tnap+/cre x B6 WT matings. Thus, offspring are apparently derived from germ cells in which Cre recombination has not taken place. Taken together, these observations reveal 93 that deletion of Setdb1 in proliferating PGCs prior to the onset of de novo DNA methylation leads to prolonged defects in gametogenesis. 94 SECTION IV DISCUSSION 95 Chapter 5:!Setdb1, ERVs silencing and germ cell development In Chapter 4, using our new ultra low-input ChIPseq method (Brind'Amour et al. 2015), we showed that a subset of young ERV families, including IAP and ETn, as well as young LINE1 families are indeed marked by H3K9me3 in E13.5 PGCs. We also found that these elements are marked by H3K27me3 and enrichment of both marks at these potentially active retroelements is positively correlated with DNA methylation levels. Germline ablation of Setdb1 in PGCs yields decreased levels of H3K9me3 and H3K27me3 as well as decreased DNA methylation at marked ERVs and LINE1 elements in both male and female PGCs. However, widespread reactivation was observed exclusively for ERVs, with specific families showing sex-dependent expression levels. Gonads isolated from male E13.5 Setdb1 knockout embryos also showed a noticeably reduction in the number of PGCs relative to somatic cells, and germline defects extending into adulthood were observed in both males and females. Taken together, these observations reveal that SETDB1 plays an essential role in the establishment and/or maintenance of H3K9me3, H3K27me3, and DNA methylation at young LTR and LINE1 elements and is required for proviral silencing prior to the onset of de novo DNA methylation in the prenatal germline. 5.1! Co-occurrence of H3K9me3 and H3K27me3 at retrotransposons Genome-wide analysis of the distribution of H3K9me3 and H3K27me3 at non-repetitive genomic regions revealed that these marks are generally mutually exclusive in E13.5 PGCs, consistent with previous analyses in ESCs (Mikkelsen et al. 2007; Karimi et al. 2011) and MEFs (Pauler et al. 2008). Co-occurrence of high levels of H3K9me3 with intermediate levels of H3K27me3 at specific ERV and LINE1 elements in E13.5 PGCs was unexpected, given that 96 they are marked exclusively with H3K9me3 in ESCs (Mikkelsen et al. 2007). H3K27me3 levels significantly increase in migrating PGCs, perhaps to compensate for the progressive loss of DNA methylation and/or H3K9me2 that takes place during this developmental window (Seki et al. 2007; Hajkova et al. 2008). Co-occurrence of these marks at gene promoters was observed previously in trophectoderm cells (Alder et al. 2010), which, like PGCs, are globally hypomethylated (Oda et al. 2013). Thus, deposition of H3K27me3 at regions that do not generally harbor H3K9me3 may be a common feature of tissues with hypomethylated genomes. Intriguingly, a recent publication showed that in the absence of FBXL10, polycomb targeted CpG islands are aberrantly methylated (Boulard et al. 2015), indicating that this factor, which binds specifically to unmethylated CpG-rich sequence through its CXXC domain, protects these regions against DNA methylation. It will be interesting to test whether the absence of CXXC domain factors like FBXL10 in PGCs is required for DNA methylation of H3K27me3 and H3K9me3 marked regions. Whether H3K27me3 is required for silencing of ERVs in PGCs remains to be determined. Since H3K27me3 and H3K9me3 levels were concomitantly reduced at ERVs in Setdb1 KO PGCs, deposition of the former may depend on the presence of the latter in such regions. Several members of the Cbx protein family interact with the core Polycomb repressive complex 1 (PRC1), which promotes deposition of H3K27me3 by PRC2/EZH2 and a subset of these chromodomain proteins, CBX4 for example, have relatively high affinity for H3K9me3 (Vermeulen et al. 2010; Kaustov et al. 2010). Similarly, the chromodomain protein CDYl, which binds with high affinity to H3K9me3 (Vermeulen et al. 2010) and directly recruits EZH2 (Zhang 97 et al. 2011), is highly expressed in PGCs. Further experiments are required to determine the role of such H3K9me3 \u00E2\u0080\u009Creaders\u00E2\u0080\u009D in targeting of H3K27me3 in PGCs. 5.2! Influence of Setdb1 deletion on DNA methylation After publishing our findings on the role of Setdb1 in DNA methylation homeostasis in male PGCs (Liu et al. 2014), we also examined genome-wide DNA methylation changes in female PGCs upon Setdb1 KO at E13.5 and analyzed a second KO male embryo by PBAT. Setdb1 KO PGCs from this independently isolated male embryo showed a global ~2-fold increase in DNA methylation relative to control PGCs, similar to our published results with the first male embryo analyzed. In contrast, the global CpG methylation levels in female HET PGCs are relatively high compared to male HET PGCs, and no global increase in DNA methylation was observed (Figure 5-1). 98 Figure 5-1 CpG DNA methylation in Setdb1 HET and KO E13.5 PGCs. Global changes of DNA methylation in E13.5 PGCs were calculated from independently isolated female Setdb1 HET and KO embryos and a second KO male embryo. The mean percentage of DNA methylation was calculated based on all annotated CpGs covered by a minimum of 5 reads for each CpG. The Table shows the number of reads analyzed in each sample. DNA methylation at H3K9me3 marked ERVs in particular is reduced upon Setdb1 depletion in female E13.5 PGCs, as shown in Figure 5-2. Note that the same ERV subfamilies, predominantly in the IAP superfamily, show a decrease in DNA methylation in both male and female Setdb1 KO PGCs, suggesting that H3K9me3 protects these regions against loss of DNA methylation. 15.8% 31.3% 27% 32.5% 0%5%10%15%20%25%30%35%40%\u00E2\u0099\u0082PGC \u00E2\u0099\u0080PGC % CpG methylation Setdb1 HET Setdb1 KO SAMPLE UNIQUE READS TOTAL READS \u00E2\u0099\u0082HET 110,201,975 197,445,679 \u00E2\u0099\u0082KO 193,416,088 351,583,643 \u00E2\u0099\u0080HET 201,376,726 362,152,569 \u00E2\u0099\u0080KO 98,142,759 180,272,555 99 Figure 5-2 DNA methylation is reduced at H3K9me3 marked ERVs in both male and female Setdb1 KO PGCs. Relationship between H3K9me3 coverage at ERVs (present in >100 copies) in heterozygous (HET) male and female E13.5 PGCs and the difference (%\u0001) in the percentage of DNA methylation at such ERVs relative to PGCs isolated from Setdb1 knockout littermates is shown. IAP and related ERK10C elements are labeled in red and orange, respectively. Unique and multi-aligned reads were included in the analysis. Note that most ERVs, particularly those not marked with H3K9me3, show a gain in methylation in male but not female PGCs, while H3K9me3 marked IAP elements in particular show a decrease in DNA methylation in the Setdb1 KO in both genders. As the global increase in DNA methylation in male PGCs includes regions devoid of H3K9me3, an indirect effect, such as premature initiation of the global wave of de novo methylation that normally occurs in male PGCs at E15.5, is the most likely explanation. Intriguingly, an inverse correlation was observed between DNA methylation gain in the Setdb1 KO and the level of H3K9me3 enrichment in control HET PGCs. CpG-rich IAPLTR1 and IAPLTR1a, which show the highest level of H3K9me3 of all LTRs, show a decrease in DNA methylation in the absence IAPLTR1aIAPLTR1IAPEz-intIAPLTR2IAPLTR2aIAPLTR4_IRLTR10CMuLV-intRLTR4-intETnERV-intIAPLTR4RLTR4RLTRETN-12.0-8.0-4.00.04.08.012.016.020.00.0 0.5 1.0 1.5 2.0 2.5%\u0001in DNA methylation(KO-HET)IAPLTR1aIAPLTR1IAPEz-int IAPLTR2IAPLTR2aIAPLTR4_IRLTR10CMuLV-intRLTR4-intETnERV-intIAPLTR4RLTR4RLTRETN-14.0-12.0-10.0-8.0-6.0-4.0-2.00.02.04.06.00.0 0.5 1.0 1.5 2.0 2.5\u0003 E13.5 PGCs \u0002E13.5 PGCs H3K9me3 SETDB1 HET (RPKM) 100 of Setdb1 in both male and female PGCs (Figure 5-2), coincident with induction of IAPez expression. Given that the de novo DNA methyltransferases DNMT3A and DNMT3B are expressed at relatively low levels and DNMT3L is not expressed at this stage (Kurimoto et al. 2008; Seisenberger et al. 2012), these results indicate that H3K9me3 likely protects methylated genomic regions against active and/or passive DNA demethylation. As Tet1 and Tet2 are expressed in both PGCs and preimplantation embryos (Seisenberger et al. 2012), H3K9me3 may specifically inhibit the activity of one or both of these putative DNA demethylation factors. Indeed, we have recently shown that a subset of IAP ERVs also show reduced DNA methylation and increased binding of TET1 in Setdb1 KO ESCs (Leung et al. 2014). Using WGBS (PBAT) DNA methylation data generated by our collaborators in Japan, we calculated the average change in the percentage of DNA methylation of all reads that align to an annotated IAPLTR1 element in the genome in Setdb1 KO vs. WT E13.5 PGCs. We agglomerated bisulfite sequencing reads at all IAPLTR1 subfamily members to form an IAP subfamily mini-genome. All reads aligning to this mini-genome were then divided into 21 groups based on the average methylation percentage across the individual molecules sequenced (methylation percentile). The ratio of the number of reads in each methylation percentile vs. total reads within the mini-genome is calculated and plotted (on the y axis) against methylation percentile reads groups (on the x axis) as shown in Figure 5-3. 101 Figure 5-3 Distribution of methylation levels of sequenced reads aligning to IAPLTR1 in control Setdb1 HET and KO PGCs. Bisulfite sequencing reads aligning to the mini-IAP genome (all annotated IAPs in the BL6 genome) were divided into groups based on the mean methylation percentile across each 116 bp read (denoted \"% of methylated CpGs in each read\" on x-axis). the number of reads within each methylation percentile is then divided by the number of total reads aligned to the mini-genome, this ratio denoted \" % of total reads aligned to IAPLTR1.Mm\" on y-axis, is plotted against the x-axis, to indicate distribution of methylation levels of sequenced reads. Note that there is a noticeably increase in the percentage of sequenced molecules showing no DNA methylation in the KO cells, and a concomitant loss of molecules showing a higher level of methylation. In male PGCs at E13.5 upon Setdb1 KO, molecules showing complete methylation show almost no change in DNA methylation, while there is a dramatic increase in the percentage of molecules that are completely unmethylated, and a significant drop in the percentage of sequenced molecules showing >30% but <90% DNA methylation. These results suggest that a significant fraction of partially methylated IAPLTR1 regions lose all DNA methylation, i.e. become unmethylated following Setdb1 deletion. Similar results were observed in female KO PGCs at IAPLTR1 subfamily sequences (Figure 5-3 right panel). This pattern was also observed for the closely related IAP families IAPLTR1a (Figure 5-4), and IAPLTR2, IAPLTR2a (data not IAPLTR1_Mm% of methylated CpGs in each read0 20 40 60 80 100% of total reads aligned to IAPLTR1.Mm0510152025303540 IAPLTR1.MmE13.5.male.EsetKO.PGC.embryo#6 (16747)E13.5.male.EsetHet.PGC.embryo#9 (9591)% of methylated CpGs in each read0 20 40 60 80 100% of total reads aligned to IAPLTR1.Mm01020304050 IAPLTR1.MmE13.5.female.EsetKO.PGCs.embryo#2 (14986)E13.5.female.EsetHet.PGC.embryo#15 (17354) 102 shown). Those IAP elements showing complete loss of DNA methylation are likely the elements that are de-repressed in the Setdb1 KO. Figure 5-4 Distribution of methylation levels of sequenced reads aligning to IAPLTR1a in control Setdb1 HET and KO PGCs. Bisulfite sequencing reads aligning to the mini-IAP genome (all annotated IAPs in the BL6 genome) were divided into groups based on methylation percentile across the read (116 bp). The ratio of the number of reads in each methylation percentile vs. total reads within the mini-genome was calculated, and plotted (on the y axis) against methylation percentile reads groups (on the x axis). Note that there is a significant increase in the percentage of sequenced molecules showing no DNA methylation in the KO cells, and a concomitant loss of molecules showing a higher level of methylation. 5.3! H3K9me3 and DNA demethylation at ERVs Recent studies of genome-wide DNA methylation in PGCs suggest that \u00E2\u0080\u009Cglobal\u00E2\u0080\u009D erasure of DNA methylation may take place in two phases (Seisenberger et al. 2012; Vincent et al. 2013; Wu and Zhang 2014). The first phase involves replication dependent passive dilution of 5mC during the PGC migration period from E7.25 to day E9. Once PGCs arrive at the gonadal ridge, PGCs start to proliferate and a second phase of DNA demethylation involving TET1/2-mediated 5mC oxidation occurs, and is completed around E12.5. Intriguingly, despite these two phases of IAPLTR1a_Mm% of methylated CpGs in each read0 20 40 60 80 100% of total reads aligned to IAPLTR1a.Mm0510152025303540 IAPLTR1a.MmE13.5.male.EsetKO.PGC.embryo#6 (24102)E13.5.male.EsetHet.PGC.embryo#9 (13655)% of methylated CpGs in each read0 20 40 60 80 100% of total reads aligned to IAPLTR1a.Mm0102030405060 IAPLTR1a.MmE13.5.female.EsetKO.PGCs.embryo#2 (21707)E13.5.female.EsetHet.PGC.embryo#15 (25412) 103 demethylation, with down regulation of UHRF1 and DNMT1 in the first phase, and upregulation of TET1/2 in second phase, a subset of retrotransposons are clearly resistant to such demethylation (Lane et al. 2003). As the activity of DNMT1 is stimulated by the binding of its co-factor NP95 to H3K9me3 (Rothbart et al. 2012; 2013), SETDB1-dependent deposition of this mark may influence the efficiency of maintenance DNA methylation. Notably, NP95 expression is down-regulated in PGCs (Kurimoto et al. 2008) and the remaining protein is detected predominantly in the cytoplasm (Seisenberger et al. 2012), likely compromising maintenance methylation at a global level; however, the residual NP95 may be sufficient to bind H3K9me3 at marked retrotransposons and in turn to promote maintenance DNA methylation exclusively at these regions through recruitment of DNMT1. Based on these observations, we propose that H3K9me3 may inhibit loss of DNA methylation in PGCs by directing UHRF1/DNMT1 to marked ERVs. In addition to H3K9me3-dependent inhibition of TET-dependent DNA demethylation, such \u00E2\u0080\u009Cpreferential\u00E2\u0080\u009D maintenance DNA methylation at marked regions in PGCs likely minimizes the likelihood of mobilization of young ERVs at later stages in development. Indeed, our results are consistent with previous reports that IAP ERVs are expressed at very low levels in E13 PGCs (Weber et al. 2002). In contrast, IAPez elements were up-regulated in Setdb1 deficient PGCs, revealing that as in ESCs, H3K9me3 likely plays a critical role in safeguarding the genome against IAP retrotransposition. 5.4! LTR-initiated chimaeric transcripts Coincident with derepression of numerous ERV families, many genes are aberrantly expressed in Setdb1 KO PGCs as a consequence of transcriptional activation of nearby related LTRs, indicating that SETDB1 plays an important role not only in safeguarding against expression of 104 intact ERVs, but also in minimizing the expression of LTR-initiated transcripts that extend into genes. Interestingly, however, numerous chimaeric transcripts detected in Setdb1 KO PGCs were not up-regulated in Setdb1 KO ESCs (Karimi et al. 2011) and vice versa, indicating that additional factors likely influence the expression of specific LTR elements in a tissue-specific manner. 5.5! Gender difference in ERV reactivation We show that a subset of constitutively expressed ERVs, including ERVK10C and ETn elements, are clearly expressed at higher levels in female than male E13.5 PGCs or vice versa. Consistent with the observation that germ cells began to adopt lineage-specific/sex-specific transcriptional states by E12.5 (Jameson et al. 2012), our data indicate that the intracellular complement of positive and/or negative transcriptional regulators differ in a sex dependent manner. Notably, a significantly greater number of ERVs, including ETn elements, showed higher levels of derepression in male than female Setdb1 KO PGCs, suggesting that the male germline may be more susceptible to retrotransposition. Indeed systematic analyses of the genomic landscape of transposable elements across 18 strains suggests that the majority of retrotransposons in the mouse are introduced through the male germline (Nell\u00C3\u00A5ker et al. 2012). The Y-chromosome itself may act as a reservoir for specific ERVs, including IAP elements, which are amplified in non-pseudoautosomal regions (Reuss et al. 1996). 5.6! Setdb1 deletion and germ cell viability As Setdb1 deficient ESCs are also inviable (Dodge et al. 2004) and characterized by activation of the same ERV families (Matsui et al. 2010; Karimi et al. 2011), it is possible that widespread 105 activation of ERVs triggers cell death in PGCs via retrotransposition-dependent or independent pathways, with activation of ERVs on the Y-chromosome rendering male PGCs more susceptible. Consistent with this model, deletion of Setdb1 in MEFs does not lead to derepression of ERVs or decreased cell viability. Furthermore, deletion of KAP1/TIF1\u00CE\u00B2, which interacts directly with SETDB1, also leads to derepression of ERVs in the early embryo (Rowe et al. 2010) and embryonic lethality (Cammas et al. 2000), as well as testicular degeneration when deleted specifically in the germ cell lineage (Cammas et al. 2000). Given that perinatal induction of ERVs in Dnmt3l null embryos at P2 does not affect gonocyte viability (Bourc'his and Bestor 2004), it is possible that proliferating PGCs are more sensitive to activation of ERVs than germ cells at later developmental stages. Alternatively, ERVs may be derepressed to a greater extent in Setdb1 KO PGCs than Dnmt3l KO perinatal gonocytes. Despite a significant depletion of male Setdb1 KO PGCs at E13.5, we did not observe increased expression of genes involved in cell cycle arrest or apoptosis, although we cannot rule out that such events occur earlier in KO PGCs. Interestingly, work published by Zeng and colleagues (An et al. 2014) demonstrates that SETDB1 depletion in postnatal spermatogonial stem cells results in increased levels of apoptosis and reduced gonadal size in a transplantation model. Taken together, these data suggest that SETDB1 is required for germ cell viability not only during epigenetic remodeling in PGCs, but also at later stages of germ cell development. 5.7! Setdb1 and the DNMT3L-piRNA pathway Global remethylation in males is initiated in gonocytes at ~E15.5 (Reik 2001; Abramowitz and Bartolomei 2011), and extends through the perinatal stage. Efficient de novo DNA methylation of a subset of ERV and LINE1 retroelements during this developmental window is dependent 106 upon the cofactor DNMT3L and piRNA pathway components, including MIWI2 (Kim et al. 2009). While IAP elements are expressed in Dnmt3l-/- P2 gonocytes, pre-meiotic Dnmt3l-/- germ cells develop normally in the perinatal stage (Bourc'his and Bestor 2004), but show reduced numbers of spermatocytes (which do not express DNMT3L or MIWI2) and undergo meiotic failure. Derepression of LINE1 and to a lesser extent IAP elements has been observed in spermatogonia and/or spermatocytes deficient in the piRNA proteins mentioned above, but like the Dnmt3l mutant, developmental defects are not manifest until the pachytene stage of meiotic prophase I in these mutants. This common postnatal germline phenotype indicates that while the DNA methylation defect is initiated in prenatal gonocytes, the loss of DNA methylation does not affect viability until the spermatogonial stages or later. Consistent with this model, induction of IAP expression was not observed in Dnmt1 KO PGCs (Walsh et al. 1998). Intriguingly, depletion of Eggless/dSETDB1, the ortholog of SETDB1 in D. melanogaster required for H3K9me3 deposition in the ovary and early oogenesis (Clough et al. 2007), leads to derepression of LTR retrotransposons and reduced transcription of piRNA clusters in the ovary (Rangan et al. 2011), indicating that H3K9me3 acts upstream of the piRNA pathway in flies. Conversely, depletion of Piwi or its binding partner Gtsf1 leads to a rapid loss of H3K9me3 and derepression of specific LTR elements, indicating that the Drosophila piRNA pathway impinges upon this histone mark (Donertas et al. 2013). Similarly, the AGO1 silencing complex RITS interacts with nascent target RNAs to guide H3K9me3 at repetitive regions in S. pombe (Castel and Martienssen 2013). However, in Setdb1 KO male and female E13.5 PGCs, expression levels of members of the piRNA pathway were not significantly altered relative to their HET and WT littermates and MIWI2 is not yet expressed at this stage, indicating that this nuclear piRNA 107 pathway is not required for SETDB1-dependent transcriptional silencing in early mammalian germ cells. By plotting our H3K9me3 ChIPseq data in PGCs against a recently published H3K9me3 ChIPseq dataset generated from P10 spermatocytes (Pezic et al. 2014), we found that H3K9me3 is enriched at the same ERVs, indicating that this mark persists in the male germline until at least P10 (Figure 5-5). Figure 5-5 H3K9me3 at ERVs persists in the male germline until at least P10. Upper panel, diagram of postnatal male germ cell development. Lower panels, plot of enrichment of H3K9me3 at ERV subfamilies in E13.5 PGCs vs. P10 spermatocytes (left) (Spermatocyte H3K9me3 data from Pezic etc., 2014) and mature sperm (data courtesy of Julie Brind\u00E2\u0080\u0099Amour, unpublished). IAP subfamilies are color coded in red. Based on these observations, we propose that SETDB1-dependent deposition of H3K9me3 at ERVs occurs very early in germline development or prior to the emergence of germ cells and persists until the early spermatocyte stage. Programmed loss of H3K9me3 at this stage would H3K9me3'at'ERVs'persists'in'the'male'germline'until'at'least'P10380 annotated ERV subfamilies (>100 copies)IAPLTR1aIAPLTR1IAPLTR4IAPLTR4_IRLTR10CIAPEY3-intRLTR6-int IAPEY2_LTRIAPEz-intRLTR4-intRLTR10ARLTR6RLTR44CETnERV-intRLTR1DRLTR1RLTR1BRLTR40.00.51.01.52.02.50.0 0.5 1.0 1.5 2.0 2.5H3K9me3 \u0001E13.5 PGC (RPKM)H3K9me3 P10 spermatocytes (RPKM) (Pezic, G&D, 2014)E13.5 \u0001 PGCs vs P10 spermatocytes IAPLTR1aIAPLTR1 IAPLTR4IAPLTR4_IRLTR10CIAPEY3-intRLTR6-intIAPEY2_LTRIAPEz-intRLTR4-intRLTR10ARLTR44C ETnERV-int0.00.51.01.52.02.50.0 0.5 1.0 1.5 2.0 2.5H3K9me3 mature sperm (RPKM)E13.5 \u0001 PGCs vs mature spermPGCs spermatocytesASupplementary Figure 1\u00E2\u0099\u0082 \u00E2\u0099\u0082B CPrimary Oocyte Pachytene of Meiosis I Meiotic Prophase I Arrest 1st Polar body Meiosis II Metaphase Arrest Secondary Oocyte Mature Oocyte PGC Synapsis Crossing over Spermatocyte Pachytene of Meiosis I PGC Synapsis Crossing over Meiosis II Round Spermatid 2nd Polar body Mature Sperm spermatogonia spermatids(round)spermatids(elongated)maturespermmitosis meiosis spermiogenesisP10 P19P0 P30 108 explain why germ cells deficient in perinatal DNA methylation, including mutants in Dnmt3l or the nuclear piRNA pathway, are susceptible to derepression of ERVs only later in spermatogenesis. Whether H3K9me3 plays a role in piRNA-dependent de novo DNA methylation of marked retroelements and/or persistence of DNA methylation at these later stages remains to be determined. Addressing these questions would require detailed analyses of the distribution of H3K9me3 and DNA methylation in prenatal and/or perinatal gonocytes. Unfortunately, the rapid loss of germ cells following Setdb1 deletion in early PGCs necessitates the establishment of an alternative conditional deletion system. Regardless, our results clearly show that SETDB1 functions as an important guardian against transcriptional activation of young ERVs in PGCs and may simultaneously protect marked elements against the waves of global demethylation that take place early in germline development. 109 References Abramowitz LK, Bartolomei MS. 2011. Genomic imprinting: recognition and marking of imprinted loci. Current Opinion in Genetics & Development 1\u00E2\u0080\u00937. Adli M, Zhu J, Bernstein BE. 2010. Genome-wide chromatin maps derived from limited numbers of hematopoietic progenitors. Nature Publishing Group 7: 615\u00E2\u0080\u0093618. Alder O, Lavial F, Helness A, Brookes E, Pinho S, Chandrashekran A, Arnaud P, Pombo A, O'Neill L, Azuara V. 2010. Ring1B and Suv39h1 delineate distinct chromatin states at bivalent genes during early mouse lineage commitment. Development 137: 2483\u00E2\u0080\u00932492. An J, Zhang X, Qin J, Wan Y, Hu Y, Liu T, Li J, Dong W, Du E, Pan C, et al. 2014. The histone methyltransferase ESET is required for the survival of spermatogonial stem/progenitor cells in mice. 5: e1196\u00E2\u0080\u009310. Aravin AA, Sachidanandam R, Bourc'his D, Schaefer C, Pezic D, Toth KF, Bestor T, Hannon GJ. 2008. A piRNA Pathway Primed by Individual Transposons Is Linked to De Novo DNA Methylation in Mice. Molecular Cell 31: 785\u00E2\u0080\u0093799. Baltus AE, Menke DB, Hu Y-C, Goodheart ML, Carpenter AE, de Rooij DG, Page DC. 2006. In germ cells of mouse embryonic ovaries, the decision to enter meiosis precedes premeiotic DNA replication. Nat Genet 38: 1430\u00E2\u0080\u00931434. Barski A, Cuddapah S, Cui K, Roh T-Y, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K. 2007. High-Resolution Profiling of Histone Methylations in the Human Genome. Cell 129: 823\u00E2\u0080\u0093837. Baust C, Baillie GJ, Mager DL. 2002. Insertional polymorphisms of ETn retrotransposons include a disruption of the wiz gene in C57BL/6 mice. Mammalian Genome 13: 423\u00E2\u0080\u0093428. Baust C, Gagnier L, Baillie GJ, Harris MJ, Juriloff DM, Mager DL. 2003. Structure and expression of mobile ETnII retroelements and their coding-competent MusD relatives in the mouse. Journal of Virology 77: 11448\u00E2\u0080\u009311458. Beck CR, Garcia-Perez JL, Badge RM, Moran JV. 2011. LINE-1 Elements in Structural Variation and Disease. Annu Rev Genom Human Genet 12: 187\u00E2\u0080\u0093215. Bilodeau S, Kagey MH, Frampton GM, Rahl PB, Young RA. 2009. SetDB1 contributes to repression of genes encoding developmental regulators and maintenance of ES cell state. Genes & Development 23: 2484\u00E2\u0080\u00932489. Boulard M, Edwards JR, Bestor TH. 2015. FBXL10 protects Polycomb-bound genes from hypermethylation. Nat Genet 47: 479\u00E2\u0080\u0093485. Bourc'his D, Bestor TH. 2004. Meiotic catastrophe and retrotransposon reactivation in male germ cells lacking Dnmt3L. Nature 431: 96\u00E2\u0080\u009399. 110 Brind'Amour J, Liu S, Hudson M, Chen C, Karimi MM, Lorincz MC. 2015. An ultra-low-input native ChIP-seq protocol for genome-wide profiling of rare cell populations. Nature Communications 6: 6033. Br\u00C3\u00BBlet P, Condamine H, Jacob F. 1985. Spatial distribution of transcripts of the long repeated ETn sequence during early mouse embryogenesis. Proc Natl Acad Sci USA 82: 2054\u00E2\u0080\u00932058. Bulut-Karslioglu A, La Rosa-Vel\u00C3\u00A1zquez De IA, Ramirez F, Barenboim M, Onishi-Seebacher M, Arand J, Gal\u00C3\u00A1n C, Winter GE, Engist B, Gerle B, et al. 2014. Suv39h-Dependent H3K9me3 Marks Intact Retrotransposons and Silences LINE Elements in Mouse Embryonic Stem Cells. Molecular Cell 55: 277\u00E2\u0080\u0093290. Cammas F, Mark M, Dolle P, Dierich A, Chambon P, Losson R. 2000. Mice lacking the transcriptional corepressor TIF1beta are defective in early postimplantation development. Development 127: 2955\u00E2\u0080\u00932963. Castel SE, Martienssen RA. 2013. RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond. Nat Rev Genet 14: 100\u00E2\u0080\u0093112. Cavanagh M-H, Landry S, Audet B, Arpin-Andr\u00C3\u00A9 C, Hivin P, Par\u00C3\u00A9 M-\u00C3\u0088, Th\u00C3\u00AAte J, Wattel \u00C3\u0089, Marriott SJ, Mesnard J-M, et al. 2006. Retrovirology. Retrovirology 3: 15\u00E2\u0080\u009315. Chedin F, Lieber MR, Hsieh CL. 2002. The DNA methyltransferase-like protein DNMT3L stimulates de novo methylation by Dnmt3a. Proceedings of the National Academy of Sciences 99: 16916\u00E2\u0080\u009316921. Cheng X, Blumenthal RM. 2008. Mammalian DNA Methyltransferases: A Structural Perspective. Structure 16: 341\u00E2\u0080\u0093350. Chomczynski P, Sacchi N. 1987. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 162: 156\u00E2\u0080\u0093159. Chomczynski P, Sacchi N. 2006. The single-step method of RNA isolation by acid guanidinium thiocyanate\u00E2\u0080\u0093phenol\u00E2\u0080\u0093chloroform extraction: twenty-something years on. Nat Protoc 1: 581\u00E2\u0080\u0093585. Clough E, Moon W, Wang S, Smith K, Hazelrigg T. 2007. Histone methylation is required for oogenesis in Drosophila. Development 134: 157\u00E2\u0080\u0093165. Consortium TEP, Consortium TEP, data analysis coordination OC, data production DPL, data analysis LA, group W, scientific management NPM, steering committee PI, Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis), Broad Institute Group (data production and analysis), et al. 2012. An integrated encyclopedia of DNA elements in the human genome. Nature 488: 57\u00E2\u0080\u009374. Dahl JA, Collas P. 2008. A rapid micro chromatin immunoprecipitation assay (ChIP). Nat Protoc 3: 1032\u00E2\u0080\u00931045. 111 Daley T, Smith AD. 2013. Predicting the molecular complexity of sequencing libraries. Nat Meth 10: 325\u00E2\u0080\u0093327. Daujat S, Weiss T, Mohn F, Lange UC, Ziegler-Birling C, Zeissler U, Lappe M, Sch\u00C3\u00BCbeler D, Torres-Padilla M-E, Schneider R. 2009. H3K64 trimethylation marks heterochromatin and is dynamically remodeled during developmental reprogramming. Nat Struct Mol Biol 16: 777\u00E2\u0080\u0093781. De Felici M. 2011. Nuclear Reprogramming in Mouse Primordial Germ Cells: Epigenetic Contribution. Stem Cells International 2011: 1\u00E2\u0080\u009315. Deng W, Lin H. 2002. miwi, a murine homolog of piwi, encodes a cytoplasmic protein essential for spermatogenesis. Developmental Cell 2: 819\u00E2\u0080\u0093830. Di Cristofano A, Strazzullo M, Longo L, La Mantia G. 1995. Characterization and genomic mapping of the ZNF80 locus: expression of this zinc-finger gene is driven by a solitary LTR of ERV9 endogenous retrovrial family. Nucleic Acids Research 23: 2823\u00E2\u0080\u00932830. Dodge JE, Kang YK, Beppu H, Lei H, Li E. 2004. Histone H3-K9 Methyltransferase ESET Is Essential for Early Development. Molecular and Cellular Biology 24: 2478\u00E2\u0080\u00932486. Donertas D, Sienski G, Brennecke J. 2013. Drosophila Gtsf1 is an essential component of the Piwi-mediated transcriptional silencing complex. Genes & Development 27: 1693\u00E2\u0080\u00931705. Dudley JP. 1987. Discrete high molecular weight RNA transcribed from the long interspersed repetitive element L1Md. Nucleic Acids Research 15: 2581\u00E2\u0080\u00932592. Durcova-Hills G, Tokunaga T, Kurosaka S, Yamaguchi M, Takahashi S, Imai H. 1999. Immunomagnetic Isolation of Primordial Germ Cells and the Establishment of Embryonic Germ Cell Lines in the Mouse. Cloning 1: 217\u00E2\u0080\u0093224. Ekram MB, Kim J. 2014. High-Throughput Targeted Repeat Element Bisulfite Sequencing (HT-TREBS): Genome-Wide DNA Methylation Analysis of IAP LTR Retrotransposon ed. D.J. Hedges. PLoS ONE 9: e101683\u00E2\u0080\u00939. Falconer E, Hills M, Naumann U, Poon SSS, Chavez EA, Sanders AD, Zhao Y, Hirst M, Lansdorp PM. 2012. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat Meth 9: 1107\u00E2\u0080\u00931112. Farrell RE. 2009. RNA Methodologies, Fourth Edition. 1\u00E2\u0080\u0093742. Chapter17. p395-399. Feng J, Liu T, Qin B, Zhang Y, Liu XS. 2012. Identifying ChIP-seq enrichment using MACS. Nat Protoc 7: 1728\u00E2\u0080\u00931740. Feuchter-Murthy AE, Freeman JD, Mager DL. 1993. Splicing of a human endogenous retrovirus to a novel phospholipase A2 related gene. Nucleic Acids Research 21: 135\u00E2\u0080\u0093143. Filion GJP, Zhenilo S, Salozhin S, Yamada D, Prokhortchouk E, Defossez PA. 2005. A Family 112 of Human Zinc Finger Proteins That Bind Methylated DNA and Repress Transcription. Molecular and Cellular Biology 26: 169\u00E2\u0080\u0093181. Gaucher J, Reynoird N, Montellier E, Boussouar F, Rousseaux S, Khochbin S. 2009. From meiosis to postmeiotic events: The secrets of histone disappearance. FEBS Journal 277: 599\u00E2\u0080\u0093604. Gentleman RC, Carey VJ, Bates DM, Ben Bolstad, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al. 2004. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5: R80. Ginsburg M, Snow MH, McLaren A. 1990. Primordial germ cells in the mouse embryo during gastrulation. Development 110: 521\u00E2\u0080\u0093528. Grow EJ, Flynn RA, Chavez SL, Bayless NL, Wossidlo M, Wesche DJ, Martin L, Ware CB, Blish CA, Chang HY, et al. 2015. Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells. Nature 522: 221\u00E2\u0080\u0093225. Gu H, Smith ZD, Bock C, Boyle P, Gnirke A, Meissner A. 2011. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat Protoc 6: 468\u00E2\u0080\u0093481. Hackett JA, Sengupta R, Zylicz JJ, Murakami K, Lee C, Down TA, Surani MA. 2013. Germline DNA demethylation dynamics and imprint erasure through 5-hydroxymethylcytosine. Science 339: 448\u00E2\u0080\u0093452. http://www.sciencemag.org/cgi/doi/10.1126/science.1229277. Hahne F, LeMeur N, Brinkman RR, Ellis B, Haaland P, Sarkar D, Spidlen J, Strain E, Gentleman R. 2009. flowCore: a Bioconductor package for high throughput flow cytometry. BMC Bioinformatics 10: 106\u00E2\u0080\u00938. Hajkova P, Ancelin K, Waldmann T, Lacoste N, Lange UC, Cesari F, Lee C, Almouzni G, Schneider R, Surani MA. 2008. Chromatin dynamics during epigenetic reprogramming in the mouse germ line. Nature 452: 877\u00E2\u0080\u0093881. Hajkova P, Erhardt S, Lane N, Haaf T, El-Maarri O, Reik W, Walter J, Surani MA. 2002. Epigenetic reprogramming in mouse primordial germ cells. Mechanisms of Development 117: 15\u00E2\u0080\u009323. Hashimoto H, Horton JR, Zhang X, Bostick M, Jacobsen SE, Cheng X. 2008. The SRA domain of UHRF1 flips 5-methylcytosine out of the DNA helix. Nature 455: 826\u00E2\u0080\u0093829. Haston KM, Tung JY, Reijo Pera RA. 2009. Dazl Functions in Maintenance of Pluripotency and Genetic and Epigenetic Programs of Differentiation in Mouse Primordial Germ Cells In Vivo and In Vitro ed. C. Creighton. PLoS ONE 4: e5654\u00E2\u0080\u009315. Henckel A, Chebli K, Kota SK, Arnaud P, Feil R. 2011. Transcription and histone methylation changes correlate with imprint acquisition in male germ cells. EMBO J 1\u00E2\u0080\u009310. 113 Hutnick LK, Huang X, Loo T-C, Ma Z, Fan G. 2010. Repression of Retrotransposal Elements in Mouse Embryonic Stem Cells Is Primarily Mediated by a DNA Methylation-independent Mechanism. Journal of Biological Chemistry 285: 21082\u00E2\u0080\u009321091. Jackson-Grusby L, Beard C, Possemato R, Tudor M, Fambrough D, Csankovszki G, Dausman J, Lee P, Wilson C, Lander E, et al. 2001. Loss of genomic methylation causes p53-dependent apoptosis and epigenetic deregulation. Nat Genet 27: 31\u00E2\u0080\u009339. Jameson SA, Natarajan A, Cool J, DeFalco T, Maatouk DM, Mork L, Munger SC, Capel B. 2012. Temporal Transcriptional Profiling of Somatic and Germ Cells Reveals Biased Lineage Priming of Sexual Fate in the Fetal Mouse Gonad ed. G.S. Barsh. PLoS Genet 8: e1002575\u00E2\u0080\u009321. Jenkins NA, Copeland NG, Taylor BA, Lee BK. 1981. Dilute (d) coat colour mutation of DBA/2J mice is associated with the site of integration of an ecotropic MuLV genome. Nature 293: 370\u00E2\u0080\u0093374. Jurka J, Kapitonov VV, Kohany O, Jurka MV. 2007. Repetitive Sequences in Complex Genomes: Structure and Evolution. Annu Rev Genom Human Genet 8: 241\u00E2\u0080\u0093259. Kaneda M, Okano M, Hata K, Sado T, Tsujimoto N, Li E, Sasaki H. 2004. Essential role for de novo DNA methyltransferase Dnmt3a in paternal and maternal imprinting. Nature 429: 900\u00E2\u0080\u0093903. Karimi MM, Goyal P, Maksakova IA, Bilenky M, Leung D, Tang JX, Shinkai Y, Mager DL, Jones S, Hirst M, et al. 2011. DNA Methylation and SETDB1/H3K9me3 Regulate Predominantly Distinct Sets of Genes, Retroelements, and Chimeric Transcripts in mESCs. Stem Cell 8: 676\u00E2\u0080\u0093687. Kaustov L, Ouyang H, Amaya M, Lemak A, Nady N, Duan S, Wasney GA, Li Z, Vedadi M, Schapira M, et al. 2010. Recognition and Specificity Determinants of the Human Cbx Chromodomains. Journal of Biological Chemistry 286: 521\u00E2\u0080\u0093529. Kazazian HH Jr, Moran JV. 1998. The impact of L1 retrotransposons on the human genome. Nat Genet. Kim VN, Han J, Siomi MC. 2009. Biogenesis of small RNAs in animals. Nat Rev Mol Cell Biol 10: 126\u00E2\u0080\u0093139. Kimmins S, Sassone-Corsi P. 2005. Chromatin remodelling and epigenetic features of germ cells. Nature 434: 583\u00E2\u0080\u0093589. Kobayashi H, Sakurai T, Imai M, Takahashi N, Fukuda A, Yayoi O, Sato S, Nakabayashi K, Hata K, Sotomaru Y, et al. 2012. Contribution of Intragenic DNA Methylation in Mouse Gametic DNA Methylomes to Establish Oocyte-Specific Heritable Marks ed. W. Reik. PLoS Genet 8: e1002440\u00E2\u0080\u009314. Kobayashi H, Sakurai T, Miura F, Imai M, Mochiduki K, Yanagisawa E, Sakashita A, Wakai T, 114 Suzuki Y, Ito T, et al. 2013. High-resolution DNA methylome analysis of primordial germ cells identifies gender-specific reprogramming in mice. Genome Research 23: 616\u00E2\u0080\u0093627. Kotaja N, Sassone-Corsi P. 2007. The chromatoid body: a germ-cell-specific RNA-processing centre. Nat Rev Mol Cell Biol 8: 85\u00E2\u0080\u009390. Koubova J, Menke DB, Zhou Q, CAPEL B, Griswold MD, Page DC. 2006. Retinoic acid regulates sex-specific timing of meiotic initiation in mice. Proceedings of the National Academy of Sciences 103: 2474\u00E2\u0080\u00932479. Krueger F, Andrews SR. 2011. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27: 1571\u00E2\u0080\u00931572. Kuff EL, Lueders KK. 1988. The intracisternal A-particle gene family: structure and functional aspects. Adv Cancer Res 51: 183\u00E2\u0080\u0093276. Kuramochi-Miyagawa S. 2004. Mili, a mammalian member of piwi family gene, is essential for spermatogenesis. Development 131: 839\u00E2\u0080\u0093849. Kurimoto K, Yabuta Y, Ohinata Y, Shigeta M, Yamanaka K, SAITOU M. 2008. Complex genome-wide transcription dynamics orchestrated by Blimp1 for the specification of the germ cell lineage in mice. Genes & Development 22: 1617\u00E2\u0080\u00931635. La Fuente De R, Baumann C, Fan T, Schmidtmann A, Dobrinski I, Muegge K. 2006. Lsh is required for meiotic chromosome synapsis and retrotransposon silencing in female germ cells. Nat Cell Biol 8: 1448\u00E2\u0080\u00931454. La Salle S, Mertineit C, Taketo T, Moens PB, Bestor TH, Trasler JM. 2004. Windows for sex-specific methylation marked by DNA methyltransferase expression profiles in mouse germ cells. Developmental Biology 268: 403\u00E2\u0080\u0093415. Lane N, Dean W, Erhardt S, Hajkova P, Surani A, Walter JR, Reik W. 2003. Resistance of IAPs to methylation reprogramming may provide a mechanism for epigenetic inheritance in the mouse. genesis 35: 88\u00E2\u0080\u009393. Lawrence M, Huber W, Pag\u00C3\u00A8s H, Aboyoun P, Carlson M, Gentleman R, Morgan MT, Carey VJ. 2013. Software for Computing and Annotating Genomic Ranges ed. A. Prlic. PLoS Comp Biol 9: e1003118\u00E2\u0080\u009310. Lawrence M, Lang DT. 2010. RGtk2: A graphical user interface toolkit for R. Journal of Statistical Software. Leonhardt H, Page AW, Weier H-U, Bestor TH. 1992. A targeting sequence directs DNA methyltransferase to sites of DNA replication in mammalian nuclei. Cell 71: 865\u00E2\u0080\u0093873. Lesch BJ, Dokshin GA, Young RA, McCarrey JR, Page DC. 2013. A set of genes critical to development is epigenetically poised in mouse germ cells from fetal stages through completion of meiosis. Proc Natl Acad Sci USA 110: 16061\u00E2\u0080\u009316066. 115 Leung D, Du T, Wagner U, Xie W, Lee AY, Goyal P, Li Y, Szulwach KE, Jin P, Lorincz MC, et al. 2014. Regulation of DNA methylation turnover at LTR retrotransposons and imprinted loci by the histone methyltransferase Setdb1. Proc Natl Acad Sci USA 111: 6690\u00E2\u0080\u00936695. Leung DC, Dong KB, Maksakova IA, Goyal P, Appanah R, Lee S, Tachibana M, Shinkai Y, Lehnertz B, Mager DL, et al. 2011. Lysine methyltransferase G9a is required for de novo DNA methylation and the establishment, but not the maintenance, of proviral silencing. Proc Natl Acad Sci USA 108: 5718\u00E2\u0080\u00935723. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754\u00E2\u0080\u00931760. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078\u00E2\u0080\u00932079. Li Y, Tollefsbol TO. 2011. DNA Methylation Detection: Bisulfite Genomic Sequencing Analysis. In Methods in Molecular Biology, Vol. 791 of, pp. 11\u00E2\u0080\u009321, Humana Press, Totowa, NJ. Liu S, Brind'Amour J, Karimi MM, Shirane K, Bogutz A, Lefebvre L, Sasaki H, Shinkai Y, Lorincz MC. 2014. Setdb1is required for germline development and silencing of H3K9me3-marked endogenous retroviruses in primordial germ cells. Genes & Development 28: 2041\u00E2\u0080\u00932055. Loebel DAF, Tsoi B, Wong N, O'Rourke MP, Tam PPL. 2004. Restricted expression of ETn-related sequences during post-implantation mouse development. Gene Expression Patterns 4: 467\u00E2\u0080\u0093471. Lomel\u00C3\u00AD H, Ramos-Mej\u00C3\u00ADa V, Gertsenstein M, Lobe CG, Nagy A. 2000. Targeted insertion of Cre recombinase into the TNAP gene: excision in primordial germ cells. genesis 26: 116\u00E2\u0080\u0093117. Lucifero D, La Salle S, Bourc'his D, Martel J, Bestor TH, Trasler JM. 2007. Coordinate regulation of DNA methyltransferase expression during oogenesis. BMC Dev Biol 7: 36\u00E2\u0080\u009314. Ma L, Buchold GM, Greenbaum MP, Roy A, Burns KH, Zhu H, Han DY, Harris RA, Coarfa C, Gunaratne PH, et al. 2009. GASZ Is Essential for Male Meiosis and Suppression of Retrotransposon Expression in the Male Germline ed. M.T. McManus. PLoS Genet 5: e1000635\u00E2\u0080\u009315. Macfarlan TS, Gifford WD, Driscoll S, Lettieri K, Rowe HM, Bonanomi D, Firth A, Singer O, Trono D, Pfaff SL. 2012. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. 1\u00E2\u0080\u00939. http://dx.doi.org/10.1038/nature11244. Mager DL. 1989. Polyadenylation function and sequence variability of the long terminal repeats of the human endogenous retrovirus-like family RTVL-H. Virology 173: 591\u00E2\u0080\u0093599. Mager DL, Freeman JD. 2000. Novel mouse type D endogenous proviruses and ETn elements 116 share long terminal repeat and internal sequences. Journal of Virology 74: 7221\u00E2\u0080\u00937229. Maksakova IA, Romanish MT, Gagnier L, Dunn CA, van de Lagemaat LN, Mager DL. 2006. Retroviral Elements and Their Hosts: Insertional Mutagenesis in the Mouse Germ Line. PLoS Genet 2: e2\u00E2\u0080\u009310. Maksakova IA, Thompson PJ, Goyal P, Jones SJ, Singh PB, Karimi MM, Lorincz MC. 2013. Distinct roles of KAP1, HP1 and G9a/GLP in silencing of the two-cell-specific retrotransposon MERVL in mouse ES cells. Epigenetics & Chromatin 6: 1\u00E2\u0080\u00931. Martin C, Zhang Y. 2005. The diverse functions of histone lysine methylation. Nat Rev Mol Cell Biol 6: 838\u00E2\u0080\u0093849. Massie CE, Mills IG. 2011. Mapping Protein\u00E2\u0080\u0093DNA Interactions Using ChIP-Sequencing. In Methods in Molecular Biology, Vol. 809 of, pp. 157\u00E2\u0080\u0093173, Springer New York, New York, NY. Matsui T, Leung D, Miyashita H, Maksakova IA, Miyachi H, Kimura H, Tachibana M, Lorincz MC, Shinkai Y. 2010. Proviral silencing in embryonic stem cells requires the histone methyltransferase ESET. Nature 464: 927\u00E2\u0080\u0093931. Matsui Y, Tokitake Y. 2009. Primordial germ cells contain subpopulations that have greater ability to develop into pluripotential stem cells. Development, Growth & Differentiation 51: 657\u00E2\u0080\u0093667. Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D'Souza C, Fouse SD, Johnson BE, Hong C, Nielsen C, Zhao Y, et al. 2010. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature 466: 253\u00E2\u0080\u0093257. McCarthy EM, McDonald JF. 2004. Long terminal repeat retrotransposons of Mus musculus. Genome Biol 5: R14. Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim T-K, Koche RP, et al. 2007. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448: 553\u00E2\u0080\u0093560. Miura F, Enomoto Y, Dairiki R, Ito T. 2012. Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging. Nucleic Acids Research 40: e136\u00E2\u0080\u0093e136. Moore LD, Le T, Fan G. 2012. DNA Methylation and Its Basic Function. Neuropsychopharmacology 38: 23\u00E2\u0080\u009338. Morgan HD, Sutherland HG, Martin DI, Whitelaw E. 1999. Epigenetic inheritance at the agouti locus in the mouse. Nat Genet 23: 314\u00E2\u0080\u0093318. Morgan M, Pages H, Obenchain V, Haydon N. 2013. Rsamtools: Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import. R package version. 117 Mouse Genome Sequencing Consortium, Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, et al. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520\u00E2\u0080\u0093562. Nagy A. 2003. Manipulating the Mouse Embryo. Cold Spring Harbor, N.Y. : Cold Spring Harbor Laboratory Press http://books.google.ca/books?id=pa8KngEACAAJ&dq=Manipulating+the+Mouse+Embryo+A+Laboratory+Manual&hl=&cd=3&source=gbs_api. Nakamura T, Liu Y-J, Nakashima H, Umehara H, Inoue K, Matoba S, Tachibana M, Ogura A, Shinkai Y, Nakano T. 2012. PGC7 binds histone H3K9me2 to protect against conversion of 5mC to 5hmC in early embryos. 1\u00E2\u0080\u00936. http://dx.doi.org/10.1038/nature11093. Nan X, Meehan RR, Bird A. 1993. Dissection of the methyl-CpG binding domain from the chromosomal protein MeCP2. Nucleic Acids Research 21: 4886\u00E2\u0080\u00934892. Nell\u00C3\u00A5ker C, Keane TM, Yalcin B, Wong K, Agam A, Belgard TG, Flint J, Adams DJ, Frankel WN, Ponting CP. 2012. The genomic landscape shaped by selection on transposable elements across 18 mouse strains. Genome Biol 13: R45. Ng J-H, Kumar V, Muratani M, Kraus P, Yeo J-C, Yaw L-P, Xue K, Lufkin T, Prabhakar S, Ng H-H. 2013. In Vivo Epigenomic Profiling of Germ Cells Reveals Germ Cell Molecular Signatures. Developmental Cell 24: 324\u00E2\u0080\u0093333. O'Geen H, Echipare L, Farnham PJ. 2011. Using ChIP-Seq Technology to Generate High-Resolution Profiles of Histone Modifications. In Methods in Molecular Biology, Vol. 791 of, pp. 265\u00E2\u0080\u0093286, Humana Press, Totowa, NJ. Oda M, Oxley D, Dean W, Reik W. 2013. Regulation of Lineage Specific DNA Hypomethylation in Mouse Trophectoderm ed. J.G. Knott. PLoS ONE 8: e68846\u00E2\u0080\u009312. Okano M, Bell DW, Haber DA, Li E. 1999. DNA Methyltransferases Dnmt3a and Dnmt3b Are Essential for De Novo Methylation and Mammalian Development. Cell 99: 247\u00E2\u0080\u0093257. Pastor WA, Stroud H, Nee K, Liu W, Pezic D, Manakov S, Lee SA, Moissiard G, Zamudio N, Bourc'his D, et al. 2014. MORC1 represses transposable elements in the mouse male germline. Nature Communications 5: 5795. Pauler FM, Sloane MA, Huang R, Regha K, Koerner MV, Tamir I, Sommer A, Aszodi A, Jenuwein T, Barlow DP. 2008. H3K27me3 forms BLOCs over silent genes and intergenic regions and specifies a histone banding pattern on a mouse autosomal chromosome. Genome Research 19: 221\u00E2\u0080\u0093233. Pepke S, Wold B, Mortazavi A. 2009. Computation for ChIP-seq and RNA-seq studies. Nat Meth 6: S22\u00E2\u0080\u0093S32. Pepling ME. 2006. From primordial germ cell to primordial follicle: mammalian female germ cell development. genesis 44: 622\u00E2\u0080\u0093632. 118 Pepling ME, Spradling AC. 2001. Mouse Ovarian Germ Cell Cysts Undergo Programmed Breakdown to Form Primordial Follicles. Developmental Biology 234: 339\u00E2\u0080\u0093351. Peters A, O'Carroll D, Scherthan H, Mechtler K. 2001. Loss of the Suv39h histone methyltransferases impairs mammalian heterochromatin and genome stability. Cell 107: 323\u00E2\u0080\u0093337. Pezic D, Manakov SA, Sachidanandam R, Aravin AA. 2014. piRNA pathway targets active LINE1 elements to establish the repressive H3K9me3 mark in germ cells. Genes & Development 28: 1410\u00E2\u0080\u00931428. Prokhortchouk A. 2001. The p120 catenin partner Kaiso is a DNA methylation-dependent transcriptional repressor. Genes & Development 15: 1613\u00E2\u0080\u00931618. Rangan P, Malone CD, Navarro C, Newbold SP, Hayes PS, Sachidanandam R, Hannon GJ, Lehmann R. 2011. piRNA Production Requires Heterochromatin Formation in Drosophila. Current Biology 21: 1373\u00E2\u0080\u00931379. Reik W. 2001. Epigenetic Reprogramming in Mammalian Development. Science 293: 1089\u00E2\u0080\u00931093. Reuss FU, Frankel WN, Moriwaki K, Shiroishi T, Coffin JM. 1996. Genetics of intracisternal-A-particle-related envelope-encoding proviral elements in mice. Journal of Virology 70: 6450\u00E2\u0080\u00936454. Reuter M, Chuma S, Tanaka T, Franz T, Stark A, Pillai RS. 2009. Loss of the Mili-interacting Tudor domain\u00E2\u0080\u0093containing protein-1 activates transposons and alters the Mili-associated small RNA profile. Nat Struct Mol Biol 16: 639\u00E2\u0080\u0093646. Ribet D. 2004. An active murine transposon family pair: Retrotransposition of \u00E2\u0080\u009Cmaster\u00E2\u0080\u009D MusD copies and ETn trans-mobilization. Genome Research 14: 2261\u00E2\u0080\u00932267. Robert E Farrell J. 2010. RNA Methodologies. Academic Press. Rothbart SB, Dickson BM, Ong MS, Krajewski K, Houliston S, Kireev DB, Arrowsmith CH, Strahl BD. 2013. Multivalent histone engagement by the linked tandem Tudor and PHD domains of UHRF1 is required for the epigenetic inheritance of DNA methylation. Genes & Development 27: 1288\u00E2\u0080\u00931298. Rothbart SB, Krajewski K, Nady N, Tempel W, Xue S, Badeaux AI, Barsyte-Lovejoy D, Martinez JY, Bedford MT, Fuchs SM, et al. 2012. Association of UHRF1 with methylated H3K9 directs the maintenance of DNA methylation. Nat Struct Mol Biol 19: 1155\u00E2\u0080\u00931160. Rowe HM, Jakobsson J, Mesnard D, Rougemont J, Reynard S, Aktas T, Maillard PV, Layard-Liesching H, Verp S, Marquis J, et al. 2010. KAP1 controls endogenous retroviruses in embryonic stem cells. 463: 237\u00E2\u0080\u0093240. http://dx.doi.org/10.1038/nature08674. Rugg-Gunn PJ, Cox BJ, Ralston A, Rossant J. 2010. Distinct histone modifications in stem cell 119 lines and tissue lineages from the early mouse embryo. Proc Natl Acad Sci USA 107: 10783\u00E2\u0080\u009310790. Saitou M, Kagiwada S, Kurimoto K. 2012. Epigenetic reprogramming in mouse pre-implantation development and primordial germ cells. Development 139: 15\u00E2\u0080\u009331. Saitou M, Yamaji M. 2012. Primordial germ cells in mice. Cold Spring Harbor Perspectives in Biology 4. Samuelson LC, Wiebauer K, Snow CM, Meisler MH. 1990. Retroviral and pseudogene insertion sites reveal the lineage of human salivary and pancreatic amylase genes from a single gene during primate evolution. Molecular and Cellular Biology 10: 2513\u00E2\u0080\u00932520. Seisenberger S, Andrews S, Krueger F, Arand J, Walter J, Santos F, Popp C, Thienpont B, Dean W, Reik W. 2012. The Dynamics of Genome-wide DNA Methylation Reprogramming in Mouse Primordial Germ Cells. Molecular Cell 48: 849\u00E2\u0080\u0093862. Seki Y, Hayashi K, Itoh K, Mizugaki M, Saitou M, Matsui Y. 2005. Extensive and orderly reprogramming of genome-wide chromatin modifications associated with specification and early development of germ cells in mice. Developmental Biology 278: 440\u00E2\u0080\u0093458. Seki Y, Yamaji M, Yabuta Y, Sano M, Shigeta M, Matsui Y, Saga Y, Tachibana M, Shinkai Y, SAITOU M. 2007. Cellular dynamics associated with the genome-wide epigenetic reprogramming in migrating primordial germ cells in mice. Development 134: 2627\u00E2\u0080\u00932638. Sharif J, Muto M, Takebayashi S-I, Suetake I, Iwamatsu A, Endo TA, Shinga J, Mizutani-Koseki Y, Toyoda T, Okamura K, et al. 2007. The SRA protein Np95 mediates epigenetic inheritance by recruiting Dnmt1 to methylated DNA. 450: 908\u00E2\u0080\u0093912. http://www.nature.com/doifinder/10.1038/nature06397. Shirane K, Toh H, Kobayashi H, Miura F, Chiba H, Ito T, Kono T, Sasaki H. 2013. Mouse Oocyte Methylomes at Base Resolution Reveal Genome-Wide Accumulation of Non-CpG Methylation and Role of DNA Methyltransferases ed. M.S. Bartolomei. PLoS Genet 9: e1003439\u00E2\u0080\u009310. Shoji M, Tanaka T, Hosokawa M, Reuter M, Stark A, Kato Y, Kondoh G, Okawa K, Chujo T, Suzuki T, et al. 2009. The TDRD9-MIWI2 Complex Is Essential for piRNA-Mediated Retrotransposon Silencing in the Mouse Male Germline. Developmental Cell 17: 775\u00E2\u0080\u0093787. Smith ZD, Chan MM, Mikkelsen TS, Gu H, Gnirke A, Regev A, Meissner A. 2012. A unique regulatory phase of DNA methylation in the early mammalian embryo. 1\u00E2\u0080\u00938. http://dx.doi.org/10.1038/nature10960. Souquet B, Tourpin S, Messiaen S, Moison D, Habert R, Livera G. 2012. Nodal Signaling Regulates the Entry into Meiosis in Fetal Germ Cells. Endocrinology 153: 2466\u00E2\u0080\u00932473. Stocking C, Kozak CA. 2008. Endogenous retroviruses. Cell Mol Life Sci 65: 3383\u00E2\u0080\u00933398. 120 Suetake I, Shinozaki F, Miyagawa J, Takeshima H, Tajima S. 2004. DNMT3L Stimulates the DNA Methylation Activity of Dnmt3a and Dnmt3b through a Direct Interaction. Journal of Biological Chemistry 279: 27816\u00E2\u0080\u009327823. Tan SL, Nishi M, Ohtsuka T, Matsui T, Takemoto K, Kamio-Miura A, Aburatani H, Shinkai Y, Kageyama R. 2012. Essential roles of the histone methyltransferase ESET in the epigenetic control of neural progenitor cells during development. Development 139: 3806\u00E2\u0080\u00933816. Thomson T, Lin H. 2009. The Biogenesis and Function of PIWI Proteins and piRNAs: Progress and Prospect. Annu Rev Cell Dev Biol 25: 355\u00E2\u0080\u0093376. Thorvaldsd\u00C3\u00B3ttir H, Robinson JT, Mesirov JP. 2013. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics 14: 178\u00E2\u0080\u0093192. Tollefsbol T. 2010. Handbook of Epigenetics. Academic Press. Truett GE, Heeger P, Mynatt RL, Truett AA, Walker JA, Warman ML. 2000. Preparation of PCR-quality mouse genomic DNA with hot sodium hydroxide and tris (HotSHOT). Biotech 29: 52\u00E2\u0080\u009354. http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&id=10907076&retmode=ref&cmd=prlinks. van der Heijden GW, Casta\u00C3\u00B1eda J, Bortvin A. 2010. Bodies of evidence - compartmentalization of the piRNA pathway in mouse fetal prospermatogonia. Current Opinion in Cell Biology 22: 752\u00E2\u0080\u0093757. Vermeulen M, Eberl HC, Matarese F, Marks H, Denissov S, Butter F, Lee KK, Olsen JV, Hyman AA, Stunnenberg HG, et al. 2010. Quantitative Interaction Proteomics and Genome-wide Profiling of Epigenetic Histone Marks and Their Readers. Cell 142: 967\u00E2\u0080\u0093980. Vincent JJ, Huang Y, Chen P-Y, Feng S, Calvopi\u00C3\u00B1a JH, Nee K, Lee SA, Le T, Yoon AJ, Faull K, et al. 2013. Stage-Specific Roles for Tet1 and Tet2 in DNA Demethylation in Primordial Germ Cells. Stem Cell 12: 470\u00E2\u0080\u0093478. Walsh CP, Chaillet JR, Bestor TH. 1998. Transcription of IAP endogenous retroviruses is constrained by cytosine methylation. Nat Genet 20: 116\u00E2\u0080\u0093117. Wang Q, Gu L, Adey A, Radlwimmer B, Wang W, Hovestadt V, hr MBA, Wolf S, Shendure J, Eils R, et al. 2013. Tagmentation-based whole-genome bisulfite sequencing. Nat Protoc 8: 2022\u00E2\u0080\u00932032. Watanabe T, Totoki Y, Toyoda A, Kaneda M, Kuramochi-Miyagawa S, Obata Y, Chiba H, Kohara Y, Kono T, Nakano T, et al. 2008. Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 453: 539\u00E2\u0080\u0093543. Weber P, Cammas F, Gerard C, Metzger D, Chambon P, Losson R, Mark M. 2002. Germ cell expression of the transcriptional co-repressor TIF1\u00CE\u00B2 is required for the maintenance of 121 spermatogenesis in the mouse. Development 129: 2329\u00E2\u0080\u00932337. Webster KE, O'Bryan MK, Fletcher S, Crewther PE, Aapola U, Craig J, Harrison DK, Aung H, Phutikanit N, Lyle R, et al. 2005. Meiotic and epigenetic defects in Dnmt3L-knockout mouse spermatogenesis. Proceedings of the National Academy of Sciences 102: 4068\u00E2\u0080\u00934073. Wu H, Zhang Y. 2014. Reversing DNA Methylation: Mechanisms, Genomics, and Biological Functions. Cell 156: 45\u00E2\u0080\u009368. Yagi T, Tokunaga T, Furuta Y, Nada S, Yoshida M, Tsukada T, Saga Y, Takeda N, Ikawa Y, Aizawa S. 1993. A novel ES cell line, TT2, with high germline-differentiating potency. Anal Biochem 214: 70\u00E2\u0080\u009376. Yuan P, Han J, Guo G, Orlov YL, Huss M, Loh YH, Yaw LP, Robson P, Lim B, Ng HH. 2009. Eset partners with Oct4 to restrict extraembryonic trophoblast lineage potential in embryonic stem cells. Genes & Development 23: 2507\u00E2\u0080\u00932520. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, et al. 2008. Model-based analysis of ChIP-Seq (MACS). Genome Biol 9: R137. Zhang Y, Yang X, Gui B, Xie G, Zhang D, Shang Y, Liang J. 2011. Corepressor Protein CDYL Functions as a Molecular Bridge between Polycomb Repressor Complex 2 and Repressive Chromatin Mark Trimethylated Histone Lysine 27. Journal of Biological Chemistry 286: 42414\u00E2\u0080\u009342425. 122 Appendices Supplementary figures of Chapter 2 is listed as Appendix A; supplementary methods of Chapter 2 is listed as Appendix B; and supplementary figures of Chapter 4 are listed in Appendix C; Supplementary Tables for Chapter 4 are listed in Appendix D. All supplementary materials are also available at my Github repository (https://github.com/sheng-liu/sheng-liu.github.io/tree/master/Supplementals). 123 Appendix A Supplementary figures of Chapter 2 Supplementary figure 2-1 Complexity ChIP-seq libraries prepared from 103 to 106 ESCs. (a) Preparation of H3, H3K9me3, H3K27me3 and H3K4me3 NChIP-seq libraries from 103 to 106 cells. Comparison of library complexity in H3K9me3 (b) H3K27me3 (c) and H3K4me3 (d) NChIP-seq libraries prepared with 103 to 106 cells. The number of reads passing Illumina\u00E2\u0080\u0099s chastity filter is presented on top of each stacked bar. Distinct (dark green) and duplicate (light green) uniquely aligned reads. Distinct (dark blue) and duplicate (light blue) multi-aligned reads (MapQ<5). Unaligned reads (gray). 124 Supplementary figure 2-2. Correlation between datasets generated from 103 to 106 ESCs. (a) Two-dimensional plots showing the genome-wide relationship (50,000 random 2kb bins) between H3K9me3 datasets generated from 103 to 106 cells as input material. (b) Two-dimensional plot showing the genome-wide relationship (50,000 random 2kb bins) between H3K27me3 datasets generated from 103 to 105 cells as input material. (c) Frequency plot illustrating the distribution of coverage density in genome-wide 2kb bins for H3K9me3 or H3K27me3 libraries. (d) Two dimensional plot showing the genome-wide relationship (50,000 random 2kb bins) of H3K9me3 and H3K27me3 datasets generated from 103 cells as input material. 125 Supplementary figure 2-3. Sensitivity of H3K9me3 and H3K27me3 ultra-low-input NChIP-seq libraries. Stacked bar graphs showing (a) the proportion of H3K9me3 peaks detected in NChIP-seq libraries built from 106 (top), 105 (middle) and 103 (bottom) cells that overlap or are in close proximity (< 2.5 kb, dark green or < 5 kb, light green) to peaks detected in libraries built from 103 to 106 cells or (b) the proportion of H3K27me3 peaks detected in NChIP-seq libraries built from 105 (top), 104(middle) and 103 (bottom) cells that overlap or are in close proximity (< 2.5 kb, dark green or < 5 kb, light green) to peaks detected in libraries built from 103 to 106 cells. (c) Venn diagram illustrating the proportion of H3K9me3 peaks detected in \u00E2\u0080\u009Cgold standard\u00E2\u0080\u009D (106 cells) that overlap with peaks detected in ultra-low input libraries built from 103-105 cells. (d) Venn diagram illustrating the proportion of H3K27me3 peaks detected in the library prepared from 105 cells that are also detected in ultra-low input libraries prepared from 103 to 105 cells. Peaks were detected with MACS peak-calling software as described in Methods. 126 Supplementary figure 2-4. Correlation between H3K9me3 datasets generated from 103 to 106 mouse ESCs. (a) Relative H3K9me3 enrichment measured in RPKM at all annotated ERV subfamilies present at >100 copies in the BL6 genome in H3K9me3 NChIP-seq libraries built from 103 to 106 cells. ERVs classes are sorted alphabetically along the X-axis according to RepeatMasker annotation name within ERV1, ERVK, ERVL, Gypsy and MaLR. (b) Relative coverage (presented here as % of maximum coverage) of H3K9me3 libraries (built from 103 to 106 cells) or pan-H3 library (built from 103 cells) in the 5\u00E2\u0080\u0099 flank (1.5 kb upstream) of all ERVKs present in the BL6 genome. (c) Correlation between H3K9me3 enrichment in the 5\u00E2\u0080\u0099 flank (1kb upstream) of individual IAP elements in libraries built from 103 and 106 cells. 127 Supplementary figure 2-5. H3K27me3 enrichment at transcription start sites in mouse ES cells. (a) Correlation between H3K27me3 enrichment at annotated TSSs (+/-1kb) as measured by read coverage in libraries built from 103 to 105 cells. (b) Overlap of the top 15% of H3K27me3 marked genes (TSS +/-1 kb) as measured by read coverage in libraries built from 103 to 105 cells. (c) Relationship between gene promoter (TSS +/-1 kb) H3K27me3 signal and gene expression in NChIP-seq libraries prepared from 103 to 105 cells. 128 Supplementary figure 2-6. Correlation of H3K27me3 enrichment at gene promoters in E13.5 PGCs isolated from single male and female embryos. Two-dimensional plot shows H3K27me3 enrichment at gene promoters (TSS +/-1 kb) measured as read coverage in data generated from 103 male versus female PGCs using our low input protocol. Of note, increased X-chromosome reads are observed in the female PGCs (yellow box). Supplementary figure 2-7. H3K27me3-marked genes in mouse ES cells and in E13.5 PGCs. Overlap of the H3K27me3 marked genes (TSS +/-1 kb) in mouse ES cells, male and female E13.5 PGCs. 129 Supplementary figure 2-8. Sex-specific H3K27me3 gene silencing in E13.5 PGCs. Relationship between expression (top) of genes involved in meiosis (a) or the TGF\u00CE\u00B2 receptor pathway (b) and H3K27me3 enrichment (bottom) in their promoter region (TSS +/-1kb). Blue: male E13.5 PGCs, red: female E13.5 PGCs. 110100EregWee2Spire2Fmn2Sycp2LfngMlh3Cdc25bRad51cSpire14921513D23RikHsf1Plk1Sgol2Trip13Sycp3Fbxo5AurkaSpin1Stag3Stra8Ppp1ca UbbDazlM2.5M2M1.5M1M0.50EregWee2Spire2Fmn2Sycp2LfngMlh3Cdc25bRad51cSpire14921513D23RikHsf1Plk1Sgol2Trip13Sycp3Fbxo5AurkaSpin1Stag3Stra8Ppp1caUbbDazla Genes&involved&in&female&meiosis0..1.02.0Expression&(RPKM)H3K27me3&(RPKM)110100Lrg1GdnfEngTgfb2Tgfbr2Tgfbr3Tgfbr3Tgfbrap1Smad6Smad3Smurf2Tgfb1Tgfbr1AmhSmad7Snx25Usp15Smad2Snx6Lefty1Lefty2Fkbp1abM2.5M2M1.5M1M0.50Lrg1GdnfEngTgfb2Tgfbr2Tgfbr3Tgfbr3Tgfbrap1Smad6Smad3Smurf2Tgfb1Tgfbr1AmhSmad7Snx25Usp15Smad2Snx6Lefty1Lefty2Fkbp1aGenes&involved&in&the&TGF\u00CE\u00B2 pathway...1.02.0Expression&(RPKM)H3K27me3&(RPKM)Supplementaryfigure 8.SexIspecificH3K27me3gene silencing inE13.5 PGCs.Relationship between expression (top) of genes involved in meiosis (a) or the TGF\u00CE\u00B2 receptor pathway (b)and H3K27me3 enrichment (bottom) in their promoter region (TSS +/M 1kb). Blue: male E13.5 PGCs, red:female E13.5 PGCs. 130 Appendix B Supplementary methods of Chapter 2 Cell culture and isolation TT2 mouse ESCs (Yagi et al. 1993) were cultured in DMEM supplemented with 15% FBS (HyClone), 20 mM HEPES, 0.1 mM non-essential amino acids, 0.1 mM 2-mercaptoethanol, 100 U/ml penicillin, 0.05 mM streptomycin, leukemia-inhibitory factor and 2 mM L-glutamine on gelatinized plates. Trypsinized cells were either FACS-sorted or aliquoted in nuclear isolation buffer (Sigma, N3408) containing protease inhibitor cocktail (Roche), flash frozen and stored at -80\u00C2\u00B0C for a few weeks to a few months. \u00E2\u0080\u009CGold-standard\u00E2\u0080\u009D native ChIP For \u00E2\u0080\u009CGold-standard\u00E2\u0080\u009D native ChIP (Karimi et al. 2011), 106 cells were resuspended in douncing buffer (10 mM Tris-HCl, pH 7.5, 4 mM MgCl2, 1 mM CaCl2 and Protease inhibitor cocktail) and homogenized through a syringe. Chromatin was digested in 2U/\u00C2\u00B5l MNase (Worthington Biochemicals) at 37 \u00C2\u00B0C for 5 min, and the reaction was quenched by 0.5 M EDTA. Chromatin was resuspended in hypotonic buffer (0.2 mM EDTA, pH 8.0, 0.1 mM benzamidine, 0.1 mM phenylmethylsulfonyl fluoride, 1.5 mM dithiothreitol, 1X PIC) and incubated for 1 hour on ice. Cellular debris was pelleted and the supernatant was recovered. Chromatin was pre-cleared with 20 \u00C2\u00B5l of 1:1 protein A:protein G Dynabeads (Life Technologies) and immunoprecipitation was carried in with antibody-bead complexes (5 \u00C2\u00B5l Active Motif #39161 H3K9me3 antibody and 20 \u00C2\u00B5l 1:1 protein A:protein G Dynabeads) overnight at 4\u00C2\u00B0C. IPed (immunoprecipitated) complexes were washed twice with 400 \u00C2\u00B5l of ChIP Wash buffer I (20 mM Tris-HCl, pH 8.0, 0.1 % SDS, 1 % Triton X-100, 2 mM EDTA, 150 mM NaCl) and twice with 400 \u00C2\u00B5l of ChIP Wash Buffer II 131 (20 mM Tris-HCl (pH 8.0), 0.1 % SDS, 1 % Triton X-100, 2 mM EDTA, 500 mM NaCl). Protein-DNA complexes were eluted in 200 \u00C2\u00B5l of elution buffer (100 mM NaHCO3, 1% SDS) for 2 hours at 68\u00C2\u00B0C. IPed (immunoprecipitated) material was purified by phenol chloroform and 5 ng of raw ChIP material was processed for library construction. ULI-NChIP We based our chromatin preparation on a previously published MNase chromatin fragmentation and library construction from single cells (Falconer et al. 2012). TT2 mouse ESCs were either FACS-sorted directly in nuclear isolation buffer (Sigma) (< 20,000 cells) or pelleted and re-suspended in nuclear isolation buffer (Sigma). Depending on input size chromatin was fragmented for 5-7.5 minutes using MNase at 21 or 37\u00C2\u00B0C , and diluted in NChIP immunoprecipitation buffer (20 mM Tris-HCl pH 8.0, 2 mM EDTA, 150 mM NaCl, 0.1% Triton X-100, 1\u00C3\u0097 EDTA-free protease inhibitor cocktail, 1 mM phenylmethanesulfonyl fluoride (Sigma)). Chromatin was pre-cleared with 5 or 10 \u00C2\u00B5l of 1:1 protein A:protein G Dynabeads (Life Technologies) and IPed (immunoprecipitated) with 0.25 or 1 mg of H3K9me3 (Active Motif #39161), H3K27me3 (Diagenode pAb-069-050) or pan-H3 (Sigma I8140) antibody-bead complexes overnight at 4\u00C2\u00B0C. IPed (immunoprecipitated) complexes were washed twice with 400 \u00C2\u00B5l of ChIP Wash buffer I (20 mM Tris-HCl, pH 8.0, 0.1 % SDS, 1 % Triton X-100, 0.1% deoxycholate, 2 mM EDTA, 150 mM NaCl) and twice with 400 \u00C2\u00B5l of ChIP Wash Buffer II (20 mM Tris-HCl (pH 8.0), 0.1 % SDS, 1 % Triton X-100, 0.1% deoxycholate, 2 mM EDTA, 500 mM NaCl). Protein-DNA complexes were eluted in 30 \u00C2\u00B5l of ChIP elution buffer (100 mM NaHCO3, 1% SDS) for 2 hours at 68\u00C2\u00B0C. IPed (immunoprecipitated) material was purified by phenol chloroform, ethanol-precipitated and raw ChIP material was re-suspended in 10 mM Tris- 132 HCl pH 8.0. As material obtained after ChIP is minimal, DNA concentration was not measured in samples prior to library construction. For optimal results, raw ChIP material was re-purified with 1.8 \u00C3\u0097 volume of Ampure XP DNA purification beads (Agencourt) prior to library construction. A detailed, step-by-step procedure is presented in (Brind'Amour et al. 2015), as well as Github Repository (https://sheng-liu.github.io/). RNA extraction and double stranded cDNA preparation Total RNA was extracted from a frozen 103 cells aliquots using TRIzol (Invitrogen, AM9738) according to the manufacturer\u00E2\u0080\u0099s manual. Residual genomic DNA was removed by treatment with DNase I (Promega), and ribosomal RNA was depleted using the RiboMinusTranscriptome Isolation Kit (Invitrogen) according the manufacturer\u00E2\u0080\u0099s low-input protocol. First strand cDNA synthesis was carried out using Superscript III (Invitrogen 18080-093) with T4 protein 32 and a combination of random 15-mers and oligo dT (NEB), followed by second strand cDNA synthesis using the Klenow polymerase (NEB) in the presence of RNaseH. Double stranded cDNA was fragmented using a BioRuptor (Diagenode) for 15 minutes (low power mode, 30s on and 30s). Library construction For \u00E2\u0080\u009Cgold-standard\u00E2\u0080\u009D H3K9me3 NChIP-seq, 5ng of raw ChIP material was used for library construction. For ultra-low-input NChIP-seq, 85% of the raw ChIP material was used for library construction. Illumina libraries were constructed using a modified custom paired-end protocol(Falconer et al. 2012). Briefly, samples were end-repaired (1\u00C3\u0097 T4 DNA ligase buffer, 0.4 mM dNTP mix, 2.25 U T4 DNA polymerase, 0.75 U Klenow DNA polymerase and 7.5 U T4 polynucleotide kinase; 30 minutes at 21-25\u00C2\u00B0C), A-tailed (1\u00C3\u0097 NEB buffer 2, 0.4 mM dNTPs and 133 3.75 U of Klenow (exo-); 30 minutes at 37\u00C2\u00B0C), and ligated (1\u00C3\u0097 rapid DNA ligation buffer, 1 mM Illumina PE adapters and 1,600 U DNA ligase; 1-8 hours at 21-25\u00C2\u00B0C). Ligated fragments were amplified using indexed primers (Illumina) for 8-10 PCR cycles. DNA was purified with 1.8 \u00C3\u0097 volume Ampure XP DNA purification beads between each step. Sequencing and alignment Amplified indexed libraries were pooled, size selected on a 2% agarose gel and diluted to a final concentration of 10 mM. Cluster generation and paired-end sequencing (100 bp reads) were performed on the Illumina cluster station and Illumina HiSeq 2000 or Illumina HiSeq 2500 sequencing platforms using Illumina Read 1 and Read 2 primers, and a third custom primer (5\u00E2\u0080\u0099- GATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCG-3\u00E2\u0080\u0099) to sequence the 6-mer unique index. Sequence reads were mapped to mm9 (NCBI 37) using BWA(Li and Durbin 2009), and duplicate reads were marked using Picard-tools (http://picard.sourceforge.net). Reads passing Illumina\u00E2\u0080\u0099s default chastity filter (total reads) were used to generate library statistics using Samtools Flagstats(Li et al. 2009), where reads with the exact same sequence are identified as \u00E2\u0080\u009Cduplicates\u00E2\u0080\u009D, and non-duplicate reads with a MapQ > 5 are identified as distinct uniquely aligned reads and reads with a MapQ < 5 are identified as distinct multi-aligned reads. Datasets ChIP-seq and RNA-seq datasets prepared for this manuscript are available at the Gene Expression Omnibus (GEO) repository under the accession number GSE63523. H3K27me3 ChIP-seq datasets prepared from 103 male and female E13.5 PGCs using ULI-NChIP-seq(Liu et al. 2014) and low input RNA-seq datasets are available under the accession GSE60377. 134 ENCODE(Consortium et al. 2012) H3K4me3 datasets generated from E14 ESCs (SRR568477 and SRR568478) and H3K27me3 datasets generated from E13.5 PGCs were obtained from accessions GSE38165(Ng et al. 2013) and SRA027978(Lesch et al. 2013) . Data analysis For analysis of relative ChIP enrichment at unique loci, duplicate reads (with identical coordinates) and reads with a MapQ < 5 (multi-aligned reads) were removed. Multi-aligned reads were included for calculating the relative ChIP enrichment at agglomerated transposable elements (TEs). Normalization of relative ChIP enrichment was calculated as Reads Per Kilobase per Million mapped reads (RPKM) (Pepke et al. 2009). For mined datasets using short, single-end reads, reads were extended to 300 bp before generating RPKM values. Potential library complexity was determined using the extrapolate function of the PreSeq package(Daley and Smith 2013) . For expression analysis, normalization of RNA-seq read enrichment was calculated as RPKM at exonic regions only (RefSeq transcripts). Peak calling Regions enriched for H3K9me3 or H3K27me3 were determined using MACS and MACS2 peak callers on non-duplicate, uniquely aligned reads (Zhang et al. 2008; Feng et al. 2012). For H3K9me3 peaks, broad domains were identified using MACS2 broadpeaks (p-value 0.05) and combined with narrow domains identified with MACS (105 and 106 cells input: p-value 0.01, 103 and 104 cells input: p-value 0.02). Peaks closer than 2kb apart were merged, and peaks larger than 0.5kb were included in our analysis. Similarly, for H3K27me3 peaks, broad regions were called using MACS2 broadpeaks (p-value 0.05) and combined with narrower domains identified 135 with MACS (104 and 105 cells input: p-value 0.01, 103 cells input: p-value 0.02). Peaks closer than 2kb apart were merged, and peaks larger than 0.5kb were included in our analysis. 136 Appendix C Supplementary figures of Chapter 4 Supplementary figure 4-1: Validation of PGC isolation strategy and ERV expression profiling. (A) Strategy for purification of male and female E13.5 PGCs based on the expression of the SSEA-1 cell surface marker. Subsequent downstream analyses conducted on 1000-3000 PGCs from individual embryos are also shown. (B) Expression (RPKM) levels of all ENSEMBL annotated genes was determined by RNA-sequencing of 103 purified E13.5 PGCs (SSEA-1+) and soma (SSEA-1-) isolated from the same male embryo. Cell-type specific genes were ASupplementary Figure 1\u00E2\u0099\u0082 \u00E2\u0099\u0082B C 137 identified in PGCs (green) and in soma (red) by applying both Z-score (> 2) and fold-difference (> 5) thresholds. (C) Expression levels of all annotated ERV1, ERVK, MaLR and ERVL elements present at >100 copies in the C57Bl/6 genome, as measured by RPKM values, are shown for the same male E13.5 PGCs and soma. Supplementary figure 4-2: Mating strategy, experimental scheme and genotyping. (A) Mating scheme to obtain germline-specific Setdb1 deficient mice. The datasets generated for each genotype and sex are outlined below. (B) Genotyping of E13.5 embryos. Representative PCR analysis of genomic DNA isolated from the tails of embryos from a single litter is shown. Zfy: present in males only (chrY), Xist: present in males and females (chrX). \u00E2\u0099\u0080Setdb1flox/flox\u00E2\u0099\u0082Setdb1\u00CE\u0094/+ Tnapcre/+HET(flox/-)Setdb1flox/-xWT(flox/+)Setdb1flox/+HET(\u00E2\u0096\u00B3/+)Setdb1\u00E2\u0096\u00B3/+ Tnap+/creKO(\u00E2\u0096\u00B3/-)Setdb1\u00E2\u0096\u00B3/- Tnap+/creChIP-seq:H3K9me3H3K27me3RNA-seqSanger bisulfitesequencingPBATAB\u00E2\u0099\u0082 \u00E2\u0099\u0082 \u00E2\u0099\u0082 \u00E2\u0099\u0080 \u00E2\u0099\u0080flox 850bpcre 350bpZfy1 350bpXist 250bp\u00E2\u0096\u00B3 800bp wt 700bpSupplementary Figure 2\u00E2\u0099\u0082--1x--\u00E2\u0099\u0080--1x--\u00E2\u0099\u0082--2x--\u00E2\u0099\u0080--2x--\u00E2\u0099\u00821x1x2x1x (IAP)1x\u00E2\u0099\u00801x1x2x1x(IAP)-\u00E2\u0099\u00821x1x2x1x (IAP)1x\u00E2\u0099\u00801x1x2x1x(IAP)-\u00E2\u0099\u0080Setdb1flox/flox\u00E2\u0099\u0082Setdb1\u00CE\u0094/+ Tnapcre/+HET(flox/-)Setdb1flox/-xWT(flox/+)Setdb1flox/+HET(\u00E2\u0096\u00B3/+)Setdb1\u00E2\u0096\u00B3/+ Tnap+/creKO(\u00E2\u0096\u00B3/-)Setdb1\u00E2\u0096\u00B3/- Tnap+/creChIP-seq:H3K9me3H3K27me3RNA-seqSanger bisulfitesequencingPBATAB\u00E2\u0099\u0082 \u00E2\u0099\u0082 \u00E2\u0099\u0082 \u00E2\u0099\u0080 \u00E2\u0099\u0080flox 850bpcre 350bpZfy1 350bpXist 250bp\u00E2\u0096\u00B3 800bp wt 700bpSupplementary Figure 2\u00E2\u0099\u0082--1x--\u00E2\u0099\u0080--1x--\u00E2\u0099\u0082--2x--\u00E2\u0099\u0080--2x--\u00E2\u0099\u00821x1x2x1x (IAP)1x\u00E2\u0099\u00801x1x2x1x(IAP)-\u00E2\u0099\u00821x1x2x1x (IAP)1x\u00E2\u0099\u00801x1x2x1x(IAP)- 138 \u00E2\u0099\u0080(RPKM)\u00E2\u0099\u0082 (RPKM)H3K9me3 H3K27me3Correlation: 0.89 Correlation: 0.85AB\u00E2\u0099\u0080H3K27me3 (RPKM)Genome-wide (1 kb bins) TSS ERVs (5\u00E2\u0080\u0099 flank)Gametic DMRs LINE1 (5\u00E2\u0080\u0099 flank)\u00E2\u0099\u0080 H3K9me3 (RPKM)Supplementary Figure 3CH3K9me3H3K27me3Genome89.222.13 3.145.51\u00E2\u0099\u0080 139 Supplementary figure 4-3: Genome-wide enrichment of H3K9me3 versus H3K27me3 in E13.5 PGCs. (A) Genome-wide correlation (50,000 random 1kb bins) of H3K9me3 (left panel) and H3K27me3 (right panel) between male and female E13.5 PGCs. (B) Enrichment of H3K9me3 and H3K27me3 in female E13.5 PGCs genome-wide (50,000 random 1 kb bins, left panel), around transcription start sites (TSSs, +/- 1kb), at gametic differentially methylated regions (DMRs), at the 1kb region in the 5\u00E2\u0080\u0099 flanks of ERVs or the 1kb region in the 5\u00E2\u0080\u0099 flanks of LINEs. (C) The percentages of the mappable genome marked by H3K9me3 and H3K27me3 in female E13.5 PGCs is shown. Enriched regions were determined by MACS2, with a p-value of 0.05. 140 Supplementary Figure 402040600123CR1_MamHAL1HAL1bL1_Mur2L1_Mus2L1_RodL1M2L1M3L1M3cL1M3eL1M4cL1M7L1MA4AL1MA6L1MA9L1MB3L1MB7L1MC1L1MC4L1MCaL1MDL1Md_F2L1Md_TL1MD3L1ME1L1ME3L1ME4aL1MEbL1MEeL1MEg2L1VL1 LxLx2A1Lx3ALx4A Lx6Lx9L2a L4LTR55LTR37BLTR65LTR78LTRIS2LTRIS5MER21BMER31BMER34AMER57A-intMER57E1MER67AMER67DMER90aMMVL30-intMuRRS-intRLTR14-intRLTR1CRLTR24RLTR41RLTR6RMER21ARodERV21-intLTR81CBGLII_BETnERV3-intIAP-d-intIAPEY3-intIAPLTR1aIAPLTR2bMERVK26-intMurERV4_19-intMYSERV6-intRLTR10ARLTR10CRLTR10-intRLTR11BRLTR13ARLTR13B1RLTR13B4RLTR13D1RLTR13D4RLTR15RLTR18RLTR19RLTR19CRLTR20A1RLTR20B2RLTR20DRLTR22_MusRLTR26RLTR31_MurRLTR33RLTR42-intRLTR44BRLTR44-intRLTR46RLTR9A2RLTR9CRLTR9FRMER12BRMER16RMER17A2RMER17CRMER17D2RMER19CRMER3D-intRMER6ARMER6DRNLTR23ERVL-E-intHERVL40-intLTR16A1LTR16B2LTR16E1LTR33ALTR33CLTR40cLTR50LTR75LTR80BLTR84aMER54BMER70BMER74BMER77MLT2B1MLT2B4MLT2C2MLT2FMT2BRLTR28RMER10BLTR87MamGypLTR1aMamGypLTR1dMamGypLTR3LTR85cMLT1AMLT1A1MLT1BMLT1C-intMLT1EMLT1E2MLT1F1MLT1GMLT1HMLT1H-intMLT1JMLT1J-intMLT1MMTAMTBMTCMTD-intMTE2bMTEa-intMTE-intORR1A0-intORR1A2ORR1A3-intORR1B1ORR1B2-intORR1C2ORR1D1-intORR1D-int% MethylationChIP-seq (RPKM)D \u00E2\u0096\u00A0 H3K9me3\u00E2\u0096\u00A0 H3K27me3\u00E2\u0096\u00A0 DNA meth.*LINE E V1 E VK ERVL Gypsy aLRLTR-ERVIAPRLTR8-46ERVK10CEtnRLTR1/4/6/47L1Md_A/F/G/T\u00E2\u0099\u0080 E13.5 PGCsE L1Md_T, chr10:11,353,795-11,364,029PBAT WT*PBAT WT*PBAT HETH3K9me3 HETH3K27me3 HETH3K9me3 HETH3K27me3 HET\u00E2\u0099\u0082\u00E2\u0099\u0080LTR-ERVsLINEsB\u00E2\u0099\u0082 H3K9me3 (RPM)\u00E2\u0099\u0082% DNA methylation*0204060800 1 2 3\u00E2\u0099\u0082 H3K27me3 (RPM)01230 0.5 1 1.5 2\u00E2\u0099\u0082 H3K9me3AH3K9me3H3K9me3H3K27me3H3K27me3RNAseqRNAseqPax5H3K9me3H3K9me3H3K27me3H3K27me3RNAseqRNAseqEbf2\u00E2\u0099\u0082\u00E2\u0099\u0080\u00E2\u0099\u0082\u00E2\u0099\u0080>> Zcchc7 >><< Pax5 <<>> Ebf2 >><< Map3k4 <> Zfp872 >>\u00E2\u0099\u0082 E13.5 PGCsC 141 Supplementary figure 4-4: Co-occurrence of H3K27me3 and H3K9me3 at TSSs and ERVs. (A) Genome browser screenshots depicting repressed genes marked by H3K9me3 and H3K27me3 around their transcription start sites. (B) Genome-wide relationship between H3K9me3 enrichment, H3K27me3 enrichment and DNA methylation in male Setdb1 E13.5 PGCs. The yellow gradient in the right panel depicts the relative H3K9me3 enrichment level for all data points shown. * E13.5 DNA methylation data (PBAT WT) is from Kobayashi et al. (Kobayashi et al. 2013). (C) Genome browser screenshots illustrating regions marked by H3K9me3, H3K27me3 and DNA methylation in male E13.5 PGCs. * E13.5 DNA methylation data (PBAT WT) is from Kobayashi et al. (Kobayashi et al. 2013). (D) Enrichment of H3K9me3 and H3K27me3 versus mean percentage of DNA methylation at LINEs and LTR ERVs (present at >100 copies in the BL6 genome) in female E13.5 PGCs. ERVs are sorted alphabetically according to ERV1, ERVK, ERVL, Gypsy, and MaLR classes along the X axis. Class III (ERVL and MaLR) elements are generally depleted of all three marks. (E) Genome browser screenshot showing the presence of residual DNA methylation and the enrichment of H3K9me3 and H3K27me3 at the 5\u00E2\u0080\u0099 end of an L1Md_T element in male (blue) and female (red) E13.5 PGCs. * E13.5 DNA methylation data (PBAT WT) is from Kobayashi et al. (Kobayashi et al. 2013). 142 Supplementary figure 4-5: Trivalent silencing marks at ERVs in E13.5 PGCs. (A) Correlation between H3K9me3 and H3K27me3 enrichment at individual IAP (top) or ETn (bottom) elements in male (blue) and female (red) E13.5 PGCs. RPKM values were calculated from uniquely mapped reads only. (B) Relationship between 0102030405060700123IAPEz-intIAPLTR2aIAPLTR4RLTRETN_MmRLTR44-intRLTR27MTB-intRMER17ARLTR43AMT2B2ORR1A4RLTR25ARMER4ARMER6DRLTR11ARLTR15RLTR12BORR1C2-intORR1D1-intLTR73MLT1G1LTR40aLTR87MER57C2MLT1DMLT1JRLTR42LTR80AMER110MLT1J-intB \u00E2\u0099\u0082 E13.5 PGCsChIP(RPKM)00.51 14 27 40 53 66 79 92 105118131144157170183196209222235248261274287300313326339352365378Younger elements Older elementsMappability % DNA methylation01020300.0 0.5 1.0 1.5 2.0 2.5 3.0Expression (RPKM)H3K9me3 (RPKM) ETnFemaleMaleDAUnique regions flanking ERVsIAPezMERVLNormalized coverage (H3K9me3) 0-6kb +6kb0Average DNA methylation INTLTRs5\u00E2\u0080\u0099 3\u00E2\u0080\u0099INTLTRs5\u00E2\u0080\u0099 3\u00E2\u0080\u0099\u00E2\u0096\u00A0 Bis-seq PGCs\u00E2\u0096\u00A0 Bis-seq Blastocyst\u00E2\u0096\u00A0 H3K9me3 ESCs 1\u00E2\u0096\u00A0 H3K9me3 ESCs 2\u00E2\u0096\u00A0 H3K9me3 PGCsC0.00.30.50.81.01.31.50 1 2 3H3K27me3 (RPKM)H3K9me3 (RPKM)Etn elements (266)(unique reads)00.511.520 1 2 3 4H3K27me3 (RPKM)lAPez elements (603)(unique reads) Female Male0.00.40.81.20.0 0.5 1.0 1.5 2.0 2.5H3K9me3 (RPKM)RLTR4 (MLV)Supplementary Figure 5\u00E2\u0096\u00A0 % DNA meth*\u00E2\u0096\u00A0 H3K9me3\u00E2\u0096\u00A0 H3K27me3\u00E2\u0096\u00A0 Pan-H3 143 element age (mappability) and enrichment of silencing histone marks at all retroelements (>100 copies) in male E13.5 PGCs. Black line: H3K9me3, red line: H3K27me3 and shaded area: Mean % DNA methylation (data from Kobayashi et al. 2013). Black shading: mappability (ENCODE, 100mers). (C) Relative enrichment of H3K9me3 (in PGCs and ESCs) and DNA methylation (in PGCs and blastocysts) in the flanks of IAP (top) and MERVL (bottom) subfamilies. PGC (Kobayashi et al. 2013) and blastocyst (Kobayashi et al. 2012) PBAT DNA methylation datasets were published previously. (D) Correlation between H3K9me3 and expression of individual ETn (left) and MLV (right) elements in male (blue) and female (red) E13.5 PGCs. RPKM values were calculated from uniquely mapped reads only. A KO(\u00CE\u0094/)HET(flox/\u00E2\u0096\u00B3)2012-06-19 density-20#CAB022.jo Group: All Samples Layout-Batch11/12/13 12:51 PM Page 1 of 1 (FlowJo v8.7)PEPIPGC6.88M ESET(-\+)TNAP(+\cre)PEPIPGC3.19M ESET(-\-)TNAP(+\cre)PEPIPGC6.34M ESET(cko\+)PEPIPGC6.16M ESET(cko\-)PEPIPGC0Negative Control2012-06 19 density- 0#CAB022.jo Group: All Samples Layout-Batch1 /12/13 12:51 PM Page 1 of 1 (FlowJo v8.7)PEPIPGC6.88M ESET(-\+)TNAP(+\cre)PEPIPGC3.19M ESET(-\-)TNAP(+\cre)PEPIPGC6.34M ESET(cko\+)PEPIPGC6.16M ESET(cko\-)PEPIPGC0Negative ControlSSEA1-PEPropidiumiodideSupplementary Figure 6B024681012146 15 16 19RPKMExon numberSetdb1 exon expression\u00E2\u0099\u0082 Het\u00E2\u0099\u0082 KO\u00E2\u0099\u0080 Het \u00E2\u0099\u0080 KOPBAT coverageloxp loxp13 14 15 16 1713 14 17loxp71%\u00E2\u0086\u009363%\u00E2\u0086\u009300.511.570%\u00E2\u0086\u009368%\u00E2\u0086\u0093 144 Supplementary figure 4-6: Isolation and characterization of germline specific Setdb1 KO E13.5 PGCs. (A) FACS plots illustrating the number of PGCs in E13.5, as measured by SSEA-1 staining, from single gonads isolated from male and female Setdb1 KO and HET littermates. % SSEA-1 positive cells is indicated. (B) Map of the Setdb1 locus revealing the locations of the loxP sites relative to the genic exons and in silico PCR analysis of RNA-seq data for E13.5 PGCs depicting Setdb1 deletion efficiency using the Tnap-Cre strategy. Two loxP sites are situated in intronic regions resulting in the deletion of exons 15/ 16 (top). Bar graph shows relative transcript levels at deleted exons (15 and 16) and control exons (6, 19). Relative coverage for the same exons for PBAT data from male HET and KO E13.5 PGCs is also shown (inset). The average percentage decrease of coverage over exons 15 and 16 is indicated above the deleted exon bars. 145 Supplementary figure 4-7: H3K9me3 and H3K27me3 depletion in Setdb1 KO E13.5 PGCs. (A) Genome-wide H3K27me3 profiles in HET versus Setdb1 KO E13.5 PGCs isolated from male littermates. Yellow gradient depicts the relative H3K9me3 enrichment level in HET PGCs. (B) qPCR validation of H3K9me3 enrichment at IAP and BH3K27me3 \u00E2\u0099\u0082 KO (RPKM)H3K27me3 \u00E2\u0099\u0082HET (RPKM)0.00.51.01.52.00.0 1.0 2.00123\u00E2\u0099\u0082 Het H3K9me3 (RPKM)ASupplementary Figure 70100200300400500600700Expression (RPKM)\u00E2\u0099\u0082 E13.5 PGC050100150200250300350400450Expression (RPKM)\u00E2\u0099\u0080 E13.5 PGC\u00E2\u0096\u00A0 HET\u00E2\u0096\u00A0 KODCRLTR6RLTR6-intRLTR1RLTR1BRLTR4-intRLTR4Etn-intIAPLTR1IAPez-intIAPLTR1aL1Md_FL1Md_Gf50607080901001101201301400.5 1.0 1.5 2.0 2.5H3K9me3 KO/HET % H3K9me3 \u00E2\u0099\u0080 HET (RPKM)ERV1ERVKLINE150607080901001101200.3 0.5 0.7 0.9 1.1H3K27me3 KO/HET % H3K27me3 \u00E2\u0099\u0082 HET (RPKM)607080901001101201300.3 0.5 0.7 0.9 1.1H3K27me3 KO/HET % H3K27me3 \u00E2\u0099\u0080 HET (RPKM)E01020304050H3K9me3 H3K27me3\u00E2\u0096\u00A0 \u00E2\u0099\u0082 HET \u00E2\u0096\u00A0 \u00E2\u0099\u0082 KO\u00E2\u0096\u00A0 \u00E2\u0099\u0080 HET\u00E2\u0096\u00A0 \u00E2\u0099\u0080 KOIAP01020304050H3K9me3 H3K27me3MERVLF 146 MERVL ERVs in matched Setdb1 HET and KO male and PGCs. Error bars show standard deviation of 3 biological replicates. (C) Relationship between H3K9me3 enrichment levels (RPKM) at LINE1, ERVK and ERV1 subfamilies in E13.5 HET female PGCs and the percentage change in H3K9me3 enrichment upon Setdb1 deletion. Unmarked ERVs (not shown) did show a small H3K9me3 level increase in Setdb1 KO PGCs, but remained below the background threshold (0.5 RPKM). (D) Relationship between H3K27me3 enrichment levels (RPKM) at LINE1, ERVK and ERV1 subfamilies in E13.5 HET male PGCs and the percentage change in H3K27me3 enrichment upon Setdb1 deletion. Unmarked ERVs (not shown) did show a small H3K27me3 level increase in Setdb1 KO PGCs, but remained below the background threshold (0.3 RPKM). (E) Relationship between H3K27me3 enrichment levels (RPKM) at LINE1, ERVK and ERV1 subfamilies in E13.5 HET female PGCs and the percentage change in H3K27me3 enrichment upon Setdb1 deletion. Unmarked ERVs (not shown) did show a small H3K27me3 level increase in Setdb1 KO PGCs, but remained below the background threshold (0.3 RPKM). (F) Relative expression of histone methyltransferases specific for H3K9 or H3K27, Trim28 and readers of H3K9 methylation in male (left) and female (right) HET and Setdb1 KO E13.5 PGCs. 147 Supplementary figure 4-8: Effect of germline Setdb1 deletion on DNA methylation and gene expression in E13.5 PGCs. (A) Histogram depicting genome-wide mean % DNA methylation (1kb bins) in male Setdb1 KO and HET littermates. (B) Expression levels (RPKM) of genes involved in DNA methylation homeostasis in male (left) and female (right) HET and Setdb1 KO E13.5 PGCs, as determined by RNA-seq. Error bars show standard deviation between independent experiments. (C) Top panel shows expression levels (RPKM) in HET and Setdb1 KO E13.5 PGCs of top 10 (male) or 12 (female) genes which are upregulated in male (left) and female (right) E13.5 PGCs (E13.5 > E12.5), as reported by Jameson et al. (Jameson et al. 2012). Bottom panel shows expression levels (RPKM) of top 10 (male) or 8 (female) genes that are down-regulated in E13.5 PGCs (E12.5 > E13.5). (D) Supplementary Figure 8A% DNA Methylation# bins025005000750000 25 50 75 100HETKOB020406080100120140160180Expression (RPKM)\u00E2\u0099\u0080 E13.5 PGC020406080100120140160180200Expression (RPKM)\u00E2\u0099\u0082 E13.5 PGC\u00E2\u0096\u00A0 HET\u00E2\u0096\u00A0 KO050100150200250Stra8Rec8Tex101Tex12Gm1564Hfm1Taf7lTrank1Tex16Ccdc73Sycp2Macrod2\u00E2\u0099\u0080 E13.5 PGC13.5 > 12.501020304050L1td1NanogGot1PfkpGtsf1lAnkr\u00E2\u0080\u00A6Nlrp9bHbb-y\u00E2\u0099\u0080 E13.5 PGC12.5 > 13.501020304050Bnc2Igf1rTdrd1LipaSycp2Gm1141Chchd7BrdtCntfrSlc9a7Expression (RPKM) \u00E2\u0099\u0082 E13.5 PGCE13.5 > E12.50510152025AmtLrrc34Cds1Pfkp1700019D03RikHbb-yBC048679Gng3CrygsVpreb1Expression (RPKM)\u00E2\u0099\u0082 E13.5 PGCE12.5 > E13.5C\u00E2\u0096\u00A0 HET\u00E2\u0096\u00A0 KOD0% 10% 20% 30% 40% 50%Genome wide binsH3K9me3 aloneH3K27me3 aloneH3K9me3 andH3K27me3Neither H3K9me3 norH3K27me3% DNA methylatonHetKOETGenome-wideH3K9me3 onlyH3K27me3 onlyH3K9me3 +H3K27me3No H3K9me3/H3K27me3\u00E2\u0099\u0082 HET H3K27me3 (RPKM)% DNA methylation (KO-HET) 030-30-60600.0 0.5 1.0 1.5 2.00RPKM123\u00E2\u0099\u0082 HET H3K9me3E 148 Relationship between H3K9me3, H3K27me3 and mean % DNA methylation in Setdb1 KO and Het littermates genome-wide (1kb bins), at unmarked genomic regions, or regions marked with H3K9me3 alone, H3K27me3 alone or both H3K9me3 and H3K27me3 (MACS, p-value 0.05). For % DNA methylation values, only CpGs with > 5X coverage were considered. Note that the increase in DNA methylation in the Setdb1 KO is lowest at H3K9me3 alone and dual marked regions. (E) Relationship between H3K27me3 in control (HET) E13.5 PGCs and the change (\u00CE\u0094) in the % of methylation in these HET PGCs relative to PGCs isolated from their Setdb1 KO littermates. Data from 50,000 random regions (1kb bins) is shown. DNA methylation values were determined by PBAT. 149 Supplementary figure 4-9: ERV de-repression in Setdb1 KO E13.5 PGCs. (A) RNA-seq coverage (multi-match reads included) of the 19 most highly expressed ERV subfamilies (present at > 100 copies in the C57BL/6 genome) in Setdb1 KO (light bars) and HET (dark bars) E13.5 PGCs isolated from male (blue) and female (red) littermates. Fold-change (Setdb1 KO/HET) is indicated next to the bars. Note that specific ERVs show sex-specific expression patterns. (B) Comparison of the level of ERV derepression (subfamilies > 100 copies) in E13.5 Setdb1 KO PGCs, as measured by Z-score values, from two independent experiments conducted on male (left) and female (right) Setdb1 KO and control (HET) littermates. IAPLTR1IAPez-intIAPLTR1aEtn-int-0.20.30.81.31.82.32.8-0.2 0.3 0.8 1.3 1.8 2.3 2.8Z-score \u00E2\u0099\u0082KO vsHETExpt2Z-score \u00E2\u0099\u0082 KO vs HET Expt 1ERVK10CIAPLTR1IAPez-intRLTR10CGLN-intRLTR4RLTR4-intIAPLTR1a-0.20.30.81.31.8-0.2 0.3 0.8 1.3 1.8Z-score \u00E2\u0099\u0080KO vsHET Expt2Z-score \u00E2\u0099\u0080 KO vs Het Expt 1BRNA-seq in \u00E2\u0099\u0080 E13.5 PGCs RNA-seq in \u00E2\u0099\u0082 E13.5 PGCsSupplementary Figure 9A-9 -7 -5 -3 -1 1 3 5 7 9MMERVK10CRLTR1BMLT1BMER57AORR1D1MERVL_2AIAPEzMER21ORR1C2ORR1DRLTR4_MMMurERV4MLT1DRLTR18HERVL40RLTR19MMERGLNMMVL30MMETnExpression (RPKM)\u00E2\u0099\u00801 3 5 7 99 7 5 3 1\u00E2\u0099\u0082KOHETKO HETRLTR4IAPLTR1IAPez-intIAPLTR1aEtn-int-0.20.30.81.31.82.32.8-0.2 0.3 0.8 1.3 1.8 2.3 2.8Z-score \u00E2\u0099\u0082KO vsHETExpt2Z-score \u00E2\u0099\u0082 KO vs HET Expt 1ERVK10CIAPLTR1IAPez-intRLTR10CGLN-intRLTR4RLTR4-intIAPLTR1a- 230 831.8-0.2 0.3 0.8 1.3 1.8Z-score \u00E2\u0099\u0080KO vsHET Expt2Z-score \u00E2\u0099\u0080 KO vs Het Expt 1BRNA-seq in \u00E2\u0099\u0080 E13.5 PGCs RNA-seq in \u00E2\u0099\u0082 E13.5 PGCsSupplementary Figure 9A-9 -7 -5 -3 -1 1 3 5 7 9MMERVK 0CLTR1BMLT1BMER57AD1MERVL_2AIAPEzM 21ORR1C2OR DRLTR4_MMMurE V4MLT1DRLTR18HERVL40RLTR19MMERGLNMMVL30MMETnExpression (RPKM)\u00E2\u0099\u00801 3 5 7 99 7 5 3 1\u00E2\u0099\u0082KOHETKO HETLTR4 150 Supplementary figure 4-10: Relationship between H3K9me3, H3K27me3 and expression of individual retroelements in HET versus Setdb1 KO PGCs. (A) Relationship between H3K9me3 loss and derepression of individual MERVL (left) or MLV (right) elements in male and female HET versus Setdb1 KO E13.5 PGCs, as measured by Z-score values. Multi-match reads were excluded from this analysis). (B) Relationship between -0.6-0.4-0.20.00.20.4-2.0 0.0 2.0 4.0 6.0 8.0MLV-0.6-0.4-0.20.00.2-2.0 0.0 2.0 4.0 6.0 8.0MERVL-0.6-0.4-0.20.00.2-2.0 0.0 2.0 4.0 6.0 8.0ERVK10C-0.6-0.4-0.20.00.20.4-2.0 0.0 2.0 4.0 6.0 8.0L1Md_Gf-0.6-0.4-0.20.00.20.4-2.0 0.0 2.0 4.0 6.0 8.0EtnRNA-seq Setdb1 KO vs. HET PGC Z-scoreH3K9me3 KO vs. HET Z-scoreABRNA-seq Setdb1 KO vs. HET PGC Z-score-0.8-0.6-0.4-0.20.00.20.4-2.0 0.0 2.0 4.0 6.0 8.0MLV-0.8-0.6-0.4-0.20.00.20.4-2.0 0.0 2.0 4.0 6.0 8.0MERVLFemaleMaleSupplementary Figure 10H3K27me3 KO vs. HET Z-score-0.6-0.4-0.20.00.2-2.0 0.0 2.0 4.0 6.0 8.0IAPezFemaleMale 151 H3K27me3 loss and derepression of individual IAPez (ERVK), MERVK10C (ERVK), MERVL (ERVL), ETn (ERVK), L1Md_GF (LINE1) and MLV (ERVL) elements in male and female HET versus Setdb1 KO E13.5 PGCs. 152 Supplementary Figure 11ARNA-seq in \u00E2\u0099\u0080 E13.5 PGCsC >> Nfatc4 (ERVK10C) >>RLTR10C/ERVK10CNfatc4\u00E2\u0099\u0082\u00E2\u0099\u0080H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (HET)H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (HET)>> Klk1b24/Klk1b3 (ERVK10C) >>H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (KO)H3K9me3 HET \u00E2\u0096\u00A0; KO \u00E2\u0096\u00A0RNA (HET)RNA (KO)\u00E2\u0099\u0082\u00E2\u0099\u0080B Genes upregulated in \u00E2\u0099\u0082 KO PGCsGO term Name GO term Id p-value (corrected) Gene Symbolsnucleic acid binding (MF) GO:0003676 0.007D1Pas1 ; Gm10351 ; Gm13145 ; Gm13154 ; Gm13242 ; Gm3376 ; Gm5698 ; Gm5699 ; Gm6712 ; Gm7221 ; Hltf ; Zbtb39 ; Zfp560 ; Zfp600 ; Zfp618 ; Zfp936 ; iron ion binding (MF) GO:0005506 0.014Alox12e ; Alox15 ; Cyp17a1 ; Cyp2b23 ; Ogfod2 ; Plod2 ; cellular component (CC) GO:0005575 0.0174933403G14Rik ; AI481877 ; Acap3 ; Atxn1l ; C130073F10Rik ; Cdh19 ; Clca4 ; Cox7b2 ; Crispld1 ; Cyp2b23 ; Dnahc6 ; Dph5 ; Efhc2 ; Gm10351 ; Gm10491 ; Gm10760 ; Gm1110 ; Gm12794 ; Gm13145 ; Gm13154 ; Gm13242 ; Gm5698 ; Gm5699 ; Gm7221 ; Gm7579 ; Gm9112 ; Gramd1c ; Klk1b24 ; Lysmd2 ; Mup6 ; Ogfod2 ; Pilrb2 ; Sectm1a ; Tex13 ; Tex15 ; Trim52 ; Ttc18 ; Usp9y ; Zbtb39 ; Zfp560 ; Zfp600 ; Zfp936 ; lipoxygenase activity (MF) GO:0016165 0.020Alox12e ; Alox15 ; biological process (BP) GO:0008150 0.0521700029P11Rik ; 4933403G14Rik ; 9430020K01Rik ; AI481877 ; Acap3 ; Atxn1l ; C130073F10Rik ; Ccnb3 ; Cdh19 ; Clca4 ; Cox7b2 ; Crispld1 ; Cyp2b23 ; Dnahc6 ; Dph5 ; Efhc2 ; Fndc4 ; Gm10351 ; Gm10491 ; Gm10760 ; Gm1110 ; Gm12794 ; Gm13145 ; Gm13154 ; Gm13242 ; Gm5698 ; Gm5699 ; Gm7221 ; Gm7579 ; Gm9112 ; Gramd1c ; Lysmd2 ; Mup6 ; Pilrb2 ; Tex13 ; Trim52 ; Ttc18 ; Zbtb39 ; Zfp560 ; Zfp600 ; Zfp618 ; Zfp936 ; Zscan4d ; male gonad development (BP)GO:0008584 0.053 Cyp17a1 ; Nkx3-1 ; Ren1 ; Genes upregulated in \u00E2\u0099\u0080 KO PGCsGO term Name GO term Id p-value (corrected) Gene Symbolsextracellular matrix (CC)GO:0031012; GO:0005578 0.00099559Adamts9 ; Col14a1 ; Egfl6 ; Emilin1 ; Htra1 ; Lama5 ; Ltbp1 ; Pxdn ; Serpinf1 ; Wnt7a ; definitive hemopoiesis (BP) GO:0060216 0.00625453Hoxb3 ; Hoxb4 ; Meis1 ; urogenital system development (BP)GO:0001655 0.01158553 Pax2 ; Pax8 ; Srd5a1 ; embryonic skeletal system morphogenesis (BP)GO:0048704 0.01305687 Hoxb3 ; Hoxb4 ; Hoxb5 ; Hoxb7 ; Sox11 ; identical protein binding (BP) GO:0042802 0.02523512Amotl1 ; Cldn3 ; Emilin1 ; Grb7 ; Grik5 ; Inhba ; Kctd17 ; Sh3gl3 ; Trim8 ; signal transduction (BP) GO:0007165 0.02684925Angptl6 ; Arhgap33 ; Arhgap39 ; Arnt2 ; Chrnb4 ; Gm266 ; Grb7 ; Nfkb2 ; Plce1 ; Plxnb1 ; Rac3 ; Rrad ; Unc5d ; Wnt7a ; 153 Supplementary figure 4-11: Genes upregulated in Setdb1 KO E13.5 PGCs (A) Expression (RPKM) of all annotated ENSEMBL genes in HET versus Setdb1 KO female E13.5 PGCs. 157 genes up-regulated in Setdb1 KO PGCs (z-score KO/WT and KO/Het >1, RPKM KO/WT and KO/Het >1.5-fold) are highlighted in red. 31 genes down-regulated in Setdb1 KO (z-score KO/WT and KO/Het <-1, RPKM KO/WT and KO/Het < 0.75-fold) are highlighted in green. (B) Gene Ontology analysis of genes upregulated in male and female Setdb1 KO PGCs. Top 6 ontologies (> 5 genes in ontology) presented for each sex. Gray: below statistical significance. (C) Genome browser views, including RepeatMasker tracks, of H3K9me3 and RNA-seq coverage in male and female HET and Setdb1 KO E13.5 PGCs. Upper panel shows the Nfatc4 gene and the annotated ERVK10C element upstream of this locus. Lower panel shows the Klk1b24 and Klk1b3 genes, and a cluster of ERVs, including ERVK10C and ETnERV2 elements, upstream of these loci. 154 Appendix D Supplementary Tables for Chapter 4 Upregulated genes are: Z-score > 1 in Setdb1KO vs WT and Het > 1.5 fold increase in Setdb1KO vs WT and Het. Tables are sorted by: Averaged gene expression level RPKM < 0.2 (assumed to be low- or non-expressed gene) in Het is sorted on its averaged expression level in KO in descending order; Averaged gene expression level RPKM \u00E2\u0089\u00A5 0.2 (assumed to be expressed gene) in Het is sorted as fold changes of averaged expression of KO over Het in descending order. Genes upregulated in both males and females are highlighted with yellow background. Genes that are LTR-driven are highlighted as bold. Supplementary Table 4-1: Genes Up in Setdb1KO Male PGCs Ensembl( Gene(name(WT(Male(E13.5(PGC(Repeat(1(WT(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(1(Het(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(3(Setdb1(KO(Male(E13.5(PGC(Repeat(1(Setdb1(KO(Male(E13.5(PGC(Repeat(2(ENSMUSG00000051949( 2010005H15Rik( 0.00# 0.00# 0.48# 0.00# 0.00# 57.67# 69.50#ENSMUSG00000000248( Clec2g( 1.56# 0.28# 0.06# 0.00# 0.04# 11.14# 14.49#ENSMUSG00000079644( Gm1110( 0.08# 0.00# 0.00# 0.00# 0.20# 12.29# 2.71#ENSMUSG00000063713( Klk1b24( 0.00# 0.00# 0.00# 0.00# 0.00# 9.69# 3.70#ENSMUSG00000070645# Ren1# 0.00# 0.25# 0.10# 0.13# 0.26# 10.62# 2.30#ENSMUSG00000090714# Zscan4d# 0.00# 0.00# 0.00# 0.00# 0.00# 9.23# 3.49#ENSMUSG00000028280( Gabrr1( 0.00# 0.00# 0.03# 0.00# 0.10# 6.36# 4.86#ENSMUSG00000025165( Sectm1a( 0.00# 0.00# 0.05# 0.00# 0.00# 5.45# 4.88#ENSMUSG00000073786# Gm7579# 0.31# 0.00# 0.00# 0.00# 0.00# 6.12# 3.78#ENSMUSG00000047562( Mmp10( 0.04# 0.00# 0.00# 0.00# 0.13# 3.48# 6.28#ENSMUSG00000079049( Serpinb1c( 0.00# 0.00# 0.00# 0.00# 0.00# 6.46# 2.59#ENSMUSG00000069031# Gm10256# 0.00# 0.00# 0.00# 0.00# 0.00# 5.25# 3.68#Up#in#males#and#females##Bold:(LTRNdriven(( 155 Ensembl\t \u00C2\u00A0 Gene\t \u00C2\u00A0name\t \u00C2\u00A0WT\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A01\t \u00C2\u00A0WT\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A02\t \u00C2\u00A0Het\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A01\t \u00C2\u00A0Het\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A02\t \u00C2\u00A0Het\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A03\t \u00C2\u00A0Setdb1\t \u00C2\u00A0KO\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A01\t \u00C2\u00A0Setdb1\t \u00C2\u00A0KO\t \u00C2\u00A0Male\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A02\t \u00C2\u00A0ENSMUSG00000000049\t \u00C2\u00A0 Apoh\t \u00C2\u00A0 0.34\t \u00C2\u00A0 0.04\t \u00C2\u00A0 0.02\t \u00C2\u00A0 0.06\t \u00C2\u00A0 0.31\t \u00C2\u00A0 3.15\t \u00C2\u00A0 5.67\t \u00C2\u00A0ENSMUSG00000047216\t \u00C2\u00A0 Cdh19\t \u00C2\u00A0 0.15\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 5.85\t \u00C2\u00A0 2.94\t \u00C2\u00A0ENSMUSG00000086151\t \u00C2\u00A0 Gm5698\t \u00C2\u00A0 0.20\t \u00C2\u00A0 0.21\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 5.42\t \u00C2\u00A0 3.20\t \u00C2\u00A0ENSMUSG00000070890\t \u00C2\u00A0 Gm12794\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 7.05\t \u00C2\u00A0 1.09\t \u00C2\u00A0ENSMUSG00000022181\t \u00C2\u00A0 C6\t \u00C2\u00A0 0.06\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 3.92\t \u00C2\u00A0 3.56\t \u00C2\u00A0ENSMUSG00000087346\t \u00C2\u00A0 Gm5699\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 4.60\t \u00C2\u00A0 2.51\t \u00C2\u00A0ENSMUSG00000091987\t \u00C2\u00A0 Gm3376\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.02\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 3.57\t \u00C2\u00A0 3.52\t \u00C2\u00A0 (Averaged gene expression RPKM \u00E2\u0089\u00A5 0.2 in Het) ENSMUSG00000018924\t \u00C2\u00A0 Alox15\t \u00C2\u00A0 0.65\t \u00C2\u00A0 0.58\t \u00C2\u00A0 0.37\t \u00C2\u00A0 0.34\t \u00C2\u00A0 1.68\t \u00C2\u00A0 27.73\t \u00C2\u00A0 36.66\t \u00C2\u00A0ENSMUSG00000066682\t \u00C2\u00A0 Pilrb2\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.12\t \u00C2\u00A0 2.57\t \u00C2\u00A0 4.01\t \u00C2\u00A0ENSMUSG00000052861\t \u00C2\u00A0 Dnahc6\t \u00C2\u00A0 0.17\t \u00C2\u00A0 0.10\t \u00C2\u00A0 0.10\t \u00C2\u00A0 0.04\t \u00C2\u00A0 0.06\t \u00C2\u00A0 4.28\t \u00C2\u00A0 1.84\t \u00C2\u00A0ENSMUSG00000003555\t \u00C2\u00A0 Cyp17a1\t \u00C2\u00A0 0.13\t \u00C2\u00A0 2.80\t \u00C2\u00A0 0.16\t \u00C2\u00A0 0.87\t \u00C2\u00A0 0.00\t \u00C2\u00A0 13.26\t \u00C2\u00A0 8.66\t \u00C2\u00A0ENSMUSG00000069044\t \u00C2\u00A0 Usp9y\t \u00C2\u00A0 0.57\t \u00C2\u00A0 0.07\t \u00C2\u00A0 0.25\t \u00C2\u00A0 0.40\t \u00C2\u00A0 0.48\t \u00C2\u00A0 6.69\t \u00C2\u00A0 8.07\t \u00C2\u00A0ENSMUSG00000040650\t \u00C2\u00A0 Cyp2b23\t \u00C2\u00A0 0.39\t \u00C2\u00A0 2.07\t \u00C2\u00A0 1.50\t \u00C2\u00A0 1.38\t \u00C2\u00A0 0.60\t \u00C2\u00A0 9.97\t \u00C2\u00A0 31.27\t \u00C2\u00A0ENSMUSG00000036292\t \u00C2\u00A0 Gramd1c\t \u00C2\u00A0 0.72\t \u00C2\u00A0 0.59\t \u00C2\u00A0 0.51\t \u00C2\u00A0 0.63\t \u00C2\u00A0 0.30\t \u00C2\u00A0 8.63\t \u00C2\u00A0 7.32\t \u00C2\u00A0ENSMUSG00000039543\t \u00C2\u00A0 Ttc18\t \u00C2\u00A0 0.31\t \u00C2\u00A0 0.20\t \u00C2\u00A0 0.31\t \u00C2\u00A0 0.27\t \u00C2\u00A0 0.36\t \u00C2\u00A0 5.87\t \u00C2\u00A0 3.01\t \u00C2\u00A0ENSMUSG00000033634\t \u00C2\u00A0 Cml2\t \u00C2\u00A0 1.53\t \u00C2\u00A0 1.11\t \u00C2\u00A0 0.99\t \u00C2\u00A0 1.22\t \u00C2\u00A0 0.76\t \u00C2\u00A0 11.65\t \u00C2\u00A0 16.01\t \u00C2\u00A0ENSMUSG00000079012\t \u00C2\u00A0 Serpina3m\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.17\t \u00C2\u00A0 0.35\t \u00C2\u00A0 0.13\t \u00C2\u00A0 0.39\t \u00C2\u00A0 5.26\t \u00C2\u00A0 1.90\t \u00C2\u00A0ENSMUSG00000049387\t \u00C2\u00A0 Cox7b2\t \u00C2\u00A0 2.60\t \u00C2\u00A0 1.04\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.96\t \u00C2\u00A0 0.80\t \u00C2\u00A0 6.99\t \u00C2\u00A0 7.44\t \u00C2\u00A0ENSMUSG00000027528\t \u00C2\u00A0 Fabp9\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.24\t \u00C2\u00A0 0.14\t \u00C2\u00A0 0.36\t \u00C2\u00A0 0.56\t \u00C2\u00A0 3.53\t \u00C2\u00A0 4.66\t \u00C2\u00A0ENSMUSG00000066515\t \u00C2\u00A0 Klk1b3\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.19\t \u00C2\u00A0 0.11\t \u00C2\u00A0 0.74\t \u00C2\u00A0 3.63\t \u00C2\u00A0 3.84\t \u00C2\u00A0ENSMUSG00000018907\t \u00C2\u00A0 Alox12e\t \u00C2\u00A0 0.57\t \u00C2\u00A0 0.45\t \u00C2\u00A0 0.68\t \u00C2\u00A0 0.44\t \u00C2\u00A0 0.35\t \u00C2\u00A0 4.30\t \u00C2\u00A0 6.01\t \u00C2\u00A0ENSMUSG00000037033\t \u00C2\u00A0 Clca4\t \u00C2\u00A0 0.48\t \u00C2\u00A0 0.99\t \u00C2\u00A0 0.30\t \u00C2\u00A0 0.34\t \u00C2\u00A0 0.52\t \u00C2\u00A0 3.01\t \u00C2\u00A0 3.71\t \u00C2\u00A0ENSMUSG00000046133\t \u00C2\u00A0 C130073F10Rik\t \u00C2\u00A0 0.25\t \u00C2\u00A0 0.31\t \u00C2\u00A0 1.28\t \u00C2\u00A0 0.46\t \u00C2\u00A0 1.31\t \u00C2\u00A0 11.21\t \u00C2\u00A0 6.12\t \u00C2\u00A0 156 Ensembl( Gene(name(WT(Male(E13.5(PGC(Repeat(1(WT(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(1(Het(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(3(Setdb1(KO(Male(E13.5(PGC(Repeat(1(Setdb1(KO(Male(E13.5(PGC(Repeat(2(ENSMUSG00000079479# Gm9112# 6.54# 2.02# 1.97# 1.49# 0.52# 9.81# 11.96#ENSMUSG00000025038# Efhc2# 3.04# 1.63# 2.43# 1.39# 1.45# 11.92# 13.22#ENSMUSG00000060441( Trim5( 0.59# 0.33# 0.22# 0.83# 0.55# 2.15# 5.46#ENSMUSG00000054196( Cthrc1( 24.44# 22.65# 28.06# 21.46# 18.88# 134.61# 165.30#ENSMUSG00000022061# Nkx3L1# 1.59# 0.27# 1.13# 0.59# 0.48# 5.07# 3.16#ENSMUSG00000025776# Crispld1# 1.01# 0.75# 0.82# 0.44# 0.87# 3.47# 4.19#ENSMUSG00000074798# Gm10760# 1.86# 0.47# 1.31# 1.67# 0.67# 6.08# 6.69#ENSMUSG00000032679( Cd59a( 3.34# 2.73# 2.04# 2.59# 0.79# 6.51# 12.05#ENSMUSG00000038598# AI481877# 2.20# 0.95# 1.16# 1.07# 1.18# 4.96# 6.63#ENSMUSG00000032374( Plod2( 18.47# 14.93# 12.84# 12.59# 13.61# 54.43# 67.61#ENSMUSG00000022113# Trim52# 2.01# 1.70# 0.90# 0.44# 0.78# 2.94# 3.65#ENSMUSG00000073291# Gm10491# 1.86# 0.00# 0.08# 1.27# 0.95# 4.45# 2.70#ENSMUSG00000063903# Klk1# 1.70# 0.32# 1.32# 3.21# 1.26# 9.28# 8.46#ENSMUSG00000042985# Upk3b# 1.37# 3.25# 2.11# 0.96# 1.81# 9.18# 5.56#ENSMUSG00000078497# Gm13145# 6.04# 5.32# 3.83# 3.24# 8.76# 14.82# 31.73#ENSMUSG00000039224# D1Pas1# 5.96# 3.65# 2.88# 2.63# 2.99# 11.84# 12.25#ENSMUSG00000051592# Ccnb3# 1.71# 0.62# 0.92# 0.46# 1.05# 3.33# 3.52#ENSMUSG00000027995# Tlr2# 0.06# 0.70# 0.81# 1.07# 0.98# 3.67# 4.33#ENSMUSG00000022679# Mpv17l# 1.63# 3.24# 2.16# 3.38# 2.67# 7.68# 15.17#ENSMUSG00000040584# Abcb1a# 0.29# 1.04# 0.45# 0.78# 1.09# 1.95# 4.12#ENSMUSG00000038552# Fndc4# 6.90# 8.96# 7.02# 8.81# 8.37# 25.16# 35.94#ENSMUSG00000061633( 1700029P11Rik( 4.52# 3.33# 4.71# 3.27# 2.57# 12.70# 13.40#ENSMUSG00000067338# Tuba3b# 41.68# 10.54# 33.12# 15.49# 14.47# 87.93# 64.33#ENSMUSG00000042686# Jph1# 2.92# 1.41# 2.16# 1.84# 2.26# 7.81# 6.96#ENSMUSG00000022221( Ripk3( 2.65# 1.58# 2.73# 0.90# 1.87# 7.57# 5.13# 157 Ensembl( Gene(name(WT(Male(E13.5(PGC(Repeat(1(WT(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(1(Het(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(3(Setdb1(KO(Male(E13.5(PGC(Repeat(1(Setdb1(KO(Male(E13.5(PGC(Repeat(2(ENSMUSG00000078689# Mup6# 6.50# 9.28# 5.01# 6.65# 7.42# 14.85# 28.44#ENSMUSG00000065999( Gm13154( 4.33# 10.06# 7.18# 8.94# 8.39# 19.10# 35.97#ENSMUSG00000071952# Gm10351# 6.63# 9.40# 6.75# 4.94# 3.53# 12.70# 19.98#ENSMUSG00000023707( Ogfod2( 3.93# 4.99# 3.10# 2.19# 2.41# 10.22# 6.28#ENSMUSG00000066007( Zfp600( 2.00# 2.10# 1.64# 1.74# 2.40# 5.00# 7.08#ENSMUSG00000033960( 9430020K01Rik( 5.34# 5.04# 4.58# 6.38# 5.62# 14.78# 18.64#ENSMUSG00000050930# 4933403G14Rik# 3.63# 2.97# 2.69# 2.38# 2.33# 6.88# 7.68#ENSMUSG00000091781# Gm9590# 0.91# 0.16# 2.47# 2.11# 0.67# 4.55# 5.10#ENSMUSG00000042386# Tex13# 4.19# 2.48# 6.13# 3.65# 2.31# 9.22# 12.58#ENSMUSG00000033554( Dph5( 7.05# 5.79# 4.65# 4.28# 4.20# 13.08# 10.08#ENSMUSG00000009628# Tex15# 39.51# 25.28# 16.68# 26.91# 26.41# 49.19# 69.13#ENSMUSG00000063889( Crem( 2.52# 2.44# 1.61# 2.28# 2.02# 4.12# 5.88#ENSMUSG00000064194# Zfp936# 12.08# 13.43# 12.68# 15.20# 12.57# 22.95# 45.40#ENSMUSG00000040213( Ccbl2( 8.29# 5.92# 5.52# 5.55# 5.26# 13.22# 13.15#ENSMUSG00000033752( Mnd1( 26.71# 42.70# 26.90# 23.83# 39.09# 81.74# 62.55#ENSMUSG00000092335# Gm7221# 5.40# 4.66# 5.21# 4.97# 3.36# 8.35# 13.38#ENSMUSG00000045519# Zfp560# 2.55# 2.33# 2.49# 2.11# 2.17# 5.16# 5.56#ENSMUSG00000044617# Zbtb39# 7.27# 7.68# 6.05# 8.11# 7.45# 15.26# 16.92#ENSMUSG00000029033( Acap3( 4.91# 3.21# 4.92# 2.91# 3.93# 9.11# 8.09#ENSMUSG00000072761# Gm6712# 4.73# 4.14# 3.86# 5.43# 3.96# 9.52# 9.46#ENSMUSG00000028145# Them4# 3.53# 2.28# 3.54# 2.85# 2.65# 6.31# 6.39#ENSMUSG00000036103( Colec12( 6.12# 7.79# 7.45# 7.39# 5.91# 13.20# 15.73#ENSMUSG00000055780# Usp26# 32.53# 17.43# 26.27# 21.43# 21.76# 42.05# 52.60#ENSMUSG00000028358( Zfp618( 5.59# 3.44# 2.59# 4.66# 6.57# 8.45# 10.08#ENSMUSG00000058186( Gm13242( 3.14# 4.09# 4.69# 4.37# 3.87# 6.32# 10.07# 158 Ensembl( Gene(name(WT(Male(E13.5(PGC(Repeat(1(WT(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(1(Het(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(3(Setdb1(KO(Male(E13.5(PGC(Repeat(1(Setdb1(KO(Male(E13.5(PGC(Repeat(2(ENSMUSG00000069895( Atxn1l( 7.30# 4.37# 4.10# 5.88# 5.13# 9.63# 9.25#ENSMUSG00000027615( Hps3( 11.72# 5.94# 10.37# 8.31# 6.81# 15.84# 15.62#ENSMUSG00000058056( Palld( 4.38# 3.59# 4.58# 4.26# 4.15# 7.58# 8.12#ENSMUSG00000002428( Hltf( 21.10# 26.34# 24.72# 24.14# 23.83# 34.76# 48.80#ENSMUSG00000005892# Trh# 10.02# 8.70# 11.26# 6.10# 10.91# 15.55# 16.00#ENSMUSG00000032184# Lysmd2# 7.70# 7.77# 8.44# 7.63# 7.32# 11.96# 13.48# 159 Supplementary Table 4-2: Genes Up in Setdb1KO Female PGCs Ensembl\t \u00C2\u00A0 Gene\t \u00C2\u00A0name\t \u00C2\u00A0WT\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A01\t \u00C2\u00A0WT\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A02\t \u00C2\u00A0Het\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A01\t \u00C2\u00A0Het\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A02\t \u00C2\u00A0Het\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A03\t \u00C2\u00A0Setdb1\t \u00C2\u00A0KO\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A01\t \u00C2\u00A0Setdb1\t \u00C2\u00A0KO\t \u00C2\u00A0Female\t \u00C2\u00A0E13.5\t \u00C2\u00A0PGC\t \u00C2\u00A0Repeat\t \u00C2\u00A02\t \u00C2\u00A0ENSMUSG00000028280\t \u00C2\u00A0 Gabrr1\t \u00C2\u00A0 0.15\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.56\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.08\t \u00C2\u00A0 26.15\t \u00C2\u00A0 18.46\t \u00C2\u00A0ENSMUSG00000043073\t \u00C2\u00A0 Dub3\t \u00C2\u00A0 0.21\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.05\t \u00C2\u00A0 0.22\t \u00C2\u00A0 12.09\t \u00C2\u00A0 7.59\t \u00C2\u00A0ENSMUSG00000070529\t \u00C2\u00A0 Wfdc10\t \u00C2\u00A0 1.24\t \u00C2\u00A0 2.59\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.24\t \u00C2\u00A0 4.52\t \u00C2\u00A0 4.42\t \u00C2\u00A0ENSMUSG00000000248\t \u00C2\u00A0 Clec2g\t \u00C2\u00A0 0.22\t \u00C2\u00A0 0.10\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.12\t \u00C2\u00A0 0.00\t \u00C2\u00A0 3.14\t \u00C2\u00A0 4.75\t \u00C2\u00A0ENSMUSG00000052861\t \u00C2\u00A0 Dnahc6\t \u00C2\u00A0 0.13\t \u00C2\u00A0 0.13\t \u00C2\u00A0 0.06\t \u00C2\u00A0 0.13\t \u00C2\u00A0 0.01\t \u00C2\u00A0 3.10\t \u00C2\u00A0 2.94\t \u00C2\u00A0 (Averaged gene expression RPKM \u00E2\u0089\u00A5 0.2 in Het) ENSMUSG00000033634\t \u00C2\u00A0 Cml2\t \u00C2\u00A0 0.50\t \u00C2\u00A0 0.29\t \u00C2\u00A0 0.26\t \u00C2\u00A0 0.57\t \u00C2\u00A0 0.08\t \u00C2\u00A0 2.94\t \u00C2\u00A0 3.97\t \u00C2\u00A0ENSMUSG00000018924\t \u00C2\u00A0 Alox15\t \u00C2\u00A0 0.43\t \u00C2\u00A0 0.30\t \u00C2\u00A0 0.17\t \u00C2\u00A0 0.59\t \u00C2\u00A0 0.50\t \u00C2\u00A0 3.20\t \u00C2\u00A0 5.21\t \u00C2\u00A0ENSMUSG00000022113\t \u00C2\u00A0 Trim52\t \u00C2\u00A0 1.44\t \u00C2\u00A0 0.85\t \u00C2\u00A0 3.17\t \u00C2\u00A0 1.32\t \u00C2\u00A0 1.46\t \u00C2\u00A0 11.63\t \u00C2\u00A0 13.66\t \u00C2\u00A0ENSMUSG00000041216\t \u00C2\u00A0 Clvs1\t \u00C2\u00A0 0.52\t \u00C2\u00A0 1.00\t \u00C2\u00A0 0.57\t \u00C2\u00A0 0.56\t \u00C2\u00A0 0.55\t \u00C2\u00A0 2.56\t \u00C2\u00A0 4.11\t \u00C2\u00A0ENSMUSG00000066938\t \u00C2\u00A0 Gm10190\t \u00C2\u00A0 1.53\t \u00C2\u00A0 0.51\t \u00C2\u00A0 0.00\t \u00C2\u00A0 0.52\t \u00C2\u00A0 1.95\t \u00C2\u00A0 5.84\t \u00C2\u00A0 3.39\t \u00C2\u00A0ENSMUSG00000025038\t \u00C2\u00A0 Efhc2\t \u00C2\u00A0 5.86\t \u00C2\u00A0 4.10\t \u00C2\u00A0 1.39\t \u00C2\u00A0 7.23\t \u00C2\u00A0 1.03\t \u00C2\u00A0 12.27\t \u00C2\u00A0 21.13\t \u00C2\u00A0ENSMUSG00000035964\t \u00C2\u00A0 Tmem59l\t \u00C2\u00A0 1.57\t \u00C2\u00A0 2.58\t \u00C2\u00A0 1.72\t \u00C2\u00A0 0.45\t \u00C2\u00A0 0.23\t \u00C2\u00A0 3.78\t \u00C2\u00A0 4.37\t \u00C2\u00A0ENSMUSG00000037033\t \u00C2\u00A0 Clca4\t \u00C2\u00A0 0.24\t \u00C2\u00A0 0.19\t \u00C2\u00A0 0.53\t \u00C2\u00A0 0.56\t \u00C2\u00A0 0.73\t \u00C2\u00A0 2.89\t \u00C2\u00A0 3.24\t \u00C2\u00A0ENSMUSG00000057969\t \u00C2\u00A0 Sema3b\t \u00C2\u00A0 2.03\t \u00C2\u00A0 1.59\t \u00C2\u00A0 0.88\t \u00C2\u00A0 1.07\t \u00C2\u00A0 0.63\t \u00C2\u00A0 5.12\t \u00C2\u00A0 3.09\t \u00C2\u00A0ENSMUSG00000060508\t \u00C2\u00A0 Nlrp9b\t \u00C2\u00A0 1.83\t \u00C2\u00A0 0.82\t \u00C2\u00A0 0.73\t \u00C2\u00A0 1.20\t \u00C2\u00A0 1.02\t \u00C2\u00A0 3.38\t \u00C2\u00A0 5.96\t \u00C2\u00A0ENSMUSG00000052415\t \u00C2\u00A0 Tchh\t \u00C2\u00A0 0.99\t \u00C2\u00A0 0.88\t \u00C2\u00A0 6.01\t \u00C2\u00A0 1.24\t \u00C2\u00A0 1.64\t \u00C2\u00A0 15.75\t \u00C2\u00A0 12.14\t \u00C2\u00A0ENSMUSG00000092064\t \u00C2\u00A0 Gm9312\t \u00C2\u00A0 2.82\t \u00C2\u00A0 3.57\t \u00C2\u00A0 1.26\t \u00C2\u00A0 1.56\t \u00C2\u00A0 1.91\t \u00C2\u00A0 9.36\t \u00C2\u00A0 4.89\t \u00C2\u00A0ENSMUSG00000020701\t \u00C2\u00A0 Tmem132e\t \u00C2\u00A0 3.16\t \u00C2\u00A0 2.44\t \u00C2\u00A0 1.95\t \u00C2\u00A0 2.30\t \u00C2\u00A0 1.56\t \u00C2\u00A0 11.29\t \u00C2\u00A0 6.21\t \u00C2\u00A0ENSMUSG00000079584\t \u00C2\u00A0 Gm364\t \u00C2\u00A0 0.95\t \u00C2\u00A0 0.00\t \u00C2\u00A0 6.66\t \u00C2\u00A0 0.59\t \u00C2\u00A0 0.32\t \u00C2\u00A0 13.80\t \u00C2\u00A0 8.82\t \u00C2\u00A0ENSMUSG00000031750\t \u00C2\u00A0 Il34\t \u00C2\u00A0 1.26\t \u00C2\u00A0 0.86\t \u00C2\u00A0 0.73\t \u00C2\u00A0 0.56\t \u00C2\u00A0 1.11\t \u00C2\u00A0 3.83\t \u00C2\u00A0 3.21\t \u00C2\u00A0ENSMUSG00000040584\t \u00C2\u00A0 Abcb1a\t \u00C2\u00A0 0.19\t \u00C2\u00A0 1.09\t \u00C2\u00A0 0.58\t \u00C2\u00A0 1.32\t \u00C2\u00A0 1.12\t \u00C2\u00A0 3.70\t \u00C2\u00A0 4.85\t \u00C2\u00A0 160 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000041794( Myrip( 1.50# 1.15# 1.67# 1.23# 2.00# 7.88# 5.68#ENSMUSG00000006519# Cyba# 6.70# 3.86# 2.18# 4.24# 0.00# 9.18# 8.48#ENSMUSG00000041324# Inhba# 1.45# 2.02# 0.84# 0.98# 1.17# 4.41# 3.57#ENSMUSG00000092443# Gm20726# 3.87# 4.49# 1.29# 3.65# 0.42# 6.94# 7.34#ENSMUSG00000084085# Gm16140# 1.57# 4.11# 1.89# 4.25# 2.09# 15.35# 6.24#ENSMUSG00000029163# Emilin1# 4.03# 2.88# 1.42# 1.48# 1.99# 7.56# 5.17#ENSMUSG00000039224# D1Pas1# 3.21# 3.68# 10.29# 3.35# 3.76# 23.41# 21.03#ENSMUSG00000047674# Pdha2# 0.42# 0.81# 1.99# 0.76# 0.70# 4.66# 4.01#ENSMUSG00000028845# Tekt2# 1.55# 0.39# 2.34# 0.49# 0.43# 5.11# 3.09#ENSMUSG00000030742# Lat# 3.28# 4.71# 2.25# 3.11# 2.31# 10.65# 8.42#ENSMUSG00000041120# Nbl1# 7.84# 6.60# 4.69# 1.91# 2.76# 11.87# 11.33#ENSMUSG00000060572# Mfap2# 18.96# 16.78# 11.37# 6.60# 8.03# 37.57# 26.79#ENSMUSG00000035395# Pet2# 1.18# 1.11# 3.63# 1.54# 0.94# 7.87# 7.09#ENSMUSG00000022469# Rapgef3# 2.20# 2.09# 1.19# 1.16# 1.17# 4.61# 3.84#ENSMUSG00000010529# Gm266# 4.29# 3.47# 3.39# 2.38# 0.90# 9.00# 6.85#ENSMUSG00000046447# Camk2n1# 8.20# 13.47# 4.63# 5.15# 7.56# 17.90# 22.15#ENSMUSG00000025020# Slit1# 1.70# 1.55# 1.86# 1.06# 0.91# 4.49# 4.15#ENSMUSG00000038721# Hoxb7# 6.59# 8.67# 4.81# 2.86# 4.13# 12.24# 14.35#ENSMUSG00000064125# BC068157# 1.61# 3.18# 1.52# 1.40# 1.55# 4.43# 5.48#ENSMUSG00000025225# Nfkb2# 3.05# 2.56# 2.46# 0.94# 1.14# 5.51# 4.57#ENSMUSG00000031616( Ednra( 2.06# 1.03# 1.05# 1.04# 1.37# 4.38# 3.23#ENSMUSG00000038700# Hoxb5# 2.06# 3.87# 1.52# 3.05# 1.70# 6.44# 6.94#ENSMUSG00000019312# Grb7# 10.03# 8.09# 6.28# 3.78# 3.92# 17.14# 12.55#ENSMUSG00000006731# B4galnt1# 2.27# 3.20# 2.83# 1.00# 1.08# 5.75# 4.58# 161 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000036292( Gramd1c( 1.09# 1.23# 1.30# 0.93# 2.20# 4.80# 4.43#ENSMUSG00000004231( Pax2( 5.32# 9.27# 3.47# 4.04# 4.64# 12.04# 13.15#ENSMUSG00000063632# Sox11# 75.92# 81.83# 57.46# 42.65# 36.10# 157.50# 121.07#ENSMUSG00000060969# Irx1# 3.15# 2.20# 1.54# 1.69# 1.59# 5.03# 4.80#ENSMUSG00000030093# Wnt7a# 29.22# 32.19# 25.70# 15.46# 15.27# 67.87# 46.98#ENSMUSG00000073600# Gm1614# 3.17# 0.92# 3.17# 0.93# 1.01# 4.56# 5.73#ENSMUSG00000048763# Hoxb3# 4.26# 5.85# 5.23# 2.77# 2.86# 10.47# 11.37#ENSMUSG00000015647# Lama5# 14.50# 12.95# 9.25# 7.26# 6.97# 27.21# 19.62#ENSMUSG00000029070# Mxra8# 8.26# 8.60# 3.12# 4.30# 4.40# 11.52# 11.90#ENSMUSG00000020160# Meis1# 13.22# 17.38# 14.88# 9.51# 9.76# 37.50# 27.36#ENSMUSG00000023707( Ogfod2( 4.65# 2.94# 2.91# 3.62# 1.82# 8.72# 7.10#ENSMUSG00000001870# Ltbp1# 25.30# 24.75# 20.71# 15.60# 14.60# 56.70# 39.21#ENSMUSG00000034168# 6430527G18Rik# 7.60# 9.89# 7.17# 4.79# 3.86# 16.65# 13.13#ENSMUSG00000025089# Gfra1# 22.04# 25.10# 21.06# 13.63# 9.73# 48.40# 35.21#ENSMUSG00000025790# Slco3a1# 2.38# 3.40# 2.25# 1.71# 1.86# 5.67# 5.13#ENSMUSG00000038742( Angptl6( 3.36# 3.56# 4.48# 3.39# 1.92# 8.63# 9.48#ENSMUSG00000038692# Hoxb4# 1.89# 3.78# 1.18# 2.47# 3.16# 6.19# 6.25#ENSMUSG00000026976# Pax8# 13.74# 21.50# 13.47# 10.01# 9.35# 28.34# 31.59#ENSMUSG00000000753# Serpinf1# 3.24# 3.20# 4.26# 1.95# 1.98# 7.58# 7.33#ENSMUSG00000067786# Nnat# 18.25# 18.12# 10.65# 9.36# 8.95# 29.82# 22.83#ENSMUSG00000024940# Ltbp3# 5.19# 5.22# 4.77# 3.81# 3.94# 13.72# 8.89#ENSMUSG00000016024# Lbp# 3.49# 1.88# 3.16# 1.91# 1.20# 6.59# 4.67#ENSMUSG00000035200# Chrnb4# 0.95# 1.37# 0.56# 1.97# 1.88# 2.52# 5.41#ENSMUSG00000055675# Kbtbd11# 2.34# 2.05# 3.04# 1.64# 1.96# 6.92# 4.95# 162 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000008734# Gprc5b# 4.21# 5.02# 2.91# 3.19# 2.66# 8.88# 6.69#ENSMUSG00000024598# Fbn2# 14.72# 16.92# 10.44# 11.20# 11.32# 34.54# 23.46#ENSMUSG00000021974# Fgf9# 8.79# 14.65# 8.13# 7.81# 7.83# 23.30# 18.49#ENSMUSG00000019539# Rcn3# 26.44# 21.46# 22.96# 11.12# 13.04# 47.57# 34.82#ENSMUSG00000026765# Lypd6b# 9.21# 9.02# 6.43# 6.13# 5.50# 19.68# 11.84#ENSMUSG00000022371# Col14a1# 7.61# 8.17# 7.79# 4.68# 4.14# 15.86# 12.90#ENSMUSG00000047264# Zfp358# 9.37# 6.67# 7.96# 2.76# 4.93# 15.03# 12.05#ENSMUSG00000029168# Dpysl5# 7.41# 6.84# 5.46# 3.05# 2.91# 10.13# 9.39#ENSMUSG00000039976# Tbc1d16# 4.09# 4.86# 4.67# 3.08# 3.04# 10.83# 7.42#ENSMUSG00000025592# Dach2# 6.80# 11.24# 4.96# 6.22# 6.18# 14.96# 14.37#ENSMUSG00000029603# Dtx1# 3.28# 3.81# 4.00# 2.03# 1.75# 6.58# 6.50#ENSMUSG00000013921# Clip3# 4.23# 4.24# 4.35# 2.69# 2.67# 7.61# 8.67#ENSMUSG00000029361# Nos1# 3.16# 4.62# 7.18# 3.93# 2.79# 13.69# 9.54#ENSMUSG00000074415# 2610203C20Rik# 2.36# 7.35# 5.22# 4.10# 5.75# 11.11# 13.93#ENSMUSG00000053646# Plxnb1# 10.94# 10.41# 8.12# 5.99# 6.35# 19.96# 13.83#ENSMUSG00000042386# Tex13# 9.19# 6.79# 14.89# 14.45# 6.83# 28.03# 31.69#ENSMUSG00000023991# Foxp4# 6.04# 6.73# 5.46# 4.11# 4.64# 10.73# 12.60#ENSMUSG00000033436# Armcx2# 7.03# 6.43# 6.08# 4.04# 3.56# 13.30# 9.16#ENSMUSG00000032908# Sgpp2# 9.88# 9.06# 8.20# 5.97# 5.01# 18.02# 13.39#ENSMUSG00000052151# Ppap2c# 3.65# 3.02# 3.09# 1.98# 2.69# 6.23# 6.40#ENSMUSG00000044674# Fzd1# 6.84# 5.66# 5.43# 3.83# 4.60# 12.02# 10.52#ENSMUSG00000046997# Spsb4# 3.82# 3.29# 2.27# 2.79# 3.43# 7.79# 6.00#ENSMUSG00000031880# Rrad# 4.30# 5.79# 2.28# 2.42# 5.09# 8.52# 7.36#ENSMUSG00000068874# Selenbp1# 3.67# 2.12# 3.90# 2.22# 1.08# 6.57# 5.04# 163 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000020674# Pxdn# 21.60# 20.25# 12.84# 14.72# 13.70# 36.17# 29.61#ENSMUSG00000023411( Nfatc4( 6.61# 5.32# 8.37# 3.92# 5.18# 15.74# 11.82#ENSMUSG00000022565# Plec# 4.90# 7.54# 6.47# 5.17# 5.32# 15.73# 10.89#ENSMUSG00000029135# Fosl2# 15.15# 13.93# 16.04# 8.65# 9.97# 31.42# 22.25#ENSMUSG00000074798# Gm10760# 2.04# 0.74# 5.30# 2.26# 4.29# 10.39# 7.92#ENSMUSG00000026185# Igfbp5# 52.35# 79.37# 53.48# 54.09# 55.99# 122.14# 129.23#ENSMUSG00000028358# Zfp618# 4.10# 5.25# 4.89# 3.63# 3.90# 10.29# 8.67#ENSMUSG00000036882# Arhgap33# 2.30# 1.82# 4.43# 2.06# 1.79# 5.98# 6.25#ENSMUSG00000015709# Arnt2# 10.00# 9.63# 13.07# 6.71# 6.49# 21.28# 17.48#ENSMUSG00000048644# Ctxn1# 36.42# 33.14# 28.08# 20.78# 24.46# 57.93# 50.13#ENSMUSG00000030220# Arhgdib# 6.74# 6.57# 6.65# 4.19# 5.61# 12.56# 11.65#ENSMUSG00000074203# G430095P16Rik# 0.96# 0.50# 1.30# 1.59# 1.97# 2.94# 4.10#ENSMUSG00000013076# Amotl1# 5.81# 6.73# 5.65# 4.18# 5.08# 11.41# 10.15#ENSMUSG00000024998# Plce1# 3.52# 3.59# 4.43# 3.07# 3.51# 9.56# 6.33#ENSMUSG00000070473# Cldn3# 8.87# 6.28# 7.57# 4.46# 6.40# 15.11# 11.40#ENSMUSG00000063626# Unc5d# 2.69# 3.90# 4.19# 2.57# 2.83# 6.58# 7.16#ENSMUSG00000042978# Sbk1# 28.05# 26.38# 21.36# 19.30# 18.29# 45.49# 38.92#ENSMUSG00000021594# Srd5a1# 5.50# 5.83# 5.98# 3.54# 3.28# 10.00# 8.33#ENSMUSG00000018012# Rac3# 7.13# 6.71# 8.46# 5.52# 5.93# 15.54# 12.95#ENSMUSG00000055421# Pcdh9# 3.12# 4.37# 4.29# 3.66# 3.17# 8.82# 7.04#ENSMUSG00000046314# Stxbp6# 6.63# 8.57# 6.80# 4.39# 5.06# 10.85# 12.32#ENSMUSG00000039081# Zfp503# 14.32# 14.88# 13.76# 10.21# 8.89# 26.17# 20.42#ENSMUSG00000034675# Dbn1# 25.34# 21.92# 26.39# 11.07# 16.53# 41.71# 34.22#ENSMUSG00000023232# Serinc2# 4.91# 5.91# 6.20# 3.03# 2.76# 8.46# 8.38#ENSMUSG00000031125# 3830403N18Rik# 3.17# 3.42# 8.32# 6.27# 6.65# 11.85# 17.42# 164 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000058317# Ube2e2# 7.27# 5.96# 5.43# 7.19# 6.65# 14.56# 11.96#ENSMUSG00000051257# Trap1a# 42.16# 30.60# 33.65# 34.63# 17.28# 63.15# 53.63#ENSMUSG00000074472# Zfp872# 1.32# 2.95# 4.49# 3.14# 3.90# 6.38# 9.24#ENSMUSG00000034832# Tet3# 3.32# 6.15# 6.79# 4.50# 4.42# 12.03# 9.18#ENSMUSG00000092206# NA# 3.49# 7.22# 4.48# 3.81# 5.62# 9.54# 9.19#ENSMUSG00000037016# Frem2# 19.07# 18.56# 15.86# 16.42# 17.06# 37.47# 28.68#ENSMUSG00000027239# Mdk# 174.11# 177.08# 155.84# 123.23# 125.73# 261.48# 280.56#ENSMUSG00000075592# Nynrin# 6.71# 7.06# 7.65# 4.68# 5.26# 13.52# 9.85#ENSMUSG00000024352# Spata24# 5.54# 3.38# 5.08# 4.85# 5.23# 12.46# 7.68#ENSMUSG00000040415# Dtx3# 8.16# 8.52# 8.84# 5.99# 7.79# 14.82# 15.04#ENSMUSG00000054793# Cadm4# 9.09# 9.15# 7.45# 6.29# 6.41# 13.22# 13.35#ENSMUSG00000070923# Klhl9# 39.38# 29.92# 34.77# 26.92# 22.90# 59.81# 51.65#ENSMUSG00000037907# Ankrd13b# 7.15# 5.76# 6.71# 5.38# 4.43# 10.52# 11.24#ENSMUSG00000025145# Lrrc45# 9.20# 9.17# 8.95# 8.55# 9.59# 19.04# 16.55#ENSMUSG00000034295# Fhod3# 11.11# 12.21# 9.37# 9.00# 9.87# 18.67# 18.24#ENSMUSG00000089715# Cbx6# 8.56# 8.42# 7.18# 7.07# 7.23# 15.54# 12.23#ENSMUSG00000042992# Loh12cr1# 3.85# 3.90# 5.25# 3.31# 3.75# 8.45# 7.23#ENSMUSG00000045391# 1700120B22Rik# 8.23# 5.23# 10.95# 9.39# 4.12# 16.75# 14.40#ENSMUSG00000026604# Ptpn14# 10.11# 13.52# 10.75# 9.61# 10.74# 20.83# 18.22#ENSMUSG00000030022# Adamts9# 10.11# 13.41# 10.15# 9.73# 10.61# 20.29# 17.75#ENSMUSG00000031661# Nkd1# 13.20# 11.23# 13.11# 8.76# 8.44# 19.42# 17.95#ENSMUSG00000049823# Zbtb12# 9.75# 10.08# 8.25# 8.34# 7.61# 15.47# 14.28#ENSMUSG00000025034# Trim8# 8.18# 7.74# 9.06# 5.93# 6.87# 15.12# 11.75#ENSMUSG00000036155# Mgat5# 12.07# 14.30# 15.16# 12.28# 8.35# 25.05# 18.90#ENSMUSG00000041596# Vmn1r90# 4.83# 3.52# 6.48# 3.25# 5.52# 9.89# 8.75# 165 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000006205# Htra1# 7.22# 7.38# 8.99# 6.46# 7.39# 12.93# 14.73#ENSMUSG00000000402# Egfl6# 18.78# 19.14# 18.66# 12.42# 13.49# 27.79# 25.78#ENSMUSG00000030638# Sh3gl3# 9.56# 10.99# 9.56# 9.97# 5.28# 15.54# 14.11#ENSMUSG00000033368# Trim69# 1.17# 1.37# 3.11# 1.19# 3.47# 4.49# 4.74#ENSMUSG00000003378# Grik5# 9.66# 7.66# 12.41# 7.44# 9.57# 18.35# 16.21#ENSMUSG00000033287# Kctd17# 7.26# 8.43# 8.23# 6.63# 7.74# 11.66# 14.75#ENSMUSG00000091207# Purb# 21.87# 21.80# 18.64# 20.17# 17.30# 32.06# 32.89#ENSMUSG00000049086# Bmyc# 7.25# 4.90# 6.94# 5.31# 6.86# 10.65# 11.44#ENSMUSG00000021714# Cenpk# 6.18# 7.45# 8.47# 8.29# 8.90# 13.25# 14.39#ENSMUSG00000033697# Arhgap39# 6.73# 8.51# 9.53# 6.70# 7.93# 12.65# 13.04#ENSMUSG00000021495# Fam193b# 14.65# 15.04# 15.52# 15.75# 14.19# 24.66# 22.73#ENSMUSG00000042675# Ypel3# 10.91# 7.47# 8.49# 9.46# 9.40# 13.80# 14.53#ENSMUSG00000020747# 2310067B10Rik# 20.21# 17.70# 22.31# 18.18# 20.84# 30.37# 31.26#ENSMUSG00000032374# Plod2# 20.67# 16.66# 23.50# 19.34# 20.48# 32.29# 31.06# 166 Supplementary Table 4-3: Genes Down in Setdb1KO Male PGCs. Ensembl( Gene(name(WT(Male(E13.5(PGC(Repeat(1(WT(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(1(Het(Male(E13.5(PGC(Repeat(2(Het(Male(E13.5(PGC(Repeat(3(Setdb1(KO(Male(E13.5(PGC(Repeat(1(Setdb1(KO(Male(E13.5(PGC(Repeat(2(ENSMUSG00000028766# Alpl# 17.96# 23.49# 30.87# 23.48# 10.46# 8.43# 6.81#ENSMUSG00000021337# Scgn# 3.48# 7.72# 5.40# 5.76# 7.30# 2.74# 2.10#ENSMUSG00000029524# Sirt4# 8.14# 14.44# 14.47# 13.43# 12.28# 5.98# 5.47#ENSMUSG00000004038# Gstm3# 15.28# 23.34# 21.33# 24.87# 25.09# 12.10# 12.35#ENSMUSG00000032202# Rab27a# 17.48# 31.30# 22.17# 30.19# 25.71# 12.50# 14.32#ENSMUSG00000042210# Abhd14a# 24.43# 22.28# 21.28# 19.13# 26.82# 15.49# 9.97#ENSMUSG00000051811# Cox6b2# 39.40# 32.26# 24.45# 33.80# 31.98# 19.45# 15.50#ENSMUSG00000034330# Plcg2# 29.27# 42.12# 47.78# 44.93# 37.29# 25.24# 25.58#ENSMUSG00000020689# Itgb3# 42.29# 26.40# 41.21# 31.07# 32.50# 22.81# 18.49#ENSMUSG00000021953# Tdh# 36.64# 45.18# 44.13# 39.25# 43.02# 27.71# 22.75#ENSMUSG00000045382# Cxcr4# 15.34# 25.09# 18.02# 17.68# 16.85# 9.38# 12.01#ENSMUSG00000035914# Cd276# 27.24# 31.50# 37.24# 40.04# 30.74# 23.74# 20.75#ENSMUSG00000056234# Ncoa4# 16.03# 12.77# 8.48# 9.66# 11.54# 6.28# 6.00#ENSMUSG00000025076# Casp7# 25.18# 34.44# 29.65# 35.85# 35.23# 17.64# 24.50#ENSMUSG00000034708# Grn# 42.11# 47.19# 56.55# 51.07# 45.31# 34.02# 31.42#ENSMUSG00000030605# Mfge8# 26.75# 47.88# 37.19# 37.86# 36.75# 24.34# 23.58#ENSMUSG00000067889# Spnb3# 83.85# 81.65# 91.80# 95.38# 94.26# 63.29# 57.47#ENSMUSG00000015697# Setdb1# 48.97# 55.43# 45.83# 52.05# 37.53# 27.11# 31.20#ENSMUSG00000040289# Hey1# 29.85# 48.15# 40.36# 38.76# 35.39# 24.16# 25.39#ENSMUSG00000024892# Pcx# 23.30# 20.90# 26.50# 20.19# 18.03# 17.39# 12.21#ENSMUSG00000011884# Gltp# 29.12# 35.00# 30.05# 34.04# 37.51# 24.34# 22.97#ENSMUSG00000036315# Znrd1# 16.17# 24.96# 13.91# 18.54# 15.30# 10.33# 12.04#ENSMUSG00000024835# Coro1b# 50.96# 36.21# 50.47# 34.83# 36.28# 32.78# 28.55# 167 Supplementary Table 4-4: Genes Down in Setdb1KO Female PGCs. Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000090783# Gm17485# 10.68# 11.41# 13.44# 15.33# 11.76# 5.64# 3.78#ENSMUSG00000040495# Chrm4# 4.19# 2.98# 2.85# 5.27# 3.60# 1.69# 1.45#ENSMUSG00000019817# Plagl1# 32.98# 48.41# 24.94# 38.31# 54.14# 15.98# 17.16#ENSMUSG00000070637# Gm694# 2.52# 4.29# 4.46# 4.56# 3.02# 1.95# 1.60#ENSMUSG00000026726# Cubn# 8.57# 10.07# 6.21# 8.91# 8.29# 3.17# 3.93#ENSMUSG00000056436# Cyct# 50.31# 56.68# 55.33# 88.92# 79.13# 37.86# 34.64#ENSMUSG00000026288# Inpp5d# 3.11# 4.36# 1.90# 3.61# 3.30# 1.52# 1.41#ENSMUSG00000045287# Rtn4rl1# 7.53# 7.57# 5.48# 7.88# 7.44# 3.21# 3.75#ENSMUSG00000022853# Ehhadh# 5.53# 5.77# 3.02# 6.92# 3.72# 2.41# 2.43#ENSMUSG00000023961# Enpp4# 23.01# 24.83# 15.21# 26.46# 24.06# 11.90# 12.17#ENSMUSG00000032224# Fam81a# 7.16# 7.45# 6.56# 11.85# 8.22# 5.02# 4.90#ENSMUSG00000038141# Tmem181a# 3.28# 3.36# 4.62# 3.08# 4.18# 2.12# 2.45#ENSMUSG00000026380# Tcfcp2l1# 16.28# 16.02# 13.99# 18.95# 14.45# 9.82# 8.83#ENSMUSG00000005892# Trh# 5.58# 5.93# 5.36# 4.72# 6.52# 3.66# 3.03#ENSMUSG00000034336# Ina# 14.47# 14.07# 14.22# 12.05# 10.84# 7.24# 7.94#ENSMUSG00000025420# Katnal2# 3.49# 3.50# 3.67# 3.52# 3.21# 2.34# 1.94#ENSMUSG00000049971# Glt1d1# 8.74# 10.57# 7.45# 10.63# 8.15# 4.89# 5.98#ENSMUSG00000075706# Gpx4# 16.75# 10.30# 10.88# 11.81# 11.12# 6.13# 8.11#ENSMUSG00000014243# Zswim7# 25.95# 28.49# 26.25# 25.80# 24.64# 16.71# 16.35#ENSMUSG00000037722# Gnpnat1# 26.03# 30.44# 17.02# 24.74# 20.33# 13.38# 13.46#ENSMUSG00000008489# Elavl2# 58.63# 60.20# 61.19# 68.08# 55.99# 38.58# 41.58#ENSMUSG00000060733# Ipmk# 32.58# 31.33# 33.07# 37.07# 34.24# 23.93# 23.57#ENSMUSG00000032350# Gclc# 97.12# 81.49# 78.07# 90.31# 81.43# 57.17# 56.74# 168 Ensembl( Gene(name(WT(Female(E13.5(PGC(Repeat(1(WT(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(1(Het(Female(E13.5(PGC(Repeat(2(Het(Female(E13.5(PGC(Repeat(3(Setdb1(KO(Female(E13.5(PGC(Repeat(1(Setdb1(KO(Female(E13.5(PGC(Repeat(2(ENSMUSG00000026723# Trdmt1# 5.40# 8.07# 4.88# 6.95# 4.99# 3.90# 3.83#ENSMUSG00000030494# Rhpn2# 37.84# 35.44# 38.21# 39.96# 34.37# 26.59# 25.23#ENSMUSG00000007646# Rad51c# 16.73# 14.25# 15.40# 16.14# 15.55# 10.70# 11.39#ENSMUSG00000072980# Oip5# 38.42# 38.85# 41.45# 40.89# 46.17# 30.70# 30.51#ENSMUSG00000029646# Cdx2# 32.57# 28.18# 26.70# 27.87# 31.36# 20.05# 20.97#ENSMUSG00000005667# Mthfd2# 20.83# 20.06# 20.58# 20.56# 19.72# 14.51# 14.81#ENSMUSG00000004535# Tax1bp1# 40.03# 37.53# 38.89# 37.29# 37.78# 28.74# 27.60#ENSMUSG00000006599# Gtf2h1# 47.91# 51.98# 50.99# 48.07# 49.85# 37.72# 37.63# 169 Supplementary Table 5: Primer list Amplicons* Primer*Name* Sequence*IAP$ IAP%retro%ChIP%as$ CTTCCTTGCGCCAGTCCCGAG$$IAP%LTR%ChIP%S$ GCTCCTGAAGATGTAAGCAATAAAG$$ MERVL$ MERVL_int%365%fw$ CTT$CCA$TTC$ACA$GCT$GCG$ACT$G$$MERVL_int%519%rv$ CTA$GAA$CCA$CTC$CTG$GTA$CCA$AC$$ MERV10KC$ MMERVK1OKC_LTR_344%FW$ TTC$GCC$TCT$GCA$ATC$AAG$CTC$TC$$MMERVK10C_INT_481%RV$ TCG$CTC$RTG$CCT$GAA$GAT$GTT$TC$$ MTA$ MTA%R$ AGCCCCAGCTAACCAGAAC$$MTA%F$ ATGTTTTGGGGAGGACTGTG$$ ESET$preKO$ 5\u00E2\u0080\u0099$screening$$$$$ CAGCTTGGAGGAATTGGTTC$$Lox$3C$$$$$$$ $TCCCAAACCTCATAGGGTAAAA$$$ ESET$KO$ 5'$Screening$F$ CAGCTTGGAGGAATTGGTTC$$3'$Screening$R$ TTTCTTTGCCTTTGAGATGGA$$ MCK$cre$ MCKCre5$ ATGTCCAATTTACTGACCGTAC$$MCKCre3$ CGCCGCATAACCAGTGAAAC$$ Zfy1$ Zfy1%F$ GACTAGACATGTCTTAACATCTGTCC$$$Zfy1%R$ CCTATTGCATGGACAGCAGCTTATG$$$ mXist$ mXist%F$ GAC$CTT$CAC$AAC$AGA$CAG$G$$mXist%R$ TGA$AGC$AGC$CAT$TAG$ACT$TG$ "@en . "Thesis/Dissertation"@en . "2016-02"@en . "10.14288/1.0223122"@en . "eng"@en . "Medical Genetics"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivs 2.5 Canada"@* . "http://creativecommons.org/licenses/by-nc-nd/2.5/ca/"@* . "Graduate"@en . "Transcriptional repression of retrotransposons in mouse germline"@en . "Text"@en . "http://hdl.handle.net/2429/56191"@en .