"Medicine, Faculty of"@en . "DSpace"@en . "UBCV"@en . "Beach, Michael"@en . "2009-11-27T19:10:56Z"@en . "2009"@en . "Master of Science - MSc"@en . "University of British Columbia"@en . "The selected expression of the genome determines distinct cell types, properties,\nand conditions. In the pancreatic \u00CE\u00B2-cell, our knowledge of how this is regulated and\nmaintained is incomplete. Deciphering the molecular physiology of the \u00CE\u00B2-cell is critical to develop improvements for expanding pools of donor islets for transplantation, the most promising curative option for sufferers of diabetes. Genomic regulation is controlled primarily by transcription factors, of which pancreatic duodenal homeobox 1 (Pdxl) plays a critical role in both the developing and mature pancreas. As such, I begin to unlock the molecular physiology of the \u00CE\u00B2-cell by\nidentifying the binding sites of Pdxl in pancreatic islets on a genome-wide scale through the use of chromatin immunoprecipitation followed by sequencing (ChIP-Seq). This provides the best picture of Pdxl binding that has ever been assembled. Moreover, I\nidentify a highly co-occurring relationship between Pdxl and pre-B-cell leukemia\nhomeobox 1 (Pbxl) in adult islets. The coupling of this data with other genome-wide analyses will prove invaluable to discovering novel transcriptional complexes and the genes they regulate. It will also\ncontribute to the creation of an islet transcriptional network, thereby greatly enhancing our knowledge of \u00CE\u00B2-cell regulation."@en . "https://circle.library.ubc.ca/rest/handle/2429/15879?expand=metadata"@en . "1766237 bytes"@en . "application/pdf"@en . "UNRAVELING THE MOLECULAR PHYSIOLOGY OF THE n-CELL: GENOME WIDE ANALYSIS OF BINDING SITES FOR THE TRANSCRIPTION FACTOR PDX1 by Michael Beach Honours B.Sc. Trinity Western University, 2006 A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in The Faculty of Graduate Studies (Interdisciplinary Oncology) The University of British Columbia (Vancouver) July 2009 \u00C2\u00A9 Michael Beach 2009 ABSTRACT The selected expression of the genome determines distinct cell types, properties, and conditions. In the pancreatic n-cell, our knowledge of how this is regulated and maintained is incomplete. Deciphering the molecular physiology of the 13-cell is critical to develop improvements for expanding pools of donor islets for transplantation, the most promising curative option for sufferers of diabetes. Genomic regulation is controlled primarily by transcription factors, of which pancreatic duodenal homeobox 1 (Pdxl) plays a critical role in both the developing and mature pancreas. As such, I begin to unlock the molecular physiology of the f3-cell by identifying the binding sites of Pdxl in pancreatic islets on a genome-wide scale through the use of chromatin immunoprecipitation followed by sequencing (ChIP-Seq). This provides the best picture of Pdxl binding that has ever been assembled. Moreover, I identifr a highly co-occurring relationship between Pdxl and pre-B-cell leukemia homeobox 1 (Pbxl) in adult islets. The coupling of this data with other genome-wide analyses will prove invaluable to discovering novel transcriptional complexes and the genes they regulate. It will also contribute to the creation of an islet transcriptional network, thereby greatly enhancing our knowledge of 13-cell regulation. 11 TABLE OF CONTENTS ABSTRACT ii TABLE OF CONTENTS iii LIST OF TABLES v LIST OF FIGURES vi LIST OF ABBREVIATIONS vii ACKNOWLEDGEMENTS ix CHAPTER 1- INTRODUCTION 1.1 Pancreas Development, Structure, and Function 1 1.2 Islet Structure and Function 2 1.3 Insulin Release and Glucose Regulation 3 1.4 Diabetes Mellitus 4 1.5 Expanding Islet Pools and Islet Transplant 6 1.6 Transcription Factor Biology 7 1.7 Key Transcription Factors of the Endocrine Pancreas 8 1.8 Pdxl and the Endocrine Pancreas 10 1.9 Chromatin Immunoprecipitation and Platforms for Sequencing 12 Hypothesis, Aims, and Objectives 16 CHAPTER 2- MATERIALS AND METHODS 2.1 Tissue Culture 17 2.2 Mouse Colony 17 2.3 Western Blotting 17 2.4 Islet Isolations 19 2.5 Chromatin Immunoprecipitation 22 2.6 Phenol-Chloroform Extractions 24 2.7 Illumina Sequencing of DNA and Peak Building 25 2.8 Quantitative Real Time Polymerase Chain Reaction 29 111 2.9 Islet siRNA Transfection.29 2.10 Fluorescence Activated Cell Sorting 30 2.11 RNA Isolation and RT 30 2.12 Tag-Seq-Lite 31 2.13 Tag-Seq-Lite Library Bioinformatics 34 2.14 Seeded Motif Discovery 34 CHAPTER 3- RESULTS 3.1 Pdxl ChIP-Seq Library Construction 3.1.1 Identification of ChIP Quality Antibody and Targets 35 3.1.2 Collection of Islet Pdxl ChIP DNA 37 3.2 Pdxl ChIP-Seq Library Results and Validation 3.2.1 Statistics and Visualizations of Pdxl ChIP-Seq Peaks 39 3.2.2 Validation of the Pdxl ChIP-Seq Library 43 3.2.3 Validation Through siPdxl Tag-Seq Library Construction 46 3.2.4 KEGG Pathways of Pdxl Genes 48 3.3 Pdxl ChIP-Seq Library Analysis 3.3.1 Pdxl and Pbxl Binding Motif Identification 50 3.3.2 Validation and Analysis of Pbxl Containing Peaks 52 CHAPTER 4\u00E2\u0080\u0094 DISCUSSION 57 CONCLUSION 64 REFERENCES 65 APPENDIX 71 CERTIFICATES 74 iv LIST OF TABLES Table 1 \u00E2\u0080\u0094 Summary of MODY Genes.5 Table 2\u00E2\u0080\u0094 Significantly Over-Represented KEGG Pathways of all Genes with a Pdxl ChIP-Seq Peak 49 Table 3\u00E2\u0080\u0094 Significantly Over-Represented KEGG Pathways of siPdxl Tag-Seq Down regulated Genes with a ChIP-Seq Peak 49 Table 4 \u00E2\u0080\u0094 Monomer and Heterodimer Gene Categories 54 V LIST OF FIGURES Figure 1 \u00E2\u0080\u0094 Chromatin Immunoprecipitation 13 Figure 2 \u00E2\u0080\u0094 Islet Isolations 21 Figure 3 \u00E2\u0080\u0094 Illumina Flow Cell Sequencing by Synthesis 26 Figure 4 \u00E2\u0080\u0094 Constructing Peaks from ChIP-Seq Data 28 Figure 5 \u00E2\u0080\u0094 Tag-Seq-Lite Library Construction 33 Figure 6\u00E2\u0080\u0094Identification of a ChIP Quality Pdxl Antibody 36 Figure 7\u00E2\u0080\u0094 Validating the Islet Pdxl ChIP DNA 38 Figure 8\u00E2\u0080\u0094 UCSC Screenshots of Pdxl ChIP-Seq at Known Sites 40 Figure 9 \u00E2\u0080\u0094 Distribution of Pdx 1 ChIP-Seq Peaks 42 Figure 10\u00E2\u0080\u0094 ChIP-Seq Versus ChIP-Chip & Known Binding Sites 44 Figure 11 \u00E2\u0080\u0094 ChIP-Seq Peaks are Validated Via ChIP-qPCR 45 Figure 12 \u00E2\u0080\u0094 Down Regulated siPdxl Tag-Seq Genes are Significantly Represented in ChIP-Seq Data and Include Expected Genes 47 Figure 13 \u00E2\u0080\u0094 Seeded Motif Discovery of Pdxl ChIP-Seq Data Returns Pdxl-like and Pbxl-like Motifs 51 Figure 14\u00E2\u0080\u0094 Pbxl has no Greater Affect on Pdxl Binding at Heterodimer Sites Compared to Monomer Sites 53 Figure 15 \u00E2\u0080\u0094 Analysis of Heterodimer and Monomer Containing Peaks 56 Figure Al \u00E2\u0080\u0094 UCSC Screenshots of Interest of Pdxl ChIP-Seq Binding Sites 71 Figure A2 \u00E2\u0080\u0094 FACSorted siCyclo Islets 72 Figure A3 \u00E2\u0080\u0094 FACSorted siPdxl Islets 73 vi LIST OF ABBREVIATIONS ATP Adenosine TriPhosphate ChIP Chromatin Immunoprecipitation EDTA Ethylenediaminetetraacetic Acid EM Enrichment Maximization ERa Estrogen Receptor Alpha ES cell Embryonic Stem Cell FACS Fluorescence Activated Cell Sorting GADEM A Genetic Algorithm Guided Formation of Spaced Dyads Coupled with an EM Algorithm for Motif Discovery GCK Glucokinase GLUT Glucose Transporter GMAT Genome-wide Mapping Technique HBSS Ranks Balanced Salt Solution HNF Hepatocyte Nuclear Factor IAPP Islet Amyloid Polypeptide ISL1 Islet-i KD Knockdown KEGG Kyoto Encyclopedia of Genes and Genomes MAFA V-maf Musculoaponeurotic Fibrosarcoma Oncogene Homolog A MTN6 Mouse Insulinoma 6 MODY Mature Onset Diabetes of the Young vii NEUROD 1 Neurogenic Differentiation 1 NGN3 Neurogenin 3 NKX NK Homeobox PAX Paired Box PBS Phosphate Buffered Saline PBX1 Pre-B-cell-Leukemia Homeobox 1 PCR Polymerase Chain Reaction PDX1 Pancreatic Duodenal Homeobox I PET Paired End DiTag PWM Position Weight Matrix RT Reverse Transcription SABE Serial Analysis of Binding Enrichment SACO Serial Analysis of Chromatin Occupancy SAGE Serial Analysis of Gene Expression STAGE Sequence Tag Analysis of Genomic Enrichment TB ST Tris-Buffered Saline Tween-20 TE Trypsin-EDTA TSS Transcriptional Start Site UCSC University of California, Santa Cruz WHO World Health Organization viii ACKNOWLEDGEMENTS Special thanks to Dr. Brad Hoffman for his mentorship and training, as well as other members of the Helgason lab: Bo Zavaglia and Joy Witzsche, and my supervisory committee: Dr. Pamela Hoodless, Dr. Cheryl Helgason, Dr. Dixie Mager, and Dr. Sylvia Ng. Islet isolations at the Verchere lab were performed by Galina Soukhatcheva. ChIPs at the Genome Sciences Centre were performed by Balgit Kamoh. Motif Discovery analysis courtesy Gordon Robertson and Leping Li. ix CHAPTER 1. INTRODUCTION 1.1 Pancreas Development, Structure, and Function Germ layer formation at gastrulation establishes the endoderm, the germ layer from which the pancreas develops. Subsequently, distinct morphological events accompanied by specific onsets of gene expression culminate in pancreas formation. The point of pancreas determination is termed the primary transition, which occurs shortly after the onset of FoxA2 expression in the endoderm. As the embryo begins to rotate, FoxA2 induces Pdxl expression, driving cells towards the pancreatic fate\u00E2\u0080\u0099. Dorsal and ventral pancreatic buds begin to form and Nkx6. 1 and NeuroD 1 become expressed in the epithelium. Expansion of the epithelium occurs before the secondary transition, when terminal differentiation of islet and exocrine cells occurs. At this point, insulin or exocrine genes experience a 100-fold activation, with Pdxl, Nkx6.1, and Nkx2.2 becoming 13-cell restricted\u00E2\u0080\u0099. Finally, at isletogenesis, endocrine cells group into the islets and exocrine acinars form. The role of the pancreas is twofold, as its exocrine cells produce and secrete digestive enzymes into the intestine, while endocrine cells release hormones into the bloodstream that are crucial to maintain homeostatic body metabolism. Exocrine cells compose the majority of the pancreas, and the enzymes they produce enter the duodenum through the ampulla of Vater2 (also termed major duodenal papilla3)where the common bile duct and main pancreatic duct join. With a pyramidal shape and basal nuclei, exocrine cells possess an abundance of rough ER and many secretory vesicles to release their digestive enzymes4. Additionally, hydrogen carbonate is also produced by the 1 exocrine pancreas to neutralize the hydrochloric acid produced in the stomach5. Thus, the exocrine pancreas has a critical role in nutrient digestion in the small intestine. While this exocrine function of the pancreas is of utmost importance, the endocrine role of the pancreas in metabolic homeostasis through the cells grouped in its islets has been a major research focus. 1.2 Islet Structure and Function In the pancreas, endocrine cells comprise only about two percent of the total pancreas mass6. Nevertheless, this relatively small population of cells is absolutely critical for normal metabolic maintenance. Embedded within the exocrine tissue, endocrine cells are found in clusters termed Islets of Langerhans. The islet is composed of several types of endocrine cells, and while their exact percentage contribution to each islet is variable, general proportions are agreed upon. The majority of the cells in islets are f3-cells. These account for 6O-80%\u00E2\u0080\u0099 8, 9 of the cell mass and release the hormone insulin. Typically, ce-cells are the next most abundant. These cells release glucagon and comprise anywhere from 10-28% of the islet7\u00E2\u0080\u00998, 9 The remaining cell types are typically less abundant and are as follows: somatostatin producing s-cells 2-10%\u00E2\u0080\u0099 8,9, 10 pancreatic polypeptide producing PP-cells 3-19%\u00E2\u0080\u0099 8, 9, 10 and ghrelin producing s-cells 1%\u00E2\u0080\u0099 \u00E2\u0080\u009C. Being released directly into the bloodstream, these hormones target primarily the liver, muscle, and fat cells5, as they play major roles in metabolic homeostasis. Consequently, islets receive a rich arterial blood supply via a unique capillary system that allows them to receive ten times the amount of blood per mass compared to exocrine cells\u00E2\u0080\u00992. Islet capillaries are also larger and contain fenestrae that increase permeability13 and aid in 2 insulin uptake following its release from the islet\u00E2\u0080\u00994. As the most dominant cell type in the islet, and the source of insulin, the f3-cell plays the most significant role in metabolic maintenance through careful regulation of blood glucose levels. 1.3 Insulin Release and Glucose Regulation Insulin release has been long understood to occur in two phases\u00E2\u0080\u00995. The first phase is a rapid response to increasing blood glucose levels and lasts approximately 2-4 minutes before decreasing to a plateau at 10-15 minutes. A more gradual process, the second phase of insulin release lasts 2-3 hours during which a steady state of insulin levels is achieved. Sensing of glucose by the -cel1 does not occur via a glucose receptor, but rather through the metabolic products of glucose that trigger a molecular response culminating in insulin secretion5. This process begins with glucose entering the 3-cell through the channel protein GLUT2\u00E2\u0080\u00996. In the cytosol, glucokinase phosphorylates glucose\u00E2\u0080\u00997preventing it from exiting the cell though GLUT2. Highly efficient oxidative metabolism breaks down glucose to CO2 and H20 resulting in an increase in ATP levels through oxidative phosphorylation via the mitochondrial electron transport chain. At this point, signal transduction moves from metabolic to electric, as membrane-bound potassium channels close in response to the increased levels of cytosolic ATp\u00E2\u0080\u00998. Closure of these channels causes depolarization of the plasma membrane, eliciting the opening of voltage-gated calcium channels and allowing calcium ions to flood the cytosol\u00E2\u0080\u00999. This rise in internal calcium levels stimulates the cortical actin network to disband, permitting insulin containing granules to fuse with the cell membrane and release insulin20. Moving through the circulation, insulin stimulates glucose uptake, acting primarily at striated 3 muscle tissue and adipose tissue. Upon binding of insulin to the insulin receptor, the glucose transporter GLUT4 moves to the cell surface and facilitates entry of glucose into the cell21. Metabolism of glucose produces ATP to meet the energy demands of the cell, or results in production of high potential energy storage molecules such as glycogen. 1.4 Diabetes Mellitus In the absence of proper insulin controlled regulation, blood glucose levels become abnormally high, indicative of the disease diabetes mellitus. In the year 2000, the WHO reported 171 million cases of diabetes worldwide with the incidence continuing to rise particularly in developed countries22. The core symptoms of diabetes include frequent urination, increased fluid uptake due to thirst, and increased appetite. If allowed to progress untreated, severe conditions can include diabetic coma, blindness, loss of limbs, renal failure, and death. Both hereditary and environmental factors significantly contribute to the progression of diabetes. In type I diabetes, the J3-cells of the pancreas are destroyed by T-cell mediated autoimmune attack23. While individuals remain responsive to insulin, the severe reduction in f3-cell numbers results in insufficient production of the hormone for the demands of the body. Conversely, type II diabetes stems from diminished insulin sensitivity leading to insulin resistance. Central obesity is a major risk factor for development of type II diabetes, and for this reason exercise is often prescribed as treatment and can restore insulin sensitivity. While environmental factors play a significant role in the development of diabetes, there are also major contributing genetic factors. This has been particularly 4 well characterized in a third form of diabetes, MODY (mature onset diabetes of the young). While not all contributing genes are known, several have been well established, most of which have been termed MODY factors. MODY genes typically have an autosomal dominant mode of inheritance, and their mutation disrupts insulin production. Depending on the gene mutation, MODY is categorized as MODY 1 through 8. The genes belonging to each category are shown below: Table 1 \u00E2\u0080\u0094 Summary of MODY Genes MODY 1 Hnf4a MODY 2 Gck MODY3 Hnfla MODY4 Pdxl MODY 5 Hnflb MODY 6 NeuroDi MODY 7 Kruppel like factor 11 MODY 8 Bile salt dependent lipase Compared to type I and type II diabetes, the MODY forms are extremely rare. However, regardless of the type, no form of diabetes has a cure. While the disease can be well managed through careful monitoring of blood glucose and insulin administration, this is only therapeutic in nature. The most likely curative option for individuals suffering from diabetes is islet transplantation. As such, there is a significant amount of research being focused on how to maximize islet transplant success and how to expand islet pools in vitro for transplant purposes. 5 1.5 Expanding Islet Pools and Islet Transplant Islet transplantation accounts for only a small percentage of the total transplant procedures being performed in British Columbia. In 2008, out of 266 transplants in BC, only 15 were pancreatic islets24. Despite this being the only curative option for persons suffering from diabetes, there are two main reasons why so few transplants are being done: 1) graft survival in islet transplants is not long lasting, with only 33% of recipients claiming insulin independence after 2 years25 and 2) there is a huge shortage of available tissue for transplant. This has motivated the majority of islet research to focus on how to improve islet graft survival, or how to increase islet survival and proliferation in culture and/or differentiate stem cells into insulin producing cells suitable for transplant. Both types of research are of critical importance for islet transplantation to evolve into a true curative therapy for diabetes. The latter of these research focuses, expansion of islet pools and stem cell differentiation, is of vital importance because currently, islets from several deceased donors must be harvested to perform a single transplant. Moreover, once in culture, survival of islets is poor, and proliferation of the cells does not readily occur26. To date, there has been some success in overcoming this barrier, but further refinements are required27. Consequently, advances in expanding isolated islet populations are invaluable to provide more tissue for transplant. Similarly, stem cell research addresses this same problem through the generation of f3-cells from early lineage precursors. This could also address the problem of graft rejection given that tissue could be differentiated directly from stem cells of the patient. While such endeavours have yielded insulin-producing cells28, they are not true 13-cells and are not as of yet suitable for transplant use29. 6 Whether the goal is to enhance existing islet survival and proliferation, or produce f3-cells from a stem cell antecedent, it is clear that a better understanding of the molecular physiology of the n-cell is needed to augment these efforts. This is because a cell\u00E2\u0080\u0099s properties are determined by the information carried in its genome, the selected expression of which serves to define distinct cell types and conditions30. This controlled expression of genomic information is regulated by transcription factors. Therefore, a comprehension of transcription factor binding and networks can aid in better understanding how a given cell type employs its genome to arrive at and maintain its final functions. 1.6 Transcription Factor Biology Transcription factors are proteins that possess DNA binding domains allowing them to directly bind to DNA and regulate transcription through activation andJor repression31. These factors are significant contributors to controlled expression of the genome, along with microRNAs30. In addition to the DNA binding domain, transcription factors can also contain trans-activating domains that serve as binding sites for other proteins acting as coregulators. This allows multiple transcription factors to associate and form complexes for highly controlled genomic regulation. Transcription factors are grouped into families based on the structure of their DNA binding domain. Pdxl, for example, is grouped in the homeodomain protein family. 7 1.7 Key Transcription Factors of the Endocrine Pancreas Most transcription factors known to have essential roles in the pancreatic 13-cell have been identified based on their roles developmentally, or from their direct influence on insulin regulation. In pancreatic islets, critical transcription factors include but are not limited to: FoxA2 (Hnf313), Hnf4a, Hnflcx, Hnfl13, Nkx2.2, Nkx6.l, NeuroDi, Ngn3, Pax4, Pax6, Isli, Mafa, Pbxl, and Pdxl. The importance of FoxA2 rests primarily on its developmental role as an activator of Pdx132,whose activity is often mediated by Pbx133\u00E2\u0080\u0099. Similar to FoxA2, Hnflct35 and Hnf11336 are also regulators of Pdxl, in addition to themselves being MODY genes. It has been suggested that these transcription factors, as well as Hnf4a, act cooperatively with Pdxl in the adult 13-cell to drive expression of essential 13-cell specific genes\u00E2\u0080\u0099. While the above Hnf family members serve to both regulate and act with Pdxl, the Nkx family members are suspected targets of Pdx 1 that are also crucial transcription factors in 13-cells\u00E2\u0080\u009D \u00E2\u0080\u0098. Knockout studies of Nkx2.2 reveal that while endocrine cells differentiate normally, the 13-cells are unable to activate the insulin gene and expression of Nkx6.1 is also lost38. Nkx6.1 is specific to the 13-cells of the pancreas and is essential for 13-cell formation. Loss of Nkx6. 1 expression results in pancreases showing normal development of islet cells with the exception of the mature 13-cell, which is completely absent39. This observation has led to speculation that Nkx6. 1 serves to repress genes that confer u-cell fate, thereby stabilizing the 13-cell phenotype. Before the stabilization of endocrine cell type can be conferred, the overall endocrine fate must first be selected; this is accomplished through Ngn3. Forced expression of Ngn3 in pancreatic ductal cells has been shown to activate an endocrine 8 program40. Similarly, transfection of endodermal ES-cells with Ngn3 induces insulin gene transcription as well as expression of other endocrine type factors41. It is believed that this occurs as a result of Ngn3 activation of another key pancreas transcription factor, NeuroD 1. NeuroD 1 is a basic helix-loop-helix factor that binds to E-box elements of the f3- cell insulin gene promoter in a complex with Pdxl and Mafa, although it is also expressed in all other endocrine cell types of the pancreas\u00E2\u0080\u009D42 Despite its ubiquitous islet expression, it does not appear to be necessary fOr endocrine differentiation as NeuroD 1 knockout mice successfully produce all islet cell types. However, upon islet formation, these same mice develop diabetes due to 13-cell apoptosis and a reduction of islet cell numbers43. Several other transcription factors have been identified as critical to pancreas development and/or function as their altered expression gives rise to definitive phenotypes. Isli was one of the first genes identified as having a role in pancreas development. The dorsal pancreatic bud fails to develop in Isli deficient mice, and in the ventral bud glucagon expressing cells are absentW. Distinct phenotypes are also observed in Pax4 and Pax6 inactivated mice. In both cases, mice die shortly after birth of similar yet opposite causes. In Pax4 knockout mice, both f3-cells and &cells are completely absent while cL-cells persist. Conversely, Pax6 knockout mice show the opposite trend in endocrine cell type presence. When both Pax4 and Pax6 are inactivated, no pancreas endocrine cell types are observed45. The factors that regulate Isli, Pax4, and Pax6 expression in the pancreas are not well understood. 9 In almost every case, the abovementioned transcription factors have been identified as crucial to the pancreas as a result of clear presentation of phenotypes. In addition, a uniting thread of a relationship with the pancreatic master regulator Pdx 1 is apparent. Therefore due to its overarching role, Pdxl represents an ideal starting candidate to decipher the molecular physiology of the 13-cell. 1.8 Pdxl and the Endocrine Pancreas In addition to the suspected relationship of Pdxl to many other pancreas critical transcription factors, Pdxl was also the first gene identified to be independently required for pancreas development in mice and humans46\u00E2\u0080\u009947\u00E2\u0080\u00A2 Its expression begins at E8.5 in the definitive endoderm, where it drives pancreatic fate, and more specifically 13-cell differentiation\u00E2\u0080\u0099. Consequently, knockout of Pdx 1 results in embryonic lethality as pancreas formation does not progress past the initial budding stages. In the absence of Pdxl, the undifferentiated cells of the pancreas fail to expand after dorsal and ventral budding occurs. Therefore, the onset of Pdxl expression at this stage is typically regarded as the beginning of pancreagenesis. Pdxl protein distribution remains homogenous until the secondary transition, when exocrine cells down regulate Pdxl while endocrine cells up regulate Pdxl resulting in a 100-fold difference in expression. In the mature islet, Pdxl is restricted to 13-cells and a small set of \u00C3\u00B6-cells48. In addition to its essential role in development, Pdxl also maintains vital importance in the adult. The most recognizable gene that Pdxl regulates is insulin. For this reason, Pdxl is a MODY factor. Binding of Pdxl at the insulin promoter has been reported to occur at two distinct E-box elements upstream of the transcriptional start 10 site49. Here, the protein is thought to form a transcriptional complex with NeuroD 1 and Mafa42. While the insulin promoter possesses potential binding sites for a variety of transcription factors50,the binding of Pdx 1 to these elements is not only confirmed but indispensable, as the loss of even one of the elements results in insulin deficiency. Recently, a short-range DNA looping model of Pdxl regulation at the insulin gene has been proposed that results in distal enhancer regions being brought into close proximity to the transcriptional start site42. In this model, only a single true Pdx 1 binding site exists, with the second binding site indirectly linked through NeuroD 1. Insulin, though arguably the most important, is not the only critical f3-cell gene regulated by Pdxl. It has also been shown to activate Gck51, Glut252, IAPP53,Mafa54, and its own promoter35. From this, it seems that Pdxl functions not only as a master regulator of pancreas development, but also a master regulator of f3-cell function in the adult. Mouse models confirm the importance of Pdxl in mature islets. Since the Pdxl knockout is embryonic lethal, our best insight into how the absence of Pdx 1 affects the adult 13-cell comes from conditional knockout studies. These mice show reduced insulin secretion as well as reduced expression of Glut2, thereby substantiating in vivo the importance of Pdxl in adulthood55. In both the embryo as well as the adult, the transcriptional activity of Pdxl is moderated, at least in part, by Pbx 1. The formation of Pdx!Pbx heterodimers has been shown to occur in vitro, and has been hypothesized to play a role in refining Pdxl activity in exocrine versus endocrine cell types\u00E2\u0080\u0099. Developmentally, the importance of the PdxlPbx interaction has been demonstrated through generation of Pdxl mice with a mutated Pbx 1 interaction domain. In these mice, the quantity and organization of 11 endocrine cells is severely impaired, suggesting a critical role for Pdx/Pbx complexes in expansion of precurser cell populations33.The significant nature of this interaction seems to continue into the mature f3-cell, as mice heterozygous for Pdxl and Pbxl mutant alleles develop more severe diabetes and hypoinsulinemia than single mutants of either gene34. The importance of Pdxl to both pancreas development and adult function is unquestionable. However, only a handful of Pdxl target genes, though absolutely critical, are known. Consequently, on a genome-wide scale there is still very little known about how Pdxl is operating to maintain f3-cell function. 1.9 Chromatin Immunoprecipitation and Platforms for Sequencing The chromatin immunoprecipitation (ChIP) procedure is a valuable tool for identifying transcription factor binding at target sites. Transcription factors are crosslinked to DNA from isolated cells and the membranes lysed to release the chromatin. Sonication pulses are used to shear the DNA into small fragments that are subsequently incubated with an antibody directed against the transcription factor of interest. To isolate antibody bound DNA, protein G beads are added which bind the antibody-transcription factor-DNA complex, allowing for isolation, elution, and retrieval of only those DNA fragments bound by the transcription factor of interest. A schematic of the ChIP procedure is displayed in Figure 1. Classically, ChIP-DNA has been assessed through polymerase chain reaction (PCR) on a site-by-site basis, which requires prior suspicion of a site of interest to warrant testing for binding enrichment. However, advancements of array and sequencing technologies have made identifying large numbers of novel binding sites more feasible 12 Cells are lysed and DNA sonicated resulting in lOO-300bp fragments 7/FI Proteins are cross linked to DNA byfixing cells with formaldehyde Incubate DNA with antibody directed against protein of interest Isolate immunoprecipitated DNA with protein G beads Reverse cross link protein DNA complex. Precipitate DNA and use for sequencing or PCR. Figure 1 - Chromatin Immunoprecipitation. The steps of the ChIP procedure are depicted leading to the isolation of transcription factor bound DNA to be sequenced. I-, and increasingly cost effective. Over the last several years, there has been much variation in the precise technology employed to identify DNA fragments isolated by ChIP. Initial hybridization of ChIP DNA to promoter microarrays (ChIP-Chip) has proven extremely cost effective in identifying transcription factor binding regions56\u00E2\u0080\u009957 However, these studies are limited insofar as they bias their results solely to promoter regions, thereby failing to account for the majority of genomic sequence. An attempt to address this shortcoming was first made through the use of Sanger sequencing in ChIP-SACO58, ChIP-SABE59,ChIP-STAGE60,and GMAT6\u00E2\u0080\u0099 studies. Nevertheless, these methods proved to be extremely cost limiting, and as such failed to present as reasonable options for identification of anything other than the most noteworthy of binding sites. More recently, ChIP-PET has made use of Roche 454 parallel pyrosequencing to identify binding sites of p5362, Oct463, Nanog63, and ERcM. While these studies marked significant improvement over previous techniques, ChIP-PET still cannot reach a cost- effective sequencing depth necessary to scrutinize an entire genome. It has only been with the emergence of flow cell sequencing technologies that our ability to confidently and cost-effectively identify binding sites at a genome wide level has truly emerged through the ChIP-Seq method65. The use of flow cell sequencing in ChIP-Seq allows for tens of millions of DNA fragments to be sequenced in a single run on parallel lanes. Currently, Roche 454 and Illumina represent the two most commonly used flow cell sequencers. For the purposes of ChIP-Seq, the Illumina device is superior to that offered by Roche 454 due to its ability to generate ten times the number of DNA sequences at approximately one tenth the cost. These flow cell technologies require as little as 1 Ong of input DNA for 14 sequencing. Comparatively, ChIP-ChIP procedures require 4-5 ig of material65. Moreover, as a constantly advancing technology, the cost associated with flow cell sequencing is continually lessening. The improving cost-effectiveness of ChIP-Seq is noteworthy, as the main competing methodology, despite its aforementioned bias and limitations, continues to be ChIP-ChIP due to its low cost. In fact, a ChIP-ChIP study of Pdxl binding in an insulinoma NIT-i cell line has been published previously56. However, with this study being cell line based in addition to encompassing the inferiorities of ChIP-ChIP as compared to ChIP-Seq, our work sought to provide a far superior representation of genome-wide Pdxi binding through ChIP-Seq in primary tissue, pancreatic islets. 15 Hypothesis, Aims, and Objectives Curative options for diabetes, an increasingly prevalent worldwide disease characterized by an inability to regulate blood glucose levels, find the most substantial promise in islet transplant. A major limitation to islet transplantation is the scarcity of tissue, a shortcoming that can be addressed if islet pools can be either expanded or derived from stem cell precursors. To manipulate these cells, a much clearer understanding of the molecular physiology of the f3-cell is required. A cell\u00E2\u0080\u0099s properties are defined by the selective expression of its genome, which is controlled largely by transcription factors, and in the f3-cell the foremost of these is Pdxl. Therefore, to begin to develop a truly in depth knowledge of the molecular workings of the n-cell, the purpose of this work is to attempt to characterize the genome-wide nature of Pdxl binding in the pancreatic islet through the use of ChIP-Seq. I hypothesize that Pdxl plays a major role in the 13-cell transcriptional network, that a substantial percentage of its binding occurs at DNA regions distal to transcriptional start sites, and that much of its binding is facilitated by cooperative partners. 16 CHAPTER 2. MATERIALS AND METHODS 2.1 Tissue Culture The mouse insulinoma adherent cell line MIN6 was maintained in 10cm tissue culture dishes (BD Biosystems) at a minimum 40% confluency and incubated at 37\u00C2\u00B0C and 5.2% CO2 in high glucose Dulbecco\u00E2\u0080\u0099s Modified Eagles Medium (DMEM) (StemCell Technologies) containing 10% Fetal Bovine Serum (FBS) (Invitrogen) and 1% L Glutamine (Invitrogen). Cells were passaged once per week at a confluency of 80-100% using Trypsin-EDTA (TE) (Invitrogen) and a centrifugation speed of 1200rpm for 5 minutes. 2.2 Mouse Colony C57B1/6J and ICR mice were maintained in the Animal Resource Centre at the BC Cancer Research Centre in Vancouver according to the guidelines of the Canadian Council on Animal Care and protocols approved by the Animal Care Committee of UBC. 2.3 Western Blotting A single well of a 24-well plate (BD Biosystems) of adherent MIN6 cells was harvested using TE, centrifuged at 1200rpm for 5 minutes to pellet cells, washed with lrnL of ice-cold 1X Phosphate Buffered Saline (PBS) (StemCell Technologies), and centrifuged again to obtain the clean cell pellet. 1 OOjiL of Radio Immuno Precipitation Assay (RIPA) lysis buffer (75mM NaC1, 1mM ethylenediaminetetraacetic acid [EDTA], 50mM Tris-HC1 pH 7.25, 0.5% Triton X-100, Protease Inhibitor (P1) @ 1/100) was added and the tube incubated on ice for a minimum of 10 minutes. The lysate was heated 17 at 96\u00C2\u00B0C for 5 minutes, and placed on ice. A pre-cast polyacrylamide gel (Invitrogen) was loaded into the running dock and 3-(N-morpholino) propanesulfonic acid sodium dodecyl sulfate (MOPS-SDS) running buffer added (Invitrogen). Precision Plus Protein Ladder (BioRad) and MIN6 lysate were added to independent wells and the gel run at 150V for 1 hour. A transfer membrane was submerged in methanol (Sigma Aldrich) for 30 seconds, removed, and submerged in NuPAGE Transfer Buffer (Invitrogen) containing 10% methanol. The gel was removed from its casting tray and the transfer apparatus assembled using the soaked transfer membrane. Protein transfer to the membrane was canied out by running at 35V for 1 hour in NuPAGE Transfer Buffer. Following transfer, the membrane was removed and blocked with 5mL Tris Buffered Saline Tween 20 (TBST) containing 5% milk powder for 1 hour at 4\u00C2\u00B0C. Blocking solution was removed and 5mL of new blocking solution containing primary Pdxl antibody (Upstate Chemicon) at 1 jiL/l 000!IL was added to the membrane and incubated overnight at 4\u00C2\u00B0C on a rocking platform. The next day, the membrane was washed 3X for 10 minutes with TBST after which 5mL blocking solution containing secondary antibody at ljiL/10,000jiL was added and the membrane incubated for 1 hour on a rocking platform at room temperature. Washes with TBST were done 3X for 10 minutes each, after which a 1:1 mix of Detection Reagent 1 and Detection Reagent 2 (Amersham) were added to the membrane which was subsequently taken for exposure and film (Kodak Chemiluminescent BioMax Light) development in a dark room. 18 2.4 Islet Isolations To isolate mouse pancreatic islets, C57B1/6J (Jackson labs) and ICR mice aged 6 to 8 weeks were sacrificed via CO2 asphyxiation and a midline incision made to expose the inner abdominal and thoracic cavities. Liver lobes were folded upwards to reveal the gall bladder and common bile duct running to the duodenum. A clamp was placed at the major duodenal papilla, the point of connection between the common bile duct and the duodenum, preventing fluid flow to the intestine and limiting it exclusively to the pancreas. Using a 26-gauge needle, 3mL of chilled collagenase (Sigma Aldrich) at 1000 units/mL in 1X Hanks Balanced Salt Solution (HBSS) (Invitrogen) was injected through the common bile duct to perfuse the pancreas. The swelled pancreas was scraped away from the intestine and placed in a 5OmL Falcon tube. As multiple pancreases were collected, they were distributed such that each 5OmL Falcon tube contained two pancreases and an additional 6mL of collagenase solution was added to each tube. The tubes were immediately placed in a 37\u00C2\u00B0C water bath for 15-20 minutes to facilitate tissue digestion. Next, a transfer pipette was used to mechanically disrupt the contents of each tube until the mixture became homogenous. To stop digestive activity, 2OmL of ice-cold lx HBSS containing 0.25% Bovine Serum Albumin (BSA) (Roche) and 0. 1M CaC12 was added and the tubes placed on ice. The tubes were centrifuged for 1 minute at 1120rpm, the supernatant was poured off, the remaining pellet was washed with 2OmL of HBSS, and again centrifuged at 1120rpm for 1 minute. This wash was repeated at least three times, or until the supernatant appeared clear. Pellets were resuspended in 2OmL HESS and exocrine tissue was removed by filtering the solution through a pre-wetted 70iM nylon mesh filter (Fisher Scientific). The contents of the filter were washed with HESS 19 into a 10cm petri dish (BD Biosystems) and placed under a stereomicroscope where islets were handpicked into a microcentrifuge tube using a 20ji1 pipette. Once a clean prep of islets was obtained, a single-cell suspension was created by adding 400uL of Enzyme free Cell Dissociation Buffer (Gibco) and incubating the tube at room temperature for 12- 15 minutes. During this time, islets were gently pipetted up and down every 3 minutes to facilitate dissociation. After a single-cell suspension was acquired, cells were centrifuged at 1200rpm for 1 minute, supernatant was removed, the cell pellet washed with lmL 1X PBS, and centrifuged again at 1200rpm. The resultant cell pellet was then ready to be used for subsequent experiments. Islet isolation images are shown in Figure 2. 20 Figure 2 - Islet Isolations. Panels A through F show images of pancreatic islet isolation. In panel A, the major duodenal papilla is labelled marking the site of clamp placement. The bile duct is labelled in panel B, marking the location of syringe insertion seen in panel C. The perfused pancreas is clearly seen in D and magnified in E. Following digestion, washes, and filtration a clean preparation of islets is obtained through pipette picking (F). Li 21 2.5 Chromatin Immunoprecipitation ChIP experiments were carried out in a manner similar to published previously66. MIN6 cells or a single-cell suspension of islets were collected and washed in lmL 1X PBS and centrifuged at 1200rpm for 2 minutes. The cell pellet was resuspended in 1360iL of 1X PBS and 381iL of 37% formaldehyde (Fisher Scientific) was added to crosslink the cells. This fixation was carried out for 10 minutes on a rotating platform at room temperature, after which l75iiL of 1M glycine (Invitrogen) was added and the suspension rotated for another 5 minutes to stop the fixation. Cells were centrifuged at 4000rpm for 2 minutes, washed in lmL 1X PBS, and again centrifuged at 4000rpm for 2 minutes. To lyse the cellular membrane, 5001iL of cold ChIP cellular lysis buffer (10mM Tris-Ci pH8.0, 10mM NaC1, 3mM MgC12, 0.5% NP-40, PT @ 1/100) was added to the cell pellet and the solution dounce homogenized for 10 strokes. The resulting suspension was then incubated on ice for at least 5 minutes and centrifuged at 13,200rpm for 3 minutes. The nuclear membrane was lysed to release the chromatin by adding lOOj.tL of cold ChIP nuclear lysis buffer (1% SDS, 5mM EDTA, 50mM Tris-Cl pH8.0, PT @ 1/100) to the cell pellet and resuspending the cells by passing them through a 26-gauge needle for 5 strokes. Shearing of the resulting chromatin was accomplished through sonication (S3000 Ultrasonic Cell Disruptor Processor, Fisher) of the solution as follows: 10 minutes total sonication time, 1 minute on followed by 30 seconds off, in an ice water bath at 50% output power. Undissolved debris was pelleted and removed by centrifuging at 13,200rpm for 10 minutes and moving the supernatant to a new tube. 1/20th of this supernatant was removed to a new tube and ChIP nuclear lysis buffer added to a final volume of 200jiL. To this, 81iL of SM NaC1 was added and the tube incubated at 65\u00C2\u00B0C 22 overnight to reverse crosslink the sample as an input control. To the 95p1 of remaining supernatant, 42.5 jiL of ChIP nuclear lysis buffer and 7.511L of ChIP spike buffer (lOX concentrate of ChIP dilution buffer \u00E2\u0080\u0094 0.01% SDS, 1.1% Triton X-100, 167mM NaC1, 16.7m1\4 Tris-Ci pH8.0, PT @ 1/100) were added making the total volume 150jiL. 20jiL of Protein G agarose beads (Pierce) were then added to pre-clear the solution by mixing on a rotating platform at 4\u00C2\u00B0C for 1 hour. Beads were spun down at 13,200rpm for 30 seconds and the supernatant transferred to siliconized tubes. 3ig of Pdxl (Upstate \u00E2\u0080\u0094 Chemicon) antibody was added to the supernatant and for each ChIP reaction a separate tube of 20pL of protein G beads were added to lmL of ChIP dilution buffer supplemented with lmg/mL BSA, and 0.lmg/mL salmon sperm DNA (Invitrogen) to block the beads. Both the supernatant and the beads were incubated overnight at 4\u00C2\u00B0C on a rotating platform. The next day, the beads were centrifuged at 13,200rpm for 30 seconds and the supernatant was removed. The antibody mixture was added to an aliquot of blocked beads and placed back on the rotating platform at 4\u00C2\u00B0C for 3 hours. Beads were centrifuged at 13,200rpm for 30 seconds, supematant removed, and beads washed as follows: 5 minutes in low salt buffer (0.1% SDS, 1% Triton X-100, 2mM EDTA, 20mM Tris-Ci pH8.0, 150mM NaC1), 5 minutes in high salt buffer (0.1% SDS, 1% Triton X 100, 2m1v1 EDTA, 20mM Tris-Ci pH8.0, 500mM NaC1), 5 minutes in LiC1 buffer (0.25M LiC1, 1% NP-40, 1% Deoxycholate, 1mM EDTA, 10mM Tris-Ci pH8.0), and 2 washes for 5 minutes each in TE Buffer (10mM EDTA, 10mM Tris-C1 pH8.0). Following these washes, 1 5OjiL of elution buffer (1% SDS, 0.1 M NaHCO3)was added to the beads, the solution transferred to a fresh tube, and incubated at 50\u00C2\u00B0C on a rotating platform for 1 23 hour. Beads were centrifuged at 13,200rpm for 30 seconds and the supernatant containing eluted chromatin was transferred to a new tube. An additional 50jiL of elution buffer was added to the beads and they were again centrifuged and the supernatant removed and combined with the initial 1 50iL. To reverse crosslink the eluted chromatin in the ChIP sample, 81iL of 5M NaC1 was added and the tube incubated overnight at 65\u00C2\u00B0C on a rotating platform. The DNA from the input sample reverse crosslinked from the previous day was extracted via phenol-chloroform extraction. Similarly, DNA from the ChIP sample was also phenol-chloroform extracted the following day. 2.6 Phenol-Chloroform Extractions To extract the DNA from input and ChIP samples, Buffer Saturated Phenol (Invitrogen) was combined with chloroform (Fisher Scientific) in a 1:1 ratio. An equivalent volume of this mixture was added to the sample to be extracted and the tube shaken vigorously to mix. After letting stand for 5 minutes to allow phase separation to begin, the sample was centrifuged at 13,200rpm for 10 minutes. The uppermost aqueous phase was removed and transferred to a new tube where a 3X volume of ice-cold 100% ethanol was added and the tube let stand for 30 minutes to precipitate the DNA. Following precipitation, the tube was centrifuged at 13,200rpm for 10 minutes and the supernatant aspirated leaving the invisible DNA pellet. This pellet was resuspended in 201iL DNase RNase free water (Invitrogen). 24 2.7 Illumina Sequencing of ChIP DNA and Peak Building Chromatin of 100-300bp was selected by running the sample on a 12% PAGE gel, excising all material found in that size range, and purifying using a Spin-X filter column (Costar) and ethanol precipitation by Baljit Kamoh at the Genome Sciences Centre (GSC). Subsequently, the isolated DNA was sequenced using the Illumina genome analyzer67 located at the GSC. Briefly, PCR amplification of the DNA was performed using ligated adapters to the size selected fragments for use as primers. The resultant PCR products were affixed to a flow cell where \u00E2\u0080\u009Cbridge\u00E2\u0080\u009D amplification was employed to produce clonal clusters of identical DNA fragments. To sequence these fragments, a primer homologous to the ligated adapters was annealed and sequence by synthesis performed using reversibly terminated fluorescently labelled nucleotides. Following each cycle of nucleotide addition, the flow cell image was captured using fluorescence microscopy. At the end of the sequencing run, the combined images were used to make base calls providing sequence information for the affixed fragments. The Illumina sequencing method is displayed in Figure 3. 25 I I I I I I I I I I I I Iv$ / liii / Sequence by synthesis Addition of DNA polymerase, primer, and fluorescently labelled nucleotides Repeat denature, anneal, and synthesis to form clusters \u00E2\u0080\u0098:1!: [ 1117/ I I I I I I I I I I I I I I Laser scanning of flow cell and image capture after each round of nucleotide addition. _________ Adapters ligated to DNA PCR amplification ___ ___ DNA denatured ssDNA attached to flow cell Annealing of free DNA ends to complementary primers on flow cell I I I I\u00E2\u0080\u0099 \u00E2\u0080\u0098 II I I I I I Hf H P Figure 3 - Illumina Flow Cell Sequencing by Synthesis. 26 Peaks were constructed from sequenced DNA using the computational tool FindPeaks3.168. The FindPeaks algorithm is utilized to analyze short-read sequencing experiments to identify areas of enrichment and produce a \u00E2\u0080\u009Cwig\u00E2\u0080\u009D file that can be uploaded to the UCSC genome browser website. Sequence reads are aligned to the genome and regions of protein-DNA interaction have an enriched concentration of reads compared to an islet input control background model. Sites of enrichment between the protein of interest and the genomic DNA are defined as peaks. A representation of the peak building process is depicted in Figure 4. 27 Reads from antibody ChIP sample Reads from control sample Aligned read density for ChIP sample 1 Aligned read density for control sample 7c77c 1 Figure 4 - Constructing Peaks from ChIP-Seq Data. Sequenced reads from ChIP DNA are aligned to the genome. Read density is compared against a control background sample to determine areas of read density enrichment. Where read density is greater than the background control employed, a \u00E2\u0080\u009Cpeak\u00E2\u0080\u009D is defined. Final \u00E2\u0080\u009Cpeak\u00E2\u0080\u009D detennined from aligned reads enriched above control sample 28 2.8 ciPCR Reactions were set up with the following components: 4p.L SYBRFast (Applied Biosystems), 0.51iL ChIP DNA, ijiL primer mix at lOiiM of both forward and reverse, and 4.5jiL dH2O. Reaction plates (Applied Biosystems) were run on a 7500 Fast Real Time PCR System (Applied Biosystems) with cycle conditions of 95\u00C2\u00B0C for 20 seconds, followed by 40 cycles of 95\u00C2\u00B0C for 3 seconds and 60\u00C2\u00B0C for 30 seconds. 2.9 Islet siRNA Transfection Islets were extracted from C57B1/6J mice and a single cell suspension created to plate islet cells to 24-well plates at an average confluency of 100,000 cells per well. Cells were cultured overnight at 37\u00C2\u00B0C, 5.2% CO2 in Royal Park Memorial Institute (RPMI) (StemCell Technologies) media containing 10% FBS and 1% L-Glutamine. Pdxl and control siRNAs (Dharmacon) were prepared to 21iM solutions in 1X siRNA Buffer (Dharmacon). For each well, 20tL of targeted siRNA was combined with 5jiL siGLO indicator (Dharmacon) and 25tL OPTI-MEM serum free media (Invitrogen). In a separate tube, 2jiL of DharmaFECT4 transfection reagent was combined with 481iL OPTI-MEM and tubes incubated at room temperature for 5 minutes. The contents of both tubes were combined and incubated at room temperature for an additional 20 minutes and added to each well along with fresh RPMI media. Transfected cells were cultured for 48 hours and harvested for Fluorescence Activated Cell Sorting (FACS). 29 2.10 FACS Islet cells were harvested into PBS and dead cells stained with 7-amino actinomycinD (7AAD) at 1/100. Sorting was performed on the BD FACS Vantage SE DiVa in the Teny Fox Lab Flow Cytometry Unit at the BCCRC. Cells were gated to remove 7AAD positives and doublets, while cells positive for siGLO were sorted directly into Trizol (Invitrogen). 2.11 RNA Isolation and RT Cells from FACS were placed into Trizol and a 1/5 volume of chloroform was added and the tube shaken vigorously. Following a 2-minute incubation at room temperature, samples were centrifuged at 13,200rpm for 10 minutes, supematants removed, and RNA extracted via manufacturer\u00E2\u0080\u0099s protocol using an RNEasy Kit (Qiagen). RNA Pellets were suspended in 2OjiL DNase RNase free water and a small portion used for subsequent reverse transcription (RT), with the remainder being used for Tag-Seq-lite library construction (section 2.12). RT was performed as follows. 1 jiL of islet RNA was added to 1 1iL lox DNase 1 reaction buffer (Invitrogen), 1 jiL Amp grade DNase 1 @ 1U/pL (Invitrogen), and DNase RNase free water to a final volume of lOjiL. Tubes were incubated for 15 minutes at room temperature and DNase 1 inactivated by addition of 1 jiL of 25mM ethylenediaminetetraacetic acid (EDTA) (Invitrogen). Following a 10 minute incubation at 65\u00C2\u00B0C, 250ng of random primers (Invitrogen) and l1iL of 10mM dNTP mix (Invitrogen) were added and the mixture heated for an additional 5 minutes at 65\u00C2\u00B0C. After letting the tube sit on ice for 1 minute, the following were added: 4j.tL 5X First Strand Buffer 30 (Invitrogen), lp.L 0.1M dithiothreitol (DTT) (Invitrogen), liiL RNaseOUT Recombinant RNase Inhibitor (Invitrogen), and 1 jiL SuperScript III RT @ 200U/tL (Invitrogen). Contents were pipetted up and down and incubated at 25\u00C2\u00B0C for 5 minutes. Incubation temperature was increased to 50\u00C2\u00B0C for an additional 60 minutes after which the reaction was inactivated by again increasing the temperature to 70\u00C2\u00B0C for another 15 minutes. The resultant cDNA was subsequently used for qPCR analysis as outlined in 2.8. 2.12 Tag-Seg-lite Tag-Seq-lite library construction was performed by the Genome Sciences Centre as described previously69. First strand cDNA was synthesized from 4Ong of DNAse1 treated islet RNA (control or siPdx 1 treated) with Superscript III Reverse Transcriptase (Invitrogen) and amplified by 20 cycles of PCR based on SMART (Switching Mechanism At the 5\u00E2\u0080\u0099 end of RNA Transcripts) cDNA synthesis to generate full-length cDNA (Clontech). Subsequently, SOOng of cDNA was digested with the anchoring enzyme N1aIII and ligated to an Illumina specific adapter containing a recognition site for the type ITS tagging enzyme Mmcl as well as sequencing and PCR primers. After digestion with Mmel and SAP (Shrimp Alkaline Phosphatase) treatment to dephosphorylate the DNA, a second Illumina adapter containing a 2bp 3\u00E2\u0080\u0099 overhang was ligated. The resultant \u00E2\u0080\u009Ctags\u00E2\u0080\u009D flanked by adapters were amplified via PCR using Phusion polymerase with the following cycling conditions: 98\u00C2\u00B0C for 30 seconds, followed by 13 cycles of 98\u00C2\u00B0C for 10 seconds, 60\u00C2\u00B0C for 30 seconds, and 72\u00C2\u00B0C for 15 seconds, and then 72\u00C2\u00B0C for 5 minutes. PCR products were purified by running samples on a 12% PAGE gel, excising the 85bp band, and purified using a Spin-X filter column and ethanol 31 precipitation. Quality assessment and DNA amount were determined using an Agilent DNA 1000 series II assay (Agilent) and DNA then diluted to lOnM. DNA was sequenced using the Illumina Genome Analyzer and 1 7bp Serial Analysis of Gene Expression (SAGE) tags extracted from the resulting reads. The process of Tag-Seq-lite is depicted in Figure 5. 32 Poly A+ RNA SMART II A Oligo \u00E2\u0080\u0098 polyA 3\u00E2\u0080\u0099 CDS Primer First strand synthesis by RT polyA3\u00E2\u0080\u0099 dC tailing by RT /\///_/_//\_/\.. polyA3\u00E2\u0080\u0099 ccc Template switching and extension by RT GGG//\u00E2\u0080\u0098\u00E2\u0080\u0098/polyA 3\u00E2\u0080\u0099 ccc J cDNA amplified by LD PCRwith primer for eStrandedcDNA NlallI Digestion with anchoring enzyme NlaIII \u00E2\u0080\u0098Y\u00E2\u0080\u00A2\ rY\ r\u00E2\u0080\u0099Arv-\ I AAAA GTAC TTTT-beads Ligate adapter A containing Mme 1 site & primer sequences Sequencing Primer Digest with Mmel Primer j CATG\u00E2\u0080\u0099\u00E2\u0080\u0099\/\u00E2\u0080\u0099)(\u00E2\u0080\u0099\[)(\u00E2\u0080\u0099NN GTAC-\u00E2\u0080\u009D \u00E2\u0080\u009C v\u00E2\u0080\u009D-\u00E2\u0080\u009D \. Ligation of adapter B containing PCR primer sequence _________ CATG GTAC Figure 5 - Tag-Seq Lite Library Construction. The process of Tag-Seq is depicted. Following the final step, 13-17 cycles of PCR are performed to amplify the DNA which is then purified on a PAGE gel and sequenced via INumina. 33 2.13 Tag-Seq Library Bioinformatics Tags were mapped to Refseq genes using Discovery Space 4.070. Tags with a count greater than 5 were included in the analysis. To account for multiple tags mapping to the same Refseq accession, the counts for all tags mapping to the same Refseq were combined, providing a Refseq and an associated count. Counts were normalized based on library size and expressed as counts per million. Normalized counts of genes were subsequently compared between the control and Pdxl siRNA library to determine which genes were significantly down and up regulated in the Pdxl siRNA library compared to the control. 2.14 Seeded Motif Discovery Seeded motif discovery was performed by Gordon Robertson and Leping Li. GADEM71 (A Genetic Algorithm Guided Formation of Spaced Dyads Coupled with an EM Algorithm for Motif Discovery) addresses large sequence sets and identifies highly prevalent motifs based on a user specified threshold. A modified version of GADEM was used that employed an initial \u00E2\u0080\u009Cseed\u00E2\u0080\u009D position weight matrix (PWM) provided by the user. A motif is deemed significantly present if its E-value, produced by both its p-value and the number of all possible motif-length segments in the search space, falls below this threshold. Pdxl and Pbxl binding motifs were identified from ChIP-Seq data using the seed PWMs IPF1_Q4_01, TRANSFAC M101013 for Pdxl, and PBX1_02, TRANSFAC M00124 for Pbxl. Threshold was established by setting the p-value limit to 5e4, and the GADEM run provided Pdxl-like and Pbx 1-like motifs. 34 CHAPTER 3. RESULTS 3.1 Pdxl ChIP-Seq Library Construction 3.1 .1 Identification of a ChIP Quality Pdx 1 Antibody and Pdx 1 Targets The first step in a successful ChIP experiment is to identify a suitable antibody. Therefore, to identify the best Pdxl antibody candidate for use in ChIP, antibodies directed against Pdxl were purchased from Developmental Studies Hybridoma Bank, Chemicon (Upstate), and SantaCruz. The initial comparative ChIP trials were performed using MTN6 cells and enrichment at several ChIP-ChIP identified Pdxl targets56 was tested using qPCR. Pdxl ChIPs were performed using a fully confluent 10cm plate of cells and enrichments established based on comparison against a control IgG ChIP. Figure 6a shows that while all antibodies produced enrichment of Pdxl targets, in every case the best performance was observed using the Chemicon antibody, followed by SantaCruz and Developmental Studies. The most highly enriched target, Epb4.113, was selected as a positive control to assess the degree of success of future ChIPs performed in islets. To ensure Chemicon Pdxl antibody fidelity, a Western Blot was performed. Figure 6b shows that a single clear band at the expected size of roughly 35kDa, corresponding to Pdxl protein, was observed. Therefore, based on these results, the Chemicon Pdxl antibody was selected for use in all future experiments. 35 AFigure 6 - Identification of a ChIP Quality Pdxl Antibody. (A) ChIP-qPCR fold enrichments compared to lgG of Pdxl targets following ChIP with the Pdxl antibodies from Developmental Studies, Santa Cruz, and Chemicon. The high levels of enrichment of targets over lgG controls indicate ChIPs were all successful. For all targets the best ChIP enrichments are clearly displayed using the Pdxl antibody from Chemicon. Maximal enrichment at the Epb4.113 target identify it as a good positive control for future ChIPs. (B) Western Blot for Pdxl performed with the Chemicon Pdxl antibody using MIN6 cell lysate. Presence of a strong and clean band at the expected size of -35kDa shows the high specificity of the Chemicon antibody. 36 0 500 - 400 - 300 - 200 - 100 - 0 Pdxl Dev Studies Pdxl Santa Cruz Pdxl Chemicon L 00 c) .0 C Target B Pdx135kDa 3.1.2 Collection of Islet Pdxl ChIP DNA To obtain primary tissue for Pdxl ChIPs, seven islet isolations were performed by Galina Soukhatcheva at the Verchere lab at the Child and Family Research Institute, from a total of fifty-six C57B1/6J mice, over the course of three months. Each islet isolation yielded a minimum of one thousand islets, which were immediately taken as far as the stop fixation step of the ChIP protocol. To maintain ChIP procedural consistency with other ChIP libraries, the fixed cells were delivered to the Genome Sciences Centre (GSC) and ChIPs performed by the GSC gene expression pipeline by Balgit Kamoh. Islets from the first three isolations were used to optimize the ChIP protocol for the creation of a single-cell suspension as well as ideal chromatin shearing conditions. Following this optimization, islets from the remaining four isolations produced chromatin that was of the correct size range of 100-300bp and showed enrichment at the Epb4.113 positive target for Pdxl. Figure 7a shows the agilent size range profiles of the sheared DNA from each of the four replicates, and Figure 7b displays qPCR enrichment for the abovementioned Epb4.113 target. Taken together, these results indicate that islet chromatin has been sheared sufficiently, and that Pdxl ChIPs have been successful. Subsequently, the chromatin from these four successful ChIPs was pooled and used for Illumina sequencing and library creation. 37 A B I I 100 150 200 Figure 7 - Validating the Islet Pdxl ChIP DNA. (A) Agilents displaying the size range profiles of the sonicated islet DNA going into each ChIP are shown. The 1 OO-300bp size ranges are labeled between the black bars, confirming presence of chromatin in the desired size range. (B) ChIP-qPCR fold enrichments of the positive Epb4. 113 target are shown for each of the four successful ChIPs performed in islets. ChIP 1 enrichment was calculated against an IgG control. ChIPs 2 to 4 were compared against an input DNA control to maximize DNA amount going into the Pdxl ChIPs. IChIP 1 (IIgG) - ChIP 2 (/input) ChIP 3 (/input) ChIP 4 (/input) I Q\u00E2\u0080\u0099 QIq :: ChIP 1 , ChIP2 ChIP 3 50 Z Fold Enrichment -- \u00E2\u0080\u0094fl.\u00E2\u0080\u0094\u00E2\u0080\u0094\u00E2\u0080\u0094 - IbP] ChIP4 3.2 Pdxl ChIP-Seq Library Results and Validation 3.2.1 Statistics and Visualization ofPdxl ChIP-Seq Peaks In total, 7 lanes of Pdxl ChIP material was sequenced using Illumina Flow Cell technology at the GSC, resulting in 62.1 million reads and a mapping efficiency of 24% to the mm8 mouse genome assembly. Following peak building using FindPeaks3.l and the establishment of a peak height threshold of 11, the number of Pdxl peaks was 13,448. To visually assess the data, the generated \u00E2\u0080\u009Cwig\u00E2\u0080\u009D file was loaded into the UCSC genome browser to scan for peaks at known Pdxl binding sites. Figure 8 shows the UCSC screenshots of several previously identified Pdxl binding sites: Insi, Ins2, Pdxl, Gck, IAPP, and Glut2. It clearly illustrates that Pdxl ChIP-Seq peaks are located at most expected sites. Additionally, scanning of ChIP-Seq data in UCSC revealed Pdxl sites at suspected, but previously unidentified, genes including Isil, Nkx2.2, Nkx6.1, and Pax6 (Appendix). Peak to gene associations were performed in the Galaxy Genome Browser (http://main.22bx.lsu.edu/) by mapping peaks to the closest Refseq transcriptional start site either up or downstream. A peak was defined as being associated with a gene if the closest transcriptional start site was within 50kb of the peak. Using this method, 5560 genes possessed a Pdxl peak. 39 Known Sites I I Peak Height 48 \u00E2\u0080\u0094 PDX1 ChIP I h8-I iNS 1 Known Sites I 28\u00E2\u0080\u0094 PDXI ChIP 6 \u00E2\u0080\u0094 INS2 -*--I:--*-:-I___ Known Site 20- PDX1 ChIP 3\u00E2\u0080\u0094 \u00E2\u0080\u0094 PDX1 Known Site 39\u00E2\u0080\u0094 PDX1 ChIP 3- .. - GCK - Known Sites II 32\u00E2\u0080\u0094 PDXI ChIP 4\u00E2\u0080\u0094 IAPP - .\u00E2\u0080\u0094 I Known Site PDX1ChIP A A GJ_TJT2 Figure 8 - UCSC Screenshots of Pdxl ChIP-Seq at Known Sites. The previously Pdxl identified binding sites at the Insi, Ins2, Pdxl, Gck, IAPP, and Glut2 genes are shown. The known binding site(s) are labeled as black vertical dashes, ChIP-Seq peaks are shown in blue. ChIP-Seq peaks are present at all known sites with the exception of Glut2. 40 Because previous ChIP-Seq data had revealed an abundance of transcription factor binding sites located distally from transcriptional start sites, I examined the distribution of Pdxl peaks to determine if a similar trend was present in this data. The fraction of Pdxl peaks was plotted against distance to the closest transcriptional start site (Figure 9a). Compared to a random distribution of sites, Pdxl peaks were highly centred at transcriptional start sites. Using Galaxy Genome Browser, peaks were overlapped with various genomic regions: promoters, enhancers, exons, introns, and regions >10kb from the TSS. This yielded a distribution of Pdxl peaks as follows: 11% promoter (0-1kb upstream of the TSS), 8% enhancer (1-10kb upstream of the TSS), 4% exons, 27% introns, and 49% >10kb (Figure 9b). This type of distribution is consistent with previous ChIP-Seq studies72, and illustrates that an abundance of sites were overlooked in ChIP- Chip experiments due to their bias towards promoter and enhancer regions only. 41 A-40000 -20000 0 20000 40000 100 0.10 0.08 Ca 0) C C C 0.04 0.02 0 Distance to TSS B \u00E2\u0080\u0094 Intronic 80 \u00E2\u0080\u0094 Exonic Ca \u00E2\u0080\u0094 Enhancers 60 \u00E2\u0080\u0094 Promoters 40 >10kb 20 0 Pdxl CIilP-Seq Peaks Figure 9 - Distribution of Pdxl ChIP-Seq Peaks. (A) Histogram of the fraction of sites occurring relative to the position of the TSS. Pdxl ChIP-Seq peaks are centred around TSS. (B) Distribution of peaks into gene regions. Peaks are found in each region as follows: >10kb away - 49%, promoters - 11%, enhancers - 8%, exonic - 4%, and intronic - 27%. 42 3.2.2 Validation of the Pdxl ChIP-Seq Library To validate the Pdxl ChIP-Seq data, Pdxl peaks were compared against known Pdxl binding sites and previously published genome-wide binding data from ChIP-Chip studies performed in NIT-i insulinoma cells56. Figure 10 shows this comparison. A 35% overlap between Pdxl ChIP-Seq data and ChIP-Chip data was observed, and 75% of known Pdxl binding sites are accounted for in the ChIP-Seq dataset while previous ChIP-ChIP data fails to identify these well-established binding sites. A table detailing and referencing the known sites is also displayed in Figure 10. Additional validation was carried out using ChIP-qPCR to assess enrichment of 35 peaks identified from the ChIP-Seq data, as well as four negative targets. For these, four replicate Pdxi ChIPs, as well as control IgG ChIPs, were performed on islets isolated by me from ICR mice. Islets were isolated from ten mice yielding at least one thousand islets for each of four replicate ChIPs, and enrichment of the positive Epb4. 113 target was confirmed. The four ChIPs were pooled and qPCR reactions setup in quadruplicate to determine the enrichments shown in Figure ha. All tested ChIP-Seq target sites were enriched over the negative controls. Importantly, a positive correlation was observed between ChIP-Seq peak height and ChIP-qPCR fold enrichment (Figure hib). 43 Pdxl ChIP-Seq Islets 13,448 Keller et al. Pdxl ChIP-ChIP NIT1 Cells \u00E2\u0080\u0094i 817 65.O% Literature Pdxl Binding Sites Gene Reference ChIP Peak Insi Al Germanetal. 1995 Yes Insi A3/4 German et al. 1995 Yes Ins2 Al Germanetal. 1995 Yes Ins2 A3/4 German et al. 1995 Yes Pdxl Gerrish et al. 2001 Yes Gck Shelton et al. 1992 Yes Glut2 Waeberetal. 1996 No* IAPP Al Carty et al. 1997 Yes IAPP A2 Carty et al. 1997 Yes Sst TSE1 Leonard et al. 1993 No** Sst TSE2 Leonard et al. 1993 No** Mafa Raum et al. 2006 Yes We do identify Pdxl binding peaks at Glut2 but not at the exact location outlined by Waeber. The absence of Pdxl binding peaks at Sst in our data is not unexpected as Sst is expressed in delta cells which compose a minimal % of islet mass. Figure 10 - Comparison of ChIP-Seq Data with ChIP-ChIP and Known Binding Sites. Islet Pdxl ChIP-Seq data was compared against Pdxl ChIP-ChIP data as well as a list of known binding sites. The Venn diagram shows that 35% of the binding regions identified in the ChIP-ChIP study are also accounted for in the ChIP-Seq data. It also displays that the total number of sites identified was much greater in the ChIP-Seq study (13,448) versus the ChIP-ChIP study (817). Of known Pdxl binding sites identified from a literature survey, 75% have ChIP-Seq peaks at the exact location. Conversely, the ChIP-ChIP data fails to directly identify any of these sites. A summary of the known sites is provided in the expanded table, as well as explanations for the absence of Pdxl ChIP-Seq peaks at those sites not identified. APdxl ChIP-Seq Peak Height Figure 11 - ChIP-Seq Peaks are Validated Via ChIP-qPCR. (A) Pdxl ChIP-qPCR results for Pdxl targets identified in ChIP-Seq data. 35/35 peaks were validated compared to 0/4 negative control targets using isolated ICR islets. qPCR reactions were performed in quadruplicate on pooled material from 4 independent ChIPs. (B) The line of best fit on the scatter plot shows the positive correlation observed between ChIP-Seq peak height and ChIP-qPCR fold enrichment. ct 4) I 380 - nI.,n IIiF II lflmflflnn .;; __( r B C.) C) \u00E2\u0080\u009444 4)4< :i1 E! 4< 4)F- \u00E2\u0080\u0094 7 Target fl\u00E2\u0080\u00A2 400 I) y = 2.708x - 37.81 R=0.493 0 50 100 150 45 3.2.3 Validation Through siPdxl Tag-Seq Library Construction To determine genes most highly impacted by direct Pdxl binding, gene expression libraries were created from islets treated with either control or Pdxl siRNA. Following siRNA knockdown of Pdxl or cyclophilin control, islet cells were harvested and sorted via FACS. Cells positive for knockdown were labelled green due to the presence of siGLO and collected directly into Trizol. FACS gating and results from the collection are shown in the appendix. A total of 30,00O cells were collected for each condition. This approximately corresponded to a 10% transfection efficiency for both collections. RNA was extracted from the sorted cells and a portion used for RT-PCR to confirm Pdxl knockdown (Figure 12a), while the remainder was sent to the GSC for Tag-Seq expression library construction as outlined in section 2.12. Tag mapping and comparison of the cyclophilin and Pdxl siRNA libraries were performed using Discovery Space7\u00C2\u00B0 and up and down regulated genes determined. 655 genes were up regulated, while 488 were down regulated, in the siPdxl library as compared to the control. Relative expression levels of known Pdx 1 positively regulated genes such as Ins 1, Ins2, Pdxl, Glut2, IAPP, and Gck, displayed reduced expression in the knockdown library, corroborating Pdxl knockdown was having a quantifiable effect on its targets (Figure 1 2b). Hence, gene lists were compared against the Pdx 1 ChIP-Seq data to determine what portion possessed a Pdxl peak. Figure 12c shows that 36% of unaltered genes, 39% of up regulated genes, and 45% of down regulated genes had a Pdxl ChIP-Seq peak association. 46 A B 1.5 1.0 siCyclo 0.8 sjPdxl I ::fl I EflIjjL[f siCyclo siPdxl Insi Ins2 Glut2 Pdxl IAPP Gck C 4.16 e-5 I I Genes without a Pdxl Peak100 Genes with a Pdxl Peak Figure 12 - Down Regulated siPdxl Tag-Seq Genes Are Significantly Represented in ChIP-Seq Data and Include Expected Genes. (A) RT-PCR from RNA collected from FACSorted siCyclo and siPdxl islets confirms that Pdxl expression is decreased. (B) The relative expression levels of known Pdxl positively regulated genes reveal the expected decreased expression in the siPdxl Tag-Seq library. (C) Unaltered, down, and up regulated genes were compared against ChIP-Seq genes. Using Fisher\u00E2\u0080\u0099s exact test, down regulated genes in the siPdxl Tag-Seq library are significantly more likely to possess a Pdxl ChIP-Seq peak. 47 3.2.4 KEGG Pathways of Pdxl Genes All genes that were associated with a Pdx 1 ChIP-Seq peak were analyzed against all Refseq genes using \u00E2\u0080\u0098WebGestalt (htt1:J/bioinfovanderbiJt.eduIwebestaJtJindex.php) to determine significantly represented KEGG pathways. The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for analyzing gene functions in terms of gene networks and molecules. Significant KEGG pathways are shown in Table 2, several of which are expected and critical in the 13-cell including: Insulin Signalling Pathway, MODY, and Type II Diabetes Mellitus. To elucidate the gene pathways on which Pdxl had the most impact, KEGG pathways were determined for the down regulated genes of the siPdxl Tag-Seq library that possessed a ChIP-Seq peak. The obtained pathways are shown in Table 3 and again include expected pathways such as MODY and Type II Diabetes Mellitus. Based on the validative studies performed on the Pdxl ChIP-Seq data, it was clear that the library was a quality representation of Pdxl binding in islets. 48 Table 2 - Significantly Over-Represented KEGG Pathways of all Genes with a Pdxl ChIP-Seq Peak KEGGP\u00C3\u00A4thway Observed Expecte R Value P Value Regulation of actm cytoskeleton 65 53.5359 1.2141 0.0395 MAPK Signalling Pathway 96 72.9778 1.3155 0.0011 Focal adhesion 61 50.4364 1.2094 0.0483 Wnt signalling pathway 54 39.7292 1.3592 0.00577 Insulin Signalling Pathway 48 37.1933 1.2906 0.0246 MODY 14 6.1989 2.2585 0.000558 VEGF Signalling Pathway 32 19.442 1.6459 0.000959 Apoptosis 32 21.6961 1.4749 0.00786 Colorectal Cancer 32 22.8232 1.4021 0.018 Pancreatic Cancer 30 20.2873 1.4788 0,00945 Type II Diabetes Mellitus 20 12.3978 1.6132 0.0107 Cell Cycle 38 29.8674 1.2723 0.0515 Table 3 - Significantly Over-Represented KEGG Pathways of siPdxl Tag-Seq Down Regulated Genes that have a ChIP-Seq Peak KEGG Pathway Observe: Expected R Value P Value Regulation of actin cytoskeleton 13 1.8372 7.076 7.47 e -7 Cytokine-cytokine receptor interaction 5 0.5846 8.5529 0.00131 MODY 4 0.501 7.984 0.00497 Focal adhesion 7 1.9207 3.6445 0.00641 Colorectal cancer 5 1.0021 4.9895 0.00744 Casignallingpathway 5 1.1691 4.2768 0.0123 mTORsignallingpathway 3 0.334 8.982 0.0125 Dorso-ventral axis formation 3 0.4175 7.1856 0.0189 Pancreatic cancer 4 0.9186 4.3545 0.0238 DRPLA 2 0.167 11.976 0.0320 Type II Diabetes Mellitus 3 0.5846 5.13 17 0.0361 Glioma 3 0.6681 4.4903 0.0469 Observed - Number of genes found in dataset of interest Expected - Number of genes expected to be found based on background R Value - Ratio of observed to expected P Value - Probability of result 49 3.3 Pdxl ChIP-Seq Library Analysis 3.3.1 Pdxl and Pbxl Binding Motif Identification Because transcription factors frequently bind DNA in complexes, I wanted to examine the sequences contained under Pdxl ChIP-Seq peaks for nucleotide sequence binding motifs to deduce possible co-regulators acting with Pdxl. Pdxl ChIP-Seq peaks were scanned for binding motifs similar to the classic Transfac Pdxl DNA binding motif as well as the Transfac Pbxl DNA binding motif to see if peaks were enriched for these nucleotide sequences. Pbxl was selected due to its well-documented embryonic co regulatory role with Pdxl. To perform this analysis, Gordon Robertson and Leping Li used the GADEM motif discovery tool (outlined in section 2.14) on Pdxl ChIP-Seq sequences based on 1 lbp Pdxl and l5bp Pbxl sequences from Transfac. This returned a Pdxl-like motif that occurred in roughly 45% of peaks (Figure 13a), and a Pbxl-like motif that occurred in roughly 43% of peaks (Figure 13b). Taken together, at least one of the identified motifs was present in 63.8% of peaks. Interestingly, the Pbxl-like motif appeared to be a heterodimer comprising core Pbxl and Pdxl binding sequences. Because Pdxl and Pbxl Transfac motifs contained similar core base pair sequences, some sequences were identified by both independent motif discovery runs. To determine which were duplicates, a histogram of the distance between site types was created (Figure 13c). Sites separated by a distance of\u00E2\u0080\u00946bp (Pbxl-like relative to Pdxl-like) were identified by both types of motif discovery runs. Consequently, Pdx 1-like sites that had Pbx 1-like sites located at a distance \u00E2\u0080\u00946bp were removed from the Pbxl-like list. 50 T\u00E2\u0080\u0099\u00E2\u0080\u0099== 010,- ,\u00E2\u0080\u0094 I\u00E2\u0080\u0094 NNTAATGNNNN +3 XXXNTGATTAATXXX NNTAATGNN1\u00E2\u0080\u0099fl\u00E2\u0080\u00994 XXXNTGATTAATXXX Figure 13 - Seeded Motif Discovery of Pdxl ChIP-Seq Data Returns Pdxl -like and Pbxl-like Motifs. (A) Using a Pdxl seed, a Pdxl-like motif (monomer) is found in 45% of peaks. (B) Using a Pbxl seed, a Pbxl-like motif (heterodimer) is found in 43% of peaks. Taken together, 64% of peaks contain at least one of the two site types. (C) Relative distance of heterodimers from monomers show primary distributions of +3, -3, and -6 base pairs. Alignments of motifs reveals that at -6bp, the same sequences were called sites by both seeded motif runs. A PDX1-Iike MONOMER T B PBX:PDX HETERODIMER C C\u00E2\u0080\u0099, CID C C 3. NNTAATGNNNN 6 XXXNTGATTAATXXX - Distance from Monomer (bp) 51 3.3.2 Validation and Analysis of Pbxl Containing Peaks In order to explore the relationship of Pdxl and Pbxl, experimental confirmation of Pbxl binding at sites identified via motif discovery was needed. The terms monomer and heterodimer were used to describe site type. Monomer sites were those identified from the Pdxl based motif discovery run, and were indicative of Pdxl binding alone to DNA. Heterodimer sites were those identified by Pbx 1 based motif discovery, and were indicative of a Pdxl and Pbxl complex binding to DNA. To confirm Pbxl binding at peaks containing a heterodimer site, ChIP-qPCR was performed in MTN6 cells with an antibody directed against Pbxl (SantaCruz), and heterodimer as well as monomer targets tested for enrichment. The results shown in Figure 1 4a reveal that Pbx 1 is enriched at heterodimer containing peaks while at monomer containing peaks it is not. To determine if Pbxl is necessary for Pdxl binding at heterodimer sites, I used siRNA to knockdown Pbxl expression in MIN6 cells after which Pdxl binding was tested by ChIP. As a control, siCyclo was transfected alongside siPbxl and Pdxl ChIPs also performed. The results of this experiment, shown in Figure 14b, revealed that while Pdxl binding at target sites was reduced through knockdown of Pbxl, the degree of binding reduction was not greater in peaks containing heterodimer sites than peaks containing monomer sites. Taken with the Pbxl ChIP result, this indicates that while Pbxl is binding at heterodimer sites, its impact on Pdxl binding is no greater at heterodimers as compared to monomers. 52 CS Q C Figure 14 - Pbxl has no greater affect on Pdxl binding at heterodimer sites compared to monomer sites. (A) ChIP-qPCR was performed in MIN6 cells with Pbxl antibody. Primers for heterodimer sites and monomer sites were tested and enrichments confirm Pbxl binding at heterodimers but absence at monomers with the exception of the S1c7a14 site. (B) ChIP-qPCR was performed using Pdxl antibody in MIN6 cells subjected to Pbxl knockdown or a control knockdown to determine if Pbxl was necessary for Pdxl binding at heterodimer sites. ioHnrir-,H S C n A B nfl2 0 nfTrFJT]T \u00E2\u0080\u00A2rin \u00E2\u0080\u0094 - \u00E2\u0080\u0094 \u00E2\u0080\u0094 c . \u00E2\u0080\u0094 \u00E2\u0080\u0094 a \u00E2\u0080\u0094 C \u00E2\u0080\u0094) ) 1) < zz Cl) \u00E2\u0080\u0098 Heterodimer Sites Monomer Sites 200 150 100 \u00E2\u0080\u00A2 L 60 40 \u00E2\u0080\u0094 20 \u00E2\u0080\u0094 CycloKD Pbxl KD C) C) en C r- C) ) \u00E2\u0080\u0094 I L Heterodimer Sites L Monomer Sites 53 We next investigated whether the presence of a monomer or heterodimer affected gene expression andlor gene specificity of the nearest gene. To do so, genes were placed into the following categories: Table 4 \u00E2\u0080\u0094 Monomer and Heterodimer Gene Categories Monomer Gene has peak(s) with Pdxl-like motif only Dimer Gene has peak(s) with Pbxl-like motif only Mono + Di Gene has peaks with both motifs, but motifs never occur in same peak Mono : Di Gene has at least one peak where motifs co-occur For expression analysis, an existing Tag-Seq library of gene expression constructed using wild-type islets was used as the basis of gene expression. A gene was defined as expressed if its count in the Tag-Seq library was greater than five. The presence of monomer and heterodimer sites was not found to have a significant impact on whether a gene was expressed or not (Figure 15a). Moreover, the absolute expression level of those genes that were expressed was also not affected (Figure 15b). An examination of the specificity of the genes in each category was also performed using SAGE libraries generated through the Mouse Atlas of Gene Expression project (www.mouseatlas.org). Genes were assigned a score quantifying their specificity to islets based on the following formula73: Specificity = Mr x3Log(Ac) 3Log(Lc) 54 Where Mr is the ratio of the counts of the tag in the library of interest (islet) over the mean of the counts of the tag in all other libraries, Ac is the absolute count of the tag in the library of interest, and Lc is the number of libraries the tag is found in. The relative specificity scores for the genes represented in each category is depicted in Figure 15c. Interestingly, when genes possess both a monomer as well as a dimer they are far more likely to be islet specific, likely because a far greater proportion of high specificity genes are found in these categories (Figure 15d), where high specificity is defined as a score greater than 2, moderate a score between 0.2 and 2, and low a score less than 0.2. 55 A B 0, V C., C.) C) V C., C) C) C 0 Figure 15 - Analysis of Heterodimer and Monomer Containing Peaks. Expression and specificity analysis was performed comparing genes with varying types of monomer and heterodimer site distributions. Site type was not seen to have an effect on whether or not a gene was expressed (A), or on its relative level of expression (B). However, genes that contained both a monomer and a heterodimer site were significantly more likely to be islet specific (C) due to a greater percentage of high specificity genes belonging to this group (D). 100 Expressed Unexpressed 1000000 1ooooT I T T T .100. V I C) \u00E2\u0080\u0094 z -. .- .---- 0 C., 10. E + 0 Z 0 0 D No Site \u00E2\u0080\u0094 Monomer \u00E2\u0080\u0094 Dimer \u00E2\u0080\u0094 Mono+Di \u00E2\u0080\u0094 Mono:Di * C * 10000 \u00E2\u0080\u0098\u00C2\u00B0\u00C2\u00B0TTTTT 0.1 0.01 Cl) E E + 0 Z 0 0 56 CHAPTER 4. DISCUSSION The aim of this study was to identify the genome-wide binding of Pdxl in pancreatic islets. This entire body of work was dependent on the first step of identifying a ChIP grade Pdxl antibody. The reason for this was twofold; first, the ChIP-Seq procedure requires isolation of sufficient amounts of DNA for sequencing to be successful, and second, all isolated DNA is used for sequencing. An antibody that is used for ChIP-Seq purposes must therefore bind its target protein with high affinity to provide sufficient DNA, and must also be highly specific for only the target protein of interest so that sequenced DNA is a reliable representation of regions bound by the transcription factor. These criteria were fulfilled by the Pdxl antibody purchased from Chemicon (Figure 6), and was largely expected given that this antibody had also been used for 56ChIP-ChIP experiments Once constructed, quality checks of the Pdxl ChIP-Seq library were conducted using several approaches. The most basic tactic, but also the most reliable, was to scan the generated data for peaks at binding sites that had been well documented in the literature. The UCSC visualizations in Figure 8, as well as the table shown in Figure 10, exemplify the reliability of the library based on this approach. Of the 12 known sites that were surveyed, only 3 did not show Pdxl peaks in our data. However, 2 of these sites were for somatostatin, a gene expressed in the delta cells of the islet which make up only 2-10% of islet mass. Therefore, the DNA contribution from these cell types into the ChIPs would have been extremely small, and would not have provided enough input for the binding site to be enriched. The other site not identified was for the glucose transporter Glut2. However, the UCSC gene depiction of Glut2 shown in Figure 8 57 clearly reveals that several Pdx 1 peaks are actually present for the gene. This suggests that Pdxl binding may actually be occurring elsewhere in the Glut2 gene region and perhaps not at the previously reported site52. Hence, all well-known Pdxl binding sites were either present in the ChIP-Seq data, or if not, readily explainable. In addition to those binding sites shown in Figure 8, peak scanning in UCSC also revealed Pdxl peaks at several distal enhancer regions of genes suspected, but not yet confirmed, to be regulated directly by Pdxl\u00E2\u0080\u0099. Most notably, a novel binding site at the Insi gene was identified, as well as distal sites upstream of the Nkx2.2 and Nkx6. 1 promoters. Binding sites at the Isll and Pax6 promoters were also observed, which is significant as factors regulating these f3-cell critical transcription factors remain largely unknown (appendix figure Al). These observations provided confidence that while the mapping efficiency of the Pdxl library (24%) was lower than has been previously reported for ChIP-Seq libraries (FoxA2 liver \u00E2\u0080\u0094 33%72), the data is a valid representation of Pdxl binding. Furthermore, the ChIP-qPCR (Figure 11) that was performed to validate ChIP-Seq peaks provides additional strong evidence that the binding sites are real. Nevertheless, improvements in the quality of our ChIP-Seq data would likely be possible if a greater amount of starting DNA was contributed to each ChIP replicate. This is because DNA input amount is a major limiting factor to ChIP success. With islets contributing so few cell numbers in comparison to other studied tissues such as liver, cell numbers are the most significant limitation facing islet ChIP studies. As an additional measure of quality, our ChIP-Seq data was compared directly against ChIP-ChIP data previously published for Pdx156. While the 35% overlap that we observe (Figure 10) is lower than previously reported comparisons of ChIP-Seq and 58 ChIP-Chip72,there are several reasons why it is not unexpected. The previous ChIP-Chip study had been performed in a NIT-i cell line whereas our data was generated using primary tissue. In addition, only putative promoter and enhancer elements were included in the ChIP-Chip study. This is a major caveat as it ignores major portions of the genome, which as evidenced in the analysis of peak distribution in gene regions (Figure 9), account for a significant portion of Pdxl binding and suggest enhancer elements may in fact be functioning further upstream of transcriptional start sites than previously thought. Moreover, a glaring concern exists with the ChIP-Chip data in that it identifies none of the well-known Pdxl binding sites. This calls into question the reliability of this previous work given that extremely significant targets such as Ins 1, Ins2, IAPP, and Gck fail to be recognized. From all of this, it is clear that we have constructed a more comprehensive, accurate, and biologically relevant documentation of genome-wide Pdxl binding than previously shown. The generated Tag-Seq libraries of gene expression (control and Pdxl knockdown) were meant to serve as both a validative tool for the ChIP library as well as to begin to provide insight into those genes that are most highly influenced by altered Pdxl expression. In the analysis of these libraries, multiple tag types had to be combined that mapped to the same gene. This was done because although the libraries are cDNA based, multiple tag types can result from alternative transcripts and errors in enzyme cutting during library construction. While the altered expression levels of known Pdxl targets such as insulin and Glut2 demonstrate the changes one would expect from diminished Pdxl expression (Figure 12), genes such as IAPP and Gck (also known Pdxl targets) show more moderate changes in expression levels. This coincides with what has 59 been shown in Pdxl conditional knockout studies where the most notable changes include severe impairment of insulin production as well as diminished Glut2 expression55. Therefore, significantly altered genes in the siPdxl library are likely highly responsive to changes in Pdxl expression. Based on this, one would expect a substantial portion of these genes to be represented in the Pdxl ChIP-Seq library. Though this is observed for the down regulated gene set compared to unaltered genes, the same cannot be said for those genes up regulated. Moreover, though statistically significant when compared to unaltered genes, the down regulated genes still only show a 45% correlation with ChIP-Seq. This is lower than desired, and suggests several changes in islet cell gene expression could be based on the stresses of culture and FACSorting rather than knockdown of Pdxl, or that the changes stem from indirect effects of Pdxl knockdown. Additionally, while substantial knockdown of Pdxl mRNA was observed at roughly 70% reduction, the possibility exists that even low levels of protein are sufficient to maintain regulation at several of its targets, or that at 48 hours post siRNA transfection, original protein levels had not had sufficient time to drop. Consequently a true knockout library of Pdxl may be necessary to holistically address its effects. Nevertheless, despite these caveats, the statistical significance between down regulated versus unaltered genes and their correlation to Pdxl ChIP-Seq does allow for several insights. The most obvious of these is that Pdxl is clearly fhnctioning most often as an activator. Were it having significant repressive effects, one would have expected to see a greater portion of up regulated genes with ChIP-Seq peaks; this is not the case. Moreover, to determine in which pathways Pdxl was having the largest activational role, KEGG pathway analysis of all ChIP-Seq genes, as well as those that were down regulated and in possession of a 60 Pdxl ChIP-Seq peak, was performed. Tables 2 and 3 show KEGG pathways that are significantly over represented correspond to signalling pathways where Pdxl is expected to have major influence, such as MODY and Type II diabetes. These are present in both KEGG analyses as one would anticipate given that they are the most impactful subjects of Pdx 1. Of most interest in addition to these expected pathways, we observe that several pathways related to cell cycle are also represented, such as those pertaining to pancreatic or colorectal cancer, apoptosis, glioma, and cell cycle itself. While not all of these are present in both KEGG analyses, their varied presence between both establishes a highly probable involvement of Pdx 1 in aspects of 13-cell cycle. A relationship between Pdxl and Pbxl is known to exist in both the embryo as well as the adult33\u00E2\u0080\u0099 This association has a profound role in expansion and organization of the developing pancreas, while in the adult a more precise control of insulin regulation has been reported through the Pdxl/Pbxl affiliation. Given the role of these heterodimers in proliferative function during development, coupled with the observed over representation of KEGG pathways related to cell cycle from Pdxl ChIP-Seq genes, the Pdxl/Pbxl adult relationship may similarly drive cell cycle related processes. This suspicion arose as a result of the unexpectedly high percentage of Pdxl ChIP-Seq peaks that upon motif discovery analysis showed the presence of a Pdxl/Pbxl heterodimer binding site (Figure 13). To begin to address this, we first examined Pbxl binding at these sites and the dependency of Pdxl on such binding. Since these heterodimers have been reported to bind DNA with up to ten times the affinity of Pdxl alone33, it was hypothesized that reduction of Pbxl would significantly alter Pdxl binding at heterodimer type sites while singly bound Pdx 1 sites would be relatively unaffected. 61 Despite the confirmation of Pbxl binding at heterodimer sites and not at monomer sites (Figure 14a), the dependency of Pdxl on Pbxl was not found to be any greater at heterodimer sites as compared to monomers (Figure 1 4b). This seems to contradict the notion that Pdxl/Pbxl binds DNA with lOx affinity at target sites. In addition, expression and specificity analysis of genes with various distributions of site type did not reveal any changes in expressivity, while a positive correlation between specificity and possession of multiple sites was observed (Figure 15). The most likely explanation for these results is that since heterodimer sites still contain the core TAAT motif required for Pdx 1-DNA binding, Pdxl is still capable of binding these regions without dimerization with Pbxl. However, this does not negate the possibility that Pdxl binding at these sites may not be able to fully drive gene expression without Pbxl. The increased specificity of genes with multiple sites finds explanation in that the greater number of transcription factor binding events occurring for a given gene within a given tissue, are indicative of that gene being highly specific for that tissue. For example, insulin, a highly specific 13- cell gene, would have the most transcription factor binding events in f3-cells than in any other tissue. Hence, specificity and the number of transcription factors binding are directly proportional. The work surrounding Pdxl/Pbxl performed in this study focused on confirming the presence and requirement for Pbxl at heterodimer sites. While the former was shown, this work revealed that Pbxl is not essential for Pdxl binding. As a next step, expression analysis could be performed on heterodimer versus monomer regulated genes following Pbxl knockdown (siRNA KD-qPCR) to determine if without Pbxl, Pdxl 62 cannot drive expression at heterodimer sites. If so, Pbxl, though not essential for Pdx 1 binding, would be essential for Pdx 1 activation of genes at heterodimer sites. Coming full circle, this bears significance in that the majority of the cell cycle related genes identified from our KEGG analyses possess Pdxl ChIP-Seq peaks containing heterodimer as opposed to monomer sites. Genes with Pdxl peaks known to have roles in f3-cell proliferation include: Ccndl, Ccnd2, p15, p21, E2F1, Menl, Rb, p13, Insulin, FoxO, NFAT, Stat5, and Pdx174. Of these 13 genes, 8 have heterodimer sites associated with them. Therefore, an analysis of the necessity for Pbxl at heterodimer sites for Pdxl mediated expression would be a logical direction in which to take this work in order to extract more insight into how Pdxl may be functioning at cell cycle related genes. As a genome-wide dataset, there also exists much value in coupling this Pdxl Chll-Seq information with future genome-wide studies for both other transcription factors and markers of DNA methylation. Since a transcription factor complex involving NeuroDi, Mafa, and Pdxl is already known to form at the insulin gene, ChIP-Seq studies of NeuroDi and Mafa would prove extremely beneficial to compile with this Pdxl dataset to determine genome-wide sites of transcriptional complex formation in islets. Additionally, considering this binding information in the context of DNA methylation status would enable us to determine functionality of sites. Furthermore, construction of an embryonic Pdxl library at E8.5 (the onset of Pdxl expression) would allow for comparison of Pdxl gene regulation developmentally on a genome-wide scale. 63 CONCLUSION The purpose of this study was to characterize Pdx 1 binding in pancreatic islets on a genome-wide scale. This was clearly accomplished by utilizing the ChIP-Seq strategy to produce the most extensive dataset of Pdxl binding generated to date. Novel binding sites at genes of high interest include Ins 1, Nkx2.2, Nkx6. 1 and Isi 1. Additionally, a highly occurring relationship with Pbxl is identified and the binding of Pbxl confirmed at suspected sites. Given the reported role of PdxlPbx in cell expansion and organization in islet precurser populations, as well as the fact that Pdxl ChIP-Seq and Tag-Seq genes show significant representation in cell cycle pathways, future investigation into the PdxlPbx adult role in proliferation is warranted. The expansion of islet populations for use in transplant for diabetic patients will find substantial improvement as the molecular physiology of the f3-cell continues to be exj,osed. This work begins to address this need, and coupled with future studies of a similar nature, will prove significant in unlocking the workings of 13-cell function through the identification of genome-wide islet transcriptional complexes, transcriptional networks, and changes in transcription factor action in the embryo versus the adult. 64 REFERENCES 1) Jensen, Jan. 2004 Gene Regulatory Factors in Pancreatic Development. Developmental Dynamics. 229: 176-200. 2) Avisse, C, Flament, JB, and Delattre, iF. 2000 Ampulla of Vater. Anatomic, embryologic, and surgical aspects. Surg Clin North Am. 80: 201-212. 3) Fukuda, Akihisa et al. 2006 Loss of the Major Duodenal Papilla results in brown pigment biliary stone formation in Pdxl Null mice. Gastroenterology. 130: 855-867. 4) Slack, JMW. 1995 Developmental biology of the pancreas. Development. 121: 1569-1580. 5) Suckale, Jakob and Solimena, Michele. 2008 Pancreas islets in metabolic signaling \u00E2\u0080\u0094 focus on the B-cell. Nature Precedings. 2: 12 pgs. 6) Rahier, J, Goebbels, RM, and Henquin, JC. 1983 Cellular Composition of the Human Diabetic Pancreas. Diabetologia. (5)24: 366-371. 7) Adeghate, E and Donath, T. 1991 Morphometric and immunohistochemical study on the endocrine cells of pancreatic tissue transplants. Experimental and Clinical Endocrinology. 98: 193-199. 8) Stefan, Y et al. 1982 Quantitation of endocrine cell content in the pancreas of nondiabetic and diabetic humans. Diabetes. 31: 694-700. 9) Elayat, AA, el-Naggar, MM, and Tahir, M. 1995 An immunocytochemical and morphometric study of the rat pancreatic islets. Journal ofAnatomy. 186: 629-637 10) Adeghate, E. 1999 Distribution calcitonin-gene-related peptide, neuropeptide-Y, vasoactive intestinal polypeptide, cholecystokinin-8, substance P and islet peptides in the pancreas of normal and diabetic rat. Neuropeptides. 33: 227-235. ii) Wierup, N et al. 2002 The ghrelin cell: a novel developmentally regulated islet cell in the human pancreas. Regul Pept. 107: 63-69. 12) Henderson, JR, and Moss, MC. 1985 A Morphometric study of the endocrine and exocrine capillaries of the pancreas. Experimental Physiology. 70: 347-3 56. 13) Levick, IR, and Smaje, LH. 1987 An anlysis of the permeability of a fenestra. Microvascular research. (2)33: 233-256. 65 14) Bendayan, M. 1993 Pathway of Insulin in pancreatic tissue on its release by the B- cell. American Journal ofPhysiology. 264: G187-G194. 15) Curry, DL, Bennett, LL, and Grodsky, GM. 1968 Dynamics of insulin secretion by the perfused rat pancreas. Endocrinology. (3)83: 572-584 16) Thorens, B, et al. 1988 Cloning and functional expression in bacteria of a novel glucose transporter present in liver, intestine, kidney, and beta-pancreatic islet cells. Cell. (2)55: 28 1-290. 17) lynedjian, P.B. 1993 Mammalian glucokinase and its gene. Journal of Biochemistry. 293: 1-13. 18) Ashcroft, F.M. and Gribble, F.M. 2000 New windows on the mechanism of action of KATP channel openers. Trends in Pharmacological Sciences. (21)11: 439-445. 19) Yang, S.N. and Berggren, P.O. 2006 The Role of Voltage Gated Calcium Channels in Pancreatic 13-cell Physiology and Pathophysiology. Endocrine Reviews. (6)27: 621- 676. 20) Rutter, G.A. et al. 2006 Insulin secretion in health and disease: genomics, proteomics and single vesicle dynamics. Biochemical Society Transactions. 34: 247-250 21) Pessin, J.E. and Saltiel, A.R. 2000 Signalling pathways in insulin action: molecular targets of insulin resistance. Journal ofClinical Investigation. (2)106: 165-169. 22) Statistics courtesy World Health Organization. 23) Rother, K.I. 2007 Diabetes Treatment \u00E2\u0080\u0094 Bridging the divide. New England Journal ofMedicine. (15)356: 1517-1526. 24) Statistics courtesy BC Transplant 25) Collaborative Islet Transplant Registry 2006 Annual Report 26) Nielson, J.H. et al. 1999 Beta Cell Proliferation and Growth Factors. Journal of Molecular Medicine. 77: 62-66. 27) Hayek, A. and Beattie, G.M. 2002 Alternatives to unmodified human islets for transplantation. Curr. Diab. Rep. 2: 371-376. 28) Moriscot, C. et al. 2005 Human bone marrow mesenchymal stem cells can express insulin and key transcription factors of the endocrine pancreas developmental pathway upon genetic and/or microenvironmental manipulation in vitro. Stem Cells. 23: 594-604 66 29) Rother, K.I. and Harlan, D.M. 2004 Challenges facing islet transplantation for the treatment of type I diabetes mellitus. Journal ofClinical Investigation. (7)114: 877-883. 30) Hobert, Oliver. 2008 Gene Regulation by Transcription Factors and MicroRNAs. Science. 319: 1785-1786. 31) Latchman, D.S. 1997 Transcription Factors: An Overview. Internationl Journal of Biochemistry and Cell Biology. (12)29: 1305-13 12. 32) Wu, K.L. et al. 1997 Hepatocyte nuclear factor 3 beta is involved in pancreatic beta cell specific transcription of the pdxl gene. Molecular Cellular Biology. 17: 6002-6013. 33) Dutta, S. et a!. 2001 Pdx:Pbx Complexes are required for normal proliferation of pancreatic cells during development. Proc NatlAcad Sci USA. 98: 1065-1070 34) Kim, S.K. et al. 2002 Pbxl Inactivation disrupts pancreas development and in Ipf-1 deficient mice promotes diabetes mellitus. National Genetics. 30: 430-435 35) Gerrish, K et al. 2001 The role of hepatic nuclear factor 1 alpha and pdx-1 in transcriptional regulation of the pdx-1 gene. Journal of Biological Chemistry. 276: 47775-47784. 36) Harries, L.W. 2006 Alternate mRNA processing of the hepatocyte nuclear factor genes and its role in monogenic diabetes. Expert Review of Endocrinology and Metabolism. 1: 715-726. 37) Watada, H. et al. 2000 Transcriptional and Translational Regulation of beta cell differentiation factor Nkx6.1. Journal ofBiological Chemistry. 275: 34224-34230 38) Sussel, L. et al. 1998 Mice lacking the homeodomain transcription factor Nkx2.2 have diabetes due to arrested differentiation of pancreatic beta cells. Development. 125: 2213-222 1. 39) Sander, M. et a!. 2000 Homeobox gene Nkx6. 1 lies downstream of Nkx2.2 in the major pathway of beta cell formation in the pancreas. Development. 127: 5533-5540. 40) Heremans, Y. et al. 2002 Recapitulation of embryonic neuroendocrine differentiation in adult human pancreatic duct cells expressing neurogenin 3. Journal of Cell Biology. 159: 303-3 12. 41) Vetere, A. et al. 2003 Neurogenin3 triggers beta cell differentiation of retinoic acid derived endoderm cells. Journal ofBiochemistry. 371: 83 1-841. 42) Babu, D.A. et al. 2008 Pdxl and Beta2/NeuroDl participate in a transcriptional complex that mediates short range DNA looping at the insulin gene. Journal of Biological Chemistry. (13)283: 8164-8172. 67 43) Kristinsson, S.Y. et al. 2001 MODY in Iceland is associated with mutations in Hnfla and a novel mutation in NeuroDi. Diabetologia. 44: 2098-2103. 44) Ahigren, U. et al. 1997 Independent requirement of Isl 1 in formation of pancreatic mesenchyme and islet cells. Nature. 385: 257-260. 45) Sosa-Pineda, B. et al. 1997 The Pax4 gene is essential for differentiation of insulin producing beta cells in the mammalian pancreas. Nature. 386: 399-402. 46) Jonsson, J. et al. 1994 Insulin Promoter factor 1 is required for pancreas development in mice. Nature. 371: 606-609 47) Stoffers, D.A. et al. 1997 Pancreatic agenesis attributable to a single nucleotide deletion in the human IPF 1 gene coding sequence. Nature Genetics. 15: 106-110. 48) Oster, A. et al. 1998 Rat endocrine pancreatic development in relation to two homeobox gene products (Pdx- 1 and Nkx6. 1). Journal of Immunohistochemistry and Cytochemistry. 46: 707-7 15. 49) German, M. et al. 1995 The insulin gene promoter. A Simplified nomenclature. Diabetes. (8)44: 1002-1004. 50) Ohneda, K. et al. 2000 Regulation of Insulin Gene Transcription. Cell and Developmental Biology. 11: 227-23 3 51) Shelton, K.D. et al. 1992 Multiple elements in the upstream glucokinase promoter contribute to transcription in insulinoma cells. Molecular and Cellular Biology. (10)12: 4578-4589. 52) Waeber, G. et al. 1996 Transcriptional Activation of the Glut2 gene by the IPF1/STF1/IDX1 homeobox factor. Molecular Endocrinology. 10: 1327-1334. 53) Catty, M.D. et al. 1997 Identification of cis and trans active factors regulating human islet amyloid polypeptide gene expression in pancreatic beta cells. Journal of Biological Chemistry. 272: 11986-11993.. 54) Raum et al. 2006 FoxA2, Nkx2.2, and Pdxl Regulate Islet B-Cell Specific mafa expression through conserved sequences located between base pairs -8118 and -7750 upstream from the transcription start site. Molecular and Cellular Biology. 26: 5735- 5743. 55) Brissova, M. et al. 2002 Reduction in pancreatic transcription factor Pdxl impairs glucose stimulated insulin secretion. Journal of Biological Chemistry. 277: 11225- 11232. 68 56) Keller, D.M. et al. 2007 Characterization of pancreatic transcription factor Pdxl Binding sites using promoter microarray and serial analysis of chromatin occupancy. The Journal ofBiological Chemistry. 282: 32084-32092. 57) Wu, J. et al. 2006 ChIP-chip comes of age for genome wide functional analysis. Cancer Research. 66: 6899-6902. 58) Impey, S. et al. 2004 Defining the CREB regulation: a genome-wide analysis of transcription factor regulatory regions. Cell. 119: 1041-1054. 59) Chen, J. and Sadowski, I. 2005 Identification of the mismatch repair genes PMS2 and MLH1 as p53 target genes by using serial analysis of binding elements. Prc Nati AcadSci USA. 102: 4813-48 18. 60) Bhinge, A.A. et a!. 2007 Mapping the chromosomal targets of STAT 1 by Sequence Tag Analysis of Genomic Enrichment (STAGE). Genome Research. 17: 910-916. 61) Roh, T.Y. and Zhao, K. 2008 High resolution genome wide mapping of chromatin modifications by GMAT. Methods in Molecular Biology. 387: 95-108 62) Wei, C.L. et al. 2006 A global map of p53 transcription factor binding sites in the human genome. Cell. 124: 207-219. 63) Lob, Y.H. et a!. 2006 The Oct4 and Nanog Transcription network regulates pluripotency in mouse embryonic stem cells. Nature Genetics. 38: 431-440. 64) Lin, C.Y. et al. 2007 Whole genome cartography of estrogen receptor alpha binding sites. PLoS Genetics. 3: E87 65) Hoffman, B.G. and Jones, S.J.M. 2009 Genome-wide identification of DNA protein interactions using chromatin immunoprecipitation coupled with flow cell sequencing (ChIP-Seq). Journal ofEndocrinology. 201: 1 66) Robertson, G. et al. 2007 Genome-wide profiles of STAT1 DNA association using chromatin Immunoprecipitation and massively parallel sequencing. Nature Methods. 4: 651-657 67) Mardis, E.R. 2008 The impact of next-generation sequencing technology on genetics. Trends in Genetics. 24(3): 133-141. 68) Fejes, A.P. et al. 2008 FindPeaks3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics. 24(15): 1729-1730. 69) Morrissy, A.S. et a!. 2009 Next-generation tag sequencing for cancer gene expression profiling. Genome Research. 19(6). 69 OL j:ll3f\u00C2\u00B044a!\u00E2\u0080\u0099taJi11v UOfl1UJTjOJJIPDtI[OT11OJ3u1JJOSJ02Ifl13SUUWI900Zif\u00E2\u0080\u0098H(frL ozoiuivuauidolaa3J1ffuwdopAps3J3u1duunpsutojd nI/ownooouo!ssJdx800tPYH\u00E2\u0080\u0098ulnuJJoH:woJpgipowt?InauoJ(EL t9c-6frcI7:9\u00E2\u0080\u0098yawasatSj3J7\u00E2\u0080\u0099uiounbspj1dIOASSWUiSflA1jflnp SflOtUU!SPSrnpuiqyxO{OAIAUIJOSiSXUfO9oozia\u00E2\u0080\u0098IT.\u00E2\u0080\u0099p(EL. 6E-LTE:()9T -\u00C2\u00B0i\u00C2\u00B0rjvuopv,znduto3JoirnwwrXJ3A0Os!pJTow.iojmqTuoJy.jjui4IMpojdnoo spCppndsJouotnuuojppinwquo1oIui:JAJJcJ\796001\u00E2\u0080\u00981\u00E2\u0080\u0098vi(IL 9\u00E2\u0080\u0099j:(j)g?OlO!qau10u30 UO!BOJddsis(juiipA!PiJUTu:wdg1cJAODSiQL0OZi\u00E2\u0080\u0099N\u00E2\u0080\u0098uosj.iqo>j(oL APPENDIX a chrl3: 1174200001 1174300001 a 13\u00E2\u0080\u0094 I Pc4x 1 0388_7L_rnrr 3.94488 \u00E2\u0080\u0094 \u00E2\u0080\u00A2 III j Conservat iOfl I U I II II I II 11 I II II III II [ I 1 I II II III I \u00E2\u0080\u00A2 chr2: I 1468800001 1468900001 I 25\u00E2\u0080\u0094 I\u00C2\u00B0\u00E2\u0080\u0099 1 6.89764_\u00E2\u0080\u0094 I Nkx2\u00E2\u0080\u00942 --) Nkx2\u00E2\u0080\u00942 \u00E2\u0080\u00A2 Nkx2\u00E2\u0080\u00942 J Conservat iofl II \u00E2\u0080\u00A2 I 11111 II I I I I \u00E2\u0080\u00A2i I II I I*4++-+-+-*-+ I . . I. I .: . I I III I J chrS: 1019000001 \u00E2\u0080\u00A21 18\u00E2\u0080\u0094 Pdx 1 0388_7L_rnrr J 4.99213 \u00E2\u0080\u0094 a \u00E2\u0080\u00A2 Conservation I\u00E2\u0080\u00A2 I I III III I II I II J chr2: 1055000001 IPdx1_MM0388_7L_mw 1 - I I Pax6 :1I:\u00E2\u0080\u0099:\u00E2\u0080\u0099Pa::6 :\u00E2\u0080\u0098 :\u00E2\u0080\u0098I bI:\u00E2\u0080\u0099 : I 1 :j : : III :1F.j:x:E.P.E<6 EU\u00E2\u0080\u00994 I Pax6osl \u00E2\u0080\u00981 I : . 1:1:.:.:. : :1111:E1p4 j Conservat ion liii I III I I liii II liii I I \u00E2\u0080\u00A2i ii I I I I I I II I Figure Al: Additional UCSC screenshots of interest of Pdxl ChIP-Seq binding sites 71 0-J u_ Siriglets II Iii (1111111411 50 100 150 200 250 SSC-A (xl,000) siGLO Positive Figure 42: FACSorted siCYCLO islets 72 ScknenOO1-Cydo Sr7ecimen OO1-Cvdo ,<,\u00E2\u0080\u009C,\u00E2\u0080\u00987AADN?gative I I Ill I I II I I I I I Ill Fl 50 100 150 200 250 FSC-A (x 1,000) 0 0 0 C \u00E2\u0080\u00984) LL 44) 0. ci 4\u00E2\u0080\u0099) Specimen OO1-Cvclo Specimen OCI-Cyclo U) 0. .4 -- Ff[ITI1IIIIII I I I FIITIj I I Ililif I I111IIJ -32 0 io2 10 I0 10 FL1A \u00E2\u0080\u00984) LI) U) liii 111111 I I 111111 I T I 111111 3 32\u00C2\u00B0 10 10 FLI-A \u00E2\u0080\u0098510 SnrimAn flfl1-nd SQecimen 001 -odx C) C,) Figure A3: FACSorted siPdxl islets 73 \u00E2\u0080\u0098C 0 0-(N Singlets Ii liii Ijill I j I IllijI 50 100 150 200 250 SSC-A (xl 000) Specimen 001 -pdx lilijil III 1111111111 I 50 100 150 200 250 FSC-A (xl,000) \u00E2\u0080\u00A2 7AAD Noative <0. Soecimen 001-pdx siGLO Positive -J U- LU jjjI -26\u00C2\u00B0 10 \u00E2\u0080\u0094 I1IIHIl I I IIIIIIj I I 111111) I I IIlIIIj -28 0 102 10 10 l0 FITC FLI-A V I I 111119 I LTHH9 1 1 FFH1JI 10 10 10 FITC FL1-A Page 1 of 1 THE UNIVERSITY OF BRITISH COLUMBIA ANIMAL CARE CERTIFICATE Application Number: A05-1741 Investigator or Course Director: Cheryl D. Helgason Department: Surgery Animals: Mice icrTac:ICR 900 Mice Pdxl-GFP Transgenic 150 Mice NGN3GFP transgenic 150 Start Date: January 1, 2006 Approval Date: April 1, 2009 Funding Sources: Funding Agency: Genome Canada Funding Title: Dissecting Gene Regulatory Networks in Mammalian Organogenesis Unfunded title: N/A The Animal Care Committee has examined and approved the use of animals for the above experimental project. This certificate is valid for one year from the above start or approval date (whichever is later) provided there is no change in the experimental procedures. Annual review is required by the CCAC and some granting agencies. A copy of this certificate must be displayed in your animal facility. Office of Research Services and Administration 102, 6190 Agronomy Road, Vancouver, BC V6T 1Z3 Phone: 604-827-5111 Fax: 604-822-5093 https://rise.ubc.calrise/Doc/0/EDOAN5E9AIF4D6COIO4UQJLVB7/fromString.html 4/1/2009"@en . "Thesis/Dissertation"@en . "2009-11"@en . "10.14288/1.0068433"@en . "eng"@en . "Interdisciplinary Oncology"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivatives 4.0 International"@en . "http://creativecommons.org/licenses/by-nc-nd/4.0/"@en . "Graduate"@en . "Unraveling the molecular physiology of the \u00CE\u00B2-cell: genome wide analysis of binding sites for the transcription factor PDX1"@en . "Text"@en . "http://hdl.handle.net/2429/15879"@en .