UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Evolutionarily conserved regulatory programs Kwon, Tae-Jun Andrew

Abstract

Despite the diversity of metazoans, common biochemical systems and structures can be found in distinct taxonomic groups. The development and formation of metazoan tissues and structures has been well researched, but their regulatory mechanisms are not understood well. To this end, we implemented bioinformatics tools regulatory mechanism analysis and applied them to study regulatory program conservation with an emphasis on muscle development. We first performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition and ChIP-Seq evidence as important characteristics to incorporate in future methods for improved predictive specificity. In studying the transcriptional regulation, motif enrichment analysis of co-expressed genes is often employed to determine mediating transcription factors. We built oPOSSUM-3, a web-based software system for identification of enriched transcription factor binding sites (TFBS) and TFBS families in DNA sequences of co-expressed genes and sequences generated from high-throughput methods, such as ChIP-Seq. Validation of the system with known sets of published data demonstrates the capacity for oPOSSUM-3 to identify mediating TFs for co-regulated genes. Studies have shown that TF binding profiles tend to be highly conserved over long evolutionary distances. In large-scale public genome annotation projects, such as modENCODE, transcriptional regulation data is compiled for comparative genomics research. Using the oPOSSUM-3 system and published data, we performed comparative analyses of the regulatory programs across evolutionarily divergent species, including human, fruit fly, and nematode, and examined the extent of conservation in major regulatory programs. The thesis research provides new approaches to computational analysis of DNA sequences and insights into the analysis of transcription regulation across the phylogenetic spectrum.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International