UBC Theses and Dissertations
Transcriptional regulation and Caenorhabditis elegans in silico Thorne, Michael
With both the complete sequences of multicellular organisms as well as the emerging results from genomic scale expression experiments at our disposal the incentive to construct models detailing the interactions of gene products is greater than ever. An integral understanding and input into any such model lies in the combinatorics of the transcription factors that result in gene expression. Footprinting experiments have led to the assumption that a specific transcription factor binds exclusively to a strongly conserved sequence of nucleotides. Consequently, the search for transcription factors has been simplified to the problem of determining the motif to which each transcription factor may bind. Caenorhabditis elegans, with its essentially complete genome sequence and comprehensive annotation is currently the best in silico model to study transcriptional regulation in a multicellular organism. Gathering between 200-2000 bp of the upstream region of all genes unlikely to lie within a polycistronic transcript, a number of different approaches to finding candidate motifs have been applied. These include the study of over-represented oligonucleotides in the above mentioned dataset vis-a-vis the whole genome, the examination of signal distribution in the dataset, the comparative genomic approach of phylogenetic footprinting with Caenorhabditis briggsae, and analyses based on the results of gene expression technologies. The cataloguing of motifs found through whatever means neccesitates a method of organization as well as a ranking according to their possible biological relevance. Grouping into a large matrix all the potential motifs on one axis and the genes they lie proximal to on the other eliminates positional and ordering information but enables one to draw on techniques from graph theory and mutivariate statistics, as well as providing the ability to cluster genes based on common transcriptional profiles. Such methods allow one to extract information about composite motifs and points to their potential use in determining, when looking at sets of coregulated genes, the underlying control mechanisms.
Item Citations and Data