Constraints on the organization and information properties of DNA sequences

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Constraints on the organization and information properties of DNA sequences Sibbald, Peter Ramsay

Abstract

In an investigation which concentrated primarily on the two completely sequenced chloroplast genomes, one from a tobacco and one from a liverwort, an attempt has been -made to discover some of the factors which produce order in DNA sequences. This was done by 1. looking in detail at doublet organization throughout the genomes, 2'. by examining the ability of different methods to predict the existence of genes, based only on sequence organization and 3. by employing information theorj' to explore various levels of ordering in these sequences. The doublet analysis was performed on seven categories of DNA: tDNA, rDNA, ribosomal proteins, open reading frames not known to be genes (URF), other protein genes, non coding regions and introns. The rDNA has the most unusual doublet properties of all categories although all categories have, to a considerable extent, similar doublet properties. I suggest that these particular doublet properties facilitate accurate replication of the genome. In addition it appears that doublets which have certain thermodynamic properties are more abundant that others, suggesting that there is a selection pressure at the level of doublets for certain thermodynamic properties. Nussinov's hypothesis, that complementary doublets have similar relative abundances due to inverted duplication events has been tested and would not seem to explain the phenomenon. Fickett's method to predict whether URFs are genes was more successful than Sheperd's method. Fickett's method was modified for use on the chloroplast genomes and its rate of successful prediction increased substantially. This modified method will be useful for other chloroplast genomes as they are sequenced and also supports Fickett's contention that the method could be improved for use on specific groups. The ability to predict genes based only on sequence data shows that the requirement to code for protein exerts a detectable amount of order on the gene sequence and that this order is distinguishable from the order in non coding regions. Nearly all URFs greater than 200 base pairs in both plants are predicted to be genes. Informational analysis showed that most order is at the level of single and double bases with a significant, lesser amount of order at the triplet and 4-plet level. This was true for both coding and noncoding regions in both plants. This is in contrast earlier work (Rowe and Trainor) which found that in viruses there was a significant difference between 4-plet ordering in coding and noncoding regions. It is suggested that DNA may be optimized for replication rather than protein production. Several new problems and experiments have been suggested.

Item Metadata

Title	Constraints on the organization and information properties of DNA sequences
Creator	Sibbald, Peter Ramsay
Publisher	University of British Columbia
Date Issued	1988
Description	In an investigation which concentrated primarily on the two completely sequenced chloroplast genomes, one from a tobacco and one from a liverwort, an attempt has been -made to discover some of the factors which produce order in DNA sequences. This was done by 1. looking in detail at doublet organization throughout the genomes, 2'. by examining the ability of different methods to predict the existence of genes, based only on sequence organization and 3. by employing information theorj' to explore various levels of ordering in these sequences. The doublet analysis was performed on seven categories of DNA: tDNA, rDNA, ribosomal proteins, open reading frames not known to be genes (URF), other protein genes, non coding regions and introns. The rDNA has the most unusual doublet properties of all categories although all categories have, to a considerable extent, similar doublet properties. I suggest that these particular doublet properties facilitate accurate replication of the genome. In addition it appears that doublets which have certain thermodynamic properties are more abundant that others, suggesting that there is a selection pressure at the level of doublets for certain thermodynamic properties. Nussinov's hypothesis, that complementary doublets have similar relative abundances due to inverted duplication events has been tested and would not seem to explain the phenomenon. Fickett's method to predict whether URFs are genes was more successful than Sheperd's method. Fickett's method was modified for use on the chloroplast genomes and its rate of successful prediction increased substantially. This modified method will be useful for other chloroplast genomes as they are sequenced and also supports Fickett's contention that the method could be improved for use on specific groups. The ability to predict genes based only on sequence data shows that the requirement to code for protein exerts a detectable amount of order on the gene sequence and that this order is distinguishable from the order in non coding regions. Nearly all URFs greater than 200 base pairs in both plants are predicted to be genes. Informational analysis showed that most order is at the level of single and double bases with a significant, lesser amount of order at the triplet and 4-plet level. This was true for both coding and noncoding regions in both plants. This is in contrast earlier work (Rowe and Trainor) which found that in viruses there was a significant difference between 4-plet ordering in coding and noncoding regions. It is suggested that DNA may be optimized for replication rather than protein production. Several new problems and experiments have been suggested.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-10-18
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0098293
URI	http://hdl.handle.net/2429/29284
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Botany
Affiliation	Science, Faculty of; Botany, Department of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1989_A1 S52.pdf -- 5.62MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Constraints on the organization and information properties of DNA sequences Sibbald, Peter Ramsay

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights