Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands Andronescu, Mirela Ştefania

Abstract

Secondary structure prediction of nucleic acid molecules is a very important problem in computational molecular biology. In this thesis we introduce two new algorithms for: (1) secondary structure prediction of pairs of nucleic acid molecules (PairFold), and (2) finding which sequences, formed from a combinatorial set of nucleic acid strands, have the most stable secondary structures (CombFold). Our algorithms run in polynomial time in the sequences lengths and are extensions of the free energy minimization algorithm [72] for secondary structure prediction without pseudoknots, using the nearest neighbour thermodynamic model. Predicting hybridization of pairs of molecules is motivated by important applications such as ribozyme - mRNA target duplexes, primer binding prediction and DNA code design. Finding the most stable concatenations in combinatorial sets of strands is useful for SELEX experiments and for testing whether sets in DNA computing or tag libraries concatenate without secondary structure. Our results for PairFold predictions show over 80% accuracy for sequences of up to 100 nucleotides. The performance goes down as the sequences increase in length and as the number of non-canonical base pairs, pseudoknots and tertiary interactions, none of these considered here, increases. The accuracy of CombFold is similar to that of the free energy minimization algorithm for single strands, being just a polynomial method for structure prediction of a combinatorial set of strands. We show that although complex, CombFold can quickly predict large concatenations of sets drawn from the literature. In the future, these two algorithms can be combined to predict the most stable duplexes formed by two combinatorial sets.

Item Metadata

Title	Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands
Creator	Andronescu, Mirela Ştefania
Publisher	University of British Columbia
Date Issued	2003
Description	Secondary structure prediction of nucleic acid molecules is a very important problem in computational molecular biology. In this thesis we introduce two new algorithms for: (1) secondary structure prediction of pairs of nucleic acid molecules (PairFold), and (2) finding which sequences, formed from a combinatorial set of nucleic acid strands, have the most stable secondary structures (CombFold). Our algorithms run in polynomial time in the sequences lengths and are extensions of the free energy minimization algorithm [72] for secondary structure prediction without pseudoknots, using the nearest neighbour thermodynamic model. Predicting hybridization of pairs of molecules is motivated by important applications such as ribozyme - mRNA target duplexes, primer binding prediction and DNA code design. Finding the most stable concatenations in combinatorial sets of strands is useful for SELEX experiments and for testing whether sets in DNA computing or tag libraries concatenate without secondary structure. Our results for PairFold predictions show over 80% accuracy for sequences of up to 100 nucleotides. The performance goes down as the sequences increase in length and as the number of non-canonical base pairs, pseudoknots and tertiary interactions, none of these considered here, increases. The accuracy of CombFold is similar to that of the free energy minimization algorithm for single strands, being just a polynomial method for structure prediction of a combinatorial set of strands. We show that although complex, CombFold can quickly predict large concatenations of sets drawn from the literature. In the future, these two algorithms can be combined to predict the most stable duplexes formed by two combinatorial sets.
Extent	6901485 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-11-02
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051269
URI	http://hdl.handle.net/2429/14551
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2003-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_2003-0621.pdf -- 6.58MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands Andronescu, Mirela Ştefania

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights