UBC Theses and Dissertations
Algorithms for predicting the secondary structure of pairs and combinatorial sets of nucleic acid strands Andronescu, Mirela Ştefania
Secondary structure prediction of nucleic acid molecules is a very important problem in computational molecular biology. In this thesis we introduce two new algorithms for: (1) secondary structure prediction of pairs of nucleic acid molecules (PairFold), and (2) finding which sequences, formed from a combinatorial set of nucleic acid strands, have the most stable secondary structures (CombFold). Our algorithms run in polynomial time in the sequences lengths and are extensions of the free energy minimization algorithm  for secondary structure prediction without pseudoknots, using the nearest neighbour thermodynamic model. Predicting hybridization of pairs of molecules is motivated by important applications such as ribozyme - mRNA target duplexes, primer binding prediction and DNA code design. Finding the most stable concatenations in combinatorial sets of strands is useful for SELEX experiments and for testing whether sets in DNA computing or tag libraries concatenate without secondary structure. Our results for PairFold predictions show over 80% accuracy for sequences of up to 100 nucleotides. The performance goes down as the sequences increase in length and as the number of non-canonical base pairs, pseudoknots and tertiary interactions, none of these considered here, increases. The accuracy of CombFold is similar to that of the free energy minimization algorithm for single strands, being just a polynomial method for structure prediction of a combinatorial set of strands. We show that although complex, CombFold can quickly predict large concatenations of sets drawn from the literature. In the future, these two algorithms can be combined to predict the most stable duplexes formed by two combinatorial sets.
Item Citations and Data