Algorithms for prediction of RNA pseudoknotted secondary structures

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Algorithms for prediction of RNA pseudoknotted secondary structures Jabbari, Hosna

Abstract

RNA molecules are crucial in different levels of cellular function, and their functions largely depend on their structures. Since experimental methods for determining RNA structure are expensive, computational methods for prediction of RNA secondary structure (the set of base pairs) are valuable. One such approach is based on the Minimum Free Energy (MFE) folding hypothesis, which states an RNA molecule folds into the structure with the minimum free energy. Another approach is based on the hierarchical folding hypothesis, which posits that an RNA molecule first folds into a psuedoknot- free structure (i.e., no crossing base pairs), then additional base pairs are added that may form pseudoknots (structures with crossing base pairs) to lower the structure’s free energy. The overarching goal of this thesis is to overcome some limitations of the previous methods, namely (1) slow running time, (2) poor prediction accuracy (i.e., low number of correctly predicted base pairs), (3) limited in types of pseudoknots they can predict, and (4) limited opportunity to provide structural information that can guide prediction. We propose two algorithms and study different variations of each method. We propose a relaxed version of the hierarchical folding hypothesis and present Iterative HFold, which is based on this hypothesis. Furthermore, we propose the CCJ algorithm, an MFE-based algorithm that significantly expands the structures handled in O(n⁵) time. Employing a sparsification technique, we propose a sparse version of CCJ that improves its space usage. While CCJ’s and Iterative HFold’s performance are not significantly different on our large data set, Iterative HFold is considerably less expensive to run than CCJ. In addition, Iterative HFold provides the user with ability to incorporate structural information as input constraint. Both CCJ and Iterative HFold predict significantly more accurate structures on key data sets than two of the best performing computational methods currently available (i.e., HotKnots V2.0 and IPknot). Of the folding hypotheses that we have studied, it seems that the relaxed hierarchical folding hypothesis has better potential to predict RNA secondary structure, at least with respect to the energy model used in this work.

Item Metadata

Title	Algorithms for prediction of RNA pseudoknotted secondary structures
Creator	Jabbari, Hosna
Publisher	University of British Columbia
Date Issued	2015
Description	RNA molecules are crucial in different levels of cellular function, and their functions largely depend on their structures. Since experimental methods for determining RNA structure are expensive, computational methods for prediction of RNA secondary structure (the set of base pairs) are valuable. One such approach is based on the Minimum Free Energy (MFE) folding hypothesis, which states an RNA molecule folds into the structure with the minimum free energy. Another approach is based on the hierarchical folding hypothesis, which posits that an RNA molecule first folds into a psuedoknot- free structure (i.e., no crossing base pairs), then additional base pairs are added that may form pseudoknots (structures with crossing base pairs) to lower the structure’s free energy. The overarching goal of this thesis is to overcome some limitations of the previous methods, namely (1) slow running time, (2) poor prediction accuracy (i.e., low number of correctly predicted base pairs), (3) limited in types of pseudoknots they can predict, and (4) limited opportunity to provide structural information that can guide prediction. We propose two algorithms and study different variations of each method. We propose a relaxed version of the hierarchical folding hypothesis and present Iterative HFold, which is based on this hypothesis. Furthermore, we propose the CCJ algorithm, an MFE-based algorithm that significantly expands the structures handled in O(n⁵) time. Employing a sparsification technique, we propose a sparse version of CCJ that improves its space usage. While CCJ’s and Iterative HFold’s performance are not significantly different on our large data set, Iterative HFold is considerably less expensive to run than CCJ. In addition, Iterative HFold provides the user with ability to incorporate structural information as input constraint. Both CCJ and Iterative HFold predict significantly more accurate structures on key data sets than two of the best performing computational methods currently available (i.e., HotKnots V2.0 and IPknot). Of the folding hypotheses that we have studied, it seems that the relaxed hierarchical folding hypothesis has better potential to predict RNA secondary structure, at least with respect to the energy model used in this work.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2015-03-16
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivs 2.5 Canada
DOI	10.14288/1.0167140
URI	http://hdl.handle.net/2429/52396
Degree	Doctor of Philosophy - PhD
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2015-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/2.5/ca/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Algorithms for prediction of RNA pseudoknotted secondary structures Jabbari, Hosna

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights