UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Empirical analysis of the MEA and MFE RNA secondary structure prediction Hajiaghayi, Monir

Abstract

RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since their functions largely depend on their folded structures, having accurate and efficient methods for RNA structure prediction is increasingly valuable. A number of computational approaches have been developed to predict RNA secondary structure. There have been some recent advances for two of these approaches, namely Minimum Free Energy (MFE) and Maximum Expected Accuracy (MEA). The main goals of this thesis are twofold: 1) to empirically analyze the accuracy (i.e., the number of correctly predicted base pairs) of the MFE and MEA algorithms in different energy models and 2) to estimate the free energy parameters specifically for the MEA by using the constraint generation (CG) approach. We present new results on the degree to which different factors influence the prediction accuracy MFE and MEA. The factors that we consider are: structural information that is provided in addition to RNA sequence, thermodynamic parameters, and data set size. Structural information significantly increases the accuracy of these methods. The relative performance of MFE and MEA changes according to the free energy parameters used; however, both have the best performance when they use Andronescu et al.'s BL* parameter set. Having bigger data sets results in less variation in prediction accuracy of the MFE and MEA algorithms. Furthermore, we try to find better free energy parameters for the MEA algorithm. For our purpose, we adapt the CG approach so that it specifically optimizes parameters for MEA. Overall, when parameters are trained using MFE, they slightly outperform the parameters estimated for MEA. For the MEA algorithm, we also study the effect of parameter γ which controls the relative sensitivity and positive predictive value (PPV). We obtain that when γ=1, the accuracy of MEA (in terms of F-measure on the BL* parameter set) is better than its accuracy for other values of γ. We also find that the sensitivity and PPV of MEA will interestingly change for different values of γ using the BL* parameter set.

Item Media

Item Citations and Data

License

Attribution-NonCommercial-NoDerivatives 4.0 International

Usage Statistics