UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Deep learning model for scoring circular RNAs and interpreting back-splicing Asmae, Mohammad Mahdi

Abstract

Circular RNAs (circRNAs) are a group of non-coding RNA molecules that are covalently closed and generated through a process called back-splicing. CircRNAs are involved in various biological processes such as gene regulation and translation, and they play important roles in diseases, including cancer. Their closed-loop structure gives them exceptional stability compared to linear RNAs. These attributes make circRNAs promising candidates for cancer diagnosis and monitoring disease progression. Many computational tools have been developed to identify circRNAs from RNA sequencing data, the results are often affected by high false-positive rates, and the low number of supporting reads makes validation difficult. Most existing tools rely solely on alignment-based read detection and do not provide a confidence score or biological interpretability. Therefore, there is a need for models that can predict circRNA formation directly from sequence features, providing a way to distinguish genuine circRNAs from false positives. This study introduces a deep learning model to predict circRNA formation using sequence-based features. The proposed model incorporates biological knowledge of circRNA biogenesis and employs convolutional layers together with a multi-head attention mechanism to capture both local and long-range dependencies. Two simulated datasets derived from public databases and RNA-seq data from prostate cancer cell lines were used to train and evaluate the model. The model achieves comparable or superior accuracy to existing methods and demonstrates high agreement with experimentally validated datasets. It can also identify sequence motifs corresponding to RNA-binding protein sites involved in circRNA biogenesis. Overall, this work presents an interpretable framework for circRNA prediction that bridges computational modeling and biological understanding. The proposed approach improves confidence in circRNA identification and contributes to a deeper understanding of circRNA formation.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International