- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Machine learning in transcriptome analysis using long...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Machine learning in transcriptome analysis using long RNA sequencing data Hafezqorani, Saber
Abstract
The advent of long-read RNA sequencing technologies has advanced transcriptomic research by enabling the sequencing of entire transcripts. This advancement holds the promise of uncovering novel insights into the complex nature of eukaryotic transcriptomes, characterized by phenomena such as alternative splicing and intron retention. However, the vast amounts of data generated by these technologies necessitates the development of sophisticated tools for effective data analysis. This thesis addresses these challenges through two main objectives: the development of a novel simulator for long-read RNA sequencing data, capable of accurately mimicking transcriptome-specific features and the application of deep learning techniques for the robust analysis of RNA sequencing data. The first objective addresses the need for a simulator tailored specifically for RNA sequencing data from Oxford Nanopore Technologies, aiming to create a tool that not only generates reads mimicking real sequencing outputs, but also incorporates critical transcriptomic features such as expression levels and intron retention events. This simulator serves as a resource for the development and refinement of transcriptome analysis tools, offering a cost-effective alternative to extensive sequencing experiments to generate ground-truth data for benchmarking. The second objective explores the potential of deep learning in transcriptomics, focusing on the development of a nucleotide sequence embedding method. It aims to capture the complex, long-term dependencies within sequences, a task that has proven challenging due to the variable length and intricate nature of RNA sequences. By leveraging deep learning's capacity to learn feature representations implicitly, this research seeks to enhance the accuracy and efficiency of sequence classification and clustering tasks within transcriptomic studies.
Item Metadata
Title |
Machine learning in transcriptome analysis using long RNA sequencing data
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
The advent of long-read RNA sequencing technologies has advanced transcriptomic research by enabling the sequencing of entire transcripts. This advancement holds the promise of uncovering novel insights into the complex nature of eukaryotic transcriptomes, characterized by phenomena such as alternative splicing and intron retention. However, the vast amounts of data generated by these technologies necessitates the development of sophisticated tools for effective data analysis. This thesis addresses these challenges through two main objectives: the development of a novel simulator for long-read RNA sequencing data, capable of accurately mimicking transcriptome-specific features and the application of deep learning techniques for the robust analysis of RNA sequencing data.
The first objective addresses the need for a simulator tailored specifically for RNA sequencing data from Oxford Nanopore Technologies, aiming to create a tool that not only generates reads mimicking real sequencing outputs, but also incorporates critical transcriptomic features such as expression levels and intron retention events. This simulator serves as a resource for the development and refinement of transcriptome analysis tools, offering a cost-effective alternative to extensive sequencing experiments to generate ground-truth data for benchmarking.
The second objective explores the potential of deep learning in transcriptomics, focusing on the development of a nucleotide sequence embedding method. It aims to capture the complex, long-term dependencies within sequences, a task that has proven challenging due to the variable length and intricate nature of RNA sequences. By leveraging deep learning's capacity to learn feature representations implicitly, this research seeks to enhance the accuracy and efficiency of sequence classification and clustering tasks within transcriptomic studies.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-07-29
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0444844
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International