- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Integrating nanopore long-range genomic variant phasing...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Integrating nanopore long-range genomic variant phasing to improve resolution of allele-specific expression from short-read RNA-seq Chang, Keng Man Glenn
Abstract
Cancer genomes often carry somatic mutations that disrupt gene expression on specific alleles, leading to Allele-Specific Expression (ASE). ASE may dysregulate cancer driver genes by favoring loss-of-function alleles, causing haploinsufficiency. However, detecting ASE requires precise genomic variant phasing, a challenge with previous phasing methods such as pseudo-phasing algorithms, short-read phasing and population phasing. These methods are limited by low accuracy, small phase blocks and inability to phase rare and somatic variants. Our objective is to develop and validate a bioinformatics pipeline for genome-wide ASE gene detection by integrating Nanopore long-read sequencing for precise genomic variant phasing with short-read RNA-sequencing for enhanced ASE detection. We will conduct an exploratory analysis of ASE data using a pan-cancer cohort as part of the Personalized OncoGenomic (POG) project. Our aim is to identify the genomic mechanisms responsible for genes exhibiting ASE, explore known instances of ASE, and utilize ASE to identify and validate dysregulated cancer genes that may contribute to tumor development. To achieve these goals, we have developed IMPALA, a pipeline designed for the detection of ASE genes and the identification of potential genetic mechanisms behind ASE. Additionally, we have developed a novel ASE simulator capable of generating synthetic RNA-seq data containing known ASE genes. Using the simulated data, IMPALA demonstrates an average F1 score of 0.93. ASE analysis was done on 179 tumor samples from the POG cohort, where an average of 26% of phased genes exhibited ASE. We explore various cis-acting genetic mechanisms that can lead to ASE, including allelic CNV imbalance, allelic promoter methylation silencing, and nonsense-mediated decay. Notably, we explored known instances of ASE, such as X-inactivation genes, imprinting genes, and their correlation with allelic methylation of imprinting control regions. Furthermore, we employ ASE to investigate clinically relevant cancer genes, focusing on somatic mutations found on the major expressing allele. Leveraging ASE, we identify 631 significant noncoding mutations linked to the expression of cancer genes, potentially serving as markers for cis-acting regulatory mechanisms. Lastly, we delve into the specific abnormal ASE of cancer genes, such as DUSP22, to uncover potential genetic mechanisms behind these ASE events.
Item Metadata
Title |
Integrating nanopore long-range genomic variant phasing to improve resolution of allele-specific expression from short-read RNA-seq
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2023
|
Description |
Cancer genomes often carry somatic mutations that disrupt gene expression on specific alleles, leading to Allele-Specific Expression (ASE). ASE may dysregulate cancer driver genes by favoring loss-of-function alleles, causing haploinsufficiency. However, detecting ASE requires precise genomic variant phasing, a challenge with previous phasing methods such as pseudo-phasing algorithms, short-read phasing and population phasing. These methods are limited by low accuracy, small phase blocks and inability to phase rare and somatic variants.
Our objective is to develop and validate a bioinformatics pipeline for genome-wide ASE gene detection by integrating Nanopore long-read sequencing for precise genomic variant phasing with short-read RNA-sequencing for enhanced ASE detection. We will conduct an exploratory analysis of ASE data using a pan-cancer cohort as part of the Personalized OncoGenomic (POG) project. Our aim is to identify the genomic mechanisms responsible for genes exhibiting ASE, explore known instances of ASE, and utilize ASE to identify and validate dysregulated cancer genes that may contribute to tumor development.
To achieve these goals, we have developed IMPALA, a pipeline designed for the detection of ASE genes and the identification of potential genetic mechanisms behind ASE. Additionally, we have developed a novel ASE simulator capable of generating synthetic RNA-seq data containing known ASE genes. Using the simulated data, IMPALA demonstrates an average F1 score of 0.93. ASE analysis was done on 179 tumor samples from the POG cohort, where an average of 26% of phased genes exhibited ASE.
We explore various cis-acting genetic mechanisms that can lead to ASE, including allelic CNV imbalance, allelic promoter methylation silencing, and nonsense-mediated decay. Notably, we explored known instances of ASE, such as X-inactivation genes, imprinting genes, and their correlation with allelic methylation of imprinting control regions. Furthermore, we employ ASE to investigate clinically relevant cancer genes, focusing on somatic mutations found on the major expressing allele. Leveraging ASE, we identify 631 significant noncoding mutations linked to the expression of cancer genes, potentially serving as markers for cis-acting regulatory mechanisms. Lastly, we delve into the specific abnormal ASE of cancer genes, such as DUSP22, to uncover potential genetic mechanisms behind these ASE events.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2023-12-21
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0438328
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International