- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A novel statistical method for the accurate identification...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
A novel statistical method for the accurate identification of RNA-edits with application to human cancers Giuliany, Ryan S.
Abstract
RNA-editing is the post-transcriptional, enzymatic modification of RNA molecules resulting in an
altered nucleotide sequence. These modifications play a critical role in mammalian tissues and are
essential for proper function of liver and neuronal development, among other processes. The advent
of high-throughput sequencing (HTS) technologies (e.g. Illumina HiSeq) has renewed interest in
RNA-editing discovery due to unprecedented opportunities for simultaneous interrogation of whole
genome and transcriptome sequences. In the past several months a number of studies have been published
describing methods and results of RNA-editing discovery in HTS data. These methods have
been ad hoc approaches based on repurposing SNP calling tools designed for genome-based variant
detection. However, the statistical properties of RNA-editing warrant specialized analytical strategies
that leverage the non-uniform substitution distributions inherent in RNA-editing processes.
A novel statistical framework, called Auditor, that simultaneously analyzes the genomic and
transcriptomic base-counts and infers the likelihood of an RNA-edit at each position in the transcriptome
is reported. This model leverages the inherent correlation present in the RNA and DNA
sequence while encoding the non-uniform substitution distributions induced by RNA-editing, conferring
increased sensitivity. Further, a Random-Forest based technical artifact removal tool that
accurately identifies sequencing and alignment errors has been implemented, greatly increasing the
specificity of the method. The combination of these approaches leads to a robust, principled method
that accurately detects RNA-edits in the presence of both biological and technical noise.
It is systematically shown, in both a simulation study and on real matched whole genome and
transcriptome data generated from 11 lymphoma samples, that Auditor significantly outperforms
similar, but simpler statistical frameworks, including a Samtools/bcftools based approach that is
similar to a recently published study. Finally by profiling 11 diffuse large B-cell lymphomas and
16 triple negative breast cancers with Auditor, it is shown that RNA-editing is an active process
in human malignancies. Surprisingly, consistent patterns of nucleotide substitutions and regional
enrichment of RNA-edits in 3 UTRs suggests that RNA-editing processes are invariant between
cell lineages and between tumours of similar histological subtypes and even cancers from distinct
tissues of origin.
ii
Item Metadata
| Title |
A novel statistical method for the accurate identification of RNA-edits with application to human cancers
|
| Creator | |
| Publisher |
University of British Columbia
|
| Date Issued |
2012
|
| Description |
RNA-editing is the post-transcriptional, enzymatic modification of RNA molecules resulting in an
altered nucleotide sequence. These modifications play a critical role in mammalian tissues and are
essential for proper function of liver and neuronal development, among other processes. The advent
of high-throughput sequencing (HTS) technologies (e.g. Illumina HiSeq) has renewed interest in
RNA-editing discovery due to unprecedented opportunities for simultaneous interrogation of whole
genome and transcriptome sequences. In the past several months a number of studies have been published
describing methods and results of RNA-editing discovery in HTS data. These methods have
been ad hoc approaches based on repurposing SNP calling tools designed for genome-based variant
detection. However, the statistical properties of RNA-editing warrant specialized analytical strategies
that leverage the non-uniform substitution distributions inherent in RNA-editing processes.
A novel statistical framework, called Auditor, that simultaneously analyzes the genomic and
transcriptomic base-counts and infers the likelihood of an RNA-edit at each position in the transcriptome
is reported. This model leverages the inherent correlation present in the RNA and DNA
sequence while encoding the non-uniform substitution distributions induced by RNA-editing, conferring
increased sensitivity. Further, a Random-Forest based technical artifact removal tool that
accurately identifies sequencing and alignment errors has been implemented, greatly increasing the
specificity of the method. The combination of these approaches leads to a robust, principled method
that accurately detects RNA-edits in the presence of both biological and technical noise.
It is systematically shown, in both a simulation study and on real matched whole genome and
transcriptome data generated from 11 lymphoma samples, that Auditor significantly outperforms
similar, but simpler statistical frameworks, including a Samtools/bcftools based approach that is
similar to a recently published study. Finally by profiling 11 diffuse large B-cell lymphomas and
16 triple negative breast cancers with Auditor, it is shown that RNA-editing is an active process
in human malignancies. Surprisingly, consistent patterns of nucleotide substitutions and regional
enrichment of RNA-edits in 3 UTRs suggests that RNA-editing processes are invariant between
cell lineages and between tumours of similar histological subtypes and even cancers from distinct
tissues of origin.
ii
|
| Genre | |
| Type | |
| Language |
eng
|
| Date Available |
2013-01-31
|
| Provider |
Vancouver : University of British Columbia Library
|
| Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
| DOI |
10.14288/1.0072919
|
| URI | |
| Degree (Theses) | |
| Program (Theses) | |
| Affiliation | |
| Degree Grantor |
University of British Columbia
|
| Graduation Date |
2012-11
|
| Campus | |
| Scholarly Level |
Graduate
|
| Rights URI | |
| Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International