Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours Anamaria, Crisan

Abstract

Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumour genome - in particular single nucleotide variants (SNVs). However, most current computational and statistical models for analyzing next generation sequencing data do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs), which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated –SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended ‘genotype space’ where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). CoNAn-SNV introduces the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to detect 24 experimentally revalidated somatic non-synonymous mutations that were not detected using copy number insensitive SNV discovery algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. Our results indicate that in genomically unstable tumours, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes. The Binomial mixture model framework, however, is unable to fully cope with the complexity of tumour sequence data. We explore substituting the Binomial mixture model framework with the Beta-Binomial framework to overcome this limitation. The effectiveness of this substitution is compared against the lobular breast carcinoma and the 30 exon capture data sets all derived from triple negative breast cancers. The performance of Binomial and Beta-Binomial mixture model is evaluated against a cohort of exon capture test cases and we report ROC and f-measures.

Item Metadata

Title	Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours
Creator	Anamaria, Crisan
Publisher	University of British Columbia
Date Issued	2010
Description	Next generation sequencing has now enabled a cost-effective enumeration of the full mutational complement of a tumour genome - in particular single nucleotide variants (SNVs). However, most current computational and statistical models for analyzing next generation sequencing data do not account for cancer-specific biological properties, including somatic segmental copy number alterations (CNAs), which require special treatment of the data. Here we present CoNAn-SNV (Copy Number Annotated –SNV): a novel algorithm for the inference of single nucleotide variants (SNVs) that overlap copy number alterations. The method is based on modelling the notion that genomic regions of segmental duplication and amplification induce an extended ‘genotype space’ where a subset of genotypes will exhibit heavily skewed allelic distributions in SNVs (and therefore render them undetectable by methods that assume diploidy). CoNAn-SNV introduces the concept of modelling allelic counts from sequencing data using a panel of Binomial mixture models where the number of mixtures for a given locus in the genome is informed by a discrete copy number state given as input. We applied CoNAn-SNV to a previously published whole genome shotgun data set obtained from a lobular breast cancer and show that it is able to detect 24 experimentally revalidated somatic non-synonymous mutations that were not detected using copy number insensitive SNV discovery algorithms. Importantly, ROC analysis shows that the increased sensitivity of CoNAn-SNV does not result in disproportionate loss of specificity. Our results indicate that in genomically unstable tumours, copy number annotation for SNV detection will be critical to fully characterize the mutational landscape of cancer genomes. The Binomial mixture model framework, however, is unable to fully cope with the complexity of tumour sequence data. We explore substituting the Binomial mixture model framework with the Beta-Binomial framework to overcome this limitation. The effectiveness of this substitution is compared against the lobular breast carcinoma and the 30 exon capture data sets all derived from triple negative breast cancers. The performance of Binomial and Beta-Binomial mixture model is evaluated against a cohort of exon capture test cases and we report ROC and f-measures.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-10-22
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0071361
URI	http://hdl.handle.net/2429/29454
Degree (Theses)	Master of Science - MSc
Program (Theses)	Bioinformatics
Affiliation	Science, Faculty of
Degree Grantor	University of British Columbia
Graduation Date	2010-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Mutation discovery in regions of segmental cancer genome amplifications from next generation sequencing of tumours Anamaria, Crisan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights