dd-PyClone : improving clonal subpopulation inference from single cells and bulk sequencing data

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

dd-PyClone : improving clonal subpopulation inference from single cells and bulk sequencing data Salehi, Sohrab

Abstract

Improving our understanding of intra-tumour heterogeneity in cancer has important clinical implications, including an opportunity to understand mechanisms behind relapses and drug resistance. Next generation bulk sequencing is a mature tech- nology that has been used to study subclonal tumour populations at an aggregate level. Inference of populations from bulk sequencing requires sophisticated com- putational deconvolution methods. An alternative is to identify populations directly with single cell sequencing. However, single cell sequencing is a very error-prone process, and this impedes its ability to completely replace bulk sequencing for now. In this work we present dd-PyClone, a statistical model to combine single cell and bulk sequencing data to study clonal subpopulation architecture and improve clustering assignment and cellular prevalence estimates of a set of genomic loci. We introduce a single nucleotide variant and copy number aberration aware genotype simulation scheme based on a phylogenetic tree, termed the Generalized Dollo model. This model is an improvement over previous genotype generator models in that it also accounts for the evolutionary process before a rare event (here the single nucleotide variant) occurs. We show that incorporating genomic loci co-occurrence patterns from single cell sequencing studies in inferring clonal subpopulation structure from bulk se- quencing data is beneficial. Our method outperforms existing methods in simula- tion studies and performs comparably in real dataset benchmarking. We also show that our method is fairly robust as to the choice of hyperparameters and performs reasonably in presence of noise. We hope that our method will further the under- standing of the evolutionary basis of cancer.

Item Metadata

Title	dd-PyClone : improving clonal subpopulation inference from single cells and bulk sequencing data
Creator	Salehi, Sohrab
Publisher	University of British Columbia
Date Issued	2015
Description	Improving our understanding of intra-tumour heterogeneity in cancer has important clinical implications, including an opportunity to understand mechanisms behind relapses and drug resistance. Next generation bulk sequencing is a mature tech- nology that has been used to study subclonal tumour populations at an aggregate level. Inference of populations from bulk sequencing requires sophisticated com- putational deconvolution methods. An alternative is to identify populations directly with single cell sequencing. However, single cell sequencing is a very error-prone process, and this impedes its ability to completely replace bulk sequencing for now. In this work we present dd-PyClone, a statistical model to combine single cell and bulk sequencing data to study clonal subpopulation architecture and improve clustering assignment and cellular prevalence estimates of a set of genomic loci. We introduce a single nucleotide variant and copy number aberration aware genotype simulation scheme based on a phylogenetic tree, termed the Generalized Dollo model. This model is an improvement over previous genotype generator models in that it also accounts for the evolutionary process before a rare event (here the single nucleotide variant) occurs. We show that incorporating genomic loci co-occurrence patterns from single cell sequencing studies in inferring clonal subpopulation structure from bulk se- quencing data is beneficial. Our method outperforms existing methods in simula- tion studies and performs comparably in real dataset benchmarking. We also show that our method is fairly robust as to the choice of hyperparameters and performs reasonably in presence of noise. We hope that our method will further the under- standing of the evolutionary basis of cancer.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2016-01-04
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivs 2.5 Canada
DOI	10.14288/1.0223066
URI	http://hdl.handle.net/2429/56179
Degree (Theses)	Master of Science - MSc
Program (Theses)	Bioinformatics
Affiliation	Science, Faculty of
Degree Grantor	University of British Columbia
Graduation Date	2016-02
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/2.5/ca/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

dd-PyClone : improving clonal subpopulation inference from single cells and bulk sequencing data Salehi, Sohrab

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights