UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A computational framework for large-scale cell lineage reconstruction Kiyota, Brett

Abstract

The advent of CRISPR-Cas9 genome editing technologies has spurred the development of large-scale cell lineage tracing systems. These techniques involve integrating synthetic DNA barcodes into the chromosomes of a single cell, which are subsequently manipulated via genome editing to introduce heritable mutations. As the cell proliferates to form complex multicellular systems, there will be an accumulation of mutations as barcodes are transmitted from one cell generation to the next. At a later time of observation, single-cell RNA sequencing (scRNA-seq) can be used to capture all of the mutated barcodes in each cell, enabling the reconstruction of a tree that aims to recapitulate the cell division history. While analogous to evolutionary tree reconstruction, cell phylogeny estimation faces unique challenges owing to the scale of many organisms, as they can be composed of millions to trillions of nucleated cells. This motivated the Yachie laboratory to develop a divide-and-conquer method that aims to augment existing tools to enable large-scale reconstruction with distributed computing. A proof-of-concept of this framework was published in 2022, which culminated in the reconstruction of a tree containing over 235 million sequences. Given the promise shown by these initial demonstrations, my project focused on further developing this framework to overcome the noise, sparsity, and biases typically associated with the molecular readouts from scRNA-seq technologies. Specifically, noise and sparsity were addressed through the development of a bootstrapping strategy based on orthogonal tree agreement. A rejection sampling strategy based on sequence distance was also developed to overcome possible biases that may be encountered in real biological datasets. Together, these methods combined to provide a relatively unbiased estimate of confidence for proposed tree branches, which could be leveraged to significantly improve reconstruction accuracy. The overall findings of this project suggested widespread improvement in the rigor and robustness of the reconstruction framework, which marked meaningful steps towards accurate cell lineage reconstruction in large-scale sequencing datasets. The Yachie laboratory is currently applying the present approaches to a high-content cell lineage tracing system resolve the developmental cell division history of an adult mouse, a resource that would deepen our understanding of developmental biology.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International