UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Computational methods for the analysis of large-scale single-cell data Subedi, Sishir

Abstract

The innovation in single-cell technology has aided in a deeper understanding of basic biology and disease. The ability to profile genomic information at a single-cell resolution allows researchers to uncover cellular heterogeneity and regulatory dynamics governing cellular processes. An increasing number of studies using these technologies have generated vast amounts of data across multiple modalities; however, robust and scalable computational resources are lacking to extract insightful biological information from large-scale omics data. In this dissertation, I present three novel computational methods to address key challenges in analyzing large-scale single-cell data, focusing on the following areas. (1) Representation learning of cancer patient data: The complex heterogeneity in cancer cells within and across cancer patients has hindered the identification of shared biological processes and the characterization of patient-specific disease mechanisms. I describe PICASA, an interpretable latent space decomposition framework based on a cross-attention mechanism and contrastive learning. The model characterizes active gene modalities common among cancer patients while isolating patient-specific genomic effects. (2) Scalable cell annotation: Many existing annotation methods in the single-cell field are adopted from bulk sequencing analysis and do not scale for large-scale single-cell data in the millions. I describe ASAP, a highly scalable method based on random projection and factorization, specialized to annotate million-scale single-cell data with limited computational resources. (3) Cell-cell communication at single-cell resolution: The current approach to studying cell interactions at the cluster level suffers from the loss of heterogeneous cellular interaction information. I present SPRUCE, an interpretable deep-learning model that captures the major cell pairwise interaction patterns regulating the tumour microenvironment. Through the development of cutting-edge computational methods, I demonstrate the complexities of analyzing large-scale single-cell data and inferring meaningful biological information. I highlight the model-driven understanding of basic biology and disease mechanisms, such as the dynamic interplay between immune and cancer cells in the tumour microenvironment, unbiased annotation of cell populations associated with unique phenotypes, and identification of condition-specific regulatory mechanisms in cancer. In summary, I present computational methodologies, share insightful findings, and formulate biological hypotheses to advance our understanding of single-cell biology.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International