UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Learning cellular hierarchies through structured topic modeling Ye, Patricia Er

Abstract

The human immune system relies on the function and balance of various immune cell subsets and their interactions. Immune cells undergo a series of differentiation steps following a lineage-tree structure stemming from hematopoietic stem cells to reach their mature cell state. During differentiation of immune cells in both homeostasis and pathological processes, many cellular features, including gene expression patterns, are shared by fully differentiated immune cell sub-types. The process of immune cell differentiation is complex and not fully understood. Additionally, aberrant function and balance plays a major contributing role in the pathogenesis of many immunological disorders, including systemic lupus erythematosus. In this thesis, I propose LaRCH, a tree-structured neural topic model as a method to quantitatively characterize shared hierarchical features between cell subsets. In this model, single-cell gene expression profiles are represented by a mixture of topics consisting of latent features that follow an underlying tree structure, mirroring the dynamics of cellular differentiation. I present findings of our model trained on simulated single-cell RNA sequencing based on cell-sorted bulk RNA-seq data and a scRNA-seq dataset of over 1.2 million cells from individuals with variable lupus disease phenotypes. The cellular topic profiles estimated by our model markedly improve cell type deconvolution accuracy over traditional methods. Trained model parameters of LaRCH illustrate cell-type specific transcriptomic differences between SLE phenotypes, revealing the contributions of multiple immune cell types in the manifestations of lupus. I also identify a number of candidate genes that may have implications in the driving mechanisms that contribute to lupus disease pathogenesis. Ultimately, LaRCH is able to capture the hierarchical context between immune cell subsets by simultaneously identifying shared and distinct latent features amongst cell subtypes within heterogeneous cell samples.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International