Machine learning approaches for single-cell multiomics data integration and generation

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Machine learning approaches for single-cell multiomics data integration and generation Niu, Yi Nian

Abstract

Single-cell multiomics technologies generate paired or multiple measurements of different cellular properties (modalities), such as gene expression and chromatin accessibility. However, multiomics technologies are more expensive than their single-modality counterparts, resulting in smaller and fewer available multiomics datasets. Here, we present scPairing, a variational autoencoder model inspired by Contrastive Language-Image Pre-Training, which embeds different modalities from the same single cells onto a common embedding space. We leverage the common embedding space to generate novel multiomics data following bridge integration. Through extensive benchmarking, we show that scPairing constructs an embedding space that fully captures both coarse and fine biological structures. Then, we use scPairing to generate new multiomics data from retina and immune cells. Furthermore, we extend to co-embed three modalities and generate a new trimodal dataset of bone marrow immune cells. Researchers can use these generated multiomics datasets to discover new biological relationships across modalities or confirm existing hypotheses without the need for costly multiomics technologies.

Item Metadata

Title	Machine learning approaches for single-cell multiomics data integration and generation
Creator	Niu, Yi Nian
Supervisor	Ding, Jiarui
Publisher	University of British Columbia
Date Issued	2024
Description	Single-cell multiomics technologies generate paired or multiple measurements of different cellular properties (modalities), such as gene expression and chromatin accessibility. However, multiomics technologies are more expensive than their single-modality counterparts, resulting in smaller and fewer available multiomics datasets. Here, we present scPairing, a variational autoencoder model inspired by Contrastive Language-Image Pre-Training, which embeds different modalities from the same single cells onto a common embedding space. We leverage the common embedding space to generate novel multiomics data following bridge integration. Through extensive benchmarking, we show that scPairing constructs an embedding space that fully captures both coarse and fine biological structures. Then, we use scPairing to generate new multiomics data from retina and immune cells. Furthermore, we extend to co-embed three modalities and generate a new trimodal dataset of bone marrow immune cells. Researchers can use these generated multiomics datasets to discover new biological relationships across modalities or confirm existing hypotheses without the need for costly multiomics technologies.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2025-07-31
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0444847
URI	http://hdl.handle.net/2429/88818
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2024-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Machine learning approaches for single-cell multiomics data integration and generation Niu, Yi Nian

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights