UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Theory and algorithms for spatial transcriptomics denoising Kubal, Sharvaj

Abstract

Spatial transcriptomics goes one step beyond single-cell RNA sequencing, yielding high-dimensional images of gene expression in a tissue, which offer the prospect of understanding cell signaling. Recent breakthroughs in spatial transcriptomics have drastically increased the area of tissue which can be profiled. However, sequencing all RNA molecules over large spatial areas is prohibitively expensive. One would like to reduce the number of RNAs sequenced, but this results in excessive technical noise in the measured gene expression data. To counter this problem, we develop theory and algorithms for spatial transcriptomics denoising, based on low rank matrix recovery and spatial smoothing. We propose two novel procedures for estimating the true underlying gene expression image: (1) a low rank maximum-likelihood-type estimator with graph-based total variation regularization, and (2) a switching procedure that switches between 'risky' estimators that work well in practice, and 'safe' estimators which have well-known properties. Our methods are backed by theoretical recovery guarantees, as well as tests on real data which suggest that it is possible to reduce the number of RNAs sequenced by more than 10-fold, without significantly increasing recovery error. Finally, as a generalization of the analysis employed above, we establish some convergence rates for the estimation of structured discrete probability distributions.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International