UBC Theses and Dissertations
Cellular composition variation drives coexpression-based gene function prediction Wu, Qinkai
Coexpression analysis has been widely used for gene function prediction, based on the principle of guilt by association. Most studies use transcriptomic data obtained from bulk tissues, where the expression level of genes reflects the contribution of multiple cell types. Previous work has already documented how variability of cellular composition impacts coexpression analysis. However, the connection between the predictability of gene functions, coexpression networks and cell type profiles has not been studied. I hypothesized that one reason bulk-data-derived coexpression networks contain signals relevant to function prediction is that it contains information about genes’ expression profiles across cell types. Focusing on human brain datasets, I applied several approaches to test this hypothesis, including creating simulated bulk datasets from single-nucleus data and bulk data deconvolution. I find that much predictive power can be attributed to cell type proportion variation. Consequently, a more explicit and interpretable function prediction can be made directly using expression patterns across cell types, which not only yields similar results but also clearly reveals the association between the functional terms and specific cell types. These findings have important implications for coexpression analysis and function prediction.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International