- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Investigations into the contributions of co-expression...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Investigations into the contributions of co-expression to computational gene function prediction Adrian-Hamazaki, Alex
Abstract
It is widely accepted in genomics that co-expression of RNA transcripts suggests a commonality of function. This observation is explicitly leveraged in machine learning methods that predict gene function, where it is often combined with other features such as protein interactions and sequence similarity. For example, including co-expression data from human tissue expression boosts performance for predicting Gene Ontology annotations. However, the biological underpinning of this observation have not been well-investigated. Building on earlier results from our group, in my work I show that gene function is predictable from co-expression substantially because it reflects differences in expression between cell types, and these differences are also intrinsic to the ground truth labels. Using simulations and analyses of real data, I show that variance in the cellular composition of bulk samples contributes positively to function learnability and attribute this to cell type marker gene content in the GO terms. I further show that cell type profiles, where the relationship between gene expression and cell type is made transparent, are effective for predicting gene function while increasing interpretability. Finally, I show that a greater breadth of cell type expression information can improve predictive performance, and I attribute the performance of specific cell type related GO terms to specific cell type expression. My results have implications for how gene function prediction methods are developed, evaluated and interpreted.
Item Metadata
Title |
Investigations into the contributions of co-expression to computational gene function prediction
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
It is widely accepted in genomics that co-expression of RNA transcripts suggests a commonality of function. This observation is explicitly leveraged in machine learning methods that predict gene function, where it is often combined with other features such as protein interactions and sequence similarity. For example, including co-expression data from human tissue expression boosts performance for predicting Gene Ontology annotations. However, the biological underpinning of this observation have not been well-investigated. Building on earlier results from our group, in my work I show that gene function is predictable from co-expression substantially because it reflects differences in expression between cell types, and these differences are also intrinsic to the ground truth labels. Using simulations and analyses of real data, I show that variance in the cellular composition of bulk samples contributes positively to function learnability and attribute this to cell type marker gene content in the GO terms. I further show that cell type profiles, where the relationship between gene expression and cell type is made transparent, are effective for predicting gene function while increasing interpretability. Finally, I show that a greater breadth of cell type expression information can improve predictive performance, and I attribute the performance of specific cell type related GO terms to specific cell type expression. My results have implications for how gene function prediction methods are developed, evaluated and interpreted.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-10-28
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0447166
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2025-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International