- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Large-scale mining of differential expression data...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Large-scale mining of differential expression data for insight into gene function Sicherman, Jordan
Abstract
A persistent challenge in genetics and genomics is the interpretation of “hit lists” of genes, leading to the development of, and almost universal application of methods such as Gene Ontology (GO) enrichment analysis. While these methods have been of unquestionable utility, GO enrichment and similar approaches based on gene annotations leave much to be desired and they are often used as a “sanity check” rather than a way to make discoveries. To offer a complementary perspective with the potential to remedy some existing challenges, I developed and evaluated an algorithm that helps put hit lists of genes into biological context by performing large-scale mining on patterns of differential expression (DE). In this work, I present the development and evaluation of my algorithm which mines over 10,000 transcriptomic datasets in a process we term “condition enrichment”. The output of the algorithm is a list of biological condition comparisons (drug treatments, diseases, etc.) scored according to their relatedness (in terms of DE) to the query genes. I show that performing searches on gene sets of a priori interest enables my algorithm to effectively identify known gene-condition relationships in real and simulated data, providing a useful summary of the condition comparisons most highly associated with the differential expression of the gene set. Finally, I present a powerful open-source web application to provide researchers access to Gemma DE, in the hope that it will aid future research.
Item Metadata
Title |
Large-scale mining of differential expression data for insight into gene function
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2021
|
Description |
A persistent challenge in genetics and genomics is the interpretation of “hit lists” of genes, leading to the development of, and almost universal application of methods such as Gene Ontology (GO) enrichment analysis. While these methods have been of unquestionable utility, GO enrichment and similar approaches based on gene annotations leave much to be desired and they are often used as a “sanity check” rather than a way to make discoveries. To offer a complementary perspective with the potential to remedy some existing challenges, I developed and evaluated an algorithm that helps put hit lists of genes into biological context by performing large-scale mining on patterns of differential expression (DE). In this work, I present the development and evaluation of my algorithm which mines over 10,000 transcriptomic datasets in a process we term “condition enrichment”. The output of the algorithm is a list of biological condition comparisons (drug treatments, diseases, etc.) scored according to their relatedness (in terms of DE) to the query genes. I show that performing searches on gene sets of a priori interest enables my algorithm to effectively identify known gene-condition relationships in real and simulated data, providing a useful summary of the condition comparisons most highly associated with the differential expression of the gene set. Finally, I present a powerful open-source web application to provide researchers access to Gemma DE, in the hope that it will aid future research.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2021-08-11
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-ShareAlike 4.0 International
|
DOI |
10.14288/1.0401373
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2021-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-ShareAlike 4.0 International