UBC Theses and Dissertations
Prioritizing genes with functionally distinct splice isoforms Bhuiyan, Shamsuddin A.
Most mammalian genes generate multiple transcripts via splicing, and we do not know the function of most splice variants. Currently, there is a debate about how many splice variants are likely nonfunctional or “noisy” transcripts. My thesis explores the claim that alternative splicing vastly increases the genome’s functional diversity in the context of noisy splicing, and in doing so attempts to identify candidate cases for which alternative splicing is likely to be of consequence. To ground computational analyses of genes with multiple splice variants in experimental data, the field needs a corpus of genes that have experimental evidence of functionally distinct splice isoforms (FDSIs). We curated the literature for 743 genes and found that ~5% had literature evidence of FDSIs. This suggests that the claim that alternative splicing vastly increases genomic functional diversity is extrapolated from a few key genes. Next, I developed a pipeline to identify candidate genes with FDSIs using long-read RNA-seq data. The output of my pipeline is a computationally-prioritized list of candidate genes likely to have FDSIs based on features such as expression, conservation, functional domains, and coding-potential. From an initial set of 6,799 genes with multiple splice variants, I prioritized 79 candidate genes. While I had limited long-read data, my work aids in establishing guidelines for high-throughput prioritization of genes with FDSIs for future study. With our collaborators, I investigated a specific application of my pipeline to the voltage-gated calcium channel gene Cacna1e. Using novel long-read data, I established a set of 2,110 splice variants for Cacna1e. Based on properties of the channel, I determined that at most 154 splice variants are likely to encode a functional channel. My results highlighted the amount of potential noise produced by one gene’s expression. Through my investigation, I added to the growing body of literature in support of noisy splicing. I also provided the field with a list of interesting genes with multiple splice variants. This includes a gold standard set of genes from the experimental literature, and a novel set of prioritized genes. Both sets of genes will be useful for future studies of gene function.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International