UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Meta-analysis strategies for inference of transcriptional regulatory targets Morin, Alexander

Abstract

Despite considerable advances in genomic methodologies and computational strategies, associating transcription regulators (TRs) to their target genes remains highly challenging. Identifying these regulatory interactions is important for understanding the specific activity of each regulator, for providing concrete examples to inform the molecular mechanisms of gene regulation, and for training models that predict regulation in the absence of direct experimentation. My research looks to evaluate the reproducibility of TR-gene patterns among the growing wealth of publicly available genomic experimentation. More specifically, I adopted a meta-analytic framework and analyzed thousands of genomic datasets in mouse and human. My explicit hypothesis is that data aggregation can prioritize direct TR-target interactions that may be less apparent in any one dataset. First, I focus on TR perturbation and ChIP-seq experiments for eight neurologically relevant regulators. Here, I nominate the candidate gene targets that are supported across datasets and methodologies, and more generally highlight the considerations of aggregating these data types across studies. This first project not only presents a novel examination of direct experimentation for eight important mammalian TRs, but also provides a framework extensible to other TRs. Next, I turned to single cell data. Using hundreds of single cell RNA-seq datasets, I identified, for each human and mouse TR, the genes that had the most coordinated expression patterns (coexpression) with each regulator. I demonstrate that in aggregate, these correlative patterns predict independently-described TR-target pairs on par with ChIP-seq, a more direct line of evidence. I then combine the coexpression and ChIP-seq rankings for all available TRs, highlighting thousands of regulatory interactions that are supported across both data types and species. Collectively, this study represents the first analysis of the reproducibility of TR coexpression patterns across a broad corpus of single cell data. Beyond the identified interactions that I provide as community resources, my research contributes to our understanding of the methodologies used to infer regulation. Importantly, the information I have assembled should provide a useful reference as well as a source of hypothesis generation for experimental and computational biologists alike.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International