UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Molecular interpretation of genome-wide association studies using multiomics analysis Farhadi Hassan Kiadeh, Farnush

Abstract

Genome-wide association studies have found thousands of single-nucleotide polymorphisms associated with various human traits. Recently, a powerful statistical approach called MetaXcan has been proposed for interpreting genome-wide associations at the gene level. We extended MetaXcan to a multiomics application, using a brain cortex reference dataset that includes gene expression, DNA methylation, and histone acetylation data from approximately 400 individuals. Our approach, Multi-MetaXcan, consists of three steps. In the first step, we use regularized regression to build models that predict gene expression and variation in epigenomic modifications from single-nucleotide polymorphisms. We call these models genotype-based imputation models. In the second step, we apply these models to map genome-wide associations to gene-level and epigenomic-level associations. Finally, in the third step, our model summarizes all molecular-level associations at the gene level by building epigenome-based imputation models that predict gene expression levels from nearby epigenomic marks like CpG sites and transcriptionally active regions. In summary, Multi-MetaXcan identifies trait-associated genes whose expression levels are impacted by single-nucleotide polymorphisms and their influence on intermediate molecular traits such as DNA methylation and histone acetylation. We applied Multi-MetaXcan to a major depressive disorder genome-wide association study. As the result, we discovered 12 genes, 25 transcriptionally active regions, and 163 CpG sites associated with major depressive disorder corresponding to 74 genes in total. 26 of these genes fall within or close to previously identified major depressive disorder-associated genomic regions. Importantly, the inclusion of epigenomic data resulted in an additional 62 genes that were not identified by gene expression imputation model alone.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International