BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Statistical and experimental methods for causal inference at complex trait associated loci Brown, Christoher

Description

Genome-wide association studies (GWAS) have identified thousands of loci that contribute to risk for complex diseases. The majority of the heritability of complex disease risk lies within the noncoding regions of the genome. This has led to the hypothesis that the causal variants at GWAS associated loci lead to changes in local gene expression. As a result of linkage disequilibrium and the fact that cis- regulatory elements (CREs) may target genes over large distances, it is often unclear which variant or gene affects disease risk. However, their identification will improve understanding of disease etiology and identify targets for novel therapeutic development. Recent work from efforts such as GTEx has identified genetic variation associated with gene expression variation for essentially every gene. Despite this wealth of data, the characterization of causal mechanisms at complex trait associate loci remains a significant challenge. To address this challenge, we have developed and applied high throughput computational and experimental approaches to identify candidate disease genes and the functional regulatory variants that mediate disease risk. We have focused on cardiovascular disease (CVD) and molecular trait mapping in the liver as model systems. Existing studies have focused on easily ascertained cell types, while the liver, which plays a critical role in regulating cholesterol and lipid metabolism, and where many CVD associated variants likely affect gene expression, has remained understudied. We have deeply phenotyped liver biopsies and iPSC derived hepatocytes form more than 400 donors, collecting RNA-seq along with histone modification and transcription factor ChIP-seq data. We have used these data to identify thousands of genetic variants associated with allele-specific transcription factor binding, histone modification, gene expression, and splicing. Comparison to data from the GTEx and Roadmap Epigenomics projects demonstrate that many of these associations are specific to the liver. We demonstrate that multi-phenotype molecular trait mapping improves statistical power to detect associations and results in improved resolution at identified loci. We have integrated these data with CVD GWAS data using a novel multi-phenotype causal inference framework based on Mendelian randomization to predict the precise variants, CREs, and genes that underlie CVD risk. Using a combination of massively parallel reporter assays, genome-edited stem cells, CRISPR interference, and in vivo mouse models, we establish rs2277862-CPNE1, rs10889356-ANGPTL3, rs10889356-DOCK7, and rs10872142-FRK as causal SNP- gene sets for CVD. These results demonstrate that a molecular trait mapping framework can rapidly identify causal genes and variants contributing to complex human traits and demonstrates that, at many GWAS loci, candidate genes have been falsely implicated based on proximity to the lead SNP.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International