BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Deep learning approaches to denoise, impute, integrate and decode functional genomic data Kundaje, Anshul


We present interpretable deep learning approaches to address three key challenges in integrative analysis of functional genomic data. (1) Data denoising: Data quality of functional genomic data is affected by a myriad of experimental parameters. Making accurate inferences from chromatin profiling experiments that involve diverse experimental parameters is challenging. We introduce a convolutional denoising algorithm to learn a mapping from suboptimal to high-quality datasets that overcomes various sources of noise and variability, substantially enhancing and recovering signal when applied to low-quality chromatin profiling datasets across individuals, cell types, and species. Our method has the potential to improve data quality at reduced costs. (2) Data imputation: It is largely infeasible to perform 100s of genome-wide assays targeting diverse transcription factors and epigenomic marks in 100s of cellular contexts due to cost and material constraints. We have developed multi-task, multi-modal deep neural networks to predict chromatin marks and in vivo binding events of 100s of TFs by integrating regulatory DNA sequence with just two assays namely ATAC-seq (or DNase-seq) and RNA-seq performed in a target cell type of interest. We train our models on large reference compendia from ENCODE/Roadmap Epigenomics and obtain high prediction accuracy in new cellular contexts thereby significant expanding the context-specific annotation of the non-coding genome.(3) Decoding the context-specific regulatory architecture of the genome: Finally, we develop novel, efficient interpretation engines for extracting predictive and biological meaningful patterns from integrative deep learning models of TF binding and chromatin accessibility. We obtain new insights into TF binding sequence affinity models (e.g. significance of flanking sequences and fusion motifs), infer high-resolution point binding events of TFs, dissect higher-order cis-regulatory sequence grammars (including density and spatial constraints), learn chromatin architectural features correlated with chromatin marks, unravel the dynamic regulatory drivers of cellular differentiation and score the regulatory influence of non-coding genetic variants. We provide early access to all associated code and frameworks at

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International