UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Representation learning strategies for the epigenome and chromatin structure using recurrent neural models Dsouza, Kevin Bradley


In this Ph.D. thesis, we propose frameworks for designing informative position-specific representations from epigenomic and structural genomic signals. We use recurrent priors in our analysis owing to the fact that the genome is heavily correlated with nearby positions, and implement them using recurrent neural models. We demonstrate that the representations we learn are helpful for various tasks, including, locating known genomic elements, identifying conserved sites, correlating with established genomic measures, enabling accurate decoding, finding elements that drive 3D conformation, attributing relative positional importance, and performing in-silico modifications. In the process of designing these representations, we study two classes of representation learning strategies that differ in their underlying philosophy, namely, autoencoding and categorical encoding. We show that the usefulness of these representations depends on the underlying strategies used while designing them.

Item Media

Item Citations and Data


Attribution-ShareAlike 4.0 International