UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Bayesian analysis of continuous time Markov chains with applications to phylogenetics Zhao, Tingting

Abstract

Bayesian analysis of continuous time, discrete state space time series is an important and challenging problem, where incomplete observation and large parameter sets call for user-defined priors based on known properties of the process. Generalized Linear Models (GLM) have a largely unexplored potential to construct such prior distributions. We show that an important challenge with Bayesian generalized linear modelling of Continuous Time Markov Chains (CTMCs) is that classical Markov Chain Monte Carlo (MCMC) techniques are too ineffective to be practical in that setup. We propose two computational methods to address this issue. The first algorithm uses an auxiliary variable construction combined with an Adaptive Hamiltonian Monte Carlo (AHMC) algorithm. The second algorithm combines Hamiltonian Monte Carlo (HMC) and Local Bouncy Particle Sampler (LBPS) to take advantage of the sparsity in certain high-dimensional rate matrices. We propose a characterization for a class of sparse factor graphs, where LBPS can be efficient. We also provide a framework to assess the computational complexity for the algorithm. An important aspect of practical implementation of Bouncy Particle Sampler (BPS) is the simulation of event times. Default implementations use conservative thinning bounds. Such bounds can slow down the algorithm and limit the computational performance. In the second algorithm, we develop exact analytical solutions to the random event times in the context of CTMCs. Both sampling algorithms and our model make it efficient both in terms of computation and analyst's time to construct stochastic processes informed by prior knowledge, such as known properties of the states of the process. We demonstrate the flexibility and scalability of our framework using both synthetic and real phylogenetic protein data.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Usage Statistics