UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Modelling complex biologging data with hidden Markov models Sidrow, Evan

Abstract

Hidden Markov models (HMMs) are commonly used to identify latent processes from observed time series, but it is challenging to fit them to large and complex time series collected by modern sensors. Using data from threatened resident killer whales (Orcinus orca) as a case study, this dissertation provides solutions to three common challenges when identifying latent behaviour from biologging data. First, time series are now collected at high frequencies and exhibit fine-scale dependence structures that violate common assumptions of HMMs. I propose a hierarchical approach that divides time series into sequences of curves. A coarse-scale HMM models the sequence of curves while a related fine-scale HMM describes the structure within each curve. The fine-scale model includes autoregression to capture dependence in addition to moving-window summaries and Fourier analysis to capture intricate fine-scale structures. I use simulation and case studies to show that this framework produces more interpretable state estimation and more accurate parameter estimates than existing methods. Second, many modern biologging time series include labels of the latent process of interest. Labels can improve predictive accuracy, but sparse labels can have a negligible influence on parameter estimates. I introduce a weighted likelihood approach that increases the relative influence of labeled observations. I use this approach to develop two HMMs that model the foraging behaviour of killer whales at different scales. Using cross-validated evaluation metrics, I show that my weighted method produces more accurate and understandable models than unweighted methods. Finally, applying HMMs to very large time series is computationally demanding, as expectation-maximization (EM) algorithms typically iterate through the entire data set for every parameter update. I propose a novel optimization algorithm that combines a partial E step with variance-reduced stochastic optimization within the M step. I prove my algorithm converges under standard regularity conditions and test my algorithm empirically using simulations and case studies. My algorithm converges in fewer epochs, with less computation time, and to regions of higher likelihood than standard numerical optimization techniques. In total, this dissertation introduces novel methods to develop and train HMMs for complex biologging time series collected by modern sensors.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International