- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Bayesian modelling of DNA secondary structure kinetics...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Bayesian modelling of DNA secondary structure kinetics : revisiting path space approximations and posterior inference in exponentially large state spaces Lovrod, Jordan
Abstract
Nucleic acid strands, which react by forming and breaking Watson-Crick base pairs, can be designed to form complex nanoscale structures or devices. Controlling such systems requires accurate predictions of the reaction rate and folding pathways of the interacting strands. These kinetic properties can be modelled using continuous-time Markov chains (CTMCs), whose states and transitions correspond to secondary structures and elementary base pair changes, respectively. The transient dynamics of a CTMC are determined by a kinetic model, which assigns transition rates to pairs of states. The rate of a reaction can be estimated using its CTMC's mean first passage time (MFPT), which can be computed exactly by solving a linear system, or approximated via Monte Carlo simulation. However, both approaches may be computationally infeasible for rare event reactions in larger systems. This limitation can be addressed by constructing truncated CTMCs, which only include a small subset of states and transitions, selected either manually or through simulation. In recent work, posterior inference in an Arrhenius-type kinetic model was performed using a fixed set of manually truncated CTMCs and a small experimental dataset of DNA reaction rates. We extend this Bayesian approach, using a larger dataset of 1105 reactions, a new prior model that is directly motivated by the physical meaning of the parameters and is compatible with experimental measurements of elementary rates, and larger truncated state spaces, constructed stochastically using the recently introduced pathway elaboration method. Despite a significantly higher computational cost, we find that the larger state spaces do not necessarily lead to more accurate rate predictions than the small, manually designed state spaces. For posterior approximation, we apply the standard random walk Metropolis algorithm and the gradient-based no-u-turn sampler. Our posterior approximations, which are often multimodal, recover an expected correlation structure among the kinetic parameters. However, we also uncover severe numerical instability in the MPFT computations. Due to numerous design limitations in the legacy software, a significant refactoring effort was required to implement the above extensions, resulting also in improvements in performance and reproducibility.
Item Metadata
Title |
Bayesian modelling of DNA secondary structure kinetics : revisiting path space approximations and posterior inference in exponentially large state spaces
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2023
|
Description |
Nucleic acid strands, which react by forming and breaking Watson-Crick base pairs, can be designed to form complex nanoscale structures or devices. Controlling such systems requires accurate predictions of the reaction rate and folding pathways of the interacting strands. These kinetic properties can be modelled using continuous-time Markov chains (CTMCs), whose states and transitions correspond to secondary structures and elementary base pair changes, respectively. The transient dynamics of a CTMC are determined by a kinetic model, which assigns transition rates to pairs of states. The rate of a reaction can be estimated using its CTMC's mean first passage time (MFPT), which can be computed exactly by solving a linear system, or approximated via Monte Carlo simulation. However, both approaches may be computationally infeasible for rare event reactions in larger systems. This limitation can be addressed by constructing truncated CTMCs, which only include a small subset of states and transitions, selected either manually or through simulation. In recent work, posterior inference in an Arrhenius-type kinetic model was performed using a fixed set of manually truncated CTMCs and a small experimental dataset of DNA reaction rates. We extend this Bayesian approach, using a larger dataset of 1105 reactions, a new prior model that is directly motivated by the physical meaning of the parameters and is compatible with experimental measurements of elementary rates, and larger truncated state spaces, constructed stochastically using the recently introduced pathway elaboration method. Despite a significantly higher computational cost, we find that the larger state spaces do not necessarily lead to more accurate rate predictions than the small, manually designed state spaces. For posterior approximation, we apply the standard random walk Metropolis algorithm and the gradient-based no-u-turn sampler. Our posterior approximations, which are often multimodal, recover an expected correlation structure among the kinetic parameters. However, we also uncover severe numerical instability in the MPFT computations. Due to numerous design limitations in the legacy software, a significant refactoring effort was required to implement the above extensions, resulting also in improvements in performance and reproducibility.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2023-04-20
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0431312
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2023-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International