- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Research Data /
- Data from: A Poissonian model of indel rate variation...
Open Collections
UBC Research Data
Data from: A Poissonian model of indel rate variation for phylogenetic tree inference Zhai, Yongliang; Alexandre, Bouchard-Cote
Description
Abstract
While indel rate variation has been observed and analyzed in detail, it is not taken into account by current indel-aware phylogenetic reconstruction methods. In this work, we introduce a continuous time stochastic process, the geometric Poisson indel process, that generalizes the Poisson indel process by allowing insertion and deletion rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling than methods that either ignore indels or indel rate variation.
Usage notes
Molluscan RNA data
Item Metadata
| Title |
Data from: A Poissonian model of indel rate variation for phylogenetic tree inference
|
| Creator | |
| Date Issued |
2021-05-19
|
| Description |
Abstract
While indel rate variation has been observed and analyzed in detail, it is not taken into account by current indel-aware phylogenetic reconstruction methods. In this work, we introduce a continuous time stochastic process, the geometric Poisson indel process, that generalizes the Poisson indel process by allowing insertion and deletion rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling than methods that either ignore indels or indel rate variation.; Usage notes Molluscan RNA data |
| Subject | |
| Type | |
| Notes |
Dryad version number: 1 Version status: submitted Dryad curation status: Published Sharing link: https://datadryad.org/stash/share/8lPS2uIKU09qZwl0Y8iG_lK4CvPyUDS6yUXsIeX9hsI</p> Storage size: 84068 Visibility: public |
| Date Available |
2020-06-24
|
| Provider |
University of British Columbia Library
|
| License |
CC0 1.0
|
| DOI |
10.14288/1.0398010
|
| URI | |
| Publisher DOI | |
| Rights URI | |
| Aggregated Source Repository |
Dataverse
|
Item Media
Item Citations and Data
License
CC0 1.0