UBC Faculty Research and Publications

Models and data of AMPlify : a deep learning tool for antimicrobial peptide prediction Li, Chenkai; Warren, René L.; Birol, Inanc

Abstract

Objectives Antibiotic resistance is a rising global threat to human health and is prompting researchers to seek effective alternatives to conventional antibiotics, which include antimicrobial peptides (AMPs). Recently, we have reported AMPlify, an attentive deep learning model for predicting AMPs in databases of peptide sequences. In our tests, AMPlify outperformed the state-of-the-art. We have illustrated its use on data describing the American bullfrog (Rana [Lithobates] catesbeiana) genome. Here we present the model files and training/test data sets we used in that study. The original model (the balanced model) was trained on a balanced set of AMP and non-AMP sequences curated from public databases. In this data note, we additionally provide a model trained on an imbalanced set, in which non-AMP sequences far outnumber AMP sequences. We note that the balanced and imbalanced models would serve different use cases, and both would serve the research community, facilitating the discovery and development of novel AMPs. Data description This data note provides two sets of models, as well as two AMP and four non-AMP sequence sets for training and testing the balanced and imbalanced models. Each model set includes five single sub-models that form an ensemble model. The first model set corresponds to the original model trained on a balanced training set that has been described in the original AMPlify manuscript, while the second model set was trained on an imbalanced training set.

Item Citations and Data

Rights

Attribution 4.0 International (CC BY 4.0)