- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Research Data /
- Data from: Viral dark matter and virus–host interactions...
Open Collections
UBC Research Data
Data from: Viral dark matter and virus–host interactions resolved from publicly available microbial genomes Roux, Simon; Hallam, Steven J.; Woyke, Tanja; Sullivan, Matthew B.
Description
<b>Abstract</b><br/>The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus-host interactions precludes accurate prediction of their roles and impacts. Here we mined publicly available bacterial and archaeal genomic datasets to identify 12,498 high‑confidence viral genomes linked to their microbial hosts. These data augment public datasets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7-38% of 'unknown' sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and co‑infection prevalences, as well as evaluation of in silico virus-host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.; <b>Usage notes</b><br /><div class="o-metadata__file-usage-entry"><h4 class="o-heading__level3-file-title">VirSorter Curated Dataset genbank file package</h4><div class="o-metadata__file-description">This zip package includes the annotated genbank files (affiliation against PFAM and Refseq) of all viral sequences predicted in the VirSorter Curated Dataset, organized in different folders based on the host of the virus (i.e. the taxonomy of the genomic data in which the viral sequence was identified).</div><div class="o-metadata__file-name">VirSorter_Curated_Dataset_genbank-files.zip</br></div></div>
Item Metadata
Title |
Data from: Viral dark matter and virus–host interactions resolved from publicly available microbial genomes
|
Creator | |
Date Issued |
2021-05-19
|
Description |
<b>Abstract</b><br/>The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus-host interactions precludes accurate prediction of their roles and impacts. Here we mined publicly available bacterial and archaeal genomic datasets to identify 12,498 high‑confidence viral genomes linked to their microbial hosts. These data augment public datasets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7-38% of 'unknown' sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and co‑infection prevalences, as well as evaluation of in silico virus-host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.; <b>Usage notes</b><br /><div class="o-metadata__file-usage-entry"><h4 class="o-heading__level3-file-title">VirSorter Curated Dataset genbank file package</h4><div class="o-metadata__file-description">This zip package includes the annotated genbank files (affiliation against PFAM and Refseq) of all viral sequences predicted in the VirSorter Curated Dataset, organized in different folders based on the host of the virus (i.e. the taxonomy of the genomic data in which the viral sequence was identified).</div><div class="o-metadata__file-name">VirSorter_Curated_Dataset_genbank-files.zip</br></div></div>
|
Subject | |
Type | |
Notes |
Dryad version number: 1</p> Version status: submitted</p> Dryad curation status: Published</p> Sharing link: https://datadryad.org/stash/share/jRLPkvzapf6WFit0ripShjyrxJljZcIK_r0-hHioIz8</p> Storage size: 254533367</p> Visibility: public</p> |
Date Available |
2020-06-30
|
Provider |
University of British Columbia Library
|
License |
CC0 1.0
|
DOI |
10.14288/1.0397892
|
URI | |
Publisher DOI | |
Rights URI | |
Aggregated Source Repository |
Dataverse
|
Item Media
Item Citations and Data
Licence
CC0 1.0