2015
DOI: 10.7717/peerj.985
|View full text |Cite
|
Sign up to set email alerts
|

VirSorter: mining viral signal from microbial genomic data

Abstract: Viruses of microbes impact all ecosystems where microbes drive key energy and substrate transformations including the oceans, humans and industrial fermenters. However, despite this recognized importance, our understanding of viral diversity and impacts remains limited by too few model systems and reference genomes. One way to fill these gaps in our knowledge of viral diversity is through the detection of viral signal in microbial genomic data. While multiple approaches have been developed and applied for the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

15
1,210
1
3

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 1,012 publications
(1,229 citation statements)
references
References 64 publications
15
1,210
1
3
Order By: Relevance
“…However, reanalysis of these data suggests that the lung sample conclusions were misled by excessive bacterial DNA content, and the mice virome analyses suffered from inflated false positives due to relaxed thresholds for in silico detection of ARG (E-value o10 − 3 on ARDB+). To guide future work, we suggest that (i) bacterial DNA contamination be quantified in viromes using analyses such as those presented here and/or automated software now available (VirSorter; Roux et al, 2015), (ii) automated analyses use a conservative threshold to quantify bona fide ARG and (iii) discovery-based work proceed with added caution. Specifically, the latter exploratory cutoffs should use a bit-score 470 threshold complemented with manual inspection for removing the kind of false positives identified in this study.…”
Section: Discussionmentioning
confidence: 98%
“…However, reanalysis of these data suggests that the lung sample conclusions were misled by excessive bacterial DNA content, and the mice virome analyses suffered from inflated false positives due to relaxed thresholds for in silico detection of ARG (E-value o10 − 3 on ARDB+). To guide future work, we suggest that (i) bacterial DNA contamination be quantified in viromes using analyses such as those presented here and/or automated software now available (VirSorter; Roux et al, 2015), (ii) automated analyses use a conservative threshold to quantify bona fide ARG and (iii) discovery-based work proceed with added caution. Specifically, the latter exploratory cutoffs should use a bit-score 470 threshold complemented with manual inspection for removing the kind of false positives identified in this study.…”
Section: Discussionmentioning
confidence: 98%
“…Bacterial genomes include representatives of 4 phyla. A total of 132 prophage sequences were identified including 99 prophages identified by CyVerse 54 implementation of VirSorter 55 in the categories 1, 2, 4, and 5, and 33 prophages identified by manual curation based on the presence of hallmark phage genes and analysis of synteny with closely related strains. Coordinates of 35 prophages predicted by VirSorter had to be manually adjusted to eliminate bacterial genes (including ribosomal RNAs and other housekeeping genes) and to separate 2 prophage sequences called as one prophage over an intervening stretch of bacterial genes.…”
Section: Isolate Reference Viruses (Ivgs)mentioning
confidence: 99%
“…This tool is powerful and highly scalable-its first application was to nearly 15 000 publically available archaeal and bacterial genomes, where VirSorter identified 12 498 new hostassociated viruses and their genomes that augmented publicly available viral genome reference data sets by ∼ 10-fold (Roux et al, 2015b). Furthermore, VirSorter scales to handle contigs derived from metagenomic data sets (Roux et al, 2015a vContact This App assigns contigs to taxonomic groups using the presence or absence of shared PCs along the length of the contig. This is critical as viruses lack a universal gene marker (Edwards and Rohwer, 2005) and o0.1% of viruses in natural environments are represented in public databases (Brum et al, 2015), which necessitates new approaches to taxonomically classify surveyed viral genomes.…”
Section: Virsortermentioning
confidence: 99%
“…This App identifies viral sequences in microbial genomes and metagenomic data sets (Roux et al, 2015a). This is necessary as viral genomes are underrepresented in databases-for example, 92% of 1659 genome-sequenced phages derive from only 4 of 54 known bacterial phyla (Roux et al, 2015b).…”
Section: Virsortermentioning
confidence: 99%
See 1 more Smart Citation