2021
DOI: 10.1101/2021.04.12.438782
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Gauge your phage: Benchmarking of bacteriophage identification tools in metagenomic sequencing data

Abstract: Background: As the relevance of bacteriophages in shaping diversity in microbial ecosystems is becoming increasingly clear, the prediction of phage sequences in metagenomic datasets has become a topic of considerable interest, which has led to the development of many novel bioinformatic tools. A comprehensive comparative analysis of these tools has so far not been performed. Methods: We benchmarked ten state-of-the-art phage identification tools. We used artificial contigs generated from complete RefSeq genom… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
27
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(27 citation statements)
references
References 70 publications
0
27
0
Order By: Relevance
“…Different viral strains or subtypes can be distinguished through viromics, opening up new possibilities to distinguish evolutionary and ecological dynamics of uncultivated viruses in infected hosts and in natural biomes. Recent benchmarking studies based on simulated or mock community data provide information on the advantages and disadvantages of different computational tools for identifying and classifying viruses [24][25][26]. Viruses that are relatively closely related to known ones can be identified by direct sequence-similarity searches of whole genomes or taxon-specific hallmark genes, which are also used for meaningful phylogenies and taxonomic classification.…”
Section: Improving Taxonomic Classification Through Computational Ana...mentioning
confidence: 99%
“…Different viral strains or subtypes can be distinguished through viromics, opening up new possibilities to distinguish evolutionary and ecological dynamics of uncultivated viruses in infected hosts and in natural biomes. Recent benchmarking studies based on simulated or mock community data provide information on the advantages and disadvantages of different computational tools for identifying and classifying viruses [24][25][26]. Viruses that are relatively closely related to known ones can be identified by direct sequence-similarity searches of whole genomes or taxon-specific hallmark genes, which are also used for meaningful phylogenies and taxonomic classification.…”
Section: Improving Taxonomic Classification Through Computational Ana...mentioning
confidence: 99%
“…In this study, we used prediction tools that search for viruses. However, because VirSorter2 has been reported to have difficulty distinguishing plasmids from viral sequences (49,52,53), we wanted to address the possibility that predictions from any of the tools may resemble other types of Neisseria MGEs. Specifically, we compared our predictions to known Neisseria plasmids and the Gonococcal Genetic Island (GGI).…”
Section: Few Predictions Are Similar To Neisseria Plasmids and The Go...mentioning
confidence: 99%
“…Several tools exist to perform each step, but while pre-processing, filtering and assembly can rely on gold standard softwares and workflows, viral identification and classification may vary greatly between different tools. Many of the virus mining tools were implemented in the last decade, and differ on the approach used, such as prophage detection, homology, machine learning, random forest or hybrid approaches, which is a combination of some of the aforementioned (Ho et al, 2021). Furthermore, one major drawback of virus metagenomic analysis is the variety of tools used in each step, which needs to be installed separately, may require different data formats and an extended set of dependencies, which may clash in their versions.…”
Section: Introductionmentioning
confidence: 99%