­­­Characterization of integrated prophages within diverse species of clinical nontuberculous mycobacteria

Glickman, Cody; Kammlade, Sara M.; Hasan, Nabeeh A.; Epperson, L. Elaine; Davidson, Rebecca M.; Strong, Michael

doi:10.21203/rs.3.rs-30072/v1

Cited by 1 publication

(1 citation statement)

References 47 publications

(55 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…VirSorter and VirSorter2 were primarily developed to identify viral regions in metagenomes rather than prophages in bacterial genomes—although they have been used for that e.g. in Glickman et al (2020). By openly providing the Prophage Prediction Comparison framework, creating a framework to install and test different software, and defining a straightforward approach to labelling prophages in GenBank files, we hope to expand our gold-standard set of genomes and mitigate many of our biases.…”

Section: Caveatsmentioning

confidence: 99%

Philympics 2021: Prophage Predictions Perplex Programs

Roach

McNair

Giles

et al. 2021

Preprint

View full text Add to dashboard Cite

Most bacterial genomes contain integrated bacteriophages—prophages—in various states of decay. Many are active and able to excise from the genome and replicate, while others are cryptic prophages, remnants of their former selves. Over the last two decades, many computational tools have been developed to identify the prophage components of bacterial genomes, and it is a particularly active area for the application of machine learning approaches. However, progress is hindered and comparisons thwarted because there are no manually curated bacterial genomes that can be used to test new prophage prediction algorithms. Here, we present a library of gold-standard bacterial genome annotations that include manually curated prophage annotations, and a computational framework to compare the predictions from different algorithms. We use this suite to compare all extant stand-alone prophage prediction algorithms to identify their strengths and weaknesses. We provide a FAIR dataset for prophage identification, and demonstrate the accuracy, precision, recall, and f1 score from the analysis of seven different algorithms for the prediction of prophages. We discuss caveats and concerns in this analysis and how those concerns may be mitigated.

show abstract

Section: Caveatsmentioning

confidence: 99%