2017
DOI: 10.1038/srep39194
|View full text |Cite
|
Sign up to set email alerts
|

PaPrBaG: A machine learning approach for the detection of novel pathogens from NGS data

Abstract: The reliable detection of novel bacterial pathogens from next-generation sequencing data is a key challenge for microbial diagnostics. Current computational tools usually rely on sequence similarity and often fail to detect novel species when closely related genomes are unavailable or missing from the reference database. Here we present the machine learning based approach PaPrBaG (Pathogenicity Prediction for Bacterial Genomes). PaPrBaG overcomes genetic divergence by training on a wide range of species with k… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
102
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 66 publications
(107 citation statements)
references
References 49 publications
5
102
0
Order By: Relevance
“…Results obtained for the "left" and "right" half of the set did not differ from those presented in Table 1 by more than 0.001. As previously shown (Deneke et al, 2017), BLAST fails to classify some of the reads at all. To compare its performance to the machine learning approaches, we define accuracy as the ratio of correct predictions to the number of all data points in a set.…”
Section: Single Readsmentioning
confidence: 65%
See 4 more Smart Citations
“…Results obtained for the "left" and "right" half of the set did not differ from those presented in Table 1 by more than 0.001. As previously shown (Deneke et al, 2017), BLAST fails to classify some of the reads at all. To compare its performance to the machine learning approaches, we define accuracy as the ratio of correct predictions to the number of all data points in a set.…”
Section: Single Readsmentioning
confidence: 65%
“…This could help track a biological threat back to its source in case of a malicious attack or accidental release. Deneke et al (2017) presented PaPrBaG, a random forest approach for predicting whether an Illumina read originates from a pathogenic or a nonpathogenic bacterium and showed that it generalizes to novel, previously unseen species. They introduce the concept of a pathogenic potential to differentiate between predicted probabilities of a given phenotype and true pathogenicity, which can only be realized in the biological context of a full genome and a specific host.…”
Section: Motivationmentioning
confidence: 99%
See 3 more Smart Citations