2019
DOI: 10.1101/607127
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluation of parameters affecting performance and reliability of machine learning-based antibiotic susceptibility testing from whole genome sequencing data

Abstract: 20Prediction of antibiotic resistance phenotypes from whole genome sequencing data by 21 machine learning methods has been proposed as a promising platform for the 22 development of sequence-based diagnostics. However, there has been no systematic 23 evaluation of factors that may influence performance of such models, how they might 24 apply to and vary across clinical populations, and what the implications might be in the 25 clinical setting. Here, we performed a meta-analysis of seven large Neisseria 26 gono… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

3
31
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 21 publications
(34 citation statements)
references
References 63 publications
3
31
0
Order By: Relevance
“…The majority (59.4%) of these isolates had MICs that were lower than expected, indicative of increased susceptibility unexplained by the genetic determinants in our model. Overall MIC variance explained by known resistance mutations was relatively low (adjusted R 2 = 0.667), in agreement with prior studies that employed whole-genome supervised learning algorithms to predict azithromycin resistance 13 . MIC variance explained by known resistance mutations was also low for ceftriaxone (adjusted R 2 = 0.674) but higher for ciprofloxacin (adjusted R 2 = 0.937), with 2.02% and 2.90% of strains, respectively, exhibiting two dilutions or lower reported MICs compared to predictions, similarly indicating unexplained susceptibility.…”
Section: Resultssupporting
confidence: 86%
“…The majority (59.4%) of these isolates had MICs that were lower than expected, indicative of increased susceptibility unexplained by the genetic determinants in our model. Overall MIC variance explained by known resistance mutations was relatively low (adjusted R 2 = 0.667), in agreement with prior studies that employed whole-genome supervised learning algorithms to predict azithromycin resistance 13 . MIC variance explained by known resistance mutations was also low for ceftriaxone (adjusted R 2 = 0.674) but higher for ciprofloxacin (adjusted R 2 = 0.937), with 2.02% and 2.90% of strains, respectively, exhibiting two dilutions or lower reported MICs compared to predictions, similarly indicating unexplained susceptibility.…”
Section: Resultssupporting
confidence: 86%
“…The average very major error rate (VME), which is defined as resistant genomes that are erroneously predicted to be susceptible, and the average major error rate (ME), which is defined as susceptible genomes that are erroneously predicted to be resistant, tend to go down as gene set size increases. Although the core gene set models described in Fig 1 have lower F1 scores and higher error rates than full-genome models that have been published previously [ 21 24 , 27 , 32 ], their accuracies are striking given the small sizes of the input data sets and the removal of well-annotated AMR genes.…”
Section: Resultsmentioning
confidence: 78%
“…Following Hicks et al. [ 28 ], we refer to the average of the (optimal) sensitivity and specificity as balanced accuracy (bACC). Finally, we selected the sparsest model that allowed maximization of the bACC up to 1 point, in order to reduce the risk of overfitting.…”
Section: Methodsmentioning
confidence: 99%
“…Second, regarding the type of ML algorithms, boosting algorithms [ 4 , 8 , 21 ], penalized regression models [ 10 , 17 , 23 ], decision trees [ 26 ], random forest [ 10 , 27 ], neural networks [ 17 ], and set cover machines [ 22 , 26 ] have already been successfully deployed in this context. While each algorithm has its own merits and shortcomings, several studies reported comparable global performance for various algorithms, with specific variations by drug and microbial species [ 10 , 17 , 28 ]. Finally, different kinds of antibiotic susceptibility information can be considered: either discrete when the objective is to distinguish susceptible from resistant (or non-susceptible) ones [ 10 , 17 , 21 , 22 ], or continuous, where one seeks to predict the minimum inhibitory concentration (MIC) of the antimicrobial agent itself [ 3 , 4 , 8 ].…”
Section: Introductionmentioning
confidence: 99%