2019
DOI: 10.3389/fgene.2019.00922
|View full text |Cite
|
Sign up to set email alerts
|

Machine Learning Predicts Accurately Mycobacterium tuberculosis Drug Resistance From Whole Genome Sequencing Data

Abstract: Background: Tuberculosis disease, caused by Mycobacterium tuberculosis, is a major public health problem. The emergence of M. tuberculosis strains resistant to existing treatments threatens to derail control efforts. Resistance is mainly conferred by mutations in genes coding for drug targets or converting enzymes, but our knowledge of these mutations is incomplete. Whole genome sequencing (WGS) is an increasingly common approach to rapidly characterize isolates and identify mutations predicting antimicrobial … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
74
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 61 publications
(76 citation statements)
references
References 29 publications
2
74
0
Order By: Relevance
“…Resistance co-occurrence is especially common in firstline drugs, since standard regimens require them to be used together. Existing machine learning methods for TB prediction in the literature have focused on single-drug prediction (Periwal et al, 2011;Zhang et al, 2013;Farhat et al, 2016;Yang et al, 2018;Deelder et al, 2019), and ignored epistasis and correlation of resistance between drugs. Building a multi-label model to account for both of the latter may improve predictive performance and be useful for extracting important MDRor XDR-TB resistance-associated mutations.…”
Section: Introductionmentioning
confidence: 99%
“…Resistance co-occurrence is especially common in firstline drugs, since standard regimens require them to be used together. Existing machine learning methods for TB prediction in the literature have focused on single-drug prediction (Periwal et al, 2011;Zhang et al, 2013;Farhat et al, 2016;Yang et al, 2018;Deelder et al, 2019), and ignored epistasis and correlation of resistance between drugs. Building a multi-label model to account for both of the latter may improve predictive performance and be useful for extracting important MDRor XDR-TB resistance-associated mutations.…”
Section: Introductionmentioning
confidence: 99%
“…We tested the robustness of SUSPECT-RIF in detecting well-characterized Mtb mutations from whole genome databases 52 where it outperformed all databases [58][59][60][61] tested, with an accuracy of 99.4%. When considering protein structural effects, where only coding mutations can be analysed, these metrics are also comparable to a whole genome sequencing-based predictor (accuracy: 95.1%) 63 , which is built on larger datasets. This latter tool possibly also includes variation within non-coding regions (raw data not available for comparison) 63 which cannot be assessed through our method.…”
Section: Discussionmentioning
confidence: 97%
“…To obtain a dataset to train and evaluate our method on, we combine data from the Pathosystems Resource Integration Center (PATRIC) [47] In order to map the raw sequence data to the reference genome, we use a method similar to that used in previous work [7,8]. We use the BWA software [25], specifically, the bwa-mem program.…”
Section: Datamentioning
confidence: 99%