BackgroundThe World Health Organization recommends universal drug susceptibility testing for Mycobacterium tuberculosis complex to guide treatment decisions and improve outcomes. We assessed whether DNA sequencing can accurately predict antibiotic susceptibility profiles for first-line anti-tuberculosis drugs. MethodsWhole-genome sequences and associated phenotypes to isoniazid, rifampicin, ethambutol and pyrazinamide were obtained for isolates from 16 countries across six continents. For each isolate, mutations associated with drug-resistance and drug-susceptibility were identified across nine genes, and individual phenotypes were predicted unless mutations of unknown association were also present. To identify how whole-genome sequencing might direct first-line drug therapy, complete susceptibility profiles were predicted. These were predicted to be pan-susceptible if predicted susceptible to isoniazid and to other drugs, or contained mutations of unknown association in genes affecting these other drugs. We simulated how negative predictive value changed with drug-resistance prevalence.Results10,209 isolates were analysed. The greatest proportion of phenotypes were predicted for rifampicin (9,660/10,130; (95.4%)) and the lowest for ethambutol (8,794/9,794; (89.8%)). Isoniazid, rifampicin, ethambutol and pyrazinamide resistance was correctly predicted with 97.1%, 97.5% 94.6% and 91.3% sensitivity, and susceptibility with 99.0%, 98.8%, 93.6% and 96.8% specificity, respectively. 5,250 (89.5%) drug profiles were correctly predicted for 5,865/7,516 (78.0%) isolates with complete phenotypic profiles. Among these, 3,952/4,037 (97.9%) predictions of pan-susceptibility were correct. The negative predictive value for 97.5% of simulated drug profiles exceeded 95% where the prevalence of drug-resistance was below 47.0%. ConclusionsPhenotypic testing for first-line drugs can be phased down in favour of DNA sequencing to guide anti- tuberculosis drug therapy.
Two billion people are infected with , leading to Mycobacterium tuberculosis 10 million new cases of active tuberculosis and 1.5 million deaths annually. Universal access to drug susceptibility testing (DST) has become a World Health Organization priority. We previously developed a software tool, , which provided offline species identification and drug Mykrobe predictor resistance predictions for from whole genome sequencing M. tuberculosis (WGS) data. Performance was insufficient to support the use of WGS as an alternative to conventional phenotype-based DST, due to mutation catalogue limitations.
Motivation Resistance co-occurrence within first-line anti-tuberculosis (TB) drugs is a common phenomenon. Existing methods based on genetic data analysis of Mycobacterium tuberculosis (MTB) have been able to predict resistance of MTB to individual drugs, but have not considered the resistance co-occurrence and cannot capture latent structure of genomic data that corresponds to lineages. Results We used a large cohort of TB patients from 16 countries across six continents where whole-genome sequences for each isolate and associated phenotype to anti-TB drugs were obtained using drug susceptibility testing recommended by the World Health Organization. We then proposed an end-to-end multi-task model with deep denoising auto-encoder (DeepAMR) for multiple drug classification and developed DeepAMR_cluster, a clustering variant based on DeepAMR, for learning clusters in latent space of the data. The results showed that DeepAMR outperformed baseline model and four machine learning models with mean AUROC from 94.4% to 98.7% for predicting resistance to four first-line drugs [i.e. isoniazid (INH), ethambutol (EMB), rifampicin (RIF), pyrazinamide (PZA)], multi-drug resistant TB (MDR-TB) and pan-susceptible TB (PANS-TB: MTB that is susceptible to all four first-line anti-TB drugs). In the case of INH, EMB, PZA and MDR-TB, DeepAMR achieved its best mean sensitivity of 94.3%, 91.5%, 87.3% and 96.3%, respectively. While in the case of RIF and PANS-TB, it generated 94.2% and 92.2% sensitivity, which were lower than baseline model by 0.7% and 1.9%, respectively. t-SNE visualization shows that DeepAMR_cluster captures lineage-related clusters in the latent space. Availability and implementation The details of source code are provided at http://www.robots.ox.ac.uk/∼davidc/code.php. Supplementary information Supplementary data are available at Bioinformatics online.
The dN/dS ratio provides evidence of adaptation or functional constraint in protein-coding genes by quantifying the relative excess or deficit of amino acid-replacing versus silent nucleotide variation. Inexpensive sequencing promises a better understanding of parameters, such as dN/dS, but analyzing very large data sets poses a major statistical challenge. Here, I introduce genomegaMap for estimating within-species genome-wide variation in dN/dS, and I apply it to 3,979 genes across 10,209 tuberculosis genomes to characterize the selection pressures shaping this global pathogen. GenomegaMap is a phylogeny-free method that addresses two major problems with existing approaches: 1) It is fast no matter how large the sample size and 2) it is robust to recombination, which causes phylogenetic methods to report artefactual signals of adaptation. GenomegaMap uses population genetics theory to approximate the distribution of allele frequencies under general, parent-dependent mutation models. Coalescent simulations show that substitution parameters are well estimated even when genomegaMap’s simplifying assumption of independence among sites is violated. I demonstrate the ability of genomegaMap to detect genuine signatures of selection at antimicrobial resistance-conferring substitutions in Mycobacterium tuberculosis and describe a novel signature of selection in the cold-shock DEAD-box protein A gene deaD/csdA. The genomegaMap approach helps accelerate the exploitation of big data for gaining new insights into evolution within species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.