One sentence summary: A unified multitask deep learning model can be used to identify multidrug resistant Mycobacterium tuberculosis using sequencing data.
AbstractThe diagnosis of multidrug resistant and extensively drug resistant tuberculosis is a global health priority. Whole genome sequencing of clinical Mycobacterium tuberculosis isolates promises to circumvent the long wait times and limited scope of conventional phenotypic drug susceptibility but gaps remain for predicting phenotype accurately from genotypic data. Using targeted or whole genome sequencing and conventional drug resistance phenotyping data from 3,601 Mycobacterium tuberculosis strains, 1,228 of which were multidrug resistant, we implemented the first multitask deep learning framework to predict phenotypic drug resistance to 10 anti-tubercular drugs. The proposed wide and deep neural network (WDNN) achieved improved predictive performance compared to regularized logistic regression and random forest: the average sensitivities and specificities, respectively, were 92.7% and 92.7% for first-line drugs and 82.0% and 92.8% for second-line drugs during cross-validation. On an independent validation set, the multitask WDNN showed significant performance gains over baseline models, with average sensitivities and specificities, respectively, of 84.5% and 93.6% for first-line drugs and 64.0% and 95.7% for second-line drugs. In addition to being able to learn from samples that have only been partially phenotyped, our proposed multitask architecture shares information across different anti-tubercular drugs and genes to provide a more accurate phenotypic prediction. We use t-distributed Stochastic Neighbor Embedding (t-SNE) visualization and feature importance analyses to examine inter-drug similarities. Deep learning has a clear role in improving drug resistance predictive performance over traditional methods and holds promise in bringing sequencing technologies closer to the bedside.. CC-BY-NC-ND 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. Diagnosing drug resistance remains a barrier to providing appropriate TB treatment. Due to insufficient resources for building diagnostic laboratories, fewer than half of the countries with a high MDR-TB burden have modern diagnostic capabilities (3). Even in the best equipped laboratories, conventional culture and culture based drug susceptibility testing (DST) constitutes a considerable biohazard and requires weeks to months before results are reported due to Mycobacterium tuberculosis's slow growth in vitro (1). Molecular diagnostics are now an increasingly common alternative to conventional cultures. The WHO has endorsed three such molecular tests: the GeneXpert MTB/RIF a rapid RT-PCR based diagnostic test assay that detects RIF resistance, the Hain line probe assay (LPA) that tests for both RIF and INH resistance, and the Hain MDRTBsl an LPA that tests for resistance to second-line in...