17The U.S. Department of Energy Systems Biology Knowledgebase (KBase) is an open-source 18 software and data platform designed to meet the grand challenge of systems biology-19 predicting and designing biological function from the biomolecular (small scale) to the ecological 20 (large scale). KBase is available for anyone to use, and enables researchers to collaboratively 21 generate, test, compare, and share hypotheses about biological functions; perform large-scale 22 analyses on scalable computing infrastructure; and combine experimental evidence and 23conclusions that lead to accurate models of plant and microbial physiology and community 24 dynamics. The KBase platform has (1) extensible analytical capabilities that currently include 25 genome assembly, annotation, ontology assignment, comparative genomics, transcriptomics, 26 and metabolic modeling; (2) a web-browser-based user interface that supports building, sharing, 27and publishing reproducible and well-annotated analyses with integrated data; (3) access to 28 extensive computational resources; and (4) a software development kit allowing the community 29to add functionality to the system. 30
Nontyphoidal Salmonella species are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have average accuracies between 95-96% within ± 1 two-fold dilution factor and can predict MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Our strategy for developing whole genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.