Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge bases EcoCyc, EcoGene, UniProt and RegulonDB. We identified the genes that lack experimental evidence of function (the ‘y-ome’) which include 1600 of 4623 unique genes (34.6%), of which 111 have absolutely no evidence of function. An additional 220 genes (4.7%) are pseudogenes or phantom genes. y-ome genes tend to have lower expression levels and are enriched in the termination region of the E. coli chromosome. Where evidence is available for y-ome genes, it most often points to them being membrane proteins and transporters. We resolve the misconception that a gene in E. coli whose primary name starts with ‘y’ is unannotated, and we discuss the value of the y-ome for systematic improvement of E. coli knowledge bases and its extension to other organisms.
BackgroundThe efficacy of antibiotics against M. tuberculosis has been shown to be influenced by experimental media conditions. Investigations of M. tuberculosis growth in physiological conditions have described an environment that is different from common in vitro media. Thus, elucidating the interplay between available nutrient sources and antibiotic efficacy has clear medical relevance. While genome-scale reconstructions of M. tuberculosis have enabled the ability to interrogate media differences for the past 10 years, recent reconstructions have diverged from each other without standardization. A unified reconstruction of M. tuberculosis H37Rv would elucidate the impact of different nutrient conditions on antibiotic efficacy and provide new insights for therapeutic intervention.ResultsWe present a new genome-scale model of M. tuberculosis H37Rv, named iEK1011, that unifies and updates previous M. tuberculosis H37Rv genome-scale reconstructions. We functionally assess iEK1011 against previous models and show that the model increases correct gene essentiality predictions on two different experimental datasets by 6% (53% to 60%) and 18% (60% to 71%), respectively. We compared simulations between in vitro and approximated in vivo media conditions to examine the predictive capabilities of iEK1011. The simulated differences recapitulated literature defined characteristics in the rewiring of TCA metabolism including succinate secretion, gluconeogenesis, and activation of both the glyoxylate shunt and the methylcitrate cycle. To assist efforts to elucidate mechanisms of antibiotic resistance development, we curated 16 metabolic genes related to antimicrobial resistance and approximated evolutionary drivers of resistance. Comparing simulations of these antibiotic resistance features between in vivo and in vitro media highlighted condition-dependent differences that may influence the efficacy of antibiotics.ConclusionsiEK1011 provides a computational knowledge base for exploring the impact of different environmental conditions on the metabolic state of M. tuberculosis H37Rv. As more experimental data and knowledge of M. tuberculosis H37Rv become available, a unified and standardized M. tuberculosis model will prove to be a valuable resource to the research community studying the systems biology of M. tuberculosis.Electronic supplementary materialThe online version of this article (10.1186/s12918-018-0557-y) contains supplementary material, which is available to authorized users.
Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge bases EcoCyc, EcoGene, UniProt, RefSeq, and RegulonDB. We identified the genes that lack direct experimental evidence of function (the "Y-ome") which include 1563 of 4653 unique genes (34%), of which 131 have absolutely no evidence of function. An additional 304 genes (6.6%) are pseudogenes or phantom genes. Y-ome genes tend to have lower expression levels and are enriched in the termination region of the E. coli chromosome. Where evidence is available for Y-ome genes, it most often points to them being membrane proteins and transporters. We resolve the misconception that a gene in E. coli whose primary name starts with "y" is unannotated, and we discuss the value of the Y-ome for systematic improvement of E. coli knowledge bases and its extension to other organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.