The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.
Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.
BackgroundOur understanding of the regulation of gene expression has benefited from the availability of high-throughput technologies that interrogate the whole genome for the binding of specific transcription factors and gene expression profiles. In the case of widely used model organisms, such as Escherichia coli K-12, the new knowledge gained from these approaches needs to be integrated with the legacy of accumulated knowledge from genetic and molecular biology experiments conducted in the pre-genomic era in order to attain the deepest level of understanding possible based on the available data.ResultsIn this paper, we describe an expansion of RegulonDB, the database containing the rich legacy of decades of classic molecular biology experiments supporting what we know about gene regulation and operon organization in E. coli K-12, to include the genome-wide dataset collections from 32 ChIP and 19 gSELEX publications, in addition to around 60 genome-wide expression profiles relevant to the functional significance of these datasets and used in their curation. Three essential features for the integration of this information coming from different methodological approaches are: first, a controlled vocabulary within an ontology for precisely defining growth conditions; second, the criteria to separate elements with enough evidence to consider them involved in gene regulation from isolated transcription factor binding sites without such support; and third, an expanded computational model supporting this knowledge. Altogether, this constitutes the basis for adequately gathering and enabling the comparisons and integration needed to manage and access such wealth of knowledge.ConclusionsThis version 10.0 of RegulonDB is a first step toward what should become the unifying access point for current and future knowledge on gene regulation in E. coli K-12. Furthermore, this model platform and associated methodologies and criteria can be emulated for gathering knowledge on other microbial organisms.
BackgroundIn the genus Streptomyces, one of the most remarkable control mechanisms of physiological processes is carbon catabolite repression (CCR). This mechanism regulates the expression of genes involved in the uptake and utilization of alternative carbon sources. CCR also affects the synthesis of secondary metabolites and morphological differentiation. Even when the outcome effect of CCR in different bacteria is the same, their essential mechanisms can be quite different. In several streptomycetes glucose kinase (Glk) represents the main glucose phosphorylating enzyme and has been regarded as a regulatory protein in CCR. To evaluate the paradigmatic model proposed for CCR in Streptomyces, a high-density microarray approach was applied to Streptomyces coelicolor M145, under repressed and non-repressed conditions. The transcriptomic study was extended to assess the ScGlk role in this model by comparing the transcriptomic profile of S. coelicolor M145 with that of a ∆glk mutant derived from the wild-type strain, complemented with a heterologous glk gene from Zymomonas mobilis (Zmglk), insensitive to CCR but able to grow in glucose (ScoZm strain).ResultsMicroarray experiments revealed that glucose influenced the expression of 651 genes. Interestingly, even when the ScGlk protein does not have DNA binding domains and the glycolytic flux was restored by a heterologous glucokinase, the ScGlk replacement modified the expression of 134 genes. From these, 91 were also affected by glucose while 43 appeared to be under the control of ScGlk. This work identified the expression of S. coelicolor genes involved in primary metabolism that were influenced by glucose and/or ScGlk. Aside from describing the metabolic pathways influenced by glucose and/or ScGlk, several unexplored transcriptional regulators involved in the CCR mechanism were disclosed.ConclusionsThe transcriptome of a classical model of CCR was studied in S. coelicolor to differentiate between the effects due to glucose or ScGlk in this regulatory mechanism. Glucose elicited important metabolic and transcriptional changes in this microorganism. While its entry and flow through glycolysis and pentose phosphate pathway were stimulated, the gluconeogenesis was inhibited. Glucose also triggered the CCR by repressing transporter systems and the transcription of enzymes required for secondary carbon sources utilization. Our results confirm and update the agar model of the CCR in Streptomyces and its dependence on the ScGlk per se. Surprisingly, the expected regulatory function of ScGlk was not found to be as global as thought before (only 43 out of 779 genes were affected), although may be accompanied or coordinated by other transcriptional regulators. Aside from describing the metabolic pathways influenced by glucose and/or ScGlk, several unexplored transcriptional regulators involved in the CCR mechanism were disclosed. These findings offer new opportunities to study and understand the CCR in S. coelicolor by increasing the number of known glucose and ScGlk -regulate...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.