Mining of integrated public transcriptomic and ChIP-Seq (cistromic) datasets can illuminate functions of mammalian cellular signaling pathways not yet explored in the research literature. Here, we designed a web knowledgebase, the Signaling Pathways Project (SPP), which incorporates community classifications of signaling pathway nodes (receptors, enzymes, transcription factors and co-nodes) and their cognate bioactive small molecules. We then mapped over 10,000 public transcriptomic or cistromic experiments to their pathway node or biosample of study. To enable prediction of pathway node-gene target transcriptional regulatory relationships through SPP, we generated consensus ‘omics signatures, or consensomes, which ranked genes based on measures of their significant differential expression or promoter occupancy across transcriptomic or cistromic experiments mapped to a specific node family. Consensomes were validated using alignment with canonical literature knowledge, gene target-level integration of transcriptomic and cistromic data points, and in bench experiments confirming previously uncharacterized node-gene target regulatory relationships. To expose the SPP knowledgebase to researchers, a web browser interface was designed that accommodates numerous routine data mining strategies. SPP is freely accessible at https://www.signalingpathways.org.
Integrated mining of public transcriptomic and ChIP-Seq datasets has the potential to illuminate facets of mammalian cellular signaling pathways not yet explored in the research literature.Here, we designed a web knowledgebase, the Signaling Pathways Project (SPP), which incorporates stable community classifications of the four major categories of signaling pathway node (receptors, enzymes, transcription factors and co-nodes) and their cognate bioactive small molecules (BSMs). We then mapped over 10,000 public transcriptomic or cistromic experiments to their relevant signaling pathway node, BSM or biosample of study. To provide for prediction of pathway node-target transcriptional regulatory relationships, we generated consensus 'omics signatures, or consensomes, based on measures of significant differential expression of genomic targets across all underlying transcriptomic experiments. To expose the SPP knowledgebase to researchers, a web browser interface accommodates a variety of routine data mining strategies. Consensomes were validated using alignment with literature-based knowledge, gene target-level integration of transcriptomic and ChIP-Seq data points, and in bench experiments that confirmed previously uncharacterized node-gene target regulatory relationships. SPP is freely accessible at https://beta.signalingpathways.org.Individual dataset pages enable integration of SPP with the research literature via digital object identifier (DOI)-driven links from external sites, as well as for citation of datasets to enhance their FAIR status 3,4 .
The nuclear receptor (NR) superfamily of ligand-regulated transcription factors directs ligand- and tissue-specific transcriptomes in myriad developmental, metabolic, immunological, and reproductive processes. The NR signaling field has generated a wealth of genome-wide expression data points, but due to deficits in their accessibility, annotation, and integration, the full potential of these studies has not yet been realized. We searched public gene expression databases and MEDLINE for global transcriptomic datasets relevant to NRs, their ligands, and coregulators. We carried out extensive, deep reannotation of the datasets using controlled vocabularies for RNA Source and regulating molecule and resolved disparate gene identifiers to official gene symbols to facilitate comparison of fold changes and their significance across multiple datasets. We assembled these data points into a database, Transcriptomine (http://www.nursa.org/transcriptomine), that allows for multiple, menu-driven querying strategies of this transcriptomic "superdataset," including single and multiple genes, Gene Ontology terms, disease terms, and uploaded custom gene lists. Experimental variables such as regulating molecule, RNA Source, as well as fold-change and P value cutoff values can be modified, and full data records can be either browsed or downloaded for downstream analysis. We demonstrate the utility of Transcriptomine as a hypothesis generation and validation tool using in silico and experimental use cases. Our resource empowers users to instantly and routinely mine the collective biology of millions of previously disparate transcriptomic data points. By incorporating future transcriptome-wide datasets in the NR signaling field, we anticipate Transcriptomine developing into a powerful resource for the NR- and other signal transduction research communities.
We previously developed a web tool, Transcriptomine, to explore expression profiling data sets involving small-molecule or genetic manipulations of nuclear receptor signaling pathways. We describe advances in biocuration, query interface design, and data visualization that enhance the discovery of uncharacterized biology in these pathways using this tool. Transcriptomine currently contains about 45 million data points encompassing more than 2000 experiments in a reference library of nearly 550 data sets retrieved from public archives and systematically curated. To make the underlying data points more accessible to bench biologists, we classified experimental small molecules and gene manipulations into signaling pathways and experimental tissues and cell lines into physiological systems and organs. Incorporation of these mappings into Transcriptomine enables the user to readily evaluate tissue-specific regulation of gene expression by nuclear receptor signaling pathways. Data points from animal and cell model experiments and from clinical data sets elucidate the roles of nuclear receptor pathways in gene expression events accompanying various normal and pathological cellular processes. In addition, data sets targeting non-nuclear receptor signaling pathways highlight transcriptional cross-talk between nuclear receptors and other signaling pathways. We demonstrate with specific examples how data points that exist in isolation in individual data sets validate each other when connected and made accessible to the user in a single interface. In summary, Transcriptomine allows bench biologists to routinely develop research hypotheses, validate experimental data, or model relationships between signaling pathways, genes, and tissues.
Signaling pathways involving nuclear receptors (NRs), their ligands and coregulators, regulate tissue-specific transcriptomes in diverse processes, including development, metabolism, reproduction, the immune response and neuronal function, as well as in their associated pathologies. The Nuclear Receptor Signaling Atlas (NURSA) is a Consortium focused around a Hub website (www.nursa.org) that annotates and integrates diverse ‘omics datasets originating from the published literature and NURSA-funded Data Source Projects (NDSPs). These datasets are then exposed to the scientific community on an Open Access basis through user-friendly data browsing and search interfaces. Here, we describe the redesign of the Hub, version 3.0, to deploy “Web 2.0” technologies and add richer, more diverse content. The Molecule Pages, which aggregate information relevant to NR signaling pathways from myriad external databases, have been enhanced to include resources for basic scientists, such as post-translational modification sites and targeting miRNAs, and for clinicians, such as clinical trials. A portal to NURSA’s Open Access, PubMed-indexed journal Nuclear Receptor Signaling has been added to facilitate manuscript submissions. Datasets and information on reagents generated by NDSPs are available, as is information concerning periodic new NDSP funding solicitations. Finally, the new website integrates the Transcriptomine analysis tool, which allows for mining of millions of richly annotated public transcriptomic data points in the field, providing an environment for dataset re-use and citation, bench data validation and hypothesis generation. We anticipate that this new release of the NURSA database will have tangible, long term benefits for both basic and clinical research in this field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.