The biomedical literature is represented by millions of abstracts available in the Medline database. These abstracts can be queried with the PubMed interface, which provides a keyword-based Boolean search engine. This approach shows limitations in the retrieval of abstracts related to very specific topics, as it is difficult for a non-expert user to find all of the most relevant keywords related to a biomedical topic. Additionally, when searching for more general topics, the same approach may return hundreds of unranked references. To address these issues, text mining tools have been developed to help scientists focus on relevant abstracts. We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time. MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.
Background: Little is known about the genes that drive embryonic stem cell differentiation. However, such knowledge is necessary if we are to exploit the therapeutic potential of stem cells. To uncover the genetic determinants of mouse embryonic stem cell (mESC) differentiation, we have generated and analyzed 11-point time-series of DNA microarray data for three biologically equivalent but genetically distinct mESC lines (R1, J1, and V6.5) undergoing undirected differentiation into embryoid bodies (EBs) over a period of two weeks.
Paralog genes arise from gene duplication events during evolution, which often lead to similar proteins that cooperate in common pathways and in protein complexes. Consequently, paralogs show correlation in gene expression whereby the mechanisms of co-regulation remain unclear. In eukaryotes, genes are regulated in part by distal enhancer elements through looping interactions with gene promoters. These looping interactions can be measured by genome-wide chromatin conformation capture (Hi-C) experiments, which revealed self-interacting regions called topologically associating domains (TADs). We hypothesize that paralogs share common regulatory mechanisms to enable coordinated expression according to TADs. To test this hypothesis, we integrated paralogy annotations with human gene expression data in diverse tissues, genome-wide enhancer–promoter associations and Hi-C experiments in human, mouse and dog genomes. We show that paralog gene pairs are enriched for co-localization in the same TAD, share more often common enhancer elements than expected and have increased contact frequencies over large genomic distances. Combined, our results indicate that paralogs share common regulatory mechanisms and cluster not only in the linear genome but also in the three-dimensional chromatin architecture. This enables concerted expression of paralogs over diverse cell-types and indicate evolutionary constraints in functional genome organization.
BackgroundNF-κB is widely involved in lymphoid malignancies; however, the functional roles and specific transcriptomes of NF-κB dimers with distinct subunit compositions have been unclear.MethodsUsing combined ChIP-sequencing and microarray analyses, we determined the cistromes and target gene signatures of canonical and non-canonical NF-κB species in Hodgkin lymphoma (HL) cells.ResultsWe found that the various NF-κB subunits are recruited to regions with redundant κB motifs in a large number of genes. Yet canonical and non-canonical NF-κB dimers up- and downregulate gene sets that are both distinct and overlapping, and are associated with diverse biological functions. p50 and p52 are formed through NIK-dependent p105 and p100 precursor processing in HL cells and are the predominant DNA binding subunits. Logistic regression analyses of combinations of the p50, p52, RelA, and RelB subunits in binding regions that have been assigned to genes they regulate reveal a cross-contribution of p52 and p50 to canonical and non-canonical transcriptomes. These analyses also indicate that the subunit occupancy pattern of NF-κB binding regions and their distance from the genes they regulate are determinants of gene activation versus repression. The pathway-specific signatures of activated and repressed genes distinguish HL from other NF-κB-associated lymphoid malignancies and inversely correlate with gene expression patterns in normal germinal center B cells, which are presumed to be the precursors of HL cells.ConclusionsWe provide insights that are relevant for lymphomas with constitutive NF-κB activation and generally for the decoding of the mechanisms of differential gene regulation through canonical and non-canonical NF-κB signaling.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-016-0280-5) contains supplementary material, which is available to authorized users.
DNA Microarrays are used to simultaneously measure the levels of thousands of mRNAs in a sample. We illustrate here that a collection of such measurements in different cell types and states is a sound source of functional predictions, provided the microarray experiments are analogous and the cell samples are appropriately diverse. We have used this approach to study stem cells, whose identity and mechanisms of control are not well understood, generating Affymetrix microarray data from more than 200 samples, including stem cells and their derivatives, from human and mouse. The data can be accessed online (StemBase;
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.