Dissection of regulatory networks that control gene transcription is one of the greatest challenges of functional genomics. Using human genomic sequences, models for binding sites of known transcription factors, and gene expression data, we demonstrate that the reverse engineering approach, which infers regulatory mechanisms from gene expression patterns, can reveal transcriptional networks in human cells. To date, such methodologies were successfully demonstrated only in prokaryotes and low eukaryotes. We developed computational methods for identifying putative binding sites of transcription factors and for evaluating the statistical significance of their prevalence in a given set of promoters. Focusing on transcriptional mechanisms that control cell cycle progression, our computational analyses revealed eight transcription factors whose binding sites are significantly overrepresented in promoters of genes whose expression is cell-cycle-dependent. The enrichment of some of these factors is specific to certain phases of the cell cycle. In addition, several pairs of these transcription factors show a significant co-occurrence rate in cell-cycle-regulated promoters. Each such pair indicates functional cooperation between its members in regulating the transcriptional program associated with cell cycle progression. The methods presented here are general and can be applied to the analysis of transcriptional networks controlling any biological process.
We present a threefold contribution to the computational task of motif discovery, a key component in the effort of delineating the regulatory map of a genome: (1) We constructed a comprehensive large-scale, publicly-available compendium of transcription factor and microRNA target gene sets derived from diverse high-throughput experiments in several metazoans. We used the compendium as a benchmark for motif discovery tools. (2) We developed Amadeus, a highly efficient, user-friendly software platform for genome-scale detection of novel motifs, applicable to a wide range of motif discovery tasks. Amadeus improves upon extant tools in terms of accuracy, running time, output information, and ease of use and is the only program that attained a high success rate on the metazoan compendium. (3) We demonstrate that by searching for motifs based on their genome-wide localization or chromosomal distributions (without using a predefined target set), Amadeus uncovers diverse known phenomena, as well as novel regulatory motifs.
A major challenge in the analysis of gene expression microarray data is to extract meaningful biological knowledge out of the huge volume of raw data. Expander (EXPression ANalyzer and DisplayER) is an integrated software platform for the analysis of gene expression data, which is freely available for academic use. It is designed to support all the stages of microarray data analysis, from raw data normalization to inference of transcriptional regulatory networks. The microarray analysis described in this protocol starts with importing the data into Expander 5.0 and is followed by normalization and filtering. Then, clustering and network-based analyses are performed. The gene groups identified are tested for enrichment in function (based on Gene Ontology), co-regulation (using transcription factor and microRNA target predictions) or co-location. The results of each analysis step can be visualized in a number of ways. The complete protocol can be executed in approximately 1 h.
Background: Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.