GTRD—Gene Transcription Regulation Database (http://gtrd.biouml.org)—is a database of transcription factor binding sites (TFBSs) identified by ChIP-seq experiments for human and mouse. Raw ChIP-seq data were obtained from ENCODE and SRA and uniformly processed: (i) reads were aligned using Bowtie2; (ii) ChIP-seq peaks were called using peak callers MACS, SISSRs, GEM and PICS; (iii) peaks for the same factor and peak callers, but different experiment conditions (cell line, treatment, etc.), were merged into clusters; (iv) such clusters for different peak callers were merged into metaclusters that were considered as non-redundant sets of TFBSs. In addition to information on location in genome, the sets contain structured information about cell lines and experimental conditions extracted from descriptions of corresponding ChIP-seq experiments. A web interface to access GTRD was developed using the BioUML platform. It provides: (i) browsing and displaying information; (ii) advanced search possibilities, e.g. search of TFBSs near the specified gene or search of all genes potentially regulated by a specified transcription factor; (iii) integrated genome browser that provides visualization of the GTRD data: read alignments, peaks, clusters, metaclusters and information about gene structures from the Ensembl database and binding sites predicted using position weight matrices from the HOCOMOCO database.
Different signal transduction pathways leading to the activation of transcription factors (TFs) converge at key molecules that master the regulation of many cellular processes. Such crossroads of signalling networks often appear as "Achilles Heels" causing a disease when not functioning properly. Novel computational tools are needed for analysis of the gene expression data in the context of signal transduction and gene regulatory pathways and for identification of the key nodes in the networks. An integrated computational system, ExPlain (www.biobase.de) was developed for causal interpretation of gene expression data and identification of key signalling molecules. The system utilizes data from two databases (TRANSFAC and TRANSPATH) and integrates two programs: (1) Composite Module Analyst (CMA) analyses 5'-upstream regions of co-expressed genes and applies a genetic algorithm to reveal composite modules (CMs) consisting of co-occurring single TF binding sites and composite elements; (2) ArrayAnalyzer is a fast network search engine that analyses signal transduction networks controlling the activities of the corresponding TFs and seeks key molecules responsible for the observed concerted gene activation. ExPlain system was applied to microarray data on inflammatory bowel diseases (IBD). The results obtained suggest a number of highly interesting biological hypotheses about molecular mechanisms of pathological genetic disregulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.