Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similar cis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of shared cis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement of cis-element clusters within nonorthologous genes, promoters, and enhancers that exhibit coordinate regulatory properties. Known functional regulatory regions of nonorthologous and less-conserved orthologous genes frequently showed cis-element shuffling, demonstrating that compositional similarity can be more sensitive than sequence similarity. These results show that combining sequence similarity with cis-element compositional similarity provides a powerful aid for the identification of potential control regions.In higher multicellular organisms, cell-type-specific nuclear machinery has an uncanny ability to direct precise patterns of gene expression through the recognition of arrays of ciselements specified by primary DNA sequence lying in the context of higher-order chromatin structure. Computationally, however, we have little ability to identify cis-regulatory regions from primary sequence and even less ability to predict cellular compartments into which expression is specified. Improved ability to do so will advance our understanding of eukaryotic gene regulatory mechanisms, facilitate improved annotation of the genome, and provide insight into the potential effects of sequence polymorphisms on gene expression patterns. Moreover, improved understanding of cis-element clusters present in coordinately regulated gene groups may allow for the prediction of gene regulatory network behaviors during development, homeostasis, and disease. To exploit the potential power of conserved cis-element clusters to contribute to our understanding of eukaryotic gene regulation, it is critical to create database resources from multiple sequence analysis methods on the basis of both phylogenetic conservation and known binding-site matches that can be mined for patterns that correlate with experimentally gathered expression profile data. As shown by man...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.