Sequence preferences of DNA-binding proteins are a primary mechanism by which cells interpret the genome. Despite these proteins' central importance in physiology, development, and evolution, comprehensive DNA-binding specificities have been determined experimentally for few proteins. Here, we used microarrays containing all 10-base-pair sequences to examine the binding specificities of 104 distinct mouse DNA-binding proteins representing 22 structural classes. Our results reveal a complex landscape of binding, with virtually every protein analyzed possessing unique preferences. Roughly half of the proteins each recognized multiple distinctly different sequence motifs, challenging our molecular understanding of how proteins interact with their DNA binding sites. This complexity in DNA recognition may be important in gene regulation and in evolution of transcriptional regulatory networks.The interactions between transcription factors (TFs) and their DNA binding sites are an integral part of the gene regulatory networks that control development, core cellular processes, and responses to environmental perturbations. However, only a handful of sequence-specific TFs have been characterized well enough to identify all the sequences that they can and, just as importantly, can not bind. Computational analysis of microarray readout of chromatin immunoprecipitation experiments (ChIP-chip) suggests extensive use of low affinity binding sites in yeast (1), and computational models of gene expression during fly embryonic development suggest that low affinity binding sites contribute as much as high affinity sites (2).The availability of TF binding data spanning the full affinity range would improve our understanding of the biophysical phenomena underlying protein-DNA recognition, and would improve accuracy in analyzing cis regulatory elements. Here we report the comprehensive determination of the DNA binding specificities of 104 known and predicted mouse TFs using the universal protein binding microarray (PBM) technology (3). These TFs represent 22 different DNA binding domain (DBD) structural classes that are the major DBD classes found in metazoan TFs.We created (4) N-terminal GST fusion constructs of the DBDs of 104 known and predicted mouse TFs (Fig. S1 and Table S1). Five of these proteins -Max, Bhlhb2, Gata3, Rfx3, and Sox7 -were also represented as full-length fusions to N-terminal GST, yielding a total set of 109 non-redundant proteins represented by 115 samples (5). Each protein was used in two PBM experiments (6,7) (Figs. S2, S3, S4 and Table S2). DNA binding site motifs initially were derived using the Seed-and-Wobble algorithm (3,8); Seed-and-Wobble first identifies the single 8-mer (ungapped or gapped) with the greatest PBM enrichment score (E-score) (3), and then systematically tests the relative preference of each nucleotide variant at each position both within and outside the seed (5). Later analyses incorporated additional motif finding algorithms, including RankMotif++ (9) and Kafal (5).Beyond simpl...
A transient inflammatory signal can initiate an epigenetic switch from non-transformed to cancer cells via a positive feedback loop involving NF-κB, Lin28, let-7, and IL6. We identify differentially regulated microRNAs important for this switch, and putative transcription factor binding sites in their promoters. STAT3, a transcription factor activated by IL6, directly activates miR-21 and miR-181b-1. Remarkably, transient expression of either microRNA induces the epigenetic switch. MiR-21 and miR-181b-1 respectively inhibit PTEN and CYLD tumor suppressors, leading to increased NF-κB activity required to maintain the transformed state. These STAT3-mediated regulatory circuits are required for the transformed state in diverse cell lines and tumor growth in xenografts, and their transcriptional signatures are observed in colon adenocarcinomas. Thus, STAT3 is not only a downstream target of IL6, but with miR-21, miR-181b-1, PTEN, and CYLD, is part of the positive feedback loop that underlies the epigenetic switch that links inflammation to cancer.
Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity and showing that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.