Eukaryotic transcription factors (TFs) from the same structural family tend to bind similar DNA sequences, despite the ability of these TFs to execute distinct functions in vivo. The cell partly resolves this specificity paradox through combinatorial strategies and the use of low-affinity binding sites, which are better able to distinguish between similar TFs. However, because these sites have low affinity, it is challenging to understand how TFs recognize them in vivo. Here, we summarize recent findings and technological advancements that allow for the quantification and mechanistic interpretation of TF recognition across a wide range of affinities. We propose a model that integrates insights from the fields of genetics and cell biology to provide further conceptual understanding of TF binding specificity. We argue that in eukaryotes, target specificity is driven by an inhomogeneous 3D nuclear distribution of TFs and by variation in DNA binding affinity such that locally elevated TF concentration allows low-affinity binding sites to be functional.
SignificanceOne-tenth of human genes produce proteins called transcription factors (TFs) that bind to our genome and read the local DNA sequence. They work together to regulate the degree to which each gene is expressed. The affinity with which DNA is bound by a particular TF can vary more than a thousand-fold with different DNA sequences. This study presents the first computational method able to quantify the sequence-affinity relationship almost perfectly over the full affinity range. It achieves this by analyzing data from experiments that use massively parallel DNA sequencing to comprehensively probe protein–DNA interactions. Strikingly, it can accurately predict the effect in vivo of DNA mutations on gene expression levels in fly embryos even for very-low-affinity binding sites.
SUMMARY
Although DNA modifications play an important role in gene regulation, the underlying mechanisms remain elusive. We developed EpiSELEX-seq to probe the sensitivity of transcription factor binding to DNA modification in vitro using massively parallel sequencing. Feature-based modeling quantifies the effect of cytosine methylation (5mC) on binding free energy in a position-specific manner. Application to the human bZIP proteins ATF4 and C/EBPβ, and three different Pbx-Hox complexes shows that 5mCpG can both increase and decrease affinity, depending on where the modification occurs within the protein-DNA interface. The TF paralogs tested vary in their methylation sensitivity, for which we provide a structural rationale. We show that 5mCpG can also enhance in vitro p53 binding, and provide evidence for increased in vivo p53 occupancy at methylated binding sites, correlating with primed-enhancer histone marks. Our results establish a powerful strategy for dissecting epigenomic modulation of protein-DNA interactions and their role in gene regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.