The recognition of specific DNA sequences by proteins is thought to depend on two types of mechanisms: one that involves the formation of hydrogen bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. By comprehensively analyzing the three dimensional structures of protein-DNA complexes, we show that the binding of arginines to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the negative electrostatic potential of the DNA. The nucleosome core particle offers a striking example of this effect. Minor groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings suggest that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA binding specificity.
Specific interactions between proteins and DNA are fundamental to many biological processes. In this review, we provide a revised view of protein-DNA interactions that emphasizes the importance of the three-dimensional structures of both macromolecules. We divide protein-DNA interactions into two categories: those where the protein recognizes the unique chemical signatures of the DNA bases (base readout) and those where the protein recognizes a sequence-dependent DNA shape (shape readout). We further divide base readout into those interactions that occur in the major groove from those that occur in the minor groove. Analogously, the readout of DNA shape is subdivided into global shape recognition, for example when the DNA helix exhibits an overall bend, and local shape recognition, for example when a base pair step is kinked or when a region of the minor groove is narrow. Based on the >1500 structures of protein-DNA complexes now available in the Protein Data Base, we argue that individual DNA binding proteins combine multiple readout mechanisms to achieve DNA binding specificity. Specificity that distinguishes between families frequently involves base readout in the major groove while shape readout is often exploited for higher resolution specificity, to distinguish between members within the same DNA-binding protein family.
It has been known for some time that the double-helix is not a uniform structure but rather exhibits sequence-specific variations that, combined with base-specific intermolecular interactions, offer the possibility of numerous modes of protein-DNA recognition. All-atom simulations have revealed mechanistic insights into the structural and energetic basis of various recognition mechanisms for a number of protein-DNA complexes while coarser grained simulations have begun to provide an understanding of the function of larger assemblies. Molecular simulations have also been applied to the prediction of transcription factor binding sites, while empirical approaches have been developed to predict nucleosome positioning. Studies that combine and integrate experimental, statistical and computational data offer the promise of rapid advances in our understanding of protein-DNA recognition mechanisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.