Tandem repeat proteins, which are formed by repetition of modular units of protein sequence and structure, play important biological roles as macromolecular binding and scaffolding domains, enzymes, and building blocks for the assembly of fibrous materials1,2. The modular nature of repeat proteins enables the rapid construction and diversification of extended binding surfaces by duplication and recombination of simple building blocks3,4. The overall architecture of tandem repeat protein structures – which is dictated by the internal geometry and local packing of the repeat building blocks – is highly diverse, ranging from extended, super-helical folds that bind peptide, DNA, and RNA partners5–9, to closed and compact conformations with internal cavities suitable for small molecule binding and catalysis10. Here we report the development and validation of computational methods for de novo design of tandem repeat protein architectures driven purely by geometric criteria defining the inter-repeat geometry, without reference to the sequences and structures of existing repeat protein families. We have applied these methods to design a series of closed alpha-solenoid11 repeat structures (alpha-toroids) in which the inter-repeat packing geometry is constrained so as to juxtapose the N- and C-termini; several of these designed structures have been validated by X-ray crystallography. Unlike previous approaches to tandem repeat protein engineering12–20, our design procedure does not rely on template sequence or structural information taken from natural repeat proteins and hence can produce structures unlike those seen in nature. As an example, we have successfully designed and validated closed alpha-solenoid repeats with a left-handed helical architecture that – to our knowledge – is not yet present in the protein structure database21.
Summary LAGLIDADG meganucleases are DNA cleaving enzymes used for genome engineering. While their cleavage specificity can be altered using several protein engineering and selection strategies, their overall ‘targetability’ is limited by highly specific indirect recognition of the central four basepairs within their recognition sites. In order to examine the physical basis of indirect sequence recognition and to expand the number of such nucleases available for genome engineering, we have determined the target sites, DNA-bound structures and ‘central four’ cleavage fidelities of 9 related enzymes. Subsequent crystallographic analyses of a meganuclease bound to two noncleavable target sites, each containing a single inactivating basepair substitution at its center, indicates that a localized slip of the mutated basepair causes a small change in the DNA backbone conformation that results in a loss of metal occupancy at one binding site, eliminating cleavage activity.
The retargeting of protein–DNA specificity, outside of extremely modular DNA binding proteins such as TAL effectors, has generally proved to be quite challenging. Here, we describe structural analyses of five different extensively retargeted variants of a single homing endonuclease, that have been shown to function efficiently in ex vivo and in vivo applications. The redesigned proteins harbor mutations at up to 53 residues (18%) of their amino acid sequence, primarily distributed across the DNA binding surface, making them among the most significantly reengineered ligand-binding proteins to date. Specificity is derived from the combined contributions of DNA-contacting residues and of neighboring residues that influence local structural organization. Changes in specificity are facilitated by the ability of all those residues to readily exchange both form and function. The fidelity of recognition is not precisely correlated with the fraction or total number of residues in the protein–DNA interface that are actually involved in DNA contacts, including directional hydrogen bonds. The plasticity of the DNA-recognition surface of this protein, which allows substantial retargeting of recognition specificity without requiring significant alteration of the surrounding protein architecture, reflects the ability of the corresponding genetic elements to maintain mobility and persistence in the face of genetic drift within potential host target sites.
Circular tandem repeat proteins (‘cTRPs’) are de novo designed protein scaffolds (in this and prior studies, based on antiparallel two-helix bundles) that contain repeated protein sequences and structural motifs and form closed circular structures. They can display significant stability and solubility, a wide range of sizes, and are useful as protein display particles for biotechnology applications. However, cTRPs also demonstrate inefficient self-assembly from smaller subunits. In this study, we describe a new generation of cTRPs, with longer repeats and increased interaction surfaces, which enhanced the self-assembly of two significantly different sizes of homotrimeric constructs. Finally, we demonstrated functionalization of these constructs with (1) a hexameric array of peptide-binding SH2 domains, and (2) a trimeric array of anti-SARS CoV-2 VHH domains. The latter proved capable of sub-nanomolar binding affinities towards the viral receptor binding domain and potent viral neutralization function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.