Combinatorial libraries of de novo amino acid sequences can provide a rich source of diversity for the discovery of novel proteins with interesting and important activities. Randomly generated sequences, however, rarely fold into well-ordered proteinlike structures. To enhance the quality of a library, features of rational design must be used to focus sequence diversity into those regions of sequence space that are most likely to yield folded structures. This review describes how focused libraries can be constructed by designing the binary pattern of polar and nonpolar amino acids to favor proteins that contain abundant secondary structure, while simultaneously burying hydrophobic side chains and exposing hydrophilic side chains to solvent. The "binary code" for protein design was used to construct several libraries of de novo proteins, including both ␣-helical and -sheet structures. The recently determined solution structure of a binary patterned four-helix bundle is well ordered, thereby demonstrating that sequences that have neither been selected by evolution (in vivo or in vitro) nor designed by computer can form nativelike proteins. Examples are presented demonstrating how binary patterned libraries have successfully produced well-ordered structures, cofactor binding, catalytic activity, self-assembled monolayers, amyloid-like nanofibrils, and proteinbased biomaterials.
A central challenge of synthetic biology is to enable the growth of living systems using parts that are not derived from nature, but designed and synthesized in the laboratory. As an initial step toward achieving this goal, we probed the ability of a collection of >106 de novo designed proteins to provide biological functions necessary to sustain cell growth. Our collection of proteins was drawn from a combinatorial library of 102-residue sequences, designed by binary patterning of polar and nonpolar residues to fold into stable 4-helix bundles. We probed the capacity of proteins from this library to function in vivo by testing their abilities to rescue 27 different knockout strains of Escherichia coli, each deleted for a conditionally essential gene. Four different strains – ΔserB, ΔgltA, ΔilvA, and Δfes – were rescued by specific sequences from our library. Further experiments demonstrated that a strain simultaneously deleted for all four genes was rescued by co-expression of four novel sequences. Thus, cells deleted for ∼0.1% of the E. coli genome (and ∼1% of the genes required for growth under nutrient-poor conditions) can be sustained by sequences designed de novo.
To probe the potential for enzymatic activity in unevolved amino acid sequence space, we created a combinatorial library of de novo 4-helix bundle proteins. This collection of novel proteins can be considered an ''artificial superfamily'' of helical bundles. The superfamily of 102-residue proteins was designed using binary patterning of polar and nonpolar residues, and expressed in Escherichia coli from a library of synthetic genes. Sequences from the library were screened for a range of biological functions including heme binding and peroxidase, esterase, and lipase activities. Proteins exhibiting these functions were purified and characterized biochemically. The majority of de novo proteins from this superfamily bound the heme cofactor, and a sizable fraction of the proteins showed activity significantly above background for at least one of the tested enzymatic activities. Moreover, several of the designed 4-helix bundles proteins showed activity in all of the assays, thereby demonstrating the functional promiscuity of unevolved proteins. These studies reveal that de novo proteins-which have neither been designed for function, nor subjected to evolutionary pressure (either in vivo or in vitro)-can provide rudimentary activities and serve as a ''feedstock'' for evolution.
To probe the potential for activity in unevolved amino acid sequence space, we created a third generation combinatorial library of de novo four-helix bundle proteins. The "artificial superfamily" of helical bundles was designed using binary patterning of polar and nonpolar residues, and expressed in Escherichia coli from a library of synthetic genes. WA20, picked from the library, is one of the most stable proteins in the superfamily, and has rudimentary activities such as esterase and lipase. Here we report the crystal structure of WA20, determined by the multiwavelength anomalous dispersion method. Unexpectedly, the WA20 crystal structure is not a monomeric four-helix bundle, but a dimeric four-helix bundle. Each monomer comprises two long α-helices that intertwist with the helices of the other monomer. The two monomers together form a 3D domain-swapped four-helix bundle dimer. In addition, there are two hydrophobic pockets, which may potentially provide substrate binding sites. Small-angle X-ray scattering shows that the molecular weight of WA20 is ~25 kDa and the shape is rod-like (the maximum length, D(max) = ~8 nm), indicating that WA20 forms a dimeric four-helix bundle in solution. These results demonstrate that our de novo protein library contains not only simple monomeric proteins, but also stable and functional multimeric proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.