Fitness landscapes1,2, depictions of how genotypes manifest at the phenotypic level, form the basis for our understanding of many areas of biology2–7 yet their properties remain elusive. Studies addressing this issue often consider specific genes and their function as proxy for fitness2,4, experimentally assessing the impact on function of single mutations and their combinations in a specific sequence2,5,8–15 or in different sequences2,3,5,16–18. However, systematic high-throughput studies of the local fitness landscape of an entire protein have not yet been reported. Here, we chart an extensive region of the local fitness landscape of the green fluorescent protein from Aequorea victoria (avGFP) by measuring the native function, fluorescence, of tens of thousands of derivative genotypes of avGFP. We find that its fitness landscape is narrow, with half of genotypes with two mutations showing reduced fluorescence and half of genotypes with five mutations being completely non-fluorescent. The narrowness is enhanced by epistasis, which was detected in up to 30% of genotypes with multiple mutations arising mostly through the cumulative impact of slightly deleterious mutations causing a threshold-like decrease of protein stability and concomitant loss of fluorescence. A model of orthologous sequence divergence spanning hundreds of millions of years predicted the extent of epistasis in our data, indicating congruence between the fitness landscape properties at the local and global scales. The characterization of the local fitness landscape of avGFP has important implications for a number of fields including molecular evolution, population genetics and protein design.
Guided by the recent success of empirical model predicting the folding rates of small two-state folding proteins from the relative contact order (CO) of their native structures, by a theoretical model of protein folding that predicts that logarithm of the folding rate decreases with the protein chain length L as L 2/3 , and by the finding that the folding rates of multistate folding proteins strongly correlate with their sizes and have very bad correlation with CO, we reexamined the dependence of folding rate on CO and L in attempt to find a structural parameter that determines folding rates for the totality of proteins. We show that the AbsCO ס CO × L, is able to predict rather accurately folding rates for both two-state and multistate folding proteins, as well as short peptides, and that this AbsCO scales with the protein chain length as L 0.70 ± 0.07 for the totality of studied single-domain proteins and peptides.
We present a method for predicting folding rates of proteins from their amino acid sequences only, or rather, from their chain lengths and their helicity predicted from their sequences. The method achieves 82% correlation with experiment over all 64 ''two-state'' and ''multistate'' proteins (including two artificial peptides) studied up to now.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.