Background
How, and the extent to which, evolution acts on DNA and protein sequences to ensure mutational robustness and evolvability is a long-standing open question in the field of molecular evolution. We addressed this issue through the first structurome-scale computational investigation, in which we estimated the change in folding free energy upon all possible single-site mutations introduced in more than 20,000 protein structures, as well as through available experimental stability and fitness data.
Results
At the amino acid level, we found the protein surface to be more robust against random mutations than the core, this difference being stronger for small proteins. The destabilizing and neutral mutations are more numerous in the core and on the surface, respectively, whereas the stabilizing mutations are about 4% in both regions. At the genetic code level, we observed smallest destabilization for mutations that are due to substitutions of base III in the codon, followed by base I, bases I+III, base II, and other multiple base substitutions. This ranking highly anticorrelates with the codon-anticodon mispairing frequency in the translation process. This suggests that the standard genetic code is optimized to limit the impact of random mutations, but even more so to limit translation errors. At the codon level, both the codon usage and the usage bias appear to optimize mutational robustness and translation accuracy, especially for surface residues.
Conclusion
Our results highlight the non-universality of mutational robustness and its multiscale dependence on protein features, the structure of the genetic code, and the codon usage. Our analyses and approach are strongly supported by available experimental mutagenesis data.
The question of how natural evolution acts on DNA and protein sequences to ensure mutational robustness and evolvability has been asked for decades without definitive answer. We tackled this issue through a structurome-scale computational investigation, in which we estimated the change in folding free energy upon all possible single-site mutations introduced in more than 20,000 protein structures. The validity of our results are supported by a very good agreement with experimental mutagenesis data. At the amino acid level, we found the protein surface to be more robust to mutations than the core, in a protein length-dependent manner. About 4% of all mutations were shown to be stabilizing, and a majority of mutations on the surface and in the core to be neutral and destabilizing, respectively. At the nucleobase level, single base substitutions were shown to yield on average less destabilizing amino acid mutations than multiple base substitutions. More precisely, the smallest average destabilization occurs for substitutions of base III in the codon, followed by base I, bases I+III, and base II. This ranking highly anticorrelates with the frequency of codon-anticodon mispairing, and suggests that the standard genetic code is optimized more to limit translation errors than the impact of random mutations. Moreover, the codon usage also appears to be optimized for minimizing the errors at the protein level, especially for surface residues that evolve faster and have therefore been under stronger selection, and for biased codons, suggesting that the codon usage bias also partly aims to optimize protein mutational robustness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.