Many native structures of proteins accomodate complex topological motifs such as knots, lassos, and other geometrical entanglements. How proteins can fold quickly even in the presence of such topological obstacles is a debated question in structural biology. Recently, the hypothesis that energetic frustration might be a mechanism to avoid topological frustration has been put forward based on the empirical observation that loops involved in entanglements are stabilized by weak interactions between amino-acids at their extrema. To verify this idea, we use a toy lattice model for the folding of proteins into two almost identical structures, one entangled and one not. As expected, the folding time is longer when random sequences folds into the entangled structure. This holds also under an evolutionary pressure simulated by optimizing the folding time. It turns out that optmized protein sequences in the entangled structure are in fact characterized by frustrated interactions at the closures of entangled loops. This phenomenon is much less enhanced in the control case where the entanglement is not present. Our findings, which are in agreement with experimental observations, corroborate the idea that an evolutionary pressure shapes the folding funnel to avoid topological and kinetic traps.
I. INTRODUCTIONThe biological function of most proteins requires them to fold into a well-defined native state, implying that both structure maintenance and efficient folding are kept under selective pressure by evolutionary processes [1]. In particular, a direct experimental evidence, pointing to some degree of folding rate optimization throughout evolution, was recently provided for ribonuclease H, using ancestral sequence reconstruction [2]. Bio-informatics analyses had also uncovered similar evolutionary signals already two decades ago for several folds [3], and more recently for a large catalog of protein domains [4].The latter study was based on the well known empirical correlation between experimentally measured folding rates of proteins and simple descriptors of the structural organization of the native state [5]. More general features of the folding mechanism are as well dictated by the overall topology of the native state [6]. In fact, contact order [7] and other related descriptors are based on the topological properties of the network formed by pairs of residues that are nearby in the three-dimensional space [8]. The simpler the network, the faster the predicted folding. The topology of the network of contacts, however, does not necessarily capture the topology of the protein backbone seen as a curve in the three-dimensional space, and the possible formation of knots and other entangled motifs.The discovery of knots in few proteins [9] came indeed as a surprise because they seem an unnecessary complication for the folding. Their presence could be related to some biological function or stability requirement [10,11], and the mechanisms allowing the dynamics to thread the protein backbone to form knots are under intense inves...