Structural classification shows that the number of different protein folds is surprisingly small. It also appears that proteins are built in a modular fashion, from a relatively small number of components. Here we propose to identify the modular building blocks of proteins with the dark soliton solution of a generalized discrete nonlinear Schrödinger equation. For this we show that practically all protein loops can be obtained simply by scaling the size and by joining together a number of copies of the soliton, one after another. The soliton has only two loop specific parameters and we identify their possible values in Protein Data Bank. We show that with a collection of 200 sets of parameters, each determining a soliton profile that describes a different short loop, we cover over 90 % of all proteins with experimental accuracy. We also present two examples that describe how the loop library can be employed both to model and to analyze the structure of folded proteins.
A comparative classification scheme provides a good basis for several approaches to understand proteins, including prediction of relations between their structure and biological function. But it remains a challenge to combine a classification scheme that describes a protein starting from its well-organized secondary structures and often involves direct human involvement, with an atomary-level physics-based approach where a protein is fundamentally nothing more than an ensemble of mutually interacting carbon, hydrogen, oxygen, and nitrogen atoms. In order to bridge these two complementary approaches to proteins, conceptually novel tools need to be introduced. Here we explain how an approach toward geometric characterization of entire folded proteins can be based on a single explicit elementary function that is familiar from nonlinear physical systems where it is known as the kink soliton. Our approach enables the conversion of hierarchical structural information into a quantitative form that allows for a folded protein to be characterized in terms of a small number of global parameters that are in principle computable from atomary-level considerations. As an example we describe in detail how the native fold of the myoglobin 1M6C emerges from a combination of kink solitons with a very high atomary-level accuracy. We also verify that our approach describes longer loops and loops connecting α helices with β strands, with the same overall accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.