Conformational sampling of protein structures is essential for understanding biochemical functions and for predicting thermodynamic properties such as free energies. Where previous approaches rely on sequential sampling procedures, recent developments in generative deep neural networks rendered possible the parallel, statistically independent sampling of molecular configurations. To be able to accurately generate samples of large molecular systems from a high-dimensional multimodal equilibrium distribution function, we developed a hierarchical approach based on expressive normalizing flows with rational quadratic neural splines and coarse-grained representation. Furthermore, system specific priors and adaptive and property-based controlled learning was designed to diminish the likelihood for the generation of high-energy structures during sampling. Finally, backmapping from a coarse-grained to fully atomistic representation is performed through an equivariant transformer model. We demonstrate the applicability of the method on the one-shot configurational sampling of a protein system with more than a hundred amino acids. The results show enhanced expressivity that diminish the invertibility constraints inherent in the normalizing flow framework. Moreover, the capacity of the hierarchical normalizing flow model was tested on a challenging case study of the folding/unfolding dynamics of the peptide chignolin.
The Targeted Free Energy Perturbation (TFEP) method aims to overcome the time-consuming and computer-intensive stratification process of standard methods for estimating the free energy difference between two states. To achieve this, TFEP uses a mapping function between the high-dimensional probability densities of these states. The bijectivity and invertibility of normalizing flow neural networks fulfill the requirements for serving as such a mapping function. Despite its theoretical potential for free energy calculations, TFEP has not yet been adopted in practice due to challenges in entropy correction, limitations in energy-based training, and mode collapse when learning density functions of larger systems with a high number of degrees of freedom. In this study, we expand flow-based TFEP to systems with variable number of atoms in the two states of consideration by exploring the theoretical basis of entropic contributions of dummy atoms, and validate our reasoning with analytical derivations for a model system containing coupled particles. We also extend the TFEP framework to handle systems of hybrid topology, propose auxiliary additions to improve the TFEP architecture, and demonstrate accurate predictions of relative free energy differences for large molecular systems. Our results provide the first practical application of the fast and accurate deep learning-based TFEP method for biomolecules and introduce it as a viable free energy estimation method within the context of drug design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.