Proteins sample a variety of conformations
distinct from their
crystal structure. These structures, their propensities, and the pathways
for moving between them contain an enormous amount of information
about protein function that is hidden from a purely structural perspective.
Molecular dynamics simulations can uncover these alternative conformations
but often at a prohibitively high computational cost. Here we apply
our recent statistical mechanics and artificial intelligence-based
molecular dynamics framework for enhanced sampling of protein loops.
We exemplify the approach through the study of three mutants of the
classical test-piece protein T4 lysozyme. We are able to correctly
rank these according to the stability of their excited state. By analyzing
reaction coordinates, we also obtain crucial insight into why these
specific perturbations in sequence space lead to tremendous variations
in conformational flexibility. Our framework thus allows an accurate
comparison of loop conformation populations with minimal prior human
bias and should be directly applicable to a range of macromolecules
in biology, chemistry, and beyond.
Molecular dynamics (MD) simulations generate valuable all-atom resolution trajectories of complex systems, but analyzing this high-dimensional data as well as reaching practical timescales even with powerful supercomputers remain open problems. As such, many specialized sampling and reaction coordinate construction methods exist that alleviate these problems. However, these methods typically don't work directly on all atomic coordinates, and still require previous knowledge of the important distinguishing features of the system, known as order parameters (OPs). Here we present AMINO, an automated method that generates such OPs by screening through a very large dictionary of OPs, such as all heavy atom contacts in a biomolecule. AMINO uses ideas from information theory and rate distortion theory. The OPs learnt from AMINO can then serve as an input for designing a reaction coordinate which can then be used in many enhanced sampling methods. Here we outline its key theoretical underpinnings, and apply it to systems of increasing complexity. Our applications include a problem of tremendous pharmaceutical and engineering relevance, namely, calculating the binding affinity of a protein-ligand system when all that is known is the structure of the bound system. Our calculations are performed in a human-free fashion, obtaining very accurate results compared to long unbiased MD simulations on the Anton supercomputer, but in orders of magnitude less computer time. We thus expect AMINO to be useful for the calculation of thermodynamics and kinetics in the study of diverse molecular systems.
# These two authors contributed equally.Proteins sample a variety of conformations distinct from their crystal structure. These structures, their propensities, and pathways for moving between them contain enormous information about protein function that is hidden from a purely structural perspective. Molecular dynamics simulations can uncover these higher energy states but often at a prohibitively high computational cost.Here we apply our recent statistical mechanics and artificial intelligence based molecular dynamics framework for enhanced sampling of protein loops in three mutants of the protein T4 lysozyme. We are able to correctly rank these according to the stability of their excited state. By analyzing reaction coordinates, we also obtain crucial insight into why these specific perturbations in sequence space lead to tremendous variations in conformational flexibility.Our framework thus allows accurate comparison of loop conformation populations with minimal prior human bias, and should be directly applicable to a range of macromolecules in biology, chemistry and beyond.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.