As the length of molecular dynamics (MD) trajectories grows with increasing computational power, so does the importance of clustering methods for partitioning trajectories into conformational bins. Of the methods available, the vast majority require users to either have some a priori knowledge about the system to be clustered or to tune clustering parameters through trial and error. Here we present non-parametric uses of two modern clustering techniques suitable for first-pass investigation of an MD trajectory. Being non-parametric, these methods require neither prior knowledge nor parameter tuning. The first method, HDBSCAN, is fast—relative to other popular clustering methods—and is able to group unstructured or intrinsically disordered systems (such as intrinsically disordered proteins, or IDPs) into bins that represent global conformational shifts. HDBSCAN is also useful for determining the overall stability of a system—as it tends to group stable systems into one or two bins—and identifying transition events between metastable states. The second method, iMWK-Means, with explicit rescaling followed by K-Means, while slower than HDBSCAN, performs well with stable, structured systems such as folded proteins and is able to identify higher resolution details such as changes in relative position of secondary structural elements. Used in conjunction, these clustering methods allow a user to discern quickly and without prior knowledge the stability of a simulated system and identify both local and global conformational changes.
Thrombin is a multifunctional enzyme that plays an important role in blood coagulation, cell growth, and metastasis. Depending upon the binding of sodium ions, thrombin presents significantly different enzymatic activities. In the environment with sodium ions, thrombin is highly active in cleaving the coagulated substrates and this is referred to as the “fast” form; in the environment without sodium ions, thrombin turns catalytically less active and is in the “slow” form. Although many experimental studies over the last two decades have attempted to reveal the structural and kinetic differences between these two forms, it remains vague and disputed how the functional switch between the “fast” and “slow” forms is mediated by Na+ cations. In this work, we employ microsecond-scale all-atom molecular dynamics simulations to investigate the differences in the structural ensembles in sodium-bound/unbound and potassium-bound/unbound thrombin. Our calculations indicate that the regulatory regions, including the 60s, γ loops, and exosite I and II, are primarily affected by both the bound and unbound cations. Conformational free energy surfaces, estimated from principal component analysis, further reveal the existence of multiple conformational states. The binding of a cation introduces changes in the distribution of these states. Through comparisons with potassium-binding, the binding of sodium ions appears to shift the population toward conformational states that might be catalytically favorable. Our study of thrombin in the presence of sodium/potassium ions suggests Na+-mediated generalized allostery is the mechanism of thrombin’s functional switch between the “fast” and “slow” forms.
Displaying a single representative conformation of a biopolymer rather than an ensemble of states mistakenly conveys a static nature rather than the actual dynamic personality of biopolymers. However, there are few apparent options due to the fixed nature of print media. Here we suggest a standardized methodology for visually indicating the distribution width, standard deviation and uncertainty of ensembles of states with little loss of the visual simplicity of displaying a single representative conformation. Of particular note is that the visualization method employed clearly distinguishes between isotropic and anisotropic motion of polymer subunits. We also apply this method to ligand binding, suggesting a way to indicate the expected error in many high throughput docking programs when visualizing the structural spread of the output. We provide several examples in the context of nucleic acids and proteins with particular insights gained via this method. Such examples include investigating a therapeutic polymer of FdUMP (5-fluoro-2-deoxyuridine-5-O-monophosphate) – a topoisomerase-1 (Top1), apoptosis-inducing poison – and nucleotide-binding proteins responsible for ATP hydrolysis from Bacillus subtilis. We also discuss how these methods can be extended to any macromolecular data set with an underlying distribution, including experimental data such as NMR structures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.