We introduce the combinatorial averaged transient structure (CATS) clustering method as a means to cluster protein structure ensembles based on the distributions of protein backbone descriptor coordinates. In our study, we use phi and psi dihedral angle coordinates of the protein backbone as descriptors due to their translational and rotational invariance. The CATS method was developed to produce unique structure ensembles that are typically difficult to obtain from flat energy landscapes using a one dimensional separation value (eg. RMSD cutoff). Through the use of higher dimensional descriptor coordinates, we remedy structure resolution shortcomings of standard clustering algorithms due to large RMSD fluctuations between structures. We compare the performance of CATS to an RMSD-based clustering method GROMOS, which may not be the best choice of IDP clustering since separation quality heavily relies on cutoff values instead of energy landscape minima. We demonstrate the performance of CATS and GROMOS by analyzing the all-atom molecular dynamics trajectories of the Tau/R2(273-284) fragment in solution with TMAO and urea osmolytes from prior studies. Our study reveals that the CATS method produces more unique clusters than the GROMOS method as a result of higher dimensional distributions of the descriptor coordinates. The cluster centers produced by CATS correspond to local minima in the multi-dimensional potential mean force, which generates a structure ensemble that adequately samples the energy landscape.
We have developed a computational method of atomistically refining the structural ensemble of intrinsically disordered peptides (IDPs) facilitated by experimental measurements using circular dichroism spectroscopy (CD). A major challenge surrounding this approach stems from the deconvolution of experimental CD spectra into secondary structure features of the IDP ensemble. Currently available algorithms for CD deconvolution were designed to analyze the spectra of proteins with stable secondary structures. Herein, our work aims to minimize any bias from the peptide deconvolution analysis by implementing a non-negative linear least-squares fitting algorithm in conjunction with a CD reference data set that contains soluble and denatured proteins (SDP48). The non-negative linear least-squares method yields the best results for deconvolution of proteins with higher disordered content than currently available methods, according to a validation analysis of a set of protein spectra with Protein Data Bank entries. We subsequently used this analysis to deconvolute our experimental CD data to refine our computational model of the peptide secondary structure ensemble produced by all-atom molecular dynamics simulations with implicit solvent. We applied this approach to determine the ensemble structures of a set of short IDPs, that mimic the calmodulin binding domain of calcium/calmodulin-dependent protein kinase II and its 1-amino-acid and 3-amino-acid mutants. Our study offers a, to our knowledge, novel way to solve the ensemble secondary structures of IDPs in solution, which is important to advance the understanding of their roles in regulating signaling pathways through the formation of complexes with multiple partners.
Calmodulin (CaM) is a calcium-binding protein that transduces signals to downstream proteins through target binding upon calcium binding in a time-dependent manner. Understanding the target binding process that tunes CaM’s affinity for the calcium ions (Ca2+), or vice versa, may provide insight into how Ca2+-CaM selects its target binding proteins. However, modeling of Ca2+-CaM in molecular simulations is challenging because of the gross structural changes in its central linker regions while the two lobes are relatively rigid due to tight binding of the Ca2+ to the calcium-binding loops where the loop forms a pentagonal bipyramidal coordination geometry with Ca2+. This feature that underlies the reciprocal relation between Ca2+ binding and target binding of CaM, however, has yet to be considered in the structural modeling. Here, we presented a coarse-grained model based on the Associative memory, Water mediated, Structure, and Energy Model (AWSEM) protein force field, to investigate the salient features of CaM. Particularly, we optimized the force field of CaM and that of Ca2+ ions by using its coordination chemistry in the calcium-binding loops to match with experimental observations. We presented a “community model” of CaM that is capable of sampling various conformations of CaM, incorporating various calcium-binding states, and carrying the memory of binding with various targets, which sets the foundation of the reciprocal relation of target binding and Ca2+ binding in future studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.