The power of structural information for informing biological mechanisms is clear for stable folded macromolecules, but similar structure–function insight is more difficult to obtain for highly dynamic systems such as intrinsically disordered proteins (IDPs) which must be described as structural ensembles. Here, we present IDPConformerGenerator, a flexible, modular open-source software platform for generating large and diverse ensembles of disordered protein states that builds conformers that obey geometric, steric, and other physical restraints on the input sequence. IDPConformerGenerator samples backbone phi (φ), psi (ψ), and omega (ω) torsion angles of relevant sequence fragments from loops and secondary structure elements extracted from folded protein structures in the RCSB Protein Data Bank and builds side chains from robust Monte Carlo algorithms using expanded rotamer libraries. IDPConformerGenerator has many user-defined options enabling variable fractional sampling of secondary structures, supports Bayesian models for assessing the agreement of IDP ensembles for consistency with experimental data, and introduces a machine learning approach to transform between internal and Cartesian coordinates with reduced error. IDPConformerGenerator will facilitate the characterization of disordered proteins to ultimately provide structural insights into these states that have key biological functions.
This paper presents a systematic study of applying composite method approximations with locally dense basis sets (LDBS) to efficiently calculate NMR shielding constants in small and medium-sized molecules. The pcSseg-n series of basis sets are shown to have similar accuracy to the pcS-n series when n ≥ 1 and can slightly reduce computational costs. We identify two different LDBS partition schemes that perform very effectively for density functional calculations. We select a large subset of the recent NS372 database containing 290 H, C, N, and O shielding values evaluated by reference methods on 106 molecules to carefully assess methods of the high, medium, and low computational costs to make practical recommendations. Our assessment covers conventional electronic structure methods (density functional theory and wave function) with global basis calculations, as well as their use in one of the satisfactory LDBS approaches, and a range of composite approaches, also with and without LDBS. Altogether 99 methods are evaluated. On this basis, we recommend different methods to reach three different levels of accuracy and time requirements across the four nuclei considered.
Intrinsically disordered proteins and unfolded proteins have fluctuating conformational ensembles that are fundamental to their biological function and impact protein folding, stability, and misfolding. Despite the importance of protein dynamics and conformational sampling, time-dependent data types are not fully exploited when defining and refining disordered protein ensembles. Here we introduce a computational framework using an elastic network model and normal-mode displacements to generate a dynamic disordered ensemble consistent with NMRderived dynamics parameters, including transverse R 2 relaxation rates and Lipari−Szabo order parameters (S 2 values). We illustrate our approach using the unfolded state of the drkN SH3 domain to show that the dynamical ensembles give better agreement than a static ensemble for a wide range of experimental validation data including NMR chemical shifts, J-couplings, nuclear Overhauser effects, paramagnetic relaxation enhancements, residual dipolar couplings, hydrodynamic radii, single-molecule fluorescence Forster resonance energy transfer, and small-angle X-ray scattering.
The structural characterization of proteins with a disorder requires a computational approach backed by experiments to model their diverse and dynamic structural ensembles. The selection of conformational ensembles consistent with solution experiments of disordered proteins highly depends on the initial pool of conformers, with currently available tools limited by conformational sampling. We have developed a Generative Recurrent Neural Network (GRNN) that uses supervised learning to bias the probability distributions of torsions to take advantage of experimental data types such as nuclear magnetic resonance J-couplings, nuclear Overhauser effects, and paramagnetic resonance enhancements. We show that updating the generative model parameters according to the reward feedback on the basis of the agreement between experimental data and probabilistic selection of torsions from learned distributions provides an alternative to existing approaches that simply reweight conformers of a static structural pool for disordered proteins. Instead, the biased GRNN, DynamICE, learns to physically change the conformations of the underlying pool of the disordered protein to those that better agree with experiments.
We developed and implemented a method-independent, fully numerical, finite difference approach to calculating nuclear magnetic resonance shieldings, using gauge-including atomic orbitals. The resulting capability can be used to explore non-standard methods, given only the energy as a function of finite-applied magnetic fields and nuclear spins. For example, standard second-order Møller-Plesset theory (MP2) has well-known efficacy for 1H and 13C shieldings and known limitations for other nuclei such as 15N and 17O. It is, therefore, interesting to seek methods that offer good accuracy for 15N and 17O shieldings without greatly increased compute costs, as well as exploring whether such methods can further improve 1H and 13C shieldings. Using a small molecule test set of 28 species, we assessed two alternatives: κ regularized MP2 (κ-MP2), which provides energy-dependent damping of large amplitudes, and MP2.X, which includes a variable fraction, X, of third-order correlation (MP3). The aug-cc-pVTZ basis was used, and coupled cluster with singles and doubles and perturbative triples [CCSD(T)] results were taken as reference values. Our κ-MP2 results reveal significant improvements over MP2 for 13C and 15N, with the optimal κ value being element-specific. κ-MP2 with κ = 2 offers a 30% rms error reduction over MP2. For 15N, κ-MP2 with κ = 1.1 provides a 90% error reduction vs MP2 and a 60% error reduction vs CCSD. On the other hand, MP2.X with a scaling factor of 0.6 outperformed CCSD for all heavy nuclei. These results can be understood as providing renormalization of doubles amplitudes to partially account for neglected triple and higher substitutions and offer promising opportunities for future applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.