Computational methods to predict protein structure from nuclear magnetic resonance (NMR) restraints that only require assignment of backbone signals hold great potential to study larger proteins and complexes. Additionally, computational methods designed to work with sparse data add atomic detail that is missing in the experimental restraints, allowing application to systems that are difficult to investigate. While specific frameworks in the Rosetta macromolecular modeling suite support the use of certain NMR restraint types, use of all commonly measured restraint types together is precluded. Here, we introduce a comprehensive framework into Rosetta that reconciles CS-Rosetta, PCS-Rosetta and RosettaNMR into a single framework, that, in addition to backbone chemical shifts and nuclear Overhauser effect distance restraints, leverages NMR restraints derived from paramagnetic labeling. Specifically, RosettaNMR incorporates pseudocontact shifts, residual dipolar couplings, and paramagnetic relaxation enhancements, measured at multiple tagging sites. We further showcase the generality of RosettaNMR for various modeling challenges and benchmark it on 28 structure prediction cases, eight symmetric assemblies, two protein-protein and three protein-ligand docking examples. Paramagnetic restraints generated more accurate models for 85% of the benchmark proteins and, when combined with chemical shifts, sampled high-accuracy models (≤ 2Å) in 50% of the cases.
Significance StatementComputational methods such as Rosetta can assist NMR structure determination by employing efficient conformational search algorithms alongside physically realistic energy functions to model protein structure from sparse experimental data. We have developed a framework in Rosetta that leverages paramagnetic NMR data in addition to chemical shift and nuclear Overhauser effect restraints and extends RosettaNMR calculations to the prediction of symmetric assemblies, protein-protein and protein-ligand complexes. RosettaNMR generated high-accuracy models (≤ 2Å) in 50% of cases for a benchmark set of 28 monomeric and eight symmetric proteins and predicted protein-protein and protein-ligand interfaces with up to 1Å accuracy. The method expands Rosetta's rich toolbox for integrative data-driven modeling and promises to be broadly useful in structural biology.