We present a novel structure determination approach that exploits the global orientational restraints from RDCs to resolve ambiguous NOE assignments. Unlike traditional approaches that bootstrap the initial fold from ambiguous NOE assignments, we start by using RDCs to compute accurate secondary structure element (SSE) backbones at the beginning of structure calculation. Our structure determination package, called RDC-PANDA (RDC-based SSE PAcking with NOEs for Structure Determination and NOE Assignment), consists of three modules: (1) RDC-EXACT; (2) PACKER; and (3) HANA (HAusdorff-based NOE Assignment). RDC-EXACT computes the global optimal solution of backbone dihedral angles for each secondary structure element by exactly solving a system of quartic RDC equations derived by Wang and Donald (2004a,b), and systematically searching over the roots, each of which is a backbone dihedral ϕ-or ψ-angle consistent with the RDC data. Using a small number of unambiguous inter-SSE NOEs extracted using only chemical shift information, PACKER performs a systematic search for the core structure, including all SSE backbone conformations. HANA uses a Hausdorff-based scoring function to measure the similarity between the experimental spectra and the back-computed NOE pattern for each side-chain from a statistically-diverse rotamer library, and drives the selection of optimal position-specific rotamers for filtering ambiguous NOE assignments. Finally, a local minimization approach is used to compute the loops and refine side-chain conformations by fixing the core structure as a rigid body while allowing movement of loops and side-chains. RDC-PANDA was applied to NMR data for the FF Domain 2 of human transcription elongation factor CA150 (RNA polymerase II C-terminal domain interacting protein), human ubiquitin, the ubiquitin-binding zinc finger domain of the human Y-family DNA polymerase Eta (pol η UBZ), and the human Set2-Rpb1 interacting domain (hSRI). These results demonstrated the efficiency and accuracy of our algorithm, and show that RDC-PANDA can be successfully applied for highresolution protein structure determination using only a limited set of NMR data by first computing RDC-defined backbones.
High-throughput structure determination based on solution Nuclear Magnetic Resonance (NMR) spectroscopy plays an important role in structural genomics. One of the main bottlenecks in NMR structure determination is the interpretation of NMR data to obtain a sufficient number of accurate distance restraints by assigning nuclear Overhauser effect (NOE) spectral peaks to pairs of protons. The difficulty in automated NOE assignment mainly lies in the ambiguities arising both from the resonance degeneracy of chemical shifts and from the uncertainty due to experimental errors in NOE peak positions. In this paper we present a novel NOE assignment algorithm, called HAusdorff-based NOE Assignment (HANA), that starts with a high-resolution protein backbone computed using only two residual dipolar couplings (RDCs) per residue 37, 39 , employs a Hausdorff-based pattern matching technique to deduce similarity between experimental and back-computed NOE spectra for each rotamer from a statistically diverse library, and drives the selection of optimal position-specific rotamers for filtering ambiguous NOE assignments. Our algorithm runs in time O(tn 3 +tn log t), where t is the maximum number of rotamers per residue and n is the size of the protein. Application of our algorithm on biological NMR data for three proteins, namely, human ubiquitin, the zinc finger domain of the human DNA Y-polymerase Eta (pol η) and the human Set2-Rpb1 interacting domain (hSRI) demonstrates that our algorithm overcomes spectral noise to achieve more than 90% assignment accuracy. Additionally, the final structures calculated using our automated NOE assignments have backbone RMSD < 1.7 Å and all-heavy-atom RMSD < 2.5 Å from reference structures that were determined either by X-ray crystallography or traditional NMR approaches. These results show that our NOE assignment algorithm can be successfully applied to protein NMR spectra to obtain high-quality structures.
Protein loops often play important roles in biological functions. Modeling loops accurately is crucial to determining the functional specificity of a protein. Despite the recent progress in loop prediction approaches, which led to a number of algorithms over the past decade, few rigorous algorithmic approaches exist to model protein loops using global orientational restraints, such as those obtained from residual dipolar coupling (RDC) data in solution NMR spectroscopy. In this article, we present a novel, sparse data, RDC-based algorithm, which exploits the mathematical interplay between RDC-derived sphero-conics and protein kinematics, and formulates the loop structure determination problem as a system of low-degree polynomial equations that can be solved exactly, in closed-form. The polynomial roots, which encode the candidate conformations, are searched systematically, using provable pruning strategies that triage the vast majority of conformations, to enumerate or prune all possible loop conformations consistent with the data; therefore, completeness is ensured. Results on experimental RDC datasets for four proteins, including human ubiquitin, FF2, DinI and GB3, demonstrate that our algorithm can compute loops with higher accuracy, a 3- to 6-fold improvement in backbone RMSD, versus those obtained by traditional structure determination protocols on the same data. Excellent results were also obtained on synthetic RDC datasets for protein loops of length 4, 8 and 12 used in previous studies. These results suggest that our algorithm can be successfully applied to determine protein loop conformations, and hence will be useful in high-resolution protein backbone structure determination, including loops, from sparse NMR data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.