We
have developed, implemented, and assessed an efficient protocol for
the prediction of NMR chemical shifts of large
nucleic acids using our molecules-in-molecules (MIM) fragment-based
quantum chemical approach. To assess the performance of our approach,
MIM-NMR calculations are calibrated on a test set of three nucleic
acids, where the structure is derived from solution-phase NMR studies.
For DNA systems with multiple conformers, the one-layer MIM method
with trimer fragments (MIM1trimer) is benchmarked to get
the lowest energy structure, with an average error of only 0.80 kcal/mol
with respect to unfragmented full molecule calculations. The MIMI-NMRdimer calibration with respect to unfragmented full molecule
calculations shows a mean absolute deviation (MAD) of 0.06 and 0.11
ppm, respectively, for 1H and 13C nuclei, but
the performance with respect to experimental NMR chemical shifts is
comparable to the more expensive MIM1-NMR and MIM2-NMR methods with
trimer subsystems. To compare with the experimental chemical shifts,
a standard protocol is derived using DNA systems with Protein Data
Bank (PDB) IDs 1SY8, 1K2K, and 1KR8. The effect of structural
minimizations is employed using a hybrid mechanics/semiempirical approach
and used for computations in solution with implicit and explicit–implicit
solvation models in our MIM1-NMRdimer methodology. To demonstrate
the applicability of our protocol, we tested it on seven nucleic acids,
including structures with nonstandard residues, heteroatom substitutions
(F and B atoms), and side chain mutations with a size ranging from
∼300 to 1100 atoms. The major improvement for predicted MIM1-NMRdimer calculations is obtained from structural minimizations
and implicit solvation effects. A significant improvement with the
explicit–implicit solvation model is observed only for two
smaller nucleic acid systems (1KR8 and 7NBK), where the expensive first solvation
shell is replaced by the microsolvation model, in which a single water
molecule is added for each solvent-exposed amino and imino protons,
along with the implicit solvation. Overall, our target accuracy of
∼0.2–0.3 ppm for 1H and ∼2–3
ppm for 13C has been achieved for large nucleic acids.
The proposed MIM-NMR approach is accurate and cost-effective (linear
scaling with system size), and it can aid in the structural assignments
of a wide range of complex biomolecules.