The
capability of
current force fields to reproduce RNA structural
dynamics is limited. Several methods have been developed to take advantage
of experimental data in order to enforce agreement with experiments.
Here, we extend an existing framework which allows arbitrarily chosen
force-field correction terms to be fitted by quantification of the
discrepancy between observables back-calculated from simulation and
corresponding experiments. We apply a robust regularization protocol
to avoid overfitting and additionally introduce and compare a number
of different regularization strategies, namely, L1, L2, Kish size,
relative Kish size, and relative entropy penalties. The training set
includes a GACC tetramer as well as more challenging systems, namely,
gcGAGAgc and gcUUCGgc RNA tetraloops. Specific intramolecular hydrogen
bonds in the AMBER RNA force field are corrected with automatically
determined parameters that we call gHBfix
opt
. A validation
involving a separate simulation of a system present in the training
set (gcUUCGgc) and new systems not seen during training (CAAU and
UUUU tetramers) displays improvements regarding the native population
of the tetraloop as well as good agreement with NMR experiments for
tetramers when using the new parameters. Then, we simulate folded
RNAs (a kink–turn and L1 stalk rRNA) including hydrogen bond
types not sufficiently present in the training set. This allows a
final modification of the parameter set which is named gHBfix21 and
is suggested to be applicable to a wider range of RNA systems.