<div>
<p>Rapid and accurate prediction of
reactivity descriptors of transition metal (TM) complexes is a major challenge
for contemporary quantum chemistry.
Recently developed GFN2-xTB method based on the density functional tight-binding
theory (DFT-B) is suitable for high-throughput
calculation of geometries and thermochemistry for TM complexes albeit with
a moderate accuracy. Herein we present a data-augmented approach to improve substantially
the accuracy of GFN2-xTB for the prediction of thermochemical properties using
pK<sub>a</sub> values of TM hydrides as a representative model example. We
constructed a comprehensive database for ca. 200 TM hydride complexes featuring
the experimentally measured pK<sub>a</sub>’s as well as the GFN2-xTB optimized
geometries and various computed electronic and energetic descriptors. The
GFN2-xTB results were further refined and validated by DFT calculations with
the hybrid PBE0 functional. Our results show that although the GFN2-xTB
performs well in most cases, it fails to adequately desribe TM complexes
featuring multicarbonyl and multihydride ligand environments. The dataset was
analyzed with the partial least squares (OLS) fitting and was used to construct
an automated machine learning (AutoML) approach for the rapid estimation of pK<sub>a</sub>
of TM hydride complexes. The results obtained show a high predictive power of
the very fast AutoML model (RMSE ~ 2.7) comparable to that of the much slower
DFT calculations (RMSE ~ 3). The presented data-augmented quantum chemistry-based
approach is promising for high-throughput computational screening workflows of
homogeneous TM-based catalysts.</p>
</div>
<br>