Spectroscopic properties of molecules hold great importance
for
the description of the molecular response under the effect of UV/vis
electromagnetic radiation. Computationally expensive
ab initio
(e.g., MultiConfigurational SCF, Coupled Cluster) or TDDFT methods
are commonly used by the quantum chemistry community to compute these
properties. In this work, we propose a (supervised) Machine Learning
approach to model the absorption spectra of organic molecules. Several
supervised ML methods have been tested such as Kernel Ridge Regression
(KRR), Multiperceptron Neural Networs (MLP), and Convolutional Neural
Networks. [
26328822
J. Chem. Phys.
2015
143
084111
Adv. Sci.
2019
6
1801367
] The use of only geometrical-atomic
number descriptors (e.g., Coulomb Matrix) proved to be insufficient
for an accurate training. [
26328822
J. Chem. Phys.
2015
143
084111
] Inspired by the
TDDFT theory, we propose to use a set of electronic descriptors obtained
from low-cost DFT methods: orbital energy differences (Δϵ
ia
= ϵ
a
–
ϵ
i
), transition dipole moment between
occupied and unoccupied Kohn–Sham orbitals (⟨ϕ
i
|
r
|ϕ
a
⟩), and when relevant, charge-transfer character of
monoexcitations (
R
ia
).
We demonstrate that with these electronic descriptors and the use
of Neural Networks we can predict not only a density of excited states
but also get a very good estimation of the absorption spectrum and
charge-transfer character of the electronic excited states, reaching
results close to chemical accuracy (∼2 kcal/mol or ∼0.1
eV).