Many
deep learning (DL)-based molecular generative models have
been proposed to design novel molecules. These models may perform
well on benchmarks, but they usually do not take real-world constraints
into account, such as available training data set, synthetic accessibility,
and scaffold diversity in drug discovery. In this study, a new algorithm,
ChemistGA, was proposed by combining the traditional heuristic algorithm
with DL, in which the crossover of the traditional genetic algorithm
(GA) was redefined by DL in conjunction with GA, and an innovative
backcrossing operation was implemented to generate desired molecules.
Our results clearly show that ChemistGA not only retains the strength
of the traditional GA but also greatly enhances the synthetic accessibility
and success rate of the generated molecules with desired properties.
Calculations on the two benchmarks illustrate that ChemistGA achieves
impressive performance among the state-of-the-art baselines, and it
opens a new avenue for the application of generative models to real-world
drug discovery scenarios.