Machine learning methods are attracting
considerable attention
from the pharmaceutical industry for use in drug discovery and applications
beyond. In recent studies, we and others have applied multiple machine
learning algorithms and modeling metrics and, in some cases, compared
molecular descriptors to build models for individual targets or properties
on a relatively small scale. Several research groups have used large
numbers of datasets from public databases such as ChEMBL in order
to evaluate machine learning methods of interest to them. The largest
of these types of studies used on the order of 1400 datasets. We have
now extracted well over 5000 datasets from CHEMBL for use with the
ECFP6 fingerprint and in comparison of our proprietary software Assay
Central with random forest, k-nearest neighbors, support vector classification,
naïve Bayesian, AdaBoosted decision trees, and deep neural
networks (three layers). Model performance was assessed using an array
of fivefold cross-validation metrics including area-under-the-curve,
F1 score, Cohen’s kappa, and Matthews correlation coefficient.
Based on ranked normalized scores for the metrics or datasets, all
methods appeared comparable, while the distance from the top indicated
that Assay Central and support vector classification were comparable.
Unlike prior studies which have placed considerable emphasis on deep
neural networks (deep learning), no advantage was seen in this case.
If anything, Assay Central may have been at a slight advantage as
the activity cutoff for each of the over 5000 datasets representing
over 570,000 unique compounds was based on Assay Central performance,
although support vector classification seems to be a strong competitor.
We also applied Assay Central to perform prospective predictions for
the toxicity targets PXR and hERG to further validate these models.
This work appears to be the largest scale comparison of these machine
learning algorithms to date. Future studies will likely evaluate additional
databases, descriptors, and machine learning algorithms and further
refine the methods for evaluating and comparing such models.