Generation and prioritization of new molecules are the most central
part of the drug design process. Matched molecular series analysis
(MMSA) has recently been proposed as a formal approach that captures
both of these key elements of design. In order to better understand
the power of MMSA and its specific limitations, we here evaluate its
performance as an ADME property prediction tool. We use four large
and diverse inhouse data sets, logD, microsomal clearance,
CYP2C9, and CYP3A4 inhibition. MMSA follows the concept of parallel
structure–activity relationship (SAR), where if two identical
substituent series on different scaffolds show similarity in their
property profiles, SAR from one series can be transferred to the other
series. We test four different similarity metrics to identify pairs
of molecular series where information can be transferred. We find
that the best prediction performance is achieved by a combination
of centered root-mean-square deviation (cRMSD) and a network score
approach previously published by Keefer et al. However, cRMSD alone
strikes the best balance between accuracy and the number of predictions
that can be made. We identify statistical metrics that allow estimating
when MMSA predictions will work, similar to the well-known applicability
domain concept in machine learning. MMSA achieves a prediction accuracy
that is comparable to a standard machine-learning model and matched
molecular pair analysis. In contrast to machine learning, however,
it is very easy to understand where MMSA predictions are coming from.
Finally, to prospectively test the power of MMSA, we retested compounds
that were strong outliers in the initial predictions and show how
the MMSA model can help to identify erroneous data points.