Alchemical relative
binding free energy calculations have recently
found important applications in drug optimization. A series of congeneric
compounds are generated from a preidentified lead compound, and their
relative binding affinities to a protein are assessed in order to
optimize candidate drugs. While methods based on equilibrium thermodynamics
have been extensively studied, an approach based on nonequilibrium
methods has recently been reported together with claims of its superiority.
However, these claims pay insufficient attention to the basis and
reliability of both methods. Here we report a comparative study of
the two approaches across a large data set, comprising more than 500
ligand transformations spanning in excess of 300 ligands binding to
a set of 14 diverse protein targets. Ensemble methods are essential
to quantify the uncertainty in these calculations, not only for the
reasons already established in the equilibrium approach but also to
ensure that the nonequilibrium calculations reside within their domain
of validity. If and only if ensemble methods are applied, we find
that the nonequilibrium method can achieve accuracy and precision
comparable to those of the equilibrium approach. Compared to the equilibrium
method, the nonequilibrium approach can reduce computational costs
but introduces higher computational complexity and longer wall clock
times. There are, however, cases where the standard length of a nonequilibrium
transition is not sufficient, necessitating a complete rerun of the
entire set of transitions. This significantly increases the computational
cost and proves to be highly inconvenient during large-scale applications.
Our findings provide a key set of recommendations that should be adopted
for the reliable implementation of nonequilibrium approaches to relative
binding free energy calculations in ligand-protein systems.