We present a novel computational approach for predicting
human
pharmacokinetics (PK) that addresses the challenges of early stage
drug design. Our study introduces and describes a large-scale data
set of 11 clinical PK end points, encompassing over 2700 unique chemical
structures to train machine learning models. To that end multiple
advanced training strategies are compared, including the integration
of in vitro data and a novel self-supervised pretraining task. In
addition to the predictions, our final model provides meaningful epistemic
uncertainties for every data point. This allows us to successfully
identify regions of exceptional predictive performance, with an absolute
average fold error (AAFE/geometric mean fold error) of less than 2.5
across multiple end points. Together, these advancements represent
a significant leap toward actionable PK predictions, which can be
utilized early on in the drug design process to expedite development
and reduce reliance on nonclinical studies.