ACKNOLEGEMENTSThe authors would like to thank Dr. Farhad Forouhar, Columbia University, for kindly offering technical advice on structure visualisaVon AUTHORS' CONTRIBUTIONS AK devised the methodology, performed the analysis and wrote the manuscript. CB and EV criVcally reviewed the manuscript and provided suggesVons, MH criVcally reviewed and edited the manuscript. All authors read and approved the final manuscript.
ABBREVIATIONS: HTS high-throughput screening PLI target protein-ligand interacVons AIC Akaike informaVon criterion BIC Bayesian informaVon criterion KEYWORDS: Drug discovery Dihedral angles B-factor Machine learning Gaussian mixture models ConformaVon distal informaVon Protein site characterisaVon
AbstractTarget evaluaVon is at the centre of raVonal drug design and biologics development. In order to successfully engineer anVbodies, T-cell receptors or small molecules it is necessary to idenVfy and characterise potenVal binding or contact sites on therapeuVcally relevant target proteins. Currently, there are numerous challenges in achieving a be;er docking precision as well as characterising relevant sites. We devised a first-of-its-kind in silico protein fingerprinVng approach based on dihedral angle and Bfactor distribuVon to probe binding sites and sites of structural importance. In addiVon, we showed that the enVre protein regions or individual structural subsets can be profiled using our derived fi-score based on amino acid dihedral angle and B-factor distribuVon. We further described a method to assess the structural profile and extract informaVon on sites of importance using machine learning Gaussian mixture models. In combinaVon, these biophysical analyVcal methods could potenVally help to classify and systemaVcally analyse not only targets but also drug candidates that bind to specific sites which would greatly improve pre-screening stage, target selecVon and drug repurposing efforts in finding other matching targets.