Prediction of P-glycoprotein substrate specificity (S(PGP)) can be viewed as a constituent part of a compound's "pharmaceutical profiling" in drug design. This task is difficult to achieve due to several factors that raised many contradictory opinions: (i) the disparity between the S(PGP) values obtained in different assays, (ii) the confusion between Pgp substrates and inhibitors, (iii) the confusion between lipophilicity and amphiphilicity of Pgp substrates, and (iv) the dilemma of describing class-specific relationships when Pgp has no binding sites of high ligand specificity. In this work, we compiled S(PGP) data for 1000 compounds. All data were represented in a binary format, assigning S(PGP) = 1 for substrates and S(PGP) = 0 for non-substrates. Each value was ranked according to the reliability of experimental assay. Two data sets were considered. Set 1 included 220 compounds with S(PGP) from polarized transport across MDR1 transfected cell monolayers. Set 2 included the entire list of 1000 compounds, with S(PGP) values of generally lower reliability. Both sets were analysed using a stepwise classification structure-activity relationship (C-SAR) method, leading to derivation of simple rules for crude estimation of S(PGP) values. The obtained rules are based on the following factors: (i) compound's size expressed through molar weight or volume, (ii) H-accepting given by the Abraham's beta (that can be crudely approximated by the sum of O and N atoms), and (iii) ionization given by the acid and base pKa values. Very roughly, S(PGP) can be estimated by the "rule of fours". Compounds with (N + O) > or = 8, MW > 400 and acid pKa > 4 are likely to be Pgp substrates, whereas compounds with (N + O) < or = 4, MW < 400 and base pKa < 8 are likely to be non-substrates. The obtained results support the view that Pgp functioning can be compared to a complex "mini-pharmacokinetic" system with fuzzy specificity. This system can be described by a probabilistic version of Abraham's solvation equation, suggesting a certain similarity between Pgp transport and chromatographic retention. The chromatographic model does not work in the case of "marginal" compounds with properties close to the "global" physicochemical cut-offs. In the latter case various class-specific rules must be considered. These can be associated with the "amphiphilicity" and "biological similarity" of compounds. The definition of class-specific effects entails construction of the knowledge base that can be very useful in ADME profiling of new drugs.
This study presents a new type of acute toxicity (LD(50)) prediction that enables automated assessment of the reliability of predictions (which is synonymous with the assessment of the Model Applicability Domain as defined by the Organization for Economic Cooperation and Development). Analysis involved nearly 75,000 compounds from six animal systems (acute rat toxicity after oral and intraperitoneal administration; acute mouse toxicity after oral, intraperitoneal, intravenous, and subcutaneous administration). Fragmental Partial Least Squares (PLS) with 100 bootstraps yielded baseline predictions that were automatically corrected for non-linear effects in local chemical spaces--a combination called Global, Adjusted Locally According to Similarity (GALAS) modelling methodology. Each prediction obtained in this manner is provided with a reliability index value that depends on both compound's similarity to the training set (that accounts for similar trends in LD(50) variations within multiple bootstraps) and consistency of experimental results with regard to the baseline model in the local chemical environment. The actual performance of the Reliability Index (RI) was proven by its good (and uniform) correlations with Root Mean Square Error (RMSE) in all validation sets, thus providing quantitative assessment of the Model Applicability Domain. The obtained models can be used for compound screening in the early stages of drug development and prioritization for experimental in vitro testing or later in vivo animal acute toxicity studies.
Fragmental methods (FMs) have great potential in many practical areas related to the design of new lead compounds. Advanced Algorithm Builder TM (AAB) is a new software system which employs FMs in (i) building QSPR, QSAR and SAR models, (ii) converting them to custom (in-house) algorithms and screening filters, and (iii) predicting physical properties and biological activities for new compounds. This review demonstrates how FMs and AAB can be used to substantiate our intuition, interpret observations, validate hypotheses and obtain new algorithms for predicting physical properties and biological activities. Applications for practical and theoretical chemists in the design of new lead compounds are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.