Three different multivariate statistical methods, PLS discriminant analysis, rule-based methods, and Bayesian classification, have been applied to multidimensional scoring data from four different target proteins: estrogen receptor alpha (ERalpha), matrix metalloprotease 3 (MMP3), factor Xa (fXa), and acetylcholine esterase (AChE). The purpose was to build classifiers able to discriminate between active and inactive compounds, given a structure-based virtual screen. Seven different scoring functions were used to generate the scoring matrices. The classifiers were compared to classical consensus scoring and single scoring functions. The classifiers show a superior performance, with rule-based methods being most effective. The precision of correctly predicting an active compound is about 90% for three of the targets and about 25% for acetylcholine esterase. On the basis of these results, a new two-stage approach is suggested for structure-based virtual screening where limited activity information is available.
Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing stateof-the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.