2012
DOI: 10.1007/s10994-012-5287-6
|View full text |Cite
|
Sign up to set email alerts
|

Efficient cross-validation for kernelized least-squares regression with sparse basis expansions

Abstract: We propose an efficient algorithm for calculating hold-out and cross-validation (CV) type of estimates for sparse regularized least-squares predictors. Holding out H data points with our method requires O(min(H 2 n, H n 2 )) time provided that a predictor with n basis vectors is already trained. In addition to holding out training examples, also some of the basis vectors used to train the sparse regularized least-squares predictor with the whole training set can be removed from the basis vector set used in the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 28 publications
0
4
0
Order By: Relevance
“…The observation that new drug targets are easier to predict than new targeted compounds is consistent with previous work [ 8 ]. Future improvements in the experimental drug–target bioactivity data coverage and quality, both in the individual profiling studies that focus on specific drug and target families, such as kinase inhibitors [ 17 , 18 ], as well as in the general drug and target databases, such as ChEMBL [ 37 ], could make it possible to start developing in silico prediction tools that can generalize beyond the training data and can be used, for instance, for prioritization of the most potential drug or target panels for experimental validation in human assays in vivo .…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The observation that new drug targets are easier to predict than new targeted compounds is consistent with previous work [ 8 ]. Future improvements in the experimental drug–target bioactivity data coverage and quality, both in the individual profiling studies that focus on specific drug and target families, such as kinase inhibitors [ 17 , 18 ], as well as in the general drug and target databases, such as ChEMBL [ 37 ], could make it possible to start developing in silico prediction tools that can generalize beyond the training data and can be used, for instance, for prioritization of the most potential drug or target panels for experimental validation in human assays in vivo .…”
Section: Discussionmentioning
confidence: 99%
“…For example, the recent study by van Laarhoven et al [ 17 ] showed that a regularized least-squares (RLS) model was able to predict binary drug–target interactions at almost perfect prediction accuracies when evaluated using a simple LOO-CV. Although RLS has proven to be an effective model in many applications [ 18 , 19 ], we argue that a part of this superior predictive power can be attributed to the oversimplified formulation of the drug–target prediction problem, as well as unrealistic evaluation of the model performance. Another source of potential bias is that simple cross-validation (CV) cannot evaluate the effect of adjusting the model parameters, and may therefore easily lead to selection bias and overoptimistic prediction results [ 20–22 ].…”
Section: Introductionmentioning
confidence: 99%
“…Regularized least-square (RLS) is an efficient model used in different types of applications (Pahikkala et al, 2012a,b). Van Laarhoven et al (2011) used RLS for the binary prediction of DTIs and achieved outstanding performance.…”
Section: Computational Prediction Of Drug-target Binding Affinitiesmentioning
confidence: 99%
“…When this is not computationally feasible, one may approximate full LPO-SCV by randomly sampling a subset of all the possible pairs. Further, for ridge regression classifiers, fast LPO-SCV can be implemented using the fast holdout algorithms (Pahikkala et al 2012) implemented in the RLScore open source library (Pahikkala and Airola 2016).…”
Section: Spatial Leave-pair-out CVmentioning
confidence: 99%