Estimation of prediction accuracy is important when our aim is prediction. The training error is an easy estimate of prediction error, but it has a downward bias. On the other hand, K-fold cross-validation has an upward bias. The upward bias may be negligible in leave-one-out crossvalidation, but it sometimes cannot be neglected in 5-fold or 10-fold cross-validation, which are favored from a computational standpoint. Since the training error has a downward bias and K-fold cross-validation has an upward bias, there will be an appropriate estimate in a family that connects the two estimates. In this paper, we investigate two families that connect the training error and K-fold crossvalidation.
Background: Proteomic data obtained from mass spectrometry have attracted great interest for the detection of early-stage cancer. However, as mass spectrometry data are high-dimensional, identification of biomarkers is a key problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.