Probabilistic predictions for regression problems are more popular than point predictions and interval predictions, since they contain more information for test labels. Conformal predictive system is a recently proposed non-parametric method to do reliable probabilistic predictions, which is computationally inefficient due to its learning process. To build faster conformal predictive system and make full use of training data, this paper proposes the predictive system based on locally weighted jackknife prediction approach. The theoretical property of our proposed method is proved with some regularity assumptions in the asymptotic setting, which extends our earlier theoretical researches from interval predictions to probabilistic predictions. In the experimental section, our method is implemented based on our theoretical analysis and its comparison with other predictive systems is conducted using 20 public data sets. The continuous ranked probability scores of the predictive distributions and the performance of the derived prediction intervals are compared. The better performance of our proposed method is confirmed with Wilcoxon tests. The experimental results demonstrate that the predictive system we proposed is not only empirically valid, but also provides more information than the other comparison predictive systems.
Distribution regression is the regression case where the input objects are distributions. Many machine learning problems can be analysed in this framework, such as multi-instance learning and learning from noisy data. This paper attempts to build a conformal predictive system(CPS) for distribution regression, where the prediction of the system for a test input is a cumulative distribution function(CDF) of the corresponding test label. The CDF output by a CPS provides useful information about the test label, as it can estimate the probability of any event related to the label and be transformed to prediction interval and prediction point with the help of the corresponding quantiles. Furthermore, a CPS has the property of validity as the prediction CDFs and the prediction intervals are statistically compatible with the realizations. This property is desired for many risk-sensitive applications, such as weather forecast. To the best of our knowledge, this is the first work to extend the learning framework of CPS to distribution regression problems. We first embed the input distributions to a reproducing kernel Hilbert space using kernel mean embedding approximated by random Fourier features, and then build a fast CPS on the top of the embeddings. While inheriting the property of validity from the learning framework of CPS, our algorithm is simple, easy to implement and fast. The proposed approach is tested on synthetic data sets and can be used to tackle the problem of statistical postprocessing of ensemble forecasts, which demonstrates the effectiveness of our algorithm for distribution regression problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.