Understanding and comparing scalable Gaussian process regression for big data

Liu, Haitao; Cai, Jianfei; Ong, Yew-Soon; Wang, Yi

doi:10.1016/j.knosys.2018.11.002

Cited by 24 publications

(9 citation statements)

References 29 publications

(58 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, though showing a remarkable complexity of O(m 3 ), we cannot expect the subset-of-data to perform well with increasing n. In terms of model capability, global approximations are capable of capturing the global patterns (long-term spatial correlations) but often filter out the local patterns due to the limited global inducing set. In contrast, due to the local nature, local approximations favor capturing local patterns (non-stationary features), enabling them to outperform global approximations for complicated tasks, see the solar example in [38]. The drawback however is that they ignore the global patterns to risk discontinuous predictions and local over-fitting.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

When Gaussian Process Meets Big Data: A Review of Scalable GPs

Liu

Ong

Shen

et al. 2020

IEEE Trans. Neural Netw. Learning Syst.

Self Cite

569

320

View full text Add to dashboard Cite

The vast quantity of information brought by big data as well as the evolving computer hardware encourages success stories in the machine learning community. In the meanwhile, it poses challenges for the Gaussian process (GP) regression, a well-known non-parametric and interpretable Bayesian model, which suffers from cubic complexity to data size. To improve the scalability while retaining desirable prediction quality, a variety of scalable GPs have been presented. But they have not yet been comprehensively reviewed and analyzed in order to be well understood by both academia and industry. The review of scalable GPs in the GP community is timely and important due to the explosion of data size. To this end, this paper is devoted to the review on state-of-the-art scalable GPs involving two main categories: global approximations which distillate the entire data and local approximations which divide the data for subspace learning. Particularly, for global approximations, we mainly focus on sparse approximations comprising prior approximations which modify the prior but perform exact inference, posterior approximations which retain exact prior but perform approximate inference, and structured sparse approximations which exploit specific structures in kernel matrix; for local approximations, we highlight the mixture/product of experts that conducts model averaging from multiple local experts to boost predictions. To present a complete review, recent advances for improving the scalability and capability of scalable GPs are reviewed. Finally, the extensions and open issues regarding the implementation of scalable GPs in various scenarios are reviewed and discussed to inspire novel ideas for future research avenues.Index Terms-Gaussian process regression, big data, scalability, sparse approximations, local approximations Haitao Liu is with the Rolls-

show abstract

Section: Introductionmentioning

confidence: 99%

“…When sharing hyperparameters, the local structure itself may have good estimations of hyperparameter to capture some kind of local patterns[38] 22. Wang et al[157] successfully trained a MVM-based exact GP over a million data points in three days through eight GPUs.…”

mentioning

confidence: 99%

When Gaussian Process Meets Big Data: A Review of Scalable GPs

Liu

Ong

Shen

et al. 2020

IEEE Trans. Neural Netw. Learning Syst.

Self Cite

569

320

View full text Add to dashboard Cite

show abstract

“…When GPR is applied to practical problems, GPR can give a confidence interval while outputting the mean value, making the validity of the prediction result continuously enhanced. In addition, because the GPR can quantitatively model Gaussian noise, it has excellent prediction accuracy [48,49]. Because of its good predictive ability, GPR has been widely used in data-driven modeling of various problems in industry [50][51][52][53], so GPR has also become an optional scheme in this paper.…”

Section: Gaussian Process Regression (Gpr)mentioning

confidence: 99%

A Comparative Assessment of Six Machine Learning Models for Prediction of Bending Force in Hot Strip Rolling Process

Feng

2020

Metals

View full text Add to dashboard Cite

In the hot strip rolling (HSR) process, accurate prediction of bending force can improve the control accuracy of the strip crown and flatness, and further improve the strip shape quality. In this paper, six machine learning models, including Artificial Neural Network (ANN), Support Vector Machine (SVR), Classification and Regression Tree (CART), Bagging Regression Tree (BRT), Least Absolute Shrinkage and Selection operator (LASSO), and Gaussian Process Regression (GPR), were applied to predict the bending force in the HSR process. A comparative experiment was carried out based on a real-life dataset, and the prediction performance of the six models was analyzed from prediction accuracy, stability, and computational cost. The prediction performance of the six models was assessed using three evaluation metrics of root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). The results show that the GPR model is considered as the optimal model for bending force prediction with the best prediction accuracy, better stability, and acceptable computational cost. The prediction accuracy and stability of CART and ANN are slightly lower than that of GPR. Although BRT also shows a good combination of prediction accuracy and computational cost, the stability of BRT is the worst in the six models. SVM not only has poor prediction accuracy, but also has the highest computational cost while LASSO showed the worst prediction accuracy.

show abstract

“…Recently, there has been an increasing trend on the development of scalable GPs, which are classified into two core categories: global approximation and local approximation [25]. As the representative of global approximation, sparse approximation considers m (m ≪ n) global inducing pairs {X m , f m } to optimally summarize the training data by approximating the prior [26] or the posterior [27], resulting in the complexity of O(nm 2 ).…”

Section: Introductionmentioning

confidence: 99%

Large-Scale Heteroscedastic Regression via Gaussian Process

Liu

Ong

Cai

2021

IEEE Trans. Neural Netw. Learning Syst.

Self Cite

View full text Add to dashboard Cite

Heteroscedastic regression which considers varying noises across input domain has many applications in fields like machine learning and statistics. Here we focus on the heteroscedastic Gaussian process (HGP) regression which integrates the latent function and the noise together in a unified nonparametric Bayesian framework. Though showing remarkable performance, HGP suffers from the cubic time complexity, which strictly limits its application to big data. To improve the scalability of HGP, we first develop a variational sparse inference algorithm, named VSHGP, to handle large-scale datasets. Furthermore, two variants are developed to further improve the scalability and capability of VSHGP. The first is stochastic VSHGP (SVSHGP) which derives a relaxed evidence lower bound factorized over data points, thus enhancing efficient stochastic variational inference. The second is distributed VSHGP (DVSHGP) which (i) follows the Bayesian committee machine formalism to distribute computations over multiple local VSHGP experts with many inducing points; and (ii) adopts hybrid parameters for experts to guard against over-fitting and capture local variety. Superiority of DVSHGP and SVSHGP as compared to existing scalable heteroscedastic/homoscedastic GPs is then verified using a synthetic dataset and four real-world datasets.

show abstract

Understanding and comparing scalable Gaussian process regression for big data

Cited by 24 publications

References 29 publications

When Gaussian Process Meets Big Data: A Review of Scalable GPs

When Gaussian Process Meets Big Data: A Review of Scalable GPs

A Comparative Assessment of Six Machine Learning Models for Prediction of Bending Force in Hot Strip Rolling Process

Large-Scale Heteroscedastic Regression via Gaussian Process

Contact Info

Product

Resources

About