Recent Advances in Scaling Up Gaussian Process Predictive Models for Large Spatiotemporal Data

Low, Kian Hsiang; Chen, Jie; Hoang, Trong Nghia; Xu, Nuo; Jaillet, Patrick

doi:10.1007/978-3-319-25138-7_16

Cited by 11 publications

(7 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We will also consider our outsourced setting in the active learning context (Cao et al, 2013;Hoang et al, 2014a;Low et al, 2008;2009;Ouyang et al, 2014;Zhang et al, 2016). For applications with a huge budget of function evaluations, we like to couple PO-GP-UCB with the use of distributed/decentralized (Chen et al, 2012;2013a;Hoang et al, 2016;2019b;a;Low et al, 2015;Ouyang & Low, 2018) or online/stochastic (Hoang et al, 2015;Low et al, 2014b;Xu et al, 2014;Teng et al, 2020;Yu et al, 2019a;…”

Section: Discussionmentioning

confidence: 99%

Private Outsourced Bayesian Optimization

Kharkovskii¹,

Dai²,

Low³

2020

Preprint

View full text Add to dashboard Cite

This paper presents the private-outsourced-Gaussian process-upper confidence bound (PO-GP-UCB) algorithm, which is the first algorithm for privacy-preserving Bayesian optimization (BO) in the outsourced setting with a provable performance guarantee. We consider the outsourced setting where the entity holding the dataset and the entity performing BO are represented by different parties, and the dataset cannot be released non-privately. For example, a hospital holds a dataset of sensitive medical records and outsources the BO task on this dataset to an industrial AI company. The key idea of our approach is to make the BO performance of our algorithm similar to that of non-private GP-UCB run using the original dataset, which is achieved by using a random projection-based transformation that preserves both privacy and the pairwise distances between inputs. Our main theoretical contribution is to show that a regret bound similar to that of the standard GP-UCB algorithm can be established for our PO-GP-UCB algorithm. We empirically evaluate the performance of our PO-GP-UCB algorithm with synthetic and real-world datasets.

show abstract

Section: Discussionmentioning

confidence: 99%

Private Outsourced Bayesian Optimization

Kharkovskii¹,

Dai²,

Low³

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…As a result, they incur linear time in the data size that is still prohibitively expensive for training with big data (i.e., million-sized datasets). To scale up to big data, parallel [3]- [5] and online [6], [7] variants of several of these SGPR models have been developed for prediction (by assuming known hyperparameters) but not hyperparameter learning.…”

Section: Introductionmentioning

confidence: 99%

Stochastic Variational Inference for Bayesian Sparse Gaussian Process Regression

Huang

Nghia

Low

et al. 2019

2019 International Joint Conference on Neural Networks (IJCNN)

Self Cite

View full text Add to dashboard Cite

This paper presents a novel variational inference framework for deriving a family of Bayesian sparse Gaussian process regression (SGPR) models whose approximations are variationally optimal with respect to the full-rank GPR model enriched with various corresponding correlation structures of the observation noises. Our variational Bayesian SGPR (VBSGPR) models jointly treat both the distributions of the inducing variables and hyperparameters as variational parameters, which enables the decomposability of the variational lower bound that in turn can be exploited for stochastic optimization. Such a stochastic optimization involves iteratively following the stochastic gradient of the variational lower bound to improve its estimates of the optimal variational distributions of the inducing variables and hyperparameters (and hence the predictive distribution) of our VBSGPR models and is guaranteed to achieve asymptotic convergence to them. We show that the stochastic gradient is an unbiased estimator of the exact gradient and can be computed in constant time per iteration, hence achieving scalability to big data. We empirically evaluate the performance of our proposed framework on two real-world, massive datasets.

show abstract

“…However, such algorithms fall short of achieving the truly decentralized GP fusion necessary for scaling up to a massive number of agents grounded in the real world (e.g., traffic sensing, modeling, and prediction by autonomous vehicles cruising in urban road networks (Chen et al 2015;Low et al 2015a;Hoang et al 2014;Min and Wynter 2011;Ouyang et al 2014;Wang and Papageorgiou 2005;Work et al 2010), distributed inference on a network of IoTs, surveillance cameras and mobile devices/robots (Kang and Larkin 2016;Natarajan et al 2014;Hoang et al 2018b;Zhang et al 2016)) due to the following critical issues: (a) An obvious limitation is the single point(s) of failure with the server agent(s) whose computational and communication capabilities must be superior and robust (e.g., against transmission loss); (b) different GP inference agents are likely to gather data of varying behaviors and correlation structure from possibly separate localities of the input domain (e.g., spatiotemporal) and would therefore incur considerable information loss due to summarization based on a common set of fixed/known GP hyperparameter settings and inducing inputs, especially when the inducing inputs are few and far from the data (in the correlation sense); and (c) like distributed GP models, distributed GP fusion algorithms implicitly assume a one-time processing of a fixed set of data and would hence repeat the entire fusion process involving all local data gathered by the agents whenever new batches of streaming data arrive, which is prohibitively expensive. To overcome these limitations, this paper presents a novel Collective Online Learning of GPs (COOL-GP) framework for enabling a massive number of agents to simultaneously perform (a) efficient online updates of their GP models using their local streaming data with varying correlation structures and (b) decentralized fusion of their resulting online GP models with different learned hyperparameter settings and inducing inputs residing in the original input domain.…”

Section: Introductionmentioning

confidence: 99%

Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems

Hoang¹,

Hoang

Low

et al. 2019

AAAI

Self Cite

View full text Add to dashboard Cite

This paper presents a novel Collective Online Learning of Gaussian Processes (COOL-GP) framework for enabling a massive number of GP inference agents to simultaneously perform (a) efficient online updates of their GP models using their local streaming data with varying correlation structures and (b) decentralized fusion of their resulting online GP models with different learned hyperparameter settings and inducing inputs. To realize this, we exploit the notion of a common encoding structure to encapsulate the local streaming data gathered by any GP inference agent into summary statistics based on our proposed representation, which is amenable to both an efficient online update via an importance sampling trick as well as multi-agent model fusion via decentralized message passing that can exploit sparse connectivity among agents for improving efficiency and enhance the robustness of our framework against transmission loss. We provide a rigorous theoretical analysis of the approximation loss arising from our proposed representation to achieve efficient online updates and model fusion. Empirical evaluations show that COOL-GP is highly effective in model fusion, resilient to information disparity between agents, robust to transmission loss, and can scale to thousands of agents.

show abstract

Recent Advances in Scaling Up Gaussian Process Predictive Models for Large Spatiotemporal Data

Cited by 11 publications

References 20 publications

Private Outsourced Bayesian Optimization

Private Outsourced Bayesian Optimization

Stochastic Variational Inference for Bayesian Sparse Gaussian Process Regression

Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems

Contact Info

Product

Resources

About