2021
DOI: 10.48550/arxiv.2105.11535
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Scalable Cross Validation Losses for Gaussian Process Models

Abstract: We introduce a simple and scalable method for training Gaussian process (GP) models that exploits cross-validation and nearest neighbor truncation. To accommodate binary and multi-class classification we leverage Pòlya-Gamma auxiliary variables and variational inference. In an extensive empirical comparison with a number of alternative methods for scalable GP regression and classification, we find that our method offers fast training and excellent predictive performance. We argue that the good predictive perfo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…This supports our numerical and theoretical evidence of the advantages of nearby sampling in Section 3.3. Third, Vecchia does perform well overall in terms of prediction performance, which is contrary to the finding in Jankowiak and Pleiss (2021). Most likely, the heuristic MMD ordering we adopted offers significant improvement in model approximation over the default coordinate-based ordering (Guinness, 2018).…”
Section: Case Studiesmentioning
confidence: 66%
“…This supports our numerical and theoretical evidence of the advantages of nearby sampling in Section 3.3. Third, Vecchia does perform well overall in terms of prediction performance, which is contrary to the finding in Jankowiak and Pleiss (2021). Most likely, the heuristic MMD ordering we adopted offers significant improvement in model approximation over the default coordinate-based ordering (Guinness, 2018).…”
Section: Case Studiesmentioning
confidence: 66%
“…. This is because F's preference for sparsity is heavily influenced by hyperparameter c, as there is a fixed variable inclusion cost of log( 1c ) (see the appendix), making model weights overly dependent on c. Moreover, the LOO-LPD is more robust against overfitting and model mis-specification than the log evidence (Jankowiak and Pleiss, 2021;Gelman et al, 2014).…”
Section: Approximating Posterior Model Weights With Leave-one-out Log...mentioning
confidence: 99%
“…To compute the LOO-LPD in efficient O(n 3 ) time in small datasets we use the Bürkner et al (2021) algorithm, which uses rank 1 updates to K XX . In large datasets we nearest-neighbour truncate the LOO-LPD following the success of this approach in Jankowiak and Pleiss (2021). That is, for each (y i , x i ), we condition on m nearest neighbours of x i found using the distance metric in the kernel.…”
Section: Approximating Posterior Model Weights With Leave-one-out Log...mentioning
confidence: 99%
See 1 more Smart Citation