2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2018
DOI: 10.1109/ipdpsw.2018.00140
|View full text |Cite
|
Sign up to set email alerts
|

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression

Abstract: We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Using hierarchical matrix approximations for the kernel matrix the memory requirements, the number of floating point operations, and the execution time are drastically reduced compared to standard dense linear algebra routines. We consider both the general H matrix hierarchical format as well as Hierarchically Semi-Separable (HSS) matrices. Furthermore, we investigate the impact of several preprocessing and cluster… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 30 publications
0
10
0
Order By: Relevance
“…While we presented a preliminary analysis of the multiplicative update scheme's convergence behavior for a special case, future work is necessary for a thorough determination of the algorithm's region of convergence. There is also strong practical interest in the adaptation of the method to GPR extensions such as those based on non-Gaussian likelihoods and Nystr öm [36,37] or hierarchical low-rank approximations [38,39,40], as well as in Table 1: Label Noise -Rate/Level: the percentage of corrupted labels and the ratio between the noise and the standard deviation of the pristine labels; R 2 : the coefficient of determination between the inferred and actual label noise; AUC: area under the ROC curve of a 'noisy label' classifier that thresholds the learned σ i ; Precision at Recall Level: precision of the classifier at specified recall levels. Regression accuracy -plain/basic/full: Σ = 0, σI, diag (σ), respectively.…”
Section: Discussionmentioning
confidence: 99%
“…While we presented a preliminary analysis of the multiplicative update scheme's convergence behavior for a special case, future work is necessary for a thorough determination of the algorithm's region of convergence. There is also strong practical interest in the adaptation of the method to GPR extensions such as those based on non-Gaussian likelihoods and Nystr öm [36,37] or hierarchical low-rank approximations [38,39,40], as well as in Table 1: Label Noise -Rate/Level: the percentage of corrupted labels and the ratio between the noise and the standard deviation of the pristine labels; R 2 : the coefficient of determination between the inferred and actual label noise; AUC: area under the ROC curve of a 'noisy label' classifier that thresholds the learned σ i ; Precision at Recall Level: precision of the classifier at specified recall levels. Regression accuracy -plain/basic/full: Σ = 0, σI, diag (σ), respectively.…”
Section: Discussionmentioning
confidence: 99%
“…Morton orderings and other space filling curves have also been used to generate tilings for matrices in low spatial dimensions [24,17]. In higher dimensional spaces, such as the feature spaces that appear in machine learning applications, approximate nearest neighbor [20,38,35] are computed based on random projection trees. These are generalizations of KD-trees, where the direction of the median split is randomized and is not one of the coordinate dimensions.…”
Section: Related Workmentioning
confidence: 99%
“…Aim of this work is to propose and analyse the use of the Hierarchically Semi-Separable (HSS) matrix representation [6] for the solution of large scale kernel SVMs. Indeed, the use of HSS approximations of kernel matrices has been already investigated in [9,33] for the solution of large scale Kernel Regression problems. The main reason for the choice of the HSS structure in this context can be summarised as follows: • using the STRUctured Matrix PACKage (STRUMPACK) [34] it is possible to obtain HSS approximations of the kernel matrices without the need to store/compute explicitly the whole matrix K. Indeed, for kernel matrix approximations, STRUMPACK uses a partially matrix-free strategy (see [9]) essentially based on an adaptive randomized sampling which requires only a black-box matrixtimes-vector multiplication routine and the access to selected elements from the kernel matrix;…”
Section: Contributionmentioning
confidence: 99%