2020
DOI: 10.1609/aaai.v34i04.6147
|View full text |Cite
|
Sign up to set email alerts
|

Divide-and-Conquer Learning with Nyström: Optimal Rate and Algorithm

Abstract: Kernel Regularized Least Squares (KRLS) is a fundamental learner in machine learning. However, due to the high time and space requirements, it has no capability to large scale scenarios. Therefore, we propose DC-NY, a novel algorithm that combines divide-and-conquer method, Nyström, conjugate gradient, and preconditioning to scale up KRLS, has the same accuracy of exact KRLS and the minimum time and space complexity compared to the state-of-the-art approximate KRLS estimates. We present a theoretical analysis … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 20 publications
0
8
0
Order By: Relevance
“…Compared with (Liu, Liu, and Wang 2021;Yin et al 2021;Lin, Wang, and Zhou 2020;Yin et al 2020a; N 2 N N 0.5 N 0.5 / Exp DKRR (Lin, Wang, and Zhou 2020) N 2.25 N 1.5 N 0.75 N 0.25 / Pro DKRR-CM (Lin, Wang, and Zhou 2020) N…”
Section: Compared With the Related Workmentioning
confidence: 97%
See 4 more Smart Citations
“…Compared with (Liu, Liu, and Wang 2021;Yin et al 2021;Lin, Wang, and Zhou 2020;Yin et al 2020a; N 2 N N 0.5 N 0.5 / Exp DKRR (Lin, Wang, and Zhou 2020) N 2.25 N 1.5 N 0.75 N 0.25 / Pro DKRR-CM (Lin, Wang, and Zhou 2020) N…”
Section: Compared With the Related Workmentioning
confidence: 97%
“…The representative distributed KRR includes DKRR (Guo, Lin, and Shi 2019;Chang, Lin, and Zhou 2017;Lin, Guo, and Zhou 2017;Zhang, Duchi, andWainwright 2015, 2013) based on divide-and-conquer, DKRR-RF (Li, Liu, and Wang 2019) based on DKRR and random features (Rudi, Camoriano, and Rosasco 2016), and DKRR-NY-PCG (Yin et al 2020a) based on DKRR and Nyström-PCG (Rudi, Carratino, and Rosasco 2017), which derive the optimal learning rate in expectation. However, they have a restricted limitation in the number of local processors p, that is, to derive the optimal learning rate, p should be restricted to a constant at the most popular case (ζ = 1/2, γ = 1).…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations