2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017
DOI: 10.1109/icassp.2017.7953042
|View full text |Cite
|
Sign up to set email alerts
|

Parsimonious Online Learning with Kernels via sparse projections in function space

Abstract: Despite their attractiveness, popular perception is that techniques for nonparametric function approximation do not scale to streaming data due to an intractable growth in the amount of storage they require. To solve this problem in a memory-affordable way, we propose an online technique based on functional stochastic gradient descent in tandem with supervised sparsification based on greedy function subspace projections. The method, called parsimonious online learning with kernels (POLK), provides a controllab… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
85
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
3
3
1

Relationship

3
4

Authors

Journals

citations
Cited by 40 publications
(85 citation statements)
references
References 36 publications
0
85
0
Order By: Relevance
“…We define the model order as number of data points M i,t in the dictionary of agent i at time t (the number of columns of X t ). FSGD is such that M i,t = t − 1, and hence grows unbounded with iteration index t. Next we address this intractable memory growth such that we may execute stochastic descent through low-dimensional projections of the stochastic gradient, inspired by [31]. First, we clarify the motivation for the choice of the penalty function (9).…”
Section: A Functional Stochastic Gradient Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…We define the model order as number of data points M i,t in the dictionary of agent i at time t (the number of columns of X t ). FSGD is such that M i,t = t − 1, and hence grows unbounded with iteration index t. Next we address this intractable memory growth such that we may execute stochastic descent through low-dimensional projections of the stochastic gradient, inspired by [31]. First, we clarify the motivation for the choice of the penalty function (9).…”
Section: A Functional Stochastic Gradient Methodsmentioning
confidence: 99%
“…κ(d M , ·)], and K D,D as the resulting kernel matrix from this dictionary. We enforce function parsimony by selecting dictionaries D i with M i,t << O(t) for each i [31].…”
Section: B Sparse Subspace Projectionsmentioning
confidence: 99%
See 2 more Smart Citations
“…So far we have shown that the complexity of the formulation can be reduced by moving centers in addition to moving the width. To further explore the effect of kernel centers on the complexity of the solution, we compare the performance of (PII ′ ) to that of kernel orthogonal matching pursuit (KOMP) with pre-fitting (see [9], [30]), for a simulated signal as in (23). KOMP takes an initial function and a set of sample points and tries to estimate it by a parsimonious function of a lower complexity.…”
Section: A Examining the Complexity Of The Solutionmentioning
confidence: 99%