2016
DOI: 10.1137/140979861
|View full text |Cite
|
Sign up to set email alerts
|

Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization

Abstract: We consider the problem of sparse coding, where each sample consists of a sparse linear combination of a set of dictionary atoms, and the task is to learn both the dictionary elements and the mixing coefficients. Alternating minimization is a popular heuristic for sparse coding, where the dictionary and the coefficients are estimated in alternate steps, keeping the other fixed. Typically, the coefficients are estimated via ℓ 1 minimization, keeping the dictionary fixed, and the dictionary is estimated through … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
167
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 81 publications
(173 citation statements)
references
References 33 publications
6
167
0
Order By: Relevance
“…The question of when a nonnegative polynomial has such a "certificate of nonnegativity" was studied by Hilbert, who realized this doesn't always hold and asked (as his 17th problem) whether a nonnegative polynomial is always a sum of squares of rational functions. 1 The book chapter [20] is a good source for several of the known upper and lower bounds, although it does not contain some of the more recent ones. 2 While it is common in the TCS community to use Lasserre to describe the primal version of this SDP and Sum-ofSquares (SOS) to describe the dual, in this paper we use the more descriptive SOS name for both programs.…”
Section: The Sum-of-squares Hierarchymentioning
confidence: 99%
See 2 more Smart Citations
“…The question of when a nonnegative polynomial has such a "certificate of nonnegativity" was studied by Hilbert, who realized this doesn't always hold and asked (as his 17th problem) whether a nonnegative polynomial is always a sum of squares of rational functions. 1 The book chapter [20] is a good source for several of the known upper and lower bounds, although it does not contain some of the more recent ones. 2 While it is common in the TCS community to use Lasserre to describe the primal version of this SDP and Sum-ofSquares (SOS) to describe the dual, in this paper we use the more descriptive SOS name for both programs.…”
Section: The Sum-of-squares Hierarchymentioning
confidence: 99%
“…There are several strong lower bounds (also known as integrality gaps) for these hierarchies, in particular showing that ω(1) levels (and often even n Ω (1) or Ω(n) levels) of many such hierarchies can't improve by much on the known polynomial-time approximation guarantees for many NP hard problems, including SAT, Independent-Set, Max-Cut and more [28,27,5,21,47,52,19,14,15]. Unfortunately, there are many fewer positive results, and many of them only show that these hierarchies can match the performance of Proceedings of the 2014 ACM Symposium on Theory of Computing 31 Proceedings of the 2014 ACM Symposium on Theory of Computing previously known (and often more efficient) methods, or give algorithms that can be converted into something much more combinatorial, rather than using hierarchies to get genuinely new algorithmic results.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, there has been a resurgence of interest in methods based on alternating minimization, as numerous authors have shown that alternating minimization (suitably initialized, and under a few technical assumptions) provably converges to the global minimum for a range of problems including matrix completion [Kes12,JNS13,Har13], robust PCA [NNS + 14], and dictionary learning [AAJN13].…”
Section: Gordon's Generalizedmentioning
confidence: 99%
“…We then show that, despite being a nonconvex objective, all local minima are global minima, under minimal conditions. We avoid the need for careful initialization strategies needed for previous optimality results for sparse coding [Agarwal et al, 2014;Arora et al, 2015], using recent results for more general dictionary learning settings [Haeffele and Vidal, 2015;Le and White, 2017], particularly by extending beyond smooth regularizers using Γ-convergence. Using this insight, we provide a simple alternating proximal gradient algorithm and demonstrate the utility of learning supervised sparse coding representations versus unsupervised sparse coding and a variety of tile-coding representations.…”
Section: Introductionmentioning
confidence: 99%