2012
DOI: 10.1137/100802001
|View full text |Cite
|
Sign up to set email alerts
|

Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems

Abstract: In this paper we propose new methods for solving huge-scale optimization problems. For problems of this size, even the simplest full-dimensional vector operations are very expensive. Hence, we propose to apply an optimization technique based on random partial update of decision variables. For these methods, we prove the global estimates for the rate of convergence. Surprisingly enough, for certain classes of objective functions, our results are better than the standard worst-case bounds for deterministic algor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

18
1,236
1
31

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 1,012 publications
(1,286 citation statements)
references
References 4 publications
18
1,236
1
31
Order By: Relevance
“…For this particular choice (non-asymptotic) convergence rates were only recently derived in [2], although the convergence of the method was extensively studied in the literature under various assumptions [13,3]. Instead of using a deterministic cyclic order, randomized strategies were proposed in [14,12,16] for choosing a block to update at each iteration of the BCGD method. At iteration k, an index i k is generated randomly according to the probability distribution vector p ∈ ∆ c .…”
Section: Randomized Block Coordinate Gradient Descentmentioning
confidence: 99%
See 3 more Smart Citations
“…For this particular choice (non-asymptotic) convergence rates were only recently derived in [2], although the convergence of the method was extensively studied in the literature under various assumptions [13,3]. Instead of using a deterministic cyclic order, randomized strategies were proposed in [14,12,16] for choosing a block to update at each iteration of the BCGD method. At iteration k, an index i k is generated randomly according to the probability distribution vector p ∈ ∆ c .…”
Section: Randomized Block Coordinate Gradient Descentmentioning
confidence: 99%
“…At iteration k, an index i k is generated randomly according to the probability distribution vector p ∈ ∆ c . In [14] the distribution vector was chosen as…”
Section: Randomized Block Coordinate Gradient Descentmentioning
confidence: 99%
See 2 more Smart Citations
“…Randomized coordinate descent has been shown to be competitive with the classical gradient descent method, in the sense that it requires less work per iteration, but a comparable number of iterations to converge [8]. In this section, we demonstrate that a similar property holds for asynchronous incremental block-coordinate descent: if the amount of work required to evaluate a partial gradient is proportional to its block size, then incremental block-coordinate descent can always be expected to be more efficient than a corresponding incremental gradient descent algorithm.…”
Section: Efficiency Comparison With Asynchronous Incremental Gramentioning
confidence: 99%