1967
DOI: 10.1090/s0025-5718-1967-0223073-7
|View full text |Cite
|
Sign up to set email alerts
|

On the relative efficiencies of gradient methods

Abstract: A comparison is made among various gradient methods for maximizing a function, based on a characterization by Crockett and Chernoff of the class of these methods. By defining the “efficiency” of a gradient step in a certain way, it becomes easy to compare the efficiencies of different schemes with that of Newton’s method, which can be regarded as a particular gradient scheme. For quadratic functions, it is shown that Newton’s method is the most efficient (a conclusion which may be approximately true for nonqua… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

1974
1974
2014
2014

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 115 publications
(19 citation statements)
references
References 4 publications
0
19
0
Order By: Relevance
“…In the numerical experiments given below,H k = H (x k , y k ) + E k , where E k is a positive-semidefinite modification chosen to ensure that the inertia of the regularized equations (5.5) is (n, m, 0). If the inertia is correct, then E k = 0; otherwise E k is defined implicitly by modifying the eigenvalues associated with the spectral decomposition of H (x k , y k ) (see Greenstadt [31]). Other, more practical approaches include: (i) modifying an inertia-controlling factorization of the KKT matrix [19,21]; (ii) using a positive-definite quasi-Newton approximation to H (x k , y k ) [24,29,30,44]; and (iii) adding increasing positive multiples of the identity matrix to H (x k , y k ) until the inertia is correct [53].…”
Section: Primal-dual Sqp Methodsmentioning
confidence: 99%
“…In the numerical experiments given below,H k = H (x k , y k ) + E k , where E k is a positive-semidefinite modification chosen to ensure that the inertia of the regularized equations (5.5) is (n, m, 0). If the inertia is correct, then E k = 0; otherwise E k is defined implicitly by modifying the eigenvalues associated with the spectral decomposition of H (x k , y k ) (see Greenstadt [31]). Other, more practical approaches include: (i) modifying an inertia-controlling factorization of the KKT matrix [19,21]; (ii) using a positive-definite quasi-Newton approximation to H (x k , y k ) [24,29,30,44]; and (iii) adding increasing positive multiples of the identity matrix to H (x k , y k ) until the inertia is correct [53].…”
Section: Primal-dual Sqp Methodsmentioning
confidence: 99%
“…Other available search algorithms incorporate the derivatives of the objective function, such as the gradient descent or conjugate gradient method (Fletcher & Reeves, 1963Hestenes & Stiefel, 1952;Hestenes, 1969) and Newton's method or quasiNewton methods (Greenstadt, 1967;Spang, 1962). Although derivative-based methods are often faster and more reliable than direct-search algorithms, they are liable to terminate far from the true solution if the objective function is ill-conditioned (Corana, Marchesi, Martini, & Ridella, 1987).…”
Section: Selecting a Search Algorithmmentioning
confidence: 99%
“…The tobit model is a truncated variable model with equation (2) replaced by Y1T1 1fRHS<T1 and requires that we know both which observations are truncated and the value of the threshold T. for at least those truncated observations. In the censored 1 model the actual value of the threshold will not generally be known for any observations. As in the tobit model the threshold censoring results in a non-zero expectation of the disturbance term within the subset of non-censored observations so that least squares will yield biased parameter estimates.…”
Section: Ifrhsmentioning
confidence: 99%
“…First the log likelihood is not concave over a wide range of the parameter space so that the matrix of second derivatives may not be negative definite, as is required for convergence of the Newton algorithm, at any arbitrary set of initial values for the coefficients. A rrodification to that Hession matrix such as the one proposed by eenstadt [2] thus proved necessary. Second, a tattern often observed in the iterative maximization was that the coefficients appeared to be moving in the right direction but the steps taken were so large that eventually the maxinun was oversteped with the variance terms driven out of the parameter space, resulting in a failure of the procedure.…”
mentioning
confidence: 99%