We consider the problem of minimizing the sum of a smooth function h with a bounded Hessian, and a nonsmooth function. We assume that the latter function is a composition of a proper closed function P and a surjective linear map M, with the proximal mappings of τ P , τ > 0, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that, if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions h and P are semi-algebraic. Furthermore, we give simple sufficient conditions to guarantee boundedness of the sequence generated. These conditions can be satisfied for a wide range of applications including the least squares problem with the ℓ 1/2 regularization. Finally, when M is the identity so that the proximal gradient algorithm can be efficiently applied, we show that any cluster point is stationary under a slightly more flexible constant step-size rule than what is known in the literature for a nonconvex h.
We adapt the Douglas-Rachford (DR) splitting method to solve nonconvex feasibility problems by studying this method for a class of nonconvex optimization problem. While the convergence properties of the method for convex problems have been well studied, far less is known in the nonconvex setting. In this paper, for the direct adaptation of the method to minimize the sum of a proper closed function g and a smooth function f with a Lipschitz continuous gradient, we show that if the step-size parameter is smaller than a computable threshold and the sequence generated has a cluster point, then it gives a stationary point of the optimization problem. Convergence of the whole sequence and a local convergence rate are also established under the additional assumption that f and g are semi-algebraic. We also give simple sufficient conditions guaranteeing the boundedness of the sequence generated. We then apply our nonconvex DR splitting method to finding a point in the intersection of a closed convex set C and a general closed set D by minimizing the squared distance to C subject to D. We show that if either set is bounded and the step-size parameter is smaller than a computable threshold, then the sequence generated from the DR splitting method is actually bounded. Consequently, the sequence generated will have cluster points that are stationary for an optimization problem, and the whole sequence is convergent under an additional assumption that C and D are semi-algebraic. We achieve these results based on a new merit function constructed particularly for the DR splitting method. Our preliminary numerical results indicate that our DR splitting method usually outperforms the alternating projection method in finding a sparse solution of a linear system, in terms of both the solution quality and the number of iterations taken.
In this paper, we study the Kurdyka-Lojasiewicz (KL) exponent, an important quantity for analyzing the convergence rate of first-order methods. Specifically, we develop various calculus rules to deduce the KL exponent of new (possibly nonconvex and nonsmooth) functions formed from functions with known KL exponents. In addition, we show that the well-studied Luo-Tseng error bound together with a mild assumption on the separation of stationary values implies that the KL exponent is 1 2 . The Luo-Tseng error bound is known to hold for a large class of concrete structured optimization problems, and thus we deduce the KL exponent of a large class of functions whose exponents were previously unknown. Building upon this and the calculus rules, we are then able to show that for many convex or nonconvex optimization models for applications such as sparse recovery, their objective function's KL exponent is 1 2 . This includes the least squares problem with smoothly clipped absolute deviation (SCAD) regularization or minimax concave penalty (MCP) regularization and the logistic regression problem with 1 regularization. Since many existing local convergence rate analysis for first-order methods in the nonconvex scenario relies on the KL exponent, our results enable us to obtain explicit convergence rate for various first-order methods when they are applied to a large variety of practical optimization models. Finally, we further illustrate how our results can be applied to establishing local linear convergence of the proximal gradient algorithm and the inertial proximal algorithm with constant step-sizes for some specific models that arise in sparse recovery.
The trust-region problem, which minimizes a nonconvex quadratic function over a ball, is a key subproblem in trust-region methods for solving nonlinear optimization problems. It enjoys many attractive properties such as an exact semi-definite linear programming relaxation (SDP-relaxation) and strong duality. Unfortunately, such properties do not, in general, hold for an extended trustregion problem having extra linear constraints. This paper shows that two useful and powerful features of the classical trust-region problem continue to hold for an extended trust-region problem with linear inequality constraints under a new dimension condition. First, we establish that the class of extended trust-region problems has an exact SDP-relaxation, which holds without the Slater constraint qualification. This is achieved by proving that a system of quadratic and affine functions involved in the model satisfies a range-convexity whenever the dimension condition is fulfilled. Second, we show that the dimension condition together with the Slater condition ensures that a set of combined first and second-order Lagrange multiplier conditions is necessary and sufficient for global optimality of the extended trust-region problem and consequently for strong duality. Through simple examples we also provide an insightful account of our development from SDP-relaxation to strong duality. Finally, we show that the dimension condition is easily satisfied for the extended trust-region model that arises from the reformulation of a robust least squares problem (LSP) as well as a robust second order cone programming model problem (SOCP) as an equivalent semi-definite linear programming problem. This leads us to conclude that, under mild assumptions, solving a robust (LSP) or (SOCP) under matrix-norm uncertainty or polyhedral uncertainty is equivalent to solving a semi-definite linear programming problem and so, their solutions can be validated in polynomial time. * The authors are grateful to the referees for their valuable suggestions and helpful comments which have contributed to the final preparation of the paper.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.