Abstract. In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Lojasiewicz (PL) inequality is actually weaker than the main conditions that have been explored to show linear convergence rates without strong convexity over the last 25 years. We also use the PL inequality to give new analyses of randomized and greedy coordinate descent methods, sign-based gradient descent methods, and stochastic gradient methods in the classic setting (with decreasing or constant step-sizes) as well as the variancereduced setting. We further propose a generalization that applies to proximal-gradient methods for non-smooth optimization, leading to simple proofs of linear convergence of these methods. Along the way, we give simple convergence results for a wide variety of problems in machine learning: least squares, logistic regression, boosting, resilient backpropagation, L1-regularization, support vector machines, stochastic dual coordinate ascent, and stochastic variance-reduced gradient methods.
Recently much progress has been made in applying field theory methods, first developed to study X-ray edge singularities, to interacting one dimensional systems in order to include band curvature effects and study edge singularities at arbitrary momentum. Finding experimental confirmations of this theory remains an open challenge. Here we point out that spin chains with uniform Dzyaloshinskii-Moriya (DM) interactions provide an opportunity to test these theories since these interactions may be exactly eliminated by a gauge transformation which shifts the momentum. However, this requires an extension of these X-ray edge methods to the transverse spectral function of the xxz spin chain in a magnetic field, which we provide. By use of above unitary transformations we have( 1.7) where the new exchange coupling and anisotropy parameters are given byThe electron spin resonance (ESR) adsorption intensity, in standard Faraday configuration, is proportional to the transverse spectral function at q = 0, since the wave-vector of microwave photons is much less than the inverse lattice spacing. After the gauge transformation, the ESR intensity is therefore proportional to S +− and S −+ , for the Hamiltonian of Eq. (1.1) at q = α. [By using circularly polarized microwave radiation both S +− (α, ω) and S −+ (α, ω) could be measured separately.] Thus the edge singularities predicted by X-ray edge methods at a nonzero wave-vector α given by Eq. (1.6) are directly measured by ESR. ESR on spin chain compounds with uniform DM interactions therefore would provide a powerful probe of the new bosonization predictions. Quasi-1D spin-1/2 antiferromagnetic insulators containing DM interactions with a uniform component include Cs 2 CuCl 4 19,20 and KCuGaF 6 . 21 This provides a strong motivation to extend the X-ray edge methods to study edge singularities in the transverse spectral functions of the xxz chain in a magnetic field.In the next section we review results on the transverse spectral function using standard bosonization and then show that band curvature effects (in the equivalent fermion model) render these results invalid close to edge singularities. In Sec. III we apply X-ray edge methods to the model obtaining new results on the leading edge singularities. In Sec. IV sub-dominant singularities are discussed. Section V discusses ESR with uniform DM interactions, based partly on the results of Sec. III. Sec. VI contains conclusions and open questions.
We propose a novel method for reducing the number of variables in quadratic unconstrained binary optimization problems, using a quantum annealer (or any sampler) to fix the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are usually much easier for the quantum annealer to solve, due to their being smaller and consisting of disconnected components. This approach significantly increases the success rate and number of observations of the best known energy value in samples obtained from the quantum annealer, when compared with calling the quantum annealer without using it, even when using fewer annealing cycles. Use of the method results in a considerable improvement in success metrics even for problems with high-precision couplers and biases, which are more challenging for the quantum annealer to solve. The results are further enhanced by applying the method iteratively and combining it with classical pre-processing. We present results for both Chimera graph-structured problems and embedded problems from a real-world application
We present and apply a general-purpose, multi-start algorithm for improving the performance of low-energy samplers used for solving optimization problems. The algorithm iteratively fixes the value of a large portion of the variables to values that have a high probability of being optimal. The resulting problems are smaller and less connected, and samplers tend to give better low-energy samples for these problems. The algorithm is trivially parallelizable, since each start in the multi-start algorithm is independent, and could be applied to any heuristic solver that can be run multiple times to give a sample. We present results for several classes of hard problems solved using simulated annealing, path-integral quantum Monte Carlo, parallel tempering with isoenergetic cluster moves, and a quantum annealer, and show that the success metrics and the scaling are improved substantially. When combined with this algorithm, the quantum annealer's scaling was substantially improved for native Chimera graph problems. In addition, with this algorithm the scaling of the time to solution of the quantum annealer is comparable to the Hamze-de Freitas-Selby algorithm on the weak-strong cluster problems introduced by Boixo et al. Parallel tempering with isoenergetic cluster moves was able to consistently solve 3D spin glass problems with 8000 variables when combined with our method, whereas without our method it could not solve any.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.