We formulate the metric learning problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the Mahalanobis distance function. Via a surprising equivalence, we show that this problem can be solved as a low-rank kernel learning problem. Specifically, we minimize the Burg divergence of a low-rank kernel to an input kernel, subject to pairwise distance constraints. Our approach has several advantages over existing methods. First, we present a natural information-theoretic formulation for the problem. Second, the algorithm utilizes the methods developed by Kulis et al.[6], which do not involve any eigenvector computation; in particular, the running time of our method is faster than most existing techniques. Third, the formulation offers insights into connections between metric learning and kernel learning.
Alternating minimization represents a widely applicable and empirically successful approach for finding low-rank matrices that best fit the given data. For example, for the problem of low-rank matrix completion, this method is believed to be one of the most accurate and efficient, and formed a major component of the winning entry in the Netflix Challenge [17].In the alternating minimization approach, the low-rank target matrix is written in a bi-linear form, i.e. X = UV † ; the algorithm then alternates between finding the best U and the best V . Typically, each alternating step in isolation is convex and tractable. However the overall problem becomes non-convex and is prone to local minima. In fact, there has been almost no theoretical understanding of when this approach yields a good result.In this paper we present one of the first theoretical analyses of the performance of alternating minimization for matrix completion, and the related problem of matrix sensing. For both these problems, celebrated recent results have shown that they become well-posed and tractable once certain (now standard) conditions are imposed on the problem. We show that alternating minimization also succeeds under similar conditions. Moreover, compared to existing results, our paper shows that alternating minimization guarantees faster (in particular, geometric) convergence to the true matrix, while allowing a significantly simpler analysis.
Phase retrieval problems involve solving linear equations, but with missing sign (or phase, for complex numbers) information. More than four decades after it was first proposed, the seminal error reduction algorithm of Gerchberg and Saxton [21] and Fienup [19] is still the popular choice for solving many variants of this problem. The algorithm is based on alternating minimization; i.e. it alternates between estimating the missing phase information, and the candidate solution. Despite its wide usage in practice, no global convergence guarantees for this algorithm are known. In this paper, we show that a (resampling) variant of this approach converges geometrically to the solution of one such problem -finding a vector x from y, A, where y = |A T x| and |z| denotes a vector of element-wise magnitudes of z -under the assumption that A is Gaussian.Empirically, we demonstrate that alternating minimization performs similar to recently proposed convex techniques for this problem (which are based on "lifting" to a convex matrix problem) in sample complexity and robustness to noise. However, it is much more efficient and can scale to large problems. Analytically, for a resampling version of alternating minimization, we show geometric convergence to the solution, and sample complexity that is off by log factors from obvious lower bounds. We also establish close to optimal scaling for the case when the unknown vector is sparse. Our work represents the first theoretical guarantee for alternating minimization (albeit with resampling) for any variant of phase retrieval problems in the non-convex setting.
PennyLane is a Python 3 software framework for optimization and machine learning of quantum and hybrid quantumclassical computations. The library provides a unified architecture for near-term quantum computing devices, supporting both qubit and continuous-variable paradigms. PennyLane's core feature is the ability to compute gradients of variational quantum circuits in a way that is compatible with classical techniques such as backpropagation. PennyLane thus extends the automatic differentiation algorithms common in optimization and machine learning to include quantum and hybrid computations. A plugin system makes the framework compatible with any gate-based quantum simulator or hardware.We provide plugins for Strawberry Fields, Rigetti Forest, Qiskit, and ProjectQ, allowing PennyLane optimizations to be run on publicly accessible quantum devices provided by Rigetti and IBM Q. On the classical front, PennyLane interfaces with accelerated machine learning libraries such as TensorFlow, PyTorch, and autograd. PennyLane can be used for the optimization of variational quantum eigensolvers, quantum approximate optimization, quantum machine learning models, and many other applications.
We consider the problem of sparse coding, where each sample consists of a sparse linear combination of a set of dictionary atoms, and the task is to learn both the dictionary elements and the mixing coefficients. Alternating minimization is a popular heuristic for sparse coding, where the dictionary and the coefficients are estimated in alternate steps, keeping the other fixed. Typically, the coefficients are estimated via ℓ 1 minimization, keeping the dictionary fixed, and the dictionary is estimated through least squares, keeping the coefficients fixed. In this paper, we establish local linear convergence for this variant of alternating minimization and establish that the basin of attraction for the global optimum (corresponding to the true dictionary and the coefficients) is O 1/s 2 , where s is the sparsity level in each sample and the dictionary satisfies RIP. Combined with the recent results of approximate dictionary estimation, this yields provable guarantees for exact recovery of both the dictionary elements and the coefficients, when the dictionary elements are incoherent.The problem of sparse coding consists of unsupervised learning of the dictionary and the coefficient matrices. Thus, given only unlabeled data, we aim to learn the set of dictionary atoms or basis functions that provide a good fit to the observed data. Sparse coding is applied in a variety of domains. Sparse coding of natural images has yielded dictionary atoms which resemble the receptive fields of neurons in the visual cortex [26,27], and has also yielded localized dictionary elements on speech and video data [19,25].An important strength of sparse coding is that it can incorporate overcomplete dictionaries, where the number of dictionary atoms r can exceed the observed dimensionality d. It has been argued that having overcomplete representation provides greater flexibility is modeling and more robustness to noise [19], which is crucial for encoding complex signals present in images, speech and video. It has been shown that the performance of most machine learning methods employed downstream is critically dependent on the choice of data representations, and overcomplete representations are the key to obtaining state-of-art prediction results [6].On the downside, the problem of learning sparse codes is computationally challenging, and is in general, NP-hard [9]. In practice, heuristics are employed based on alternating minimization. At a high level, this consists of alternating steps, where the dictionary is kept fixed and the coefficients are updated and vice versa. Such alternating minimization methods have enjoyed empirical success in a number of settings [18,10,2,20,35]. In this paper, we carry out a theoretical analysis of the alternating minimization procedure for sparse coding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.