Learning Unitary Operators with Help From u(n)

Hyland, Stephanie L.; Rätsch, Gunnar

doi:10.1609/aaai.v31i1.10928

Cited by 10 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, recurrent neural networks may suffer from the exploding gradient problem. This can be prevented by constraining the operations to be unitary and much work has been done to efficiently parameterize the unitary group [114,115]. PQC models have the advantage of naturally implementing unitary operations on an exponentially large vector space.…”

Section: Referencementioning

confidence: 99%

Parameterized quantum circuits as machine learning models

Benedetti

Lloyd

Sack

et al. 2019

Quantum Sci. Technol.

772

495

View full text Add to dashboard Cite

Hybrid quantum-classical systems make it possible to utilize existing quantum computers to their fullest extent. Within this framework, parameterized quantum circuits can be thought of as machine learning models with remarkable expressive power. This Review presents components of these models and discusses their application to a variety of data-driven tasks such as supervised learning and generative modeling. With experimental demonstrations carried out on actual quantum hardware, and with software actively being developed, this rapidly growing field could become one of the first instances of quantum computing that addresses real world problems.

show abstract

Section: Referencementioning

confidence: 99%

Parameterized quantum circuits as machine learning models

Benedetti

Lloyd

Sack

et al. 2019

Quantum Sci. Technol.

772

495

View full text Add to dashboard Cite

show abstract

“…Various algorithms have then been developed to train orthogonal RNNs. Typically, these works propose parametrizations of the orthogonal/unitary group that lead to computationally cheap operations, see (Arjovsky, Shah, and Bengio 2016;Wisdom et al 2016;Jing et al 2017;Hyland and Rätsch 2017;Mhammedi The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22) Helfrich, Willmott, and Ye 2018;Lezcano-Casado and Martínez-Rubio 2019;Maduranga, Helfrich, and Ye 2019). However, most of these works are focused on algorithmic contributions and do not contain convergence analyses.…”

Section: Related Workmentioning

confidence: 99%

Coordinate Descent on the Orthogonal Group for Recurrent Neural Network Training

Massart

Abrol

2022

AAAI

View full text Add to dashboard Cite

We address the poor scalability of learning algorithms for orthogonal recurrent neural networks via the use of stochastic coordinate descent on the orthogonal group, leading to a cost per iteration that increases linearly with the number of recurrent states. This contrasts with the cubic dependency of typical feasible algorithms such as stochastic Riemannian gradient descent, which prohibits the use of big network architectures. Coordinate descent rotates successively two columns of the recurrent matrix. When the coordinate (i.e., indices of rotated columns) is selected uniformly at random at each iteration, we prove convergence of the algorithm under standard assumptions on the loss function, stepsize and minibatch noise. In addition, we numerically show that the Riemannian gradient has an approximately sparse structure. Leveraging this observation, we propose a variant of our proposed algorithm that relies on the Gauss-Southwell coordinate selection rule. Experiments on a benchmark recurrent neural network training problem show that the proposed approach is a very promising step towards the training of orthogonal recurrent neural networks with big architectures.

show abstract

“…During the development of neural networks, orthogonality was first shown to be useful in mitigating the vanishing or exploding gradients problem (Bengio, Simard, and Frasconi 1994), especially on recurrent neural networks (RNNs) (Pascanu, Mikolov, and Bengio 2013;Le, Jaitly, and Hinton 2015;Wisdom et al 2016;Arjovsky, Shah, and Bengio 2016;Jing et al 2017;Hyland and Rätsch 2017;Vorontsov et al 2017;Helfrich and Ye 2020). To improve the efficiency of the optimization algorithms with orthogonality, many techniques have been utilized, e.g., householder reflections (Mhammedi et al 2017), Cayley transform (Helfrich, Willmott, andYe 2018;Maduranga, Helfrich, and Ye 2019), and exponential-map-based parameterization (Lezcano Casado 2019;Lezcano-Casado and Martınez-Rubio 2019).…”

Section: Related Workmentioning

confidence: 99%

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs

Chang

2022

AAAI

View full text Add to dashboard Cite

The optimization with orthogonality has been shown useful in training deep neural networks (DNNs). To impose orthogonality on DNNs, both computational efficiency and stability are important. However, existing methods utilizing Riemannian optimization or hard constraints can only ensure stability while those using soft constraints can only improve efficiency. In this paper, we propose a novel method, named Feedback Gradient Descent (FGD), to our knowledge, the first work showing high efficiency and stability simultaneously. FGD induces orthogonality based on the simple yet indispensable Euler discretization of a continuous-time dynamical system on the tangent bundle of the Stiefel manifold. In particular, inspired by a numerical integration method on manifolds called Feedback Integrators, we propose to instantiate it on the tangent bundle of the Stiefel manifold for the first time. In our extensive image classification experiments, FGD comprehensively outperforms the existing state-of-the-art methods in terms of accuracy, efficiency, and stability.

show abstract

Learning Unitary Operators with Help From u(n)

Cited by 10 publications

References 9 publications

Parameterized quantum circuits as machine learning models

Parameterized quantum circuits as machine learning models

Coordinate Descent on the Orthogonal Group for Recurrent Neural Network Training

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs

Contact Info

Product

Resources

About