2017
DOI: 10.1609/aaai.v31i1.10928
|View full text |Cite
|
Sign up to set email alerts
|

Learning Unitary Operators with Help From u(n)

Abstract: A major challenge in the training of recurrent neural networks is the so-called vanishing or exploding gradient problem. The use of a norm-preserving transition operator can address this issue, but parametrization is challenging. In this work we focus on unitary operators and describe a parametrization using the Lie algebra u(n) associated with the Lie group U(n) of n × n unitary matrices. The exponential map provides a correspondence between these spaces, and allows us to define a unitary matrix using n2 real… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…For example, recurrent neural networks may suffer from the exploding gradient problem. This can be prevented by constraining the operations to be unitary and much work has been done to efficiently parameterize the unitary group [114,115]. PQC models have the advantage of naturally implementing unitary operations on an exponentially large vector space.…”
Section: Referencementioning
confidence: 99%
“…For example, recurrent neural networks may suffer from the exploding gradient problem. This can be prevented by constraining the operations to be unitary and much work has been done to efficiently parameterize the unitary group [114,115]. PQC models have the advantage of naturally implementing unitary operations on an exponentially large vector space.…”
Section: Referencementioning
confidence: 99%
“…Various algorithms have then been developed to train orthogonal RNNs. Typically, these works propose parametrizations of the orthogonal/unitary group that lead to computationally cheap operations, see (Arjovsky, Shah, and Bengio 2016;Wisdom et al 2016;Jing et al 2017;Hyland and Rätsch 2017;Mhammedi The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22) Helfrich, Willmott, and Ye 2018;Lezcano-Casado and Martínez-Rubio 2019;Maduranga, Helfrich, and Ye 2019). However, most of these works are focused on algorithmic contributions and do not contain convergence analyses.…”
Section: Related Workmentioning
confidence: 99%
“…During the development of neural networks, orthogonality was first shown to be useful in mitigating the vanishing or exploding gradients problem (Bengio, Simard, and Frasconi 1994), especially on recurrent neural networks (RNNs) (Pascanu, Mikolov, and Bengio 2013;Le, Jaitly, and Hinton 2015;Wisdom et al 2016;Arjovsky, Shah, and Bengio 2016;Jing et al 2017;Hyland and Rätsch 2017;Vorontsov et al 2017;Helfrich and Ye 2020). To improve the efficiency of the optimization algorithms with orthogonality, many techniques have been utilized, e.g., householder reflections (Mhammedi et al 2017), Cayley transform (Helfrich, Willmott, andYe 2018;Maduranga, Helfrich, and Ye 2019), and exponential-map-based parameterization (Lezcano Casado 2019;Lezcano-Casado and Martınez-Rubio 2019).…”
Section: Related Workmentioning
confidence: 99%