Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.43
|View full text |Cite
|
Sign up to set email alerts
|

A Formal Hierarchy of RNN Architectures

Abstract: We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be described by a weighted finite-state machine. We place several RNN variants within this hierarchy. For example, we prove the LSTM is not rational, which formally separates it from the related QRNN (Bradbury et al., 2016). We also show how these models' expressive … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 38 publications
(36 citation statements)
references
References 20 publications
0
35
0
1
Order By: Relevance
“…But these theoretical complexities do not have significant effect on real world applications, if parallel processing (e.g., a GPU) is used for running the matrix multiplication. Merrill et al [122] described a useful range between narrow upper and lower bounds of the space complexities for various models of neural networks. The space complexity of RNN, CNN, and HAN is O(1) [122].…”
Section: Time-space Complexities Of Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation
“…But these theoretical complexities do not have significant effect on real world applications, if parallel processing (e.g., a GPU) is used for running the matrix multiplication. Merrill et al [122] described a useful range between narrow upper and lower bounds of the space complexities for various models of neural networks. The space complexity of RNN, CNN, and HAN is O(1) [122].…”
Section: Time-space Complexities Of Algorithmsmentioning
confidence: 99%
“…Merrill et al [122] described a useful range between narrow upper and lower bounds of the space complexities for various models of neural networks. The space complexity of RNN, CNN, and HAN is O(1) [122]. The DL algorithms (e.g., RNN) can use hidden layer as memory store to learn sequences.…”
Section: Time-space Complexities Of Algorithmsmentioning
confidence: 99%
“…In our experiment, as illustrated in figure 3, we simulate a putative divergence of a phonotactic grammar into sub-modules by feeding a corpus of Japanese words into a dynamic probabilistic model that is allowed to fork into two submodels. Mayer (2020), whose one-layer RNN of finite precision has been shown to be unable to learn unattested patterns such as a n b n (Weiss et al, 2018;Merrill et al, 2020). Each cell h i of the RNN is fed (a) a vector-encoding of the input segment x i and (b) the vector output of the previous hidden state h i−1 .…”
Section: The Experimentsmentioning
confidence: 99%
“…Many recent works have explored the computational power of RNNs in practical settings. Several works (Merrill et al, 2020), (Weiss et al, 2018) recently studied the ability of RNNs to recognize counter-like languages. The capability of RNNs to recognize strings of balanced parantheses has also been studied (Sennhauser and Berwick, 2018;Skachkova et al, 2018).…”
Section: Related Workmentioning
confidence: 99%