Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-1002
|View full text |Cite
|
Sign up to set email alerts
|

The emergence of number and syntax units in

Abstract: Recent work has shown that LSTMs trained on a generic language modeling objective capture syntax-sensitive generalizations such as longdistance number agreement. We have however no mechanistic understanding of how they accomplish this remarkable feat. Some have conjectured it depends on heuristics that do not truly take hierarchical structure into account. We present here a detailed study of the inner mechanics of number tracking in LSTMs at the single neuron level. We discover that longdistance number informa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
159
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 117 publications
(165 citation statements)
references
References 18 publications
6
159
0
Order By: Relevance
“…For example, while in humans, the occurrence of an attractor within a prepositional phrase was found to elicit more agreement errors compared to its presence in a relative clause [ 4 ], recurrent neural network language models were found to show the opposite effect [ 35 ]. Similarly, recurrent neural models were found in Reference [ 33 ] to produce different errors rates than humans on a variety of syntactic structures (but see, References [ 26 , 36 ]).…”
Section: Understanding Capacity Limitation In Light Of Neural Langmentioning
confidence: 68%
See 2 more Smart Citations
“…For example, while in humans, the occurrence of an attractor within a prepositional phrase was found to elicit more agreement errors compared to its presence in a relative clause [ 4 ], recurrent neural network language models were found to show the opposite effect [ 35 ]. Similarly, recurrent neural models were found in Reference [ 33 ] to produce different errors rates than humans on a variety of syntactic structures (but see, References [ 26 , 36 ]).…”
Section: Understanding Capacity Limitation In Light Of Neural Langmentioning
confidence: 68%
“…In a recent work [ 26 ], we studied the processing of long-range dependencies in an NLM and found that a sparse neural circuit emerged in the neural language model during training, which was shown to carry grammatical-number agreements across long-range dependencies in various sentences. In this section, we briefly describe the main findings therein, and describe how a capacity limitation can further be derived.…”
Section: Understanding Capacity Limitation In Light Of Neural Langmentioning
confidence: 99%
See 1 more Smart Citation
“…as grammatical. Prior work has shown that LSTMs seem to use an maintain an encoding of the subject's number to process agreement (Lakretz et al, 2019); the fact that this asymmetry emerged from what appears to be an encoding model of agreement is therefore particularly intriguing and should motivate future work.…”
Section: Grammaticality Asymmetrymentioning
confidence: 99%
“…Initial work had shown RNNs do not really learn the structure-dependency of this construction (Linzen et al, 2016), but follow up work has shown that stronger techniques can yield more positive results (Gulordava et al, 2018), only to be very promptly rebutted by work suggesting that the apparently positive results could be the artifact of a much simpler strategy, which takes advantage of the unnaturally simple structure of the examples and simply learns properties of the first word in the sentence (Kuncoro et al, 2018). Recent work by Lakretz et al (2019), however, studies RNNs in more detail, looking at single neurons, and finds that individual neurons encode linguistically meaningful features very saliently and with behaviour over time that corresponds to the expected propagation of subject-verb number agreement information.…”
Section: Introductionmentioning
confidence: 99%