Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.158
|View full text |Cite
|
Sign up to set email alerts
|

A Systematic Assessment of Syntactic Generalization in Neural Language Models

Abstract: While state-of-the-art neural network models continue to achieve lower perplexity scores on language modeling benchmarks, it remains unknown whether optimizing for broad-coverage predictive performance leads to human-like syntactic knowledge. Furthermore, existing work has not provided a clear picture about the model properties required to produce proper syntactic generalizations. We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

7
113
1
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 104 publications
(135 citation statements)
references
References 44 publications
7
113
1
1
Order By: Relevance
“…Much has been written about the ability of ANNs to learn number agreement (Linzen et al, 2016;Gulordava et al, 2018;Giulianelli et al, 2018), including their ability to maintain the dependency across different types of intervening material (Marvin and Linzen, 2018) and with coordinated noun phrases . Hu et al (2020) find that model architecture, rather than training data size, may contribute most to performance on number agreement and related tasks. Focusing on RNN models, Lakretz et al (2019) find evidence that number agreement is tracked by specific "number" units that work in concert with units that carry more general syntactic information like tree depth.…”
Section: Related Workmentioning
confidence: 88%
“…Much has been written about the ability of ANNs to learn number agreement (Linzen et al, 2016;Gulordava et al, 2018;Giulianelli et al, 2018), including their ability to maintain the dependency across different types of intervening material (Marvin and Linzen, 2018) and with coordinated noun phrases . Hu et al (2020) find that model architecture, rather than training data size, may contribute most to performance on number agreement and related tasks. Focusing on RNN models, Lakretz et al (2019) find evidence that number agreement is tracked by specific "number" units that work in concert with units that carry more general syntactic information like tree depth.…”
Section: Related Workmentioning
confidence: 88%
“…find that while LMs learn agreement phenomena at a similarly early stage, other phenomena require more data to learn. Finally, Hu et al (2020) find that adopting architectures that build in linguistic bias, such as RNNGs (Dyer et al, 2016), has a big-ger effect on the acceptability task than increasing training data from 1M to 40M words.…”
Section: Related Workmentioning
confidence: 96%
“…For instance, an RNN classifier is capable of representing any function, but prefers ones that focus mostly on local relationships within the input sequence (Dhingra et al, 2018;Ravfogel et al, 2019). Some recent work seeks to design neural architectures that build in desirable inductive biases (Dyer et al, 2016;Battaglia et al, 2018), or compares the immutable biases of different architectures Hu et al, 2020). However, inductive biases can also be learned by biological (Harlow, 1949) and artificial systems alike (Lake et al, 2017).…”
Section: Inductive Biasmentioning
confidence: 99%
“…Recent work has suggested that LMs acquire abstract, often human-like, knowledge of syntax (e.g., Gulordava et al, 2018;Hu et al, 2020). Additionally, knowledge of grammatical and referential aspects linking a pronoun to its antecedent noun (reference) have been demonstrated for both transformer and long short-term memory architectures (Sorodoc et al, 2020).…”
Section: Introductionmentioning
confidence: 99%