Proceedings of the 25th Conference on Computational Natural Language Learning 2021
DOI: 10.18653/v1/2021.conll-1.8
|View full text |Cite
|
Sign up to set email alerts
|

Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network

Abstract: Inflectional morphology has since long been a useful testing ground for broader questions about generalisation in language and the viability of neural network models as cognitive models of language. Here, in line with that tradition, we explore how recurrent neural networks acquire the complex German plural system and reflect upon how their strategy compares to human generalisation and rule-based models of this system. We perform analyses including behavioural experiments, diagnostic classification, representa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 52 publications
0
5
0
Order By: Relevance
“…We plot this behavior for models trained on increasing amounts of epochs in Figure 1B, from which it can be seen that this task follows a peculiar inverse scaling pattern (McKenzie et al, 2023). Exploring this pattern in more detail could provide an interesting direction for future research, connecting it to the rule learning of irregular forms in LMs (Dankers et al, 2021).…”
Section: Resultsmentioning
confidence: 99%
“…We plot this behavior for models trained on increasing amounts of epochs in Figure 1B, from which it can be seen that this task follows a peculiar inverse scaling pattern (McKenzie et al, 2023). Exploring this pattern in more detail could provide an interesting direction for future research, connecting it to the rule learning of irregular forms in LMs (Dankers et al, 2021).…”
Section: Resultsmentioning
confidence: 99%
“…The German plural system, on the other hand, relies on the addition of a regular suffix (/-e/, /-er/, /-en/, /-s/ or /-ø/) to the singular form which makes the modeling of pluralization process closer to a classification problem. Dankers et al (2021) employed a simple unidirectional recurrent encoder-decoder model based on Long-Short-Term-Memory (LSTM) layers with no attention mechanism, while the research reported by Beser (2021) compared the performance of a bidirectional LSTM model to a vanilla transformer model. The performance of both approaches relied immensely on the frequency of each suffix in a German corpus.…”
Section: Related Workmentioning
confidence: 99%
“…It has been argued that LMs acquire an abstract notion of word order that goes beyond mere n-gram co-occurrence statistics (Futrell and Levy, 2019;Kuribayashi et al, 2020;Merrill et al, 2024), a claim that we in this paper assess for large-scale LMs in the context of adjective order. Finally, numerous works have investigated the trade-off between memorization and generalization in LMs: it has been shown that larger LMs are able to memorize entire passages from the training data (Biderman et al, 2023a;Lesci et al, 2024;Prashanth et al, 2024), but generalization patterns for grammatical phenomena have also been shown to follow human-like generalization (Dankers et al, 2021;Hupkes et al, 2023;Alhama et al, 2023).…”
Section: Word Order In Language Modelsmentioning
confidence: 99%