2020
DOI: 10.1162/tacl_a_00304
|View full text |Cite
|
Sign up to set email alerts
|

Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks

Abstract: Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection. For both tasks, the training set is consistent with a generalization based on hierar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
69
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 52 publications
(70 citation statements)
references
References 32 publications
1
69
0
Order By: Relevance
“…C Encoder-Decoder architecture and its mathematics In order to model the production of plurals for any given Maltese singulars we created a deep learning model in the form of an encoder-decoder neural network. This network consists of two neural networks, an encoder and a decoder (McCoy et al, 2020). These are fed singulars and plurals respectively.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…C Encoder-Decoder architecture and its mathematics In order to model the production of plurals for any given Maltese singulars we created a deep learning model in the form of an encoder-decoder neural network. This network consists of two neural networks, an encoder and a decoder (McCoy et al, 2020). These are fed singulars and plurals respectively.…”
Section: Discussionmentioning
confidence: 99%
“…To investigate the production of Maltese plurals for any given singular, we used an encoder-decoder neural network (McCoy et al, 2020). We used an encoderdecoder neural network, rather than the existing Minimal Generalization Learner (MGL) proposed by Albright and Hayes (2003), because the MGL cannot model non-concatenative morphology for principled reasons.…”
Section: The Present Studymentioning
confidence: 99%
See 3 more Smart Citations