Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017) 2017
DOI: 10.18653/v1/k17-1003
|View full text |Cite
|
Sign up to set email alerts
|

Exploring the Syntactic Abilities of RNNs with Multi-task Learning

Abstract: Recent work has explored the syntactic abilities of RNNs using the subject-verb agreement task, which diagnoses sensitivity to sentence structure. RNNs performed this task well in common cases, but faltered in complex sentences (Linzen et al., 2016). We test whether these errors are due to inherent limitations of the architecture or to the relatively indirect supervision provided by most agreement dependencies in a corpus. We trained a single RNN to perform both the agreement task and an additional task, eithe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
5

Relationship

2
8

Authors

Journals

citations
Cited by 25 publications
(29 citation statements)
references
References 30 publications
0
29
0
Order By: Relevance
“…In this setup the target language impact is less visible and gender accuracy at the LSTM state level is overall much higher than that of the mono-target systems (0.77 vs 0.68 on average) whereas BLEU scores are slightly lower (−0.9% on average). While this is only an initial exploration of multilingual NMT systems, our results suggest that this kind of multi-task objective pushes the model to learn linguistic features in a more consistent way (Bjerva, 2017;Enguehard et al, 2017).…”
Section: Source-target Language Relatednessmentioning
confidence: 85%
“…In this setup the target language impact is less visible and gender accuracy at the LSTM state level is overall much higher than that of the mono-target systems (0.77 vs 0.68 on average) whereas BLEU scores are slightly lower (−0.9% on average). While this is only an initial exploration of multilingual NMT systems, our results suggest that this kind of multi-task objective pushes the model to learn linguistic features in a more consistent way (Bjerva, 2017;Enguehard et al, 2017).…”
Section: Source-target Language Relatednessmentioning
confidence: 85%
“…One of the first techniques to examine a neural network involves the analysis of activation patterns of the hidden layers (Elman, 1991;Giles et al, 1992). Nowadays, given its popularity, recurrent neural networks are the most evaluated networks, mainly investigated on the structures and linguistic properties they are encoding (Linzen et al, 2016;Enguehard et al, 2017;Kuncoro et al, 2017;Gulordava et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…In contrast to these approaches, the DSA-LSTM only models the probability of surface strings, albeit with an auxiliary loss that distills the next-word predictive distribution of a syntactic language model. Earlier work has also explored multi-task learning with syntactic objectives as an auxiliary loss in language modelling and machine translation (Luong et al, 2016;Eriguchi et al, 2016;Nadejde et al, 2017;Enguehard et al, 2017;Aharoni and Goldberg, 2017;Eriguchi et al, 2017). Our approach of injecting syntactic bias through a KD objective is orthogonal to this approach, with the primary difference that here the student DSA-LSTM has no direct access to syntactic annotations; it does, however, have access to the teacher RNNG's softmax distribution over the next word.…”
Section: Related Workmentioning
confidence: 99%