Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.614
|View full text |Cite
|
Sign up to set email alerts
|

On the Role of Supervision in Unsupervised Constituency Parsing

Abstract: We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the parsing F 1 score on the Wall Street Journal (WSJ) development set (1,700 sentences). We introduce strong baselines for them, by training an existing supervised parsing model (Kitaev and Klein, 2018) on the same labeled examples they access. When training on the 1,700 examples, or even when using only 50 examples for training and 5 for development, such a few-shot parsing approach can outperform all the uns… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 16 publications
(25 citation statements)
references
References 36 publications
0
25
0
Order By: Relevance
“…BiAffine: the bi-affine dependency parsing model proposed by Dozat and Manning (2017). Shi et al (2020) have shown that SUB 2 can significantly improve few-shot constituency parsing on the Penn Treebank dataset; in this work, we extend the few-shot parsing evaluation to other domains, using the Foreebank (FBANK; Kaljahi et al, 2015) and NXT-Switchboard (SWBD;Calhoun et al, 2010) datasets. Foreebank consists of 1,000 English and 1,000 French sentences; for either language, we randomly select 50 sentences for training, 50 for development, and 250 for testing.…”
Section: Dependency Parsingmentioning
confidence: 99%
See 2 more Smart Citations
“…BiAffine: the bi-affine dependency parsing model proposed by Dozat and Manning (2017). Shi et al (2020) have shown that SUB 2 can significantly improve few-shot constituency parsing on the Penn Treebank dataset; in this work, we extend the few-shot parsing evaluation to other domains, using the Foreebank (FBANK; Kaljahi et al, 2015) and NXT-Switchboard (SWBD;Calhoun et al, 2010) datasets. Foreebank consists of 1,000 English and 1,000 French sentences; for either language, we randomly select 50 sentences for training, 50 for development, and 250 for testing.…”
Section: Dependency Parsingmentioning
confidence: 99%
“…Data augmentation has been found effective for various natural language processing (NLP) tasks, such as machine translation (Fadaee et al, 2017;Gao et al, 2019;Xia et al, 2019, inter alia), text classification (Wei and Zou, 2019;Quteineh et al, 2020), syntactic and semantic parsing (Jia and Liang, 2016;Shi et al, 2020;Dehouck and Gómez-Rodríguez, 2020), semantic role labeling (Fürstenau and Lapata, 2009), and dialogue understanding (Hou et al, 2018;Niu and Bansal, 2019). Such methods enhance the diversity of the training set by generating examples based on existing ones, and can make the learned models more robust against noise (Xie et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…We here mention a few limitations of our approach and propose avenues for future work. First, analogous to several unsupervised parsers (Shi et al, 2020) including PCFGs (Zhao and Titov, 2021), the current form of our method relies on a few gold-standard annotations from the validation set to determine the best hyperparameters (i.e., the best choice for attention head selection). This dependency makes it hard to say that our approach is entirely unsupervised, although it steps aside from a typical way of learning parsers with supervision.…”
Section: Limitations and Future Workmentioning
confidence: 99%
“…Notably, developed an approach similar to ours, but they focused on Transformers trained for machine translation rather than language models. Work on neural unsupervised parsing (Shen et al (2018b(Shen et al ( , 2019; Kim et al (2019a); Shi et al (2020), inter alia) also seeks to generate parse trees without supervision from gold-standard trees. It is worth noting that some work such as Kann et al (2019) and Zhao and Titov (2021) attempt to evaluate unsupervised parsers in multilingual settings, akin to our work.…”
Section: Related Workmentioning
confidence: 99%