Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1119
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

Abstract: The incorporation of pseudo data in the training of grammatical error correction models has been one of the main factors in improving the performance of such models. However, consensus is lacking on experimental configurations, namely, choosing how the pseudo data should be generated or used. In this study, these choices are investigated through extensive experiments, and state-of-the-art performance is achieved on the CoNLL-2014 test set (F 0.5 = 65.0) and the official test set of the BEA-2019 shared task (F … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
172
2

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 115 publications
(174 citation statements)
references
References 27 publications
0
172
2
Order By: Relevance
“…The ensembles combine the four models from the preceding row. (Grundkiewicz et al, 2019) 69.5 64.2 61.2 (Kiyono et al, 2019) 70.2 65.0 61.4 (Lichtarge et al, 2019) -60.4 63.3 (Xu et al, 2019) 66.6 63.2 62.6 (Omelianchuk et al, 2020) 73.7 66.5 this work -unscored 71.9 65.3 64.7 this work -scored 73.0 66.8 64.9 ment, as seen in the example-level analysis in Section 7. Other methods for scoring individual examples should be explored.…”
Section: Future Workmentioning
confidence: 99%
“…The ensembles combine the four models from the preceding row. (Grundkiewicz et al, 2019) 69.5 64.2 61.2 (Kiyono et al, 2019) 70.2 65.0 61.4 (Lichtarge et al, 2019) -60.4 63.3 (Xu et al, 2019) 66.6 63.2 62.6 (Omelianchuk et al, 2020) 73.7 66.5 this work -unscored 71.9 65.3 64.7 this work -scored 73.0 66.8 64.9 ment, as seen in the example-level analysis in Section 7. Other methods for scoring individual examples should be explored.…”
Section: Future Workmentioning
confidence: 99%
“…In this work, we proposed a handcrafted procedure for performing data augmentation for model pretraining. In future work, a learning-based method for data augmentation (e.g., back-translation [17]) could further improve the correction performance.…”
Section: Discussionmentioning
confidence: 99%
“…Techniques from statistical machine translation [11], and subsequently, neural machine translation (NMT) [11], [14], [15] have been employed with great success compared to the traditional two-stage systems. More specialized architectures [12], [16] and techniques for data augmentation and training [12], [17] emerged later.…”
Section: A Text Correction Systems For Natural Languagementioning
confidence: 99%
See 1 more Smart Citation
“…However, the majority of recent studies have considered only the first factor. With the emergence of deep learning, the GEC task is being considered from the viewpoint of Machine Transla-tion(MT) [1,6,7,8,13,16,17,27]. GEC is regarded as a task that translates a sentence with errors into a correct sentence.…”
Section: Introductionmentioning
confidence: 99%