An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

Kiyono, Satoshi; Suzuki, Jun; Mita, Masato; Mizumoto, Tomoya; Inui, Kentaro

doi:10.18653/v1/d19-1119

Cited by 115 publications

(174 citation statements)

References 27 publications

Supporting

Mentioning

172

Contrasting

Order By: Relevance

“…The ensembles combine the four models from the preceding row. (Grundkiewicz et al, 2019) 69.5 64.2 61.2 (Kiyono et al, 2019) 70.2 65.0 61.4 (Lichtarge et al, 2019) -60.4 63.3 (Xu et al, 2019) 66.6 63.2 62.6 (Omelianchuk et al, 2020) 73.7 66.5 this work -unscored 71.9 65.3 64.7 this work -scored 73.0 66.8 64.9 ment, as seen in the example-level analysis in Section 7. Other methods for scoring individual examples should be explored.…”

Section: Future Workmentioning

confidence: 99%

Corpora Generation for Grammatical Error Correction

Lichtarge

Alberti

Kumar

et al. 2019

Proceedings of the 2019 Conference of the North

119

105

View full text Add to dashboard Cite

Grammatical Error Correction (GEC) has been recently modeled using the sequenceto-sequence framework. However, unlike sequence transduction problems such as machine translation, GEC suffers from the lack of plentiful parallel data. We describe two approaches for generating large parallel datasets for GEC using publicly available Wikipedia data.The first method extracts sourcetarget pairs from Wikipedia edit histories with minimal filtration heuristics, while the second method introduces noise into Wikipedia sentences via round-trip translation through bridge languages. Both strategies yield similar sized parallel corpora containing around 4B tokens. We employ an iterative decoding strategy that is tailored to the loosely supervised nature of our constructed corpora. We demonstrate that neural GEC models trained using either type of corpora give similar performance. Fine-tuning these models on the Lang-8 corpus and ensembling allows us to surpass the state of the art on both the CoNLL-2014 benchmark and the JFLEG task. We provide systematic analysis that compares the two approaches to data generation and highlights the effectiveness of ensembling. * * Equal contribution. Listing order is random. Jared conducted systematic experiments to determine useful variants of the Wikipedia revisions corpus, pre-training and finetuning strategies, and iterative decoding. Chris implemented the ensemble and provided background knowledge and resources related to GEC. Shankar ran training and decoding experiments using round-trip translated data. Jared, Chris and Shankar wrote the paper. Noam identified Wikipedia revisions as a source of training data. Noam developed the heuristics for using the full Wikipedia revisions at scale and conducted initial experiments to train Transformer models for GEC. Noam and Niki provided guidance on training Transformer models using the Tensor2Tensor toolkit. Simon proposed using round-trip translations as a source for training data, and corrupting them with common errors extracted from Wikipedia revisions. Simon generated such data for this paper.

show abstract

Section: Future Workmentioning

confidence: 99%

Corpora Generation for Grammatical Error Correction

Lichtarge

Alberti

Kumar

et al. 2019

Proceedings of the 2019 Conference of the North

119

105

View full text Add to dashboard Cite

show abstract

“…In this work, we proposed a handcrafted procedure for performing data augmentation for model pretraining. In future work, a learning-based method for data augmentation (e.g., back-translation [17]) could further improve the correction performance.…”

Section: Discussionmentioning

confidence: 99%

“…Techniques from statistical machine translation [11], and subsequently, neural machine translation (NMT) [11], [14], [15] have been employed with great success compared to the traditional two-stage systems. More specialized architectures [12], [16] and techniques for data augmentation and training [12], [17] emerged later.…”

Section: A Text Correction Systems For Natural Languagementioning

confidence: 99%

“…Because GLEU evaluates words based on ngrams instead of individual tokens, it tends to favor grouped errors over scattered ones, whereas WER treats all errors equally. For the English GEC and spelling correction tasks, we employed the standard M2 and GLEU for comparability with the existing literature [11], [12], [14]- [17], [21]. Our initial goal was to adopt both correction metrics from English GEC for our Thai TC task.…”

Section: A Evaluation Criteriamentioning

confidence: 99%

See 1 more Smart Citation

Thai Spelling Correction and Word Normalization on Social Text Using a Two-Stage Pipeline With Neural Contextual Attention

2020

View full text Add to dashboard Cite

Text correction systems (e.g., spell checkers) have been used to improve the quality of computerized text by detecting and correcting errors. However, the task of performing spelling correction and word normalization (text correction) for Thai social media text has remained largely unexplored. In this paper, we investigated how current text correction systems perform on correcting errors and word variances in Thai social texts and propose a method designed for this task. We have found that currently available Thai text correction systems are insufficiently robust for correcting spelling errors and word variances, while the text correctors designed for English grammatical error correction suffer from overcorrections (text rewrites). Thus, we proposed a neural-based text corrector with a two-stage structure to alleviate issues of overcorrections while exploiting the benefits of a neural Seq2Seq corrector. Our method consists of a neuralbased error detector and a Seq2Seq neural error corrector with contextual attention. This novel architecture allows the Seq2Seq network to produce corrections based on both the erroneous text and its context without the need for an end-to-end structure. Our method outperformed all the other evaluated text correction systems. When compared to the second-best result (copy-augmented transformer), our method further reduced the word error rate (WER) from 2.51% to 2.07%, improved the generalized language evaluation understanding (GLEU) score from 0.9409 to 0.9502 on the Thai text correction task, and improved the GLEU score from 0.7409 to 0.7539 on the English spelling correction task.

show abstract

“…However, the majority of recent studies have considered only the first factor. With the emergence of deep learning, the GEC task is being considered from the viewpoint of Machine Transla-tion(MT) [1,6,7,8,13,16,17,27]. GEC is regarded as a task that translates a sentence with errors into a correct sentence.…”

Section: Introductionmentioning

confidence: 99%

Comparison of the Evaluation Metrics for Neural Grammatical Error Correction With Overcorrection

et al. 2020

View full text Add to dashboard Cite

Grammar error correction (GEC) refers to the proper correction of grammatical errors in a given sentence. Important factors to consider in GEC are not only the grammatical correction of the sentence, but also the recognition of a correct sentence in which no changes are required. However, GEC approaches in which deep learning recently started being used consider only the former aspect, which leads to overcorrection, whereby changes are made to a correct sentence unnecessarily. Because this bias is also reflected in performance metrics, conventional performance metrics consider only part of the important factors in GEC. This study proposes a new metric to consider both important aspects in GEC and to provide a new viewpoint for the GEC task. To the best of the authors knowledge, this study is the first to deal with comprehensively considering the correction performance and overcorrection problem in GEC. The experimental results demonstrate that the model performance ranking was reversed when evaluating the performance with the proposed metric compared to the General Language Understanding Evaluation benchmark [21], which only considers the correction performance. This indicates that the high performance of the correction does not result in less problems with the overcorrection and that the overcorrection problem should also be considered when evaluating the model performance. Moreover, we found that the copy mechanism [14] helps to alleviate the problem of overcorrection.

show abstract

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

Cited by 115 publications

References 27 publications

Corpora Generation for Grammatical Error Correction

Corpora Generation for Grammatical Error Correction

Thai Spelling Correction and Word Normalization on Social Text Using a Two-Stage Pipeline With Neural Contextual Attention

Comparison of the Evaluation Metrics for Neural Grammatical Error Correction With Overcorrection

Contact Info

Product

Resources

About