Controlling Grammatical Error Correction Using Word Edit Rate

Hotate, Kengo; Kaneko, Masahiro; Katsumata, Satoru; Komachi, Mamoru

doi:10.18653/v1/p19-2020

Cited by 4 publications

(3 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The word error rate of error corpora is an useful statistic that can be used to balance precision/recall ratios (Rozovskaya and Roth, 2010;Junczys-Dowmunt et al, 2018b;Hotate et al, 2019). Increasing WER in the synthetic data from 15% to 25% increases recall at the expense of precision, but no overall improvement is observed.…”

Section: Results and Analysismentioning

confidence: 99%

Minimally-Augmented Grammatical Error Correction

Grundkiewicz¹,

Junczys-Dowmunt²

2019

Proceedings of the 5th Workshop on Noisy User-Generated Text (W-Nut 2019)

View full text Add to dashboard Cite

There has been an increased interest in lowresource approaches to automatic grammatical error correction. We introduce Minimally-Augmented Grammatical Error Correction (MAGEC) that does not require any errorlabelled data. Our unsupervised approach is based on a simple but effective synthetic error generation method based on confusion sets from inverted spell-checkers. In low-resource settings, we outperform the current state-ofthe-art results for German and Russian GEC tasks by a large margin without using any real error-annotated training data. When combined with labelled data, our method can serve as an efficient pre-training technique.

show abstract

Section: Results and Analysismentioning

confidence: 99%

Minimally-Augmented Grammatical Error Correction

Grundkiewicz¹,

Junczys-Dowmunt²

2019

Proceedings of the 5th Workshop on Noisy User-Generated Text (W-Nut 2019)

View full text Add to dashboard Cite

show abstract

“…Hidey and McKeown (2019) controlled semantic edits in the task of arugment generation. Hotate et al (2019) corrected grammatical errors using the word error rate as the control variable.…”

Section: Introductionmentioning

confidence: 99%

An Empirical Study of Extrapolation in Text Generation with Scalar Control

Jain¹,

Berg-Kirkpatrick²

2021

Preprint

View full text Add to dashboard Cite

We conduct an empirical evaluation of extrapolation performance when conditioning on scalar control inputs like desired output length, desired edit from an input sentence, and desired sentiment across three text generation tasks. Specifically, we examine a zero-shot setting where models are asked to generalize to ranges of control values not seen during training. We focus on evaluating popular embedding methods for scalar inputs, including both learnable and sinusoidal embeddings, as well as simpler approaches. Surprisingly, our findings indicate that the simplest strategy of using scalar inputs directly, without further encoding, most reliably allows for successful extrapolation. Related WorkKikuchi et al. (2016) and Liu et al. (2018) modified the initial state of the decoder in seq2seq models in order to condition on control values. Sennrich et al.

show abstract

“…Conversely, considering a local sequence transduction task in GEC, wherein most of the tokens in the source and target sentences overlap, excessive correction of the input sentence is not preferred because unnecessary rewriting damages the grammatically correct parts of the input sentence. Furthermore, encouraging more corrections than necessary decreases the performance of the GEC itself (Hotate et al, 2019). We hypothesize that both plain beam search and diverse global beam search methods may not be suitable for GEC tasks, and a GEC model must correct the grammatical errors of the input sentence in diverse ways while preserving the correct portions of the sentence.…”

Section: Introductionmentioning

confidence: 99%

Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction

Hotate

Kaneko

Komachi

2020

Proceedings of the 28th International Conference on Computational Linguistics

Self Cite

View full text Add to dashboard Cite

In this study, we propose a beam search method to obtain diverse outputs in a local sequence transduction task where most of the tokens in the source and target sentences overlap, such as in grammatical error correction (GEC). In GEC, it is advisable to rewrite only the local sequences that must be rewritten while leaving the correct sequences unchanged. However, existing methods of acquiring various outputs focus on revising all tokens of a sentence. Therefore, existing methods may either generate ungrammatical sentences because they force the entire sentence to be changed or produce non-diversified sentences by weakening the constraints to avoid generating ungrammatical sentences. Considering these issues, we propose a method that does not rewrite all the tokens in a text, but only rewrites those parts that need to be diversely corrected. Our beam search method adjusts the search token in the beam according to the probability that the prediction is copied from the source sentence. The experimental results show that our proposed method generates more diverse corrections than existing methods without losing accuracy in the GEC task.

show abstract

Controlling Grammatical Error Correction Using Word Edit Rate

Cited by 4 publications

References 10 publications

Minimally-Augmented Grammatical Error Correction

Minimally-Augmented Grammatical Error Correction

An Empirical Study of Extrapolation in Text Generation with Scalar Control

Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction

Contact Info

Product

Resources

About