Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.418
|View full text |Cite
|
Sign up to set email alerts
|

Seq2Edits: Sequence Transduction Using Span-level Edit Operations

Abstract: We propose Seq2Edits, an open-vocabulary approach to sequence editing for natural language processing (NLP) tasks with a high degree of overlap between input and output texts. In this approach, each sequence-to-sequence transduction is represented as a sequence of edit operations, where each operation either replaces an entire source span with target tokens or keeps it unchanged. We evaluate our method on five NLP tasks (text normalization, sentence fusion, sentence splitting & rephrasing, text simplification,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
46
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(46 citation statements)
references
References 41 publications
0
46
0
Order By: Relevance
“…The only model that shows advantages over our 9+3 model is GECToR which is developed based on the powerful pretrained mod- els (e.g., RoBERTa and XL-Net (Yang et al, 2019)) with its multi-stage training strategy. Following GECToR's recipe, we leverage the pretrained model BART to initialize a 12+2 model which proves to work well in NMT (Li et al, 2021) despite more parameters, and apply the multi-stage fine-tuning strategy used in Stahlberg and Kumar (2020). The final single model 11 with aggressive decoding achieves the state-of-the-art result -66.4 F 0.5 in the CoNLL-14 test set with a 9.6× speedup over the Transformerbig baseline.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The only model that shows advantages over our 9+3 model is GECToR which is developed based on the powerful pretrained mod- els (e.g., RoBERTa and XL-Net (Yang et al, 2019)) with its multi-stage training strategy. Following GECToR's recipe, we leverage the pretrained model BART to initialize a 12+2 model which proves to work well in NMT (Li et al, 2021) despite more parameters, and apply the multi-stage fine-tuning strategy used in Stahlberg and Kumar (2020). The final single model 11 with aggressive decoding achieves the state-of-the-art result -66.4 F 0.5 in the CoNLL-14 test set with a 9.6× speedup over the Transformerbig baseline.…”
Section: Resultsmentioning
confidence: 99%
“…The Transformer (Vaswani et al, 2017) has become the most popular model for Grammatical Error Correction (GEC). In practice, however, the sequenceto-sequence (seq2seq) approach has been blamed recently (Chen et al, 2020;Stahlberg and Kumar, 2020;Omelianchuk et al, 2020) for its poor inference efficiency in modern writing assistance applications (e.g., Microsoft Office Word 1 , Google Docs 2 and Grammarly 3 ) where a GEC model usually performs online inference, instead of batch inference, for proactively and incrementally checking a user's latest completed sentence to offer instantaneous feedback.…”
Section: Introductionmentioning
confidence: 99%
“…summarization, grammatical error correction, sentence splitting, etc.) as a text editing task (Malmi et al, 2019;Panthaplackel et al, 2020;Stahlberg and Kumar, 2020) where target texts are reconstructed from inputs using several edit operations.…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…Our work also relates to recent work on sentencelevel transduction tasks, like grammatical error correction (GEC), which allows for directly predicting certain span-level edits (Stahlberg and Kumar, 2020). These edits are different from our insertion operations, requiring token-level operations except when copying from the source sentence, and are obtained, following a long line of work in GEC (Swanson and Yamangil, 2012;Xue and Hwa, 2014;Felice et al, 2016;Bryant et al, 2017), by heuristically merging token-level alignments obtained with a Damerau-Levenshtein-style algorithm (Brill and Moore, 2000).…”
Section: Related Workmentioning
confidence: 98%