Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.467
|View full text |Cite
|
Sign up to set email alerts
|

Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space

Abstract: In this paper, we propose a novel data augmentation method, referred to as Controllable Rewriting based Question Data Augmentation (CRQDA), for machine reading comprehension (MRC), question generation, and question-answering natural language inference tasks. We treat the question data augmentation task as a constrained question rewriting problem to generate context-relevant, high-quality, and diverse question data samples. CRQDA utilizes a Transformer autoencoder to map the original discrete question into a co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(14 citation statements)
references
References 41 publications
0
14
0
Order By: Relevance
“…Paraphrasing Thesauruses Zhang et al [5], Wei et al [6], Coulombe et al [7] Semantic Embeddings Wang et al [8] MLMs Jiao et al [9] Rules Coulombe et al [7], Regina et al [10], Louvan et al [11] Machine Translation Back-translation Xie et al [12], Zhang et al [13] Unidirectional Translation Nishikawa et al [14], Bornea et al [15] Model Generation Hou et al [16], Li et al [17], Liu et al [18] Noising Swapping Wei et al [6], Luque et al [19], Yan et al [20] Deletion Wei et al [6], Peng et al [21], Yu et al [22] Insertion Wei et al [6], Peng et al [21], Yan et al [20] Substitution Coulombe et al [7], Xie et al [23], Louvan et al [11] Mixup Guo et al [24], Cheng et al [25] Sampling Rules Min et al [26], Liu et al [27] Seq2Seq Models Kang et al [28], Zhang et al [13], Raille et al [29] Language Models…”
Section: Da For Nlpmentioning
confidence: 99%
See 1 more Smart Citation
“…Paraphrasing Thesauruses Zhang et al [5], Wei et al [6], Coulombe et al [7] Semantic Embeddings Wang et al [8] MLMs Jiao et al [9] Rules Coulombe et al [7], Regina et al [10], Louvan et al [11] Machine Translation Back-translation Xie et al [12], Zhang et al [13] Unidirectional Translation Nishikawa et al [14], Bornea et al [15] Model Generation Hou et al [16], Li et al [17], Liu et al [18] Noising Swapping Wei et al [6], Luque et al [19], Yan et al [20] Deletion Wei et al [6], Peng et al [21], Yu et al [22] Insertion Wei et al [6], Peng et al [21], Yan et al [20] Substitution Coulombe et al [7], Xie et al [23], Louvan et al [11] Mixup Guo et al [24], Cheng et al [25] Sampling Rules Min et al [26], Liu et al [27] Seq2Seq Models Kang et al [28], Zhang et al [13], Raille et al [29] Language Models…”
Section: Da For Nlpmentioning
confidence: 99%
“…9 Kober et al [68] use GAN to generate samples that are very similar to the original data. Liu et al [18] employ a pre-trained model to share the question embeddings and the guidance for the proposed Transformer-based model. Then the proposed model could generate both context-relevant answerable questions and unanswerable questions.…”
Section: Model Generationmentioning
confidence: 99%
“…Label-conditioned text generation. Recent work has explored generating new examples by training a conditional text generation model (Bergmanis et al, 2017;Liu et al, 2020a;Ding et al, 2020;Liu et al, 2020b, inter alia), or applying post-processing on the examples generated by pretrained models Wan et al, 2020;Yoo et al, 2020). In the data augmentation stage, given labels in the original dataset as conditions, such models generate associated text accordingly.…”
Section: Related Workmentioning
confidence: 99%
“…We pose the data augmentation problem [9,11] as a positive example expansion problem. First, we sample which answers should be used for augmentation with probability đť‘ť.…”
Section: Data Augmentationmentioning
confidence: 99%