Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI 2021
DOI: 10.18653/v1/2021.nlp4convai-1.4
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual Paraphrase Generation For Bootstrapping New Features in Task-Oriented Dialog Systems

Abstract: The lack of labeled training data for new features is a common problem in rapidly changing real-world dialog systems. As a solution, we propose a multilingual paraphrase generation model that can be used to generate novel utterances for a target feature and target language. The generated utterances can be used to augment existing training data to improve intent classification and slot labeling models. We evaluate the quality of generated utterances using intrinsic evaluation metrics and by conducting downstrea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…Paraphrasing the data is one of the ways frequently used for augmentation and can produce more diverse synthetic text with different word choices and sentence structures while preserving the meaning of the original text. Paraphrasing methods have been shown to be effective in many natural language processing tasks (Gupta et al, 2018;Edunov et al, 2018;Iyyer et al, 2018;Wei and Zou, 2019;Cai et al, 2020;Okur et al, 2022;Panda et al, 2021;Jolly et al, 2020). However, such methods often fail to generate more challenging and semantically diverse sentences that are important for the robustness of the downstream models.…”
Section: Related Workmentioning
confidence: 99%
“…Paraphrasing the data is one of the ways frequently used for augmentation and can produce more diverse synthetic text with different word choices and sentence structures while preserving the meaning of the original text. Paraphrasing methods have been shown to be effective in many natural language processing tasks (Gupta et al, 2018;Edunov et al, 2018;Iyyer et al, 2018;Wei and Zou, 2019;Cai et al, 2020;Okur et al, 2022;Panda et al, 2021;Jolly et al, 2020). However, such methods often fail to generate more challenging and semantically diverse sentences that are important for the robustness of the downstream models.…”
Section: Related Workmentioning
confidence: 99%
“…Recent studies explore data augmentation via Natural Language Generation (NLG) for few-shot intents (Xia et al, 2020) and paraphrase generation for intents and slots in task-oriented dialogue systems (Jolly et al, 2020). Another relevant recent work (Panda et al, 2021) is an extension of a transformer-based model by Jolly et al (2020) that works for multilingual paraphrase generation for intents and slots, even in the zeroshot settings. Several other recent works have also been exploring data augmentation with fine-tuning large language models and few-shot learning for intent classification and slot-filling tasks (Kumar et al, 2019;Kumar et al, 2020;Lee et al, 2021).…”
Section: Data Augmentationmentioning
confidence: 99%
“…Paraphrase generation (PG) (Madnani and Dorr, 2010) is to rephrase a sentence into an alternative expression with the same semantics, which has been applied to many downstream tasks, such as question answering (Gan and Ng, 2019) and dialogue systems (Jolly et al, 2020;Gao et al, 2020;Panda et al, 2021;Liang et al, 2019Liang et al, , 2021Liang et al, , 2022. On this basis, to improve the syntactic diversity of paraphrases, syntactically-controlled paraphrase Figure 1: The generated paraphrases with different templates.…”
Section: Introductionmentioning
confidence: 99%