Proceedings of the 3rd Workshop on Neural Generation and Translation 2019
DOI: 10.18653/v1/d19-5609
|View full text |Cite
|
Sign up to set email alerts
|

Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Abstract: Data availability is a bottleneck during early stages of development of new capabilities for intelligent artificial agents. We investigate the use of text generation techniques to augment the training data of a popular commercial artificial agent across categories of functionality, with the goal of faster development of new functionality. We explore a variety of encoderdecoder generative models for synthetic training data generation and propose using conditional variational auto-encoders. Our approach requires… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
21
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 36 publications
(21 citation statements)
references
References 18 publications
0
21
0
Order By: Relevance
“…In our work, we make no assumption about the availability of labeled data in other languages, user feedback or unlabeled data, as none of them is necessarily available when bootstrapping a new feature. Therefore, the closest existing work to ours is and (Malandrakis et al, 2019), who both deal with the setup where only seed examples are available. Malandrakis et al (2019) propose using conditional variational auto-encoders to generate paraphrases for the seed data and show that the paraphrases increase intent classification performance in their experiment.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In our work, we make no assumption about the availability of labeled data in other languages, user feedback or unlabeled data, as none of them is necessarily available when bootstrapping a new feature. Therefore, the closest existing work to ours is and (Malandrakis et al, 2019), who both deal with the setup where only seed examples are available. Malandrakis et al (2019) propose using conditional variational auto-encoders to generate paraphrases for the seed data and show that the paraphrases increase intent classification performance in their experiment.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, the closest existing work to ours is and (Malandrakis et al, 2019), who both deal with the setup where only seed examples are available. Malandrakis et al (2019) propose using conditional variational auto-encoders to generate paraphrases for the seed data and show that the paraphrases increase intent classification performance in their experiment. In contrast to our work, they do not evaluate on slot labeling and do not suggest a technique to add slot labels to the paraphrases.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, these approaches suffer from high training variance and mode collapse. Controlled text generation techniques were also explored to perform data augmentation (Malandrakis et al, 2019). For instance, Variational Autoencoder (VAE) was applied in text generation (Bowman et al, 2016).…”
Section: Related Workmentioning
confidence: 99%
“…For example, Wei and Zou (2019) investigated language transformations like insertion, deletion and swap. Several works (Malandrakis et al, 2019;Yoo et al, 2019;Xia et al, 2020b;Xia et al, 2020a) utilized variational autoencoders (VAEs) (Kingma and Welling, 2013) to generate more raw inputs. Nevertheless, these methods often rely on some extra knowledge to guarantee the quality of new inputs, and they have to be working in a pipeline.…”
Section: Introductionmentioning
confidence: 99%