PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Bao, Siqi; Huang, He; Wang, Fan; Wu, Hua; Wang, Haifeng

doi:10.18653/v1/2020.acl-main.9

Cited by 181 publications

(179 citation statements)

References 22 publications

Supporting

Mentioning

178

Contrasting

Unclassified

Order By: Relevance

“…Con-veRT (Henderson et al, 2019a) pre-trained a dual transformer encoder for response selection task on large-scale Reddit (input, response) pairs. PLATO (Bao et al, 2019) uses both Twitter and Reddit data to pre-trained a dialogue generation model with discrete latent variables. All of them are designed to cope with the response generation task for opendomain chatbots.…”

Section: Related Workmentioning

confidence: 99%

“…However, previous work (Rashkin et al, 2018;Wolf et al, 2019) shows that there are some deficiencies in the performance to apply fine-tuning on conversational corpora directly. One possible reason could be the intrinsic difference of linguistic patterns between human conversations and writing text, resulting in a large gap of data distributions (Bao et al, 2019). Therefore, pre-training dialogue language models using chit-chat corpora from social media, such as Twitter or Reddit, has been recently investigated, especially for dialogue response generation (Zhang et al, 2019) and retrieval (Henderson et al, 2019b).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

Wu¹,

Hoi²,

Socher³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

157

122

View full text Add to dashboard Cite

The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice. In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling. We propose a contrastive objective function to simulate the response selection task. Our pre-trained task-oriented dialogue BERT (TOD-BERT) outperforms strong baselines like BERT on four downstream taskoriented dialogue applications, including intention recognition, dialogue state tracking, dialogue act prediction, and response selection. We also show that TOD-BERT has a stronger few-shot ability that can mitigate the data scarcity problem for task-oriented dialogue.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

Wu¹,

Hoi²,

Socher³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

157

122

View full text Add to dashboard Cite

show abstract

“…See Appendix E.5 for more extensive evaluation results.15 Same size asShen et al (2017) andBao et al (2020).16 See Appendix F for all experimental results on Japanese.…”

mentioning

confidence: 99%

Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness

Akama¹,

Yokoi

Suzuki

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Large-scale dialogue datasets have recently become available for training neural dialogue agents. However, these datasets have been reported to contain a non-negligible number of unacceptable utterance pairs. In this paper, we propose a method for scoring the quality of utterance pairs in terms of their connectivity and relatedness. The proposed scoring method is designed based on findings widely shared in the dialogue and linguistics research communities. We demonstrate that it has a relatively good correlation with the human judgment of dialogue quality. Furthermore, the method is applied to filter out potentially unacceptable utterance pairs from a large-scale noisy dialogue corpus to ensure its quality. We experimentally confirm that training data filtered by the proposed method improves the quality of neural dialogue agents in response generation. 1

show abstract

“…2019; Zhang et al, 2019;Bao et al, 2020;Henderson et al, 2019;. However, domain adaptation capabilities of these models remain to be further explored for goal-oriented dialogues.…”

Section: Introductionmentioning

confidence: 99%

Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging

Yavuz¹,

Hashimoto

Liu

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

The concept of Dialogue Act (DA) is universal across different task-oriented dialogue domains -the act of "request" carries the same speaker intention whether it is for restaurant reservation or flight booking. However, DA taggers trained on one domain do not generalize well to other domains, which leaves us with the expensive need for a large amount of annotated data in the target domain. In this work, we investigate how to better adapt DA taggers to desired target domains with only unlabeled data. We propose MASKAUGMENT, a controllable mechanism that augments text input by leveraging the pre-trained MASK token from BERT model. Inspired by consistency regularization, we use MASKAUGMENT to introduce an unsupervised teacher-student learning scheme to examine the domain adaptation of DA taggers. Our extensive experiments on the Simulated Dialogue (GSim) and Schema-Guided Dialogue (SGD) datasets show that MASKAUGMENT is useful in improving the cross-domain generalization for DA tagging.

show abstract

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Cited by 181 publications

References 22 publications

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness

Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging

Contact Info

Product

Resources

About