Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.9
|View full text |Cite
|
Sign up to set email alerts
|

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Abstract: Pre-training models have been proved effective for a wide range of natural language processing tasks. Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chitchat , knowledge grounded dialogues, and conversational question answering. In this framework, we adopt flexible attention mechanisms to fully leverage the bi-directional context and the uni-directional characteristic of language generation. We also introduce discrete latent … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
178
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 181 publications
(179 citation statements)
references
References 22 publications
0
178
0
1
Order By: Relevance
“…Con-veRT (Henderson et al, 2019a) pre-trained a dual transformer encoder for response selection task on large-scale Reddit (input, response) pairs. PLATO (Bao et al, 2019) uses both Twitter and Reddit data to pre-trained a dialogue generation model with discrete latent variables. All of them are designed to cope with the response generation task for opendomain chatbots.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Con-veRT (Henderson et al, 2019a) pre-trained a dual transformer encoder for response selection task on large-scale Reddit (input, response) pairs. PLATO (Bao et al, 2019) uses both Twitter and Reddit data to pre-trained a dialogue generation model with discrete latent variables. All of them are designed to cope with the response generation task for opendomain chatbots.…”
Section: Related Workmentioning
confidence: 99%
“…However, previous work (Rashkin et al, 2018;Wolf et al, 2019) shows that there are some deficiencies in the performance to apply fine-tuning on conversational corpora directly. One possible reason could be the intrinsic difference of linguistic patterns between human conversations and writing text, resulting in a large gap of data distributions (Bao et al, 2019). Therefore, pre-training dialogue language models using chit-chat corpora from social media, such as Twitter or Reddit, has been recently investigated, especially for dialogue response generation (Zhang et al, 2019) and retrieval (Henderson et al, 2019b).…”
Section: Introductionmentioning
confidence: 99%
“…See Appendix E.5 for more extensive evaluation results.15 Same size asShen et al (2017) andBao et al (2020).16 See Appendix F for all experimental results on Japanese.…”
mentioning
confidence: 99%
“…2019; Zhang et al, 2019;Bao et al, 2020;Henderson et al, 2019;. However, domain adaptation capabilities of these models remain to be further explored for goal-oriented dialogues.…”
Section: Introductionmentioning
confidence: 99%