Proceedings of the 3rd Workshop on Neural Generation and Translation 2019
DOI: 10.18653/v1/d19-5602
|View full text |Cite
|
Sign up to set email alerts
|

Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

Abstract: Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of taskspecific data. In this paper, we demonstrate that recent progress in language modeling pretraining and transfer learning shows promise to overcome this problem. We propose a taskoriented dialogue model tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
156
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 228 publications
(156 citation statements)
references
References 23 publications
0
156
0
Order By: Relevance
“…This idea can also be applied to task-oriented dialog systems to transfer general natural language knowledge from large-scale corpora to a specific dialog task. Some early studies have shown the possibility of using pre-training models to model task-oriented dialogs [46,99,100,130,131].…”
Section: Discussion and Future Trendsmentioning
confidence: 99%
See 1 more Smart Citation
“…This idea can also be applied to task-oriented dialog systems to transfer general natural language knowledge from large-scale corpora to a specific dialog task. Some early studies have shown the possibility of using pre-training models to model task-oriented dialogs [46,99,100,130,131].…”
Section: Discussion and Future Trendsmentioning
confidence: 99%
“…Wolf et al [99] followed this way by first pre-training a transformer model on large-scale dialog data and then fine-tuning the model on a personalized dialog task with multi-task learning. Budzianowski et al [100] further explored this idea to task-oriented dialog without explicit standalone dialogue policy and generation modules. In this work, the belief state and database state are first converted to natural language text and then taken as input to the transformer decoder besides the context.…”
Section: Unsupervised Methodsmentioning
confidence: 99%
“…Recent research [38] demonstrates that huge language models may outperform specific dialogue systems at the cost of computation. Even so, the use of these pre-trained models and transfer learning is where the research points towards [39] in order to achieve outstanding performances with very little data. Also, this adds robustness against breakdowns when encountering unconsidered user needs [40].…”
Section: Related Workmentioning
confidence: 99%
“…After performing experiments by training directly on math word problem corpora, we perform a different set of experiments by pretraining on a general language corpus. The success of pretrained models such as ELMo [17], GPT-2 [18], and BERT [19] for many natural language tasks, provides reasoning that pre-training is likely to produce better learning by our system. We use pre-training so that the system has some foundational knowledge of English before we train it on the domain-specific text of math word problems.…”
Section: Approachmentioning
confidence: 99%