Findings of the Association for Computational Linguistics: NAACL 2022 2022
DOI: 10.18653/v1/2022.findings-naacl.54
|View full text |Cite
|
Sign up to set email alerts
|

A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning

Abstract: Training a deep reinforcement learning-based dialogue policy with brute-force random sampling is costly. A new training paradigm was proposed to improve learning performance and efficiency by combining curriculum learning. However, attempts in the field of dialogue policy are very limited due to the lack of reliable evaluation of difficulty scores of dialogue tasks and the high sensitivity to the mode of progression through dialogue tasks. In this paper, we present a novel versatile adaptive curriculum learnin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 14 publications
1
8
0
Order By: Relevance
“…In the past few years, there has been growing interest in dialogue generation tasks with complex objectives, such as negotiation (Lewis et al 2017;He et al 2018;Zhou et al 2019b), persuasion (Wang et al 2019;Li et al 2020;Samad et al 2022), and emotional support (Liu et al 2021a;Peng et al 2022;Xu, Meng, and Wang 2022;Zhao et al 2023b).…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…In the past few years, there has been growing interest in dialogue generation tasks with complex objectives, such as negotiation (Lewis et al 2017;He et al 2018;Zhou et al 2019b), persuasion (Wang et al 2019;Li et al 2020;Samad et al 2022), and emotional support (Liu et al 2021a;Peng et al 2022;Xu, Meng, and Wang 2022;Zhao et al 2023b).…”
Section: Related Workmentioning
confidence: 99%
“…In real scenarios, the guidelines for these challenging dialogue tasks usually recommend breaking down the complex goals into multiple aspects and jointly promoting them to work towards the broad objective (Petty et al 1986;Fershtman 1990;Hill 2009). More recently, several works have applied LLMs to complex goal-oriented dialogues, by directly prompting the LLM to generate utterances (Zhao et al 2023a;Deng et al 2023a) or further improving the performance via iterative revision (Fu et al 2023). Current LLMs exhibit remarkable improvement compared to the previous methods on these tasks, but it is also found that they tend to lack a larger picture of the overall dialogue progression and fail to achieve the dialogue objective strategically through multi-turn interactions (Deng et al 2023a).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…If "it's burning hot, turn on the air conditioner for me" is played on TV, the voice may be parsed into the intention of controlling the air conditioner. However, it's not spoken by a real human, and thus shall be rejected [7]. Therefore, it is rather difficult for non-real-human sounds to be recognized at the level of text semantics.…”
Section: Introductionmentioning
confidence: 99%