A hierarchical approach for efficient multi-intent dialogue policy learning

Saha, Tulika; Gupta, Dhawal; Saha, Sriparna; Bhattacharyya, Pushpak

doi:10.1007/s11042-020-09070-7

Cited by 3 publications

(1 citation statement)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the existing dialogue policy model based on the deep reinforcement learning usually gives the agent explicit rewards which only depending on the terminal state of the dialogue task, and introduces a small penalty in each round to encourage the agent to complete the task with as few interactions as possible [3][4][5] , such methods tend to pay more attention to the efficiency of task completion but ignore the consideration of users' emotional state. In fact, the emotions expressed by users during a conversation usually reflect their degree of satisfaction with the agent's response action 6 , so it can be used as an important reference for the policy model to learn and evaluate its actions.…”

Section: Introductionmentioning

confidence: 99%

ESDP: An Emotion-Sensitive Dialogue Policy for Task-Oriented Dialogue System

Zhu,

Wang,

Wang

et al. 2024

Preprint

View full text Add to dashboard Cite

Reinforcement learning (RL) is an effective method in training dialogue policies to steer the conversation towards successful task completion. However, most RL-based methods only rely on semantic inputs that lack empathy as they ignore the user emotional information. Moreover, these methods suffer from delayed rewards caused by the user simulator returning valuable results only at dialogue end. Recently, some methods have been proposed to learn the reward function together with user emotions, but they missing to consider user emotion in each dialogue turn. In this paper, we proposed an emotion-sensitive dialogue policy model (ESDP), it incorporates user emotions information into dialogue policy and selects the optimal action by the combination of top-k actions with the user emotions. The user emotion information in each turn is used as an immediate reward for the current dialogue state to solve sparse rewards and termination dependency. Extensive experiments validate that our method outperforms the baseline approaches when combined with different Q-Learning algorithms, and also surpasses other popular existing dialog policies' performance.

show abstract

Section: Introductionmentioning

confidence: 99%

ESDP: An Emotion-Sensitive Dialogue Policy for Task-Oriented Dialogue System

Zhu,

Wang,

Wang

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

Aspect-level sentiment-controlled knowledge grounded multimodal dialog generation using generative models for reviews

Varshney,

Singh,

Ekbal

2023

Multimed Tools Appl

View full text Add to dashboard Cite

A Unified Dialogue Management Strategy for Multi-intent Dialogue Conversations in Multiple Languages

SahaTulika

GuptaDhawal

SahaSriparna

et al. 2021

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

View full text Add to dashboard Cite

Building Virtual Agents capable of carrying out complex queries of the user involving multiple intents of a domain is quite a challenge, because it demands that the agent manages several subtasks simultaneously. This article presents a universal Deep Reinforcement Learning framework that can synthesize dialogue managers capable of working in a task-oriented dialogue system encompassing various intents pertaining to a domain. The conversation between agent and user is broken down into hierarchies, to segregate subtasks pertinent to different intents. The concept of Hierarchical Reinforcement Learning, particularly options , is used to learn policies in different hierarchies that operates in distinct time steps to fulfill the user query successfully. The dialogue manager comprises top-level intent meta-policy to select among subtasks or options and a low-level controller policy to pick primitive actions to communicate with the user to complete the subtask provided to it by the top-level policy in varying intents of a domain. The proposed dialogue management module has been trained in a way such that it can be reused for any language for which it has been developed with little to no supervision. The developed system has been demonstrated for “Air Travel” and “Restaurant” domain in English and Hindi languages. Empirical results determine the robustness and efficacy of the learned dialogue policy as it outperforms several baselines and a state-of-the-art system.

show abstract

A hierarchical approach for efficient multi-intent dialogue policy learning

Cited by 3 publications

References 25 publications

ESDP: An Emotion-Sensitive Dialogue Policy for Task-Oriented Dialogue System

ESDP: An Emotion-Sensitive Dialogue Policy for Task-Oriented Dialogue System

Aspect-level sentiment-controlled knowledge grounded multimodal dialog generation using generative models for reviews

A Unified Dialogue Management Strategy for Multi-intent Dialogue Conversations in Multiple Languages

Contact Info

Product

Resources

About