PyDial: A Multi-domain Statistical Dialogue System Toolkit

Ultes, Stefan; Rojas-Barahona, Lina Maria; Su, Pei-Hao; Vandyke, David; Kim, Dongho; Casanueva, Iñigo; Budzianowski, Paweł; Mrkšić, Nikola; Wen, Tao; Gašić, Milica; Young, Steve

doi:10.18653/v1/p17-4013

Cited by 130 publications

(88 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we evaluate the performance of ACER incorporated in an SDS. We find that ACER delivers the best performance and fastest convergence among the compared NN-based algorithms (eNAC and A2C) implemented in the PyDial dialogue toolkit [37]. We also deploy the algorithm in a more challenging setting without the execution mask aiding action selection.…”

Section: Discussionmentioning

confidence: 97%

“…We use the agenda-based user simulator, with the focus belief tracker for all experiments. For details, see [37]. The agenda-based user simulator [39] consists of a goal which is a randomly generated slot-value pairs that the entity that the user seeks must be satisfied and an agenda which is a dynamic stack of dialogue acts that the user elicits in order to satisfy the goal.…”

Section: B User Simulatormentioning

confidence: 99%

See 1 more Smart Citation

Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Weisz

Budzianowski

et al. 2018

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimisation task, which attempts to find a policy describing how to respond to humans, in the form of a function taking the current state of the dialogue and returning the response of the system. In this paper, we investigate deep reinforcement learning approaches to solve this problem. Particular attention is given to actor-critic methods, off-policy reinforcement learning with experience replay, and various methods aimed at reducing the bias and variance of estimators. When combined, these methods result in the previously proposed ACER algorithm that gave competitive results in gaming environments. These environments however are fully observable and have a relatively small action set so in this paper we examine the application of ACER to dialogue policy optimisation. We show that this method beats the current state-of-the-art in deep learning approaches for spoken dialogue systems. This not only leads to a more sample efficient algorithm that can train faster, but also allows us to apply the algorithm in more difficult environments than before. We thus experiment with learning in a very large action space, which has two orders of magnitude more actions than previously considered. We find that ACER trains significantly faster than the current state-ofthe-art.Index Terms-deep reinforcement learning, spoken dialogue systems, Gaussian processes.

show abstract

Section: Discussionmentioning

confidence: 97%

Section: B User Simulatormentioning

confidence: 99%

Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Weisz

Budzianowski

et al. 2018

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

show abstract

“…As baseline system a more traditional, modular statistical dialogue system (BASE-SDS) was chosen which was based on the PyDial toolkit [14].…”

Section: Baseline Dialogue System (Base-sds)mentioning

confidence: 99%

Comparison of an End-to-end Trainable Dialogue System with a Modular Statistical Dialogue System

Braunschweiler

Papangelis

2018

Interspeech 2018

View full text Add to dashboard Cite

This paper presents a comparison of two dialogue systems: one is end-to-end trainable and the other uses a more traditional, modular architecture. End-to-end trainable dialogue systems recently attracted a lot of attention because they offer several advantages over traditional systems. One of them is the avoidance to train each system module independently, by creating a single network architecture which maps an input to the corresponding output without the need for intermediate representations. While the end-to-end system investigated here had been tested in a text-in/out scenario it remained an open question how the system would perform in a speech-in/out scenario, with noisy input from a speech recognizer and output speech generated by a speech synthesizer. To evaluate this, both dialogue systems were trained on the same corpus, including human-human dialogues in the Cambridge restaurant domain, and then compared in both scenarios by human evaluation. The results show, that in both interfaces the end-to-end system receives significantly higher ratings on all metrics than the traditional modular system, an indication that it enables users to reach their goals faster and experience both a more natural system response and a better comprehension by the dialogue system.

show abstract

“…It is implemented as a main function to drive the DS. A rule-based and probabilistic belief tracking or dialogue state tracking model could be used to maintain the dialogue flow [25]. We used a rule-based model where the dialogue flow module keeps track of the input dialogue acts and DoP and send them to the response manager to fetch responses.…”

Section: Dialogue Flowmentioning

confidence: 99%

Towards Dialogue-Based Navigation with Multivariate Adaptation Driven by Intention and Politeness for Social Robots

Bothe¹,

García

Maya

et al. 2018

Social Robotics

View full text Add to dashboard Cite

Service robots need to show appropriate social behaviour in order to be deployed in social environments such as healthcare, education, retail, etc. Some of the main capabilities that robots should have are navigation and conversational skills. If the person is impatient, the person might want a robot to navigate faster and vice versa. Linguistic features that indicate politeness can provide social cues about a person's patient and impatient behaviour. The novelty presented in this paper is to dynamically incorporate politeness in robotic dialogue systems for navigation. Understanding the politeness in users' speech can be used to modulate the robot behaviour and responses. Therefore, we developed a dialogue system to navigate in an indoor environment, which produces different robot behaviours and responses based on users' intention and degree of politeness. We deploy and test our system with the Pepper robot that adapts to the changes in user's politeness.

show abstract

PyDial: A Multi-domain Statistical Dialogue System Toolkit

Cited by 130 publications

References 23 publications

Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Comparison of an End-to-end Trainable Dialogue System with a Modular Statistical Dialogue System

Towards Dialogue-Based Navigation with Multivariate Adaptation Driven by Intention and Politeness for Social Robots

Contact Info

Product

Resources

About