SimpleDS: A Simple Deep Reinforcement Learning Dialogue System

Cuayáhuitl, Heriberto

doi:10.1007/978-981-10-2585-3_8

Cited by 46 publications

(27 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Previous work on dialogue policy learning using DRL map raw (noisy) text to actions [2], [7]. This is not only computationally intensive, but it becomes infeasible for dialogue systems with large vocabularies.…”

Section: B Ndqn With Compressed Raw Inputsmentioning

confidence: 99%

“…The proposed framework for training multi-domain neuralbased dialogue agents is a substantial extension from the publicly available software tools SimpleDS [2] and ConvnetJS [11]. It can be executed in training or test mode using simulations or speech-based interactions (via a mobile App 1 ).…”

Section: Multi-domain Dialogue Systemmentioning

confidence: 99%

“…Rather than learning with whole action sets, our framework supports learning from constrained actions by applying learning updates only on the set of valid actions. These actions are derived from the most likely actions, P r(a|s) > 0.0001, from Naive Bayes classifiers (due to scalability purposes) 1 https://youtu.be/B5fZfZ-xaKM 2 The unique words in our system's vocabulary excludes words from information presentation due to the vast amount of information about hotels and restaurants. Nonetheless and during testing, our system retrieves live information from http://www.bookatable.co.uk) and www.reservetravel.com.…”

Section: Multi-domain Dialogue Systemmentioning

confidence: 99%

“…This paper makes use of raw noisy text as features in an attempt to avoid engineered features to represent the dialogue state. By using this representation, dialogue agents bypass spoken language understanding in order to learn dialogue policies directly from raw (noisy) text to actions [2].…”

Section: Introductionmentioning

confidence: 99%

“…Raw features in human-machine conversations such as words with confidence scores can be given as input to a reinforcement learning agent to induce dialogue policies from interaction with the environment, where situations (words) are mapped to actions (dialogue acts) by maximizing a long-term reward signal [2]. An RL agent is typically characterized by: (i) a finite set of states S = {s 1 , ..., s n }; (ii) a finite set of actions A = {a 1 , ..., a m }; (iii) a state transition function T (s, a, s ) that specifies the next state s given the current state s and action a; (iv) a reward function R(s, a, s ) that specifies the reward given to the agent for choosing action a when the environment makes a transition from state s to state s ; and (v) a policy π : S → A that defines a mapping from states to actions.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Scaling up deep reinforcement learning for multi-domain dialogue systems

Cuayáhuitl

Williamson

et al. 2017

2017 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

Abstract-Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems due to large search spaces. This paper proposes a three-stage method for multi-domain dialogue policy learning-termed NDQN, and applies it to an informationseeking spoken dialogue system in the domains of restaurants and hotels. In this method, the first stage does multi-policy learning via a network of DQN agents; the second makes use of compact state representations by compressing raw inputs; and the third stage applies a pre-training phase for bootstraping the behaviour of agents in the network. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that the proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems. An additional evaluation reports that the NDQN agents outperformed a K-Nearest Neighbour baseline in task success and dialogue length, yielding more efficient and successful dialogues.

show abstract

Section: B Ndqn With Compressed Raw Inputsmentioning

confidence: 99%

Section: Multi-domain Dialogue Systemmentioning

confidence: 99%

Section: Multi-domain Dialogue Systemmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations