2017 International Joint Conference on Neural Networks (IJCNN) 2017
DOI: 10.1109/ijcnn.2017.7966275
|View full text |Cite
|
Sign up to set email alerts
|

Scaling up deep reinforcement learning for multi-domain dialogue systems

Abstract: Abstract-Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems due to large search spaces. This paper proposes a three-stage method for multi-domain dialogue policy learning-termed NDQN, and applies it to an informationseeking spoken dialogue system in the domains of restaurants and hotels. In this method, the first stage does multi-policy learning via a network of DQN agents; the second makes use of compact state representations by co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
44
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 40 publications
(44 citation statements)
references
References 15 publications
0
44
0
Order By: Relevance
“…word present or absent), the words derived from user responses can be seen as continuous variables by taking ASR confidence scores into account. Our state representations used delexicalised word-based representations and excluded words from information presentation-for increased scalability, as described in [20].…”
Section: Multi-domain Dialogue Systemmentioning
confidence: 99%
See 4 more Smart Citations
“…word present or absent), the words derived from user responses can be seen as continuous variables by taking ASR confidence scores into account. Our state representations used delexicalised word-based representations and excluded words from information presentation-for increased scalability, as described in [20].…”
Section: Multi-domain Dialogue Systemmentioning
confidence: 99%
“…Firstly, actions are selected from the most likely actions, P r(a|s) > 0.0001, derived from Naive Bayes classifiers (due to scalability purposes) trained from demonstration dialogues. See example demonstration dialogue in Appendix of [20]. Secondly, the most likely actions in he previous stage are extended with legitimate requests, apologies and confirmations.…”
Section: Multi-domain Dialogue Systemmentioning
confidence: 99%
See 3 more Smart Citations