2016 International Joint Conference on Neural Networks (IJCNN) 2016
DOI: 10.1109/ijcnn.2016.7727695
|View full text |Cite
|
Sign up to set email alerts
|

Advantage based value iteration for Markov decision processes with unknown rewards

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…We are looking for an optimal QoS-selection strategy satisfying the user's requirements in terms of QoS attributes. Since the user's preferences with respect to QoS attributes are unknown, we use a partially known MDP model, and more particularly, a vector-valued MDP (VMDP) model [2].…”
Section: Services Composition As An Mdpmentioning
confidence: 99%
See 3 more Smart Citations
“…We are looking for an optimal QoS-selection strategy satisfying the user's requirements in terms of QoS attributes. Since the user's preferences with respect to QoS attributes are unknown, we use a partially known MDP model, and more particularly, a vector-valued MDP (VMDP) model [2].…”
Section: Services Composition As An Mdpmentioning
confidence: 99%
“…2) MDPs can be easily integrated into a dynamic environment with uncertainty. In this article, we use part of the results presented in [2] and [41] on how to efficiently learn the users' preferences. These results are based on the combined use of MDP and RL techniques.…”
Section: B Services Composition As a Discrete-time Vmdpmentioning
confidence: 99%
See 2 more Smart Citations