1982
DOI: 10.1016/0022-247x(82)90122-6
|View full text |Cite
|
Sign up to set email alerts
|

Multi-objective infinite-horizon discounted Markov decision processes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
79
0
1

Year Published

1994
1994
2015
2015

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 69 publications
(80 citation statements)
references
References 3 publications
0
79
0
1
Order By: Relevance
“…These formalisms have been extended [11,13,6] with parameter uncertainty. Here, we consider the formalism of Givan et al [6]; other models are more powerful.…”
Section: Related Workmentioning
confidence: 99%
“…These formalisms have been extended [11,13,6] with parameter uncertainty. Here, we consider the formalism of Givan et al [6]; other models are more powerful.…”
Section: Related Workmentioning
confidence: 99%
“…These ideas are related to earlier work on solving multi-criteria MDPs where weights are used to indicate the importance of different reward components. For example, in the work by White (1982), vector-based generalizations of successive approximation techniques are used to solved the MDP. Feinberg and Schwartz (1995) formulate the problem as optimizing a weighted sum of the total discounted rewards for the different components of the reward function.…”
Section: Related Workmentioning
confidence: 99%
“…, w k , the pure memoryless optimal strategy for the single objective i w i r i is Pareto-optimal. This technique is called the weighted factor method [6,12], and used commonly in engineering practice to find subsets of the Pareto set [7]. However, not all Pareto-optimal points are obtained in this fashion, as the following example shows that randomization is necessary.…”
Section: Propositionmentioning
confidence: 99%
“…We study MDPs with multiple long-run average objectives, an extension of the MDP model where there are several reward functions [6,12]. In MDPs with multiple objectives, we are interested not in a single solution that is simultaneously optimal in all objectives (which may not exist), but in a notion of "tradeoffs" called the Pareto curve.…”
Section: Introductionmentioning
confidence: 99%