1999
DOI: 10.1613/jair.575
|View full text |Cite
|
Sign up to set email alerts
|

Decision-Theoretic Planning: Structural Assumptions and Computational Leverage

Abstract: Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many di erent elds, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives adopted in these areas often di er in substantial ways, many planning problems of interest to researchers in these elds can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision the… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
531
0
19

Year Published

2000
2000
2013
2013

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 625 publications
(550 citation statements)
references
References 81 publications
0
531
0
19
Order By: Relevance
“…Markov decision processes (MDPs) are used to model sequential decision problems under uncertainty in economics, operations research, computer science, and many other areas (Puterman, 1994;Boutilier et al, 1999). Unlike the problems above, where a single decision is made, in an MDP one determines a sequence of decisions/actions to guide some stochastic system into certain desirable states.…”
Section: Markov Decision Processesmentioning
confidence: 99%
See 2 more Smart Citations
“…Markov decision processes (MDPs) are used to model sequential decision problems under uncertainty in economics, operations research, computer science, and many other areas (Puterman, 1994;Boutilier et al, 1999). Unlike the problems above, where a single decision is made, in an MDP one determines a sequence of decisions/actions to guide some stochastic system into certain desirable states.…”
Section: Markov Decision Processesmentioning
confidence: 99%
“…Navigating a system in this way requires the appropriate choice of policy π which dictates which action to take at any system state. An optimal policy is one which maximizes the expected sum of rewards accrued, and can be computed using a variety of dynamic and linear programming methods (Puterman, 1994;Boutilier et al, 1999).…”
Section: Markov Decision Processesmentioning
confidence: 99%
See 1 more Smart Citation
“…See Chapter 9 (section 10) of Gosavi (2003) for discussions of other model-based RL algorithms. Use of Bayesian networks to represent the TPs in a compact manner was pioneered by Boutilier et al (1999), and was called a factored MDP. The TPMs are learned as functions of a few scalars via a Bayesian network.…”
Section: Semi-markov Decision Problemsmentioning
confidence: 99%
“…Abstraction and aggregation are techniques [2] that aid factored representations to avoid this problem. Several authors use these notions to find computationally feasible methods for the construction of (approximately) optimal and satisfying policies.…”
Section: Introductionmentioning
confidence: 99%