2009
DOI: 10.1287/ijoc.1080.0305
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning: A Tutorial Survey and Recent Advances

Abstract: In the last few years, Reinforcement Learning (RL), also called adaptive (or approximate) dynamic programming (ADP), has emerged as a powerful tool for solving complex sequential decision-making problems in control theory. Although seminal research in this area was performed in the artificial intelligence (AI) community, more recently, it has attracted the attention of optimization theorists because of several noteworthy success stories from operations management. It is on large-scale and complex problems of d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
107
0
3

Year Published

2011
2011
2020
2020

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 260 publications
(110 citation statements)
references
References 112 publications
0
107
0
3
Order By: Relevance
“…The solution of our DP formulation searches the stochastic shortest path in a stochastic activity network [50]. This DP can be characterized as a hierarchical DP [51,52]. Therefore classical Reinforcement Learning (RL) is not suitable and hierarchical RL has to be applied [52].…”
Section: Literature and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The solution of our DP formulation searches the stochastic shortest path in a stochastic activity network [50]. This DP can be characterized as a hierarchical DP [51,52]. Therefore classical Reinforcement Learning (RL) is not suitable and hierarchical RL has to be applied [52].…”
Section: Literature and Related Workmentioning
confidence: 99%
“…Most RL approaches are based on environments that do not vary over time. We refer to [51] for a good survey on reinforcement learning techniques.…”
Section: Literature and Related Workmentioning
confidence: 99%
“…The solution of our DP formulation searches the stochastic shortest path in a stochastic activity network [5]. This DP can be characterized as a hierarchical DP [8], [1]. Therefore classical Reinforcement Learning (RL) is not suitable and hierarchical RL has to be applied [1].…”
Section: Literature and Related Workmentioning
confidence: 99%
“…Most RL approaches are based on environments that do not vary over time. We refer to [8] for a good survey on reinforcement learning techniques.…”
Section: Literature and Related Workmentioning
confidence: 99%
“…Instead, a learning mechanism adapting to a highly dynamic environment is more appropriate for the situation (Busoniu et al, 2008) (Xiao et al, 2007). Reinforcement learning is probably the best-known method of learning due to the similarity of empirical learning in mammals (Gosavi, 2009) (Kaelbling et al, 1996). The learning takes place by trial and error, following by performance evaluation but always provided after a sequence of actions.…”
Section: Introductionmentioning
confidence: 99%