2020
DOI: 10.48550/arxiv.2005.00935
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey

Abstract: Latest technological improvements increased the quality of transportation. New data-driven approaches bring out a new research direction for all control-based systems, e.g., in transportation, robotics, IoT and power systems. Combining data-driven applications with transportation systems plays a key role in recent transportation applications. In this paper, the latest deep reinforcement learning (RL) based traffic control applications are surveyed. Specifically, traffic signal control (TSC) applications based … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(11 citation statements)
references
References 117 publications
(165 reference statements)
0
11
0
Order By: Relevance
“…The agent is trained for multiple episodes where each episode consists of multiple iterations. The cumulative sum of discounted reward under the policy 𝜋 𝜃 at tth iteration of an episode is defined as G t (Equation 3) 24 :…”
Section: Advanced Methodology For Drlmentioning
confidence: 99%
See 1 more Smart Citation
“…The agent is trained for multiple episodes where each episode consists of multiple iterations. The cumulative sum of discounted reward under the policy 𝜋 𝜃 at tth iteration of an episode is defined as G t (Equation 3) 24 :…”
Section: Advanced Methodology For Drlmentioning
confidence: 99%
“…The update rule by gradient-ascent for 𝜃 at (k + 1)th episode (ie, 𝜃 (k+1) ) is defined in Equation (5). 24 𝜃 (k+1) = 𝜃 k + 𝛼∇J(𝜃 k ).…”
Section: Advanced Methodology For Drlmentioning
confidence: 99%
“…RL is the method that can assist intelligent decision-making. It takes sequential actions similar to Markov Decision Process (MDP) with a rewarding criterion (5). It interacts with the environment and try to learn how to behave to maximize rewards, which can be used in complex systems such as a transportation network.…”
Section: Introductionmentioning
confidence: 99%
“…This is in part due to many recent MARL innovations utilising novel information structures, such as centralised training schemes to overcome issues of non-stationarity (Lowe et al, 2017) and learned networked communication protocols for scalable and effective cooperation (Foerster et al, 2016;Sukhbaatar et al, 2016;Chu et al, 2020). Moreover, critical infrastructure responsible for society's wellbeing have been demonstrated to be amenable to multi-agent system control and include management of electricity, telecommunications and transportation systems (Herrera et al, 2020;Haydari and Yilmaz, 2020;Chu et al, 2020).…”
Section: Introductionmentioning
confidence: 99%