2020
DOI: 10.1049/iet-its.2019.0709
|View full text |Cite
|
Sign up to set email alerts
|

Speed harmonisation and merge control using connected automated vehicles on a highway lane closure: a reinforcement learning approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 31 publications
0
11
0
Order By: Relevance
“…Reinforcement learning has greatly simplified verification and validation of performance of algorithms and models for ITS [21][22][23]; however, there are challenges in the definition of a suitable training environment in accordance with real world parameters as well as the definition of meaningful reward functions for taking multiple actions in a given training, which are typical scenarios of ITSs (e.g. slow the car down while turning to the left).…”
Section: A Overview Of Intelligent Transportation Systemsmentioning
confidence: 99%
“…Reinforcement learning has greatly simplified verification and validation of performance of algorithms and models for ITS [21][22][23]; however, there are challenges in the definition of a suitable training environment in accordance with real world parameters as well as the definition of meaningful reward functions for taking multiple actions in a given training, which are typical scenarios of ITSs (e.g. slow the car down while turning to the left).…”
Section: A Overview Of Intelligent Transportation Systemsmentioning
confidence: 99%
“…It owns an online network to calculate online Q-value q (s t , a t ; ) and the target network to calculate target Q-value q(s t +1 , a t +1 ; ′ ). The t +1 is defined as a t ; )]∇ q (s, a; ) (10) According to Equation 10, the maximum of the target Qvalue is always used to be selected, which leads to the overoptimisation of the target value estimate. DDQN [40] network is designed to solve the target value over-optimisation problem by choosing the action that owns max Q-value from the online Q-network and its Q-value from the target Q-network as the target Q-value.…”
Section: Algorithm Designmentioning
confidence: 99%
“…Recently, the deep reinforcement learning (DRL) method has been dramatically promoted and shows a great prospect to solve a variety of problems in the UGV area, including dynamic control [8], path planning [9], and motion harmonisation [10]. Compared to classic autonomous drive systems with environment perception, path planning, and dynamics control modules [11], the DRL method combines deep neural networks with a reinforcement learning frame.…”
Section: Introductionmentioning
confidence: 99%
“…In Wu et al 36 and Zhao et al, 37 DRL was used to achieve autonomous driving of V2X (vehicle-to-everything) scenario. Speed harmonization and merge control on CAVs (connected automated vehicles) was achieved by DRL in Ko et al 38 Although their control objects are different, these articles have a common feature, which is to use the DRL method to determine a specific control policy of each agent. [39][40][41] It is worth noting that because the train is limited by the track, its movement dimension is definite and it is better controlled than other types of vehicles.…”
Section: Related Workmentioning
confidence: 99%