2019 Chinese Control Conference (CCC) 2019
DOI: 10.23919/chicc.2019.8866005
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic

Abstract: High-level driving behavior decision-making is an open-challenging problem for connected vehicle technology, especially in heterogeneous traffic scenarios. In this paper, a deep reinforcement learning based high-level driving behavior decision-making approach is proposed for connected vehicle in heterogeneous traffic situations. The model is composed of three main parts: a data preprocessor that maps hybrid data into a data format called hyper-grid matrix, a two-stream deep neural network that extracts the hid… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(13 citation statements)
references
References 8 publications
0
13
0
Order By: Relevance
“…In the area of motion planning, the end episode rewards are calculated from the fulfillment or failure of the driving task. The overall performance factors are generally: time of finishing the task, keeping the desired speed or achieving as high average speed as possible, yaw or distance from lane middle or the desired trajectory, overtaking more vehicles, achieve as few lane changes as possible [57], keeping right [58], [59] etc. Rewarding systems also can represent passenger comfort, where the smoothness of the vehicle dynamics is enforced.…”
Section: Rewardingmentioning
confidence: 99%
See 2 more Smart Citations
“…In the area of motion planning, the end episode rewards are calculated from the fulfillment or failure of the driving task. The overall performance factors are generally: time of finishing the task, keeping the desired speed or achieving as high average speed as possible, yaw or distance from lane middle or the desired trajectory, overtaking more vehicles, achieve as few lane changes as possible [57], keeping right [58], [59] etc. Rewarding systems also can represent passenger comfort, where the smoothness of the vehicle dynamics is enforced.…”
Section: Rewardingmentioning
confidence: 99%
“…surrounding vehicles in a grid needs not only occupancy, but other information hence the spatial grid's cell need to hold additional information. In [57] the authors used equidistant grid, where the egovehicle is placed in the center, and the cells occupied by other vehicles represented the longitudinal velocity of the corresponding car (See Fig. 7).…”
Section: E Observation Spacementioning
confidence: 99%
See 1 more Smart Citation
“…In the area of motion planning, the end episode rewards are calculated from the fulfillment or failure of the driving task. The overall performance factors are generally: time of finishing the task, keeping the desired speed or achieving as high average speed as possible, yaw or distance from lane middle or the desired trajectory, overtaking more vehicles, achieve as few lane changes as possible [44], keeping right [45], [46] etc. Rewarding systems also can represent passenger comfort, where the smoothness of the vehicle dynamics is enforced.…”
Section: Rewardingmentioning
confidence: 99%
“…Zhu et al proposed a human-like autonomous car following a planning framework based on deep reinforcement learning and established a human-like car-following model [ 22 ]. In addition, some driving decisions based on learning algorithms have been extensively studied in recent researches [ 23 , 24 ].…”
Section: Introductionmentioning
confidence: 99%