2019 IEEE Intelligent Transportation Systems Conference (ITSC) 2019
DOI: 10.1109/itsc.2019.8917002
|View full text |Cite
|
Sign up to set email alerts
|

Lane-Merging Using Policy-based Reinforcement Learning and Post-Optimization

Abstract: Many current behavior generation methods struggle to handle real-world traffic situations as they do not scale well with complexity. However, behaviors can be learned off-line using data-driven approaches. Especially, reinforcement learning is promising as it implicitly learns how to behave utilizing collected experiences. In this work, we combine policy-based reinforcement learning with local optimization to foster and synthesize the best of the two methodologies. The policy-based reinforcement learning algor… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 8 publications
0
7
0
Order By: Relevance
“…More than 12% of papers studied RL since it has been well investigated for its transportation-related usage. In particular, researchers intended to apply RL to make real-time safe decisions for autonomous driving vehicles [93,142,18].…”
Section: Data Synthesismentioning
confidence: 99%
See 1 more Smart Citation
“…More than 12% of papers studied RL since it has been well investigated for its transportation-related usage. In particular, researchers intended to apply RL to make real-time safe decisions for autonomous driving vehicles [93,142,18].…”
Section: Data Synthesismentioning
confidence: 99%
“…-Post-optimization: Adding an additional safety layer after the RL to exclude unsafe actions like safe lane merging in autonomous driving [93], -Uncertainty estimation: Estimating what the agent does not know in order to avoid performing certain actions, making the agent behaviour robust to unseen observations. For example, collision avoidance for pedestrian [142], -Stability certification of RL-based controllers: Providing theoretical guarantees for RL-based controllers like [160].…”
Section: Safety In Rlmentioning
confidence: 99%
“…mixed-integer programming [22], have proven to be computationally fast for motion and strategic planning, and can act as a dual approach compared to learning-based approaches. A combination of classical and learning-based methods [23] is computationally fast and achieves safe and comfortable motions.…”
Section: A Behavior Modelingmentioning
confidence: 99%
“…Both of these approaches achieve state-of-the-art results in continuous control tasks. Hart et al [10] applied the SAC method in a lane-merging scenario. They additionally use a post-optimization to lower the remaining collisions and to improve the ride comfort.…”
Section: Related Workmentioning
confidence: 99%
“…This is of course if one leaves aside the concerns and restrictions that come with neural networks in safetycritical applications. Using table-based approaches instead of deep neural networks or post-processing of the RL solution [10] could mitigate these restrictions.…”
Section: B Counterfactual Reasoningmentioning
confidence: 99%