2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020
DOI: 10.1109/itsc45102.2020.9294338
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning with Iterative Reasoning for Merging in Dense Traffic

Abstract: Maneuvering in dense traffic is a challenging task for autonomous vehicles because it requires reasoning about the stochastic behaviors of many other participants. In addition, the agent must achieve the maneuver within a limited time and distance. In this work, we propose a combination of reinforcement learning and game theory to learn merging behaviors. We design a training curriculum for a reinforcement learning agent using the concept of level-k behavior. This approach exposes the agent to a broad variety … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 35 publications
(19 citation statements)
references
References 21 publications
0
19
0
Order By: Relevance
“…(2). In line with previous works [20], [23], [34], we let ql-0 agents be reflective (nonstrategic) agents who do not explicitly take into account their opponents' possible responses but rather maximize their immediate rewards by treating other agents as stationary objects. The ground-truth reward functions of synthetic agents in each environment are manually tuned to achieve reasonable behaviors.…”
Section: Results and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…(2). In line with previous works [20], [23], [34], we let ql-0 agents be reflective (nonstrategic) agents who do not explicitly take into account their opponents' possible responses but rather maximize their immediate rewards by treating other agents as stationary objects. The ground-truth reward functions of synthetic agents in each environment are manually tuned to achieve reasonable behaviors.…”
Section: Results and Analysismentioning
confidence: 99%
“…One of our approach's limitations is the ability to treat continuous states and actions. Fortunately, the procedure for computing π i,k can be realized in an iterative deep Q-learning fashion [36] and plugged into Algorithm 1 seamlessly. Another limitation is related to the assumption about humans' constant intelligence levels during interactions.…”
Section: Discussionmentioning
confidence: 99%
“…V-H. To further increase safety, we believe that it is beneficial to make the vehicle able to deviate the route a bit in such dangerous situations. Second, this work assumes perfect measurements from a perception or V2X module like many previous works [30]- [32]. However, measurement noises are inevitable in the real world.…”
Section: Discussionmentioning
confidence: 99%
“…However, these work only considers static environments without interaction with other vehicles. By contrast, [20], [21], [30] and [31] considered dynamic traffic scenarios, thus they are more applicable for urban driving. However, current methods have not been well designed for scalable AD in a uniform setup.…”
Section: B Deep Reinforcement Learningmentioning
confidence: 99%
See 1 more Smart Citation