2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020
DOI: 10.1109/iros45743.2020.9340636
|View full text |Cite
|
Sign up to set email alerts
|

Planning on the fast lane: Learning to interact using attention mechanisms in path integral inverse reinforcement learning

Abstract: General-purpose trajectory planning algorithms for automated driving utilize complex reward functions to perform a combined optimization of strategic, behavioral, and kinematic features. The specification and tuning of a single reward function is a tedious task and does not generalize over a large set of traffic situations. Deep learning approaches based on path integral inverse reinforcement learning have been successfully applied to predict local situation-dependent reward functions using features of a set o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…In contrast to the multi-headed attention mechanism approach based on graph neural networks, some algorithms do not model the agents as graphs. For example, Rosbach et al [24] applied the attention mechanism to inverse reinforcement learning for predicting the reward function over an extended planning horizon. Shah et al [25] proposed a novel linguistic instruction attention mechanism, where the attention mechanism scored the tokens of the input visual and instruction information.…”
Section: Attention Mechanismsmentioning
confidence: 99%
“…In contrast to the multi-headed attention mechanism approach based on graph neural networks, some algorithms do not model the agents as graphs. For example, Rosbach et al [24] applied the attention mechanism to inverse reinforcement learning for predicting the reward function over an extended planning horizon. Shah et al [25] proposed a novel linguistic instruction attention mechanism, where the attention mechanism scored the tokens of the input visual and instruction information.…”
Section: Attention Mechanismsmentioning
confidence: 99%
“…An RL-based policy can account for the vehicle interactions in certain scenarios through training in an environment capable of representing such interactions [14], [15]. In order to obtain through RL driving policies that behave like human drivers, several researchers chose to use inverse RL to estimate human's reward function for driving [16], [17], [18], [19]. To be able to model different human driver styles and/or interaction intentions, [20] incorporates cooperativeness into the intelligent driver model and [21] formulates different reward functions for different drivers and performs RL based on the models.…”
Section: Introductionmentioning
confidence: 99%