2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.01494
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

Abstract: End-to-end approaches to autonomous driving commonly rely on expert demonstrations. Although humans are good drivers, they are not good coaches for end-to-end algorithms that demand dense on-policy supervision. On the contrary, automated experts that leverage privileged information can efficiently generate large scale on-policy and off-policy demonstrations. However, existing automated experts for urban driving make heavy use of hand-crafted rules and perform suboptimally even on driving simulators, where grou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
43
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 103 publications
(43 citation statements)
references
References 31 publications
0
43
0
Order By: Relevance
“…DA-RB+ (Prakash et al 2020) proposed an on-policy data aggregation and sampling techniques in the context of dense urban driving. Recently, Zhang et al (2021) trained an IL agent with the supervisions from an RL coach and BEV image ground-truths. In this work, we use behavior cloning tasks to help the representative feature extraction from raw observations for sub-sequent DRL-based agent rather than controlling the vehicle directly.…”
Section: Related Workmentioning
confidence: 99%
“…DA-RB+ (Prakash et al 2020) proposed an on-policy data aggregation and sampling techniques in the context of dense urban driving. Recently, Zhang et al (2021) trained an IL agent with the supervisions from an RL coach and BEV image ground-truths. In this work, we use behavior cloning tasks to help the representative feature extraction from raw observations for sub-sequent DRL-based agent rather than controlling the vehicle directly.…”
Section: Related Workmentioning
confidence: 99%
“…This approach is based on simple handcrafted rules. Building the expert with RL is also possible [111], [112] but it is more computationally demanding and less interpretable. Our expert policy consists of an A* planner followed by 2 PID controllers (for lateral and longitudinal control).…”
Section: Expertmentioning
confidence: 99%
“…However, others use interpretable intermediate representations [33,34,35]. In particular, BEV semantic occupancy grid representations are widely used in modern driving approaches [36,22,23,37,26]. This representation can be inferred from images [38,39,40,41,42,26,43,44].…”
Section: Related Workmentioning
confidence: 99%