2022
DOI: 10.1038/s41586-021-04357-7
|View full text |Cite
|
Sign up to set email alerts
|

Outracing champion Gran Turismo drivers with deep reinforcement learning

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
81
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 219 publications
(81 citation statements)
references
References 20 publications
0
81
0
Order By: Relevance
“…Liu et al (2021b) investigate simulated humanoid football from motor control to team cooperation. Wurman et al (2022) develop automobile racing agent, winning the world's best e-sports drivers.…”
Section: Gamesmentioning
confidence: 99%
See 1 more Smart Citation
“…Liu et al (2021b) investigate simulated humanoid football from motor control to team cooperation. Wurman et al (2022) develop automobile racing agent, winning the world's best e-sports drivers.…”
Section: Gamesmentioning
confidence: 99%
“…Krishnan et al (2021) introduce a simulator for resource-constrained autonomous aerial robots. Wurman et al (2022) develop automobile racing agent in simulation, the PlayStation game Gran Turismo, to win the world's best e-sports drivers. Ibarz et al (2021) review how to train robots with deep RL and discuss outstanding challenges and strategies to mitigate them: 1) reliable and stable learning; 2) sample efficiency: 2.1) off-policy algorithms, 2.2) model-based algorithms, 2.3) input remapping for high-dimensional observations, and 2.4) offline training; 3) use of simulation: 3.1) better simulation, 3.2) domain randomization, and 3.3) domain adaptation; 4) side-stepping exploration challenges: 4.1) initialization, 4.2) data aggregation, 4.3) joint training, 4.4) demonstrations in model-based RL, 4.5) scripted policies, and 4.6) reward shaping; 5) generalization: 5.1) data diversity and 5.2) proper evaluation; 6) avoiding model exploita-tion; 7) robot operation at scale: 7.1) experiment design, 7.2) facilitating continuous operation, and 7.3) non-stationarity owing to environment changes; 8) asynchronous control: thinking and acting at the same time; 9) setting goals and specifying rewards; 10) multi-task learning and meta-learning; 11) safe learning: 11.1) designing safe action spaces, 11.2) smooth actions, 11.3) recognizing unsafe situations, 11.4) constraining learned policies, and 11.5) robustness to unseen observations; and 12) robot persistence: 12.1) self-persistence and 12.2) task persistence.…”
Section: Roboticsmentioning
confidence: 99%
“…Classical reinforcement learning (RL) [1] has generated excellent results in different regions [2][3][4][5][6][7]. During the past decade, RL has been broadly applied to master Go [2], design chips [7], play the game for StarCraft and Gran Turismo [3,4], improve the nuclear fusion problem [5], and solve the problem of protein folding [6]. Despite the remarkable achievements, most RL techniques fail to balance the tradeoff between exploitation and exploration [8].…”
Section: Introductionmentioning
confidence: 99%
“…Environments can be based on simulations. For example, popular RL applications with simulation-based environments include Atari video-games (Mnih et al, 2013), robotic tasks (Tunyasuvunakool et al, 2020) and autonomous driving (Sallab et al, 2017;Wurman et al, 2022).…”
Section: Introductionmentioning
confidence: 99%