2021
DOI: 10.48550/arxiv.2102.09812
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Latent Competition: Learning to Race Using Visual Control Policies in Latent Space

Abstract: Learning competitive behaviors in multi-agent settings such as racing requires long-term reasoning about potential adversarial interactions. This paper presents Deep Latent Competition (DLC), a novel reinforcement learning algorithm that learns competitive visual control policies through self-play in imagination. The DLC agent imagines multi-agent interaction sequences in the compact latent space of a learned world model that combines a joint transition function with opponent viewpoint prediction. Imagined sel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 25 publications
0
4
0
Order By: Relevance
“…Their simulator merges an analytic model with a data-driven dynamic model to be used in the real world with a hierarchical policy structure. Schwarting et al [29] learn a world model in latent space to imagine self-play that reduces sample generation in a multi-agent DRL training scheme. The work of Kalaria et al [30] proposes curriculum learning to utilize a control barrier function that is gradually removed during training to not comprise the final performance.…”
Section: B Mapless Methodsmentioning
confidence: 99%
“…Their simulator merges an analytic model with a data-driven dynamic model to be used in the real world with a hierarchical policy structure. Schwarting et al [29] learn a world model in latent space to imagine self-play that reduces sample generation in a multi-agent DRL training scheme. The work of Kalaria et al [30] proposes curriculum learning to utilize a control barrier function that is gradually removed during training to not comprise the final performance.…”
Section: B Mapless Methodsmentioning
confidence: 99%
“…To better aid reinforcement learning for exploration, a dynamic model of the learning environment is a viable idea [29], [30]. A world model is learned to train agents through imagined self-play to achieve agility in action sequences [31], [32], [33], and the latent strategies of opponents can be predicted to guide the agent toward appropriate directions [34]. Dynamic model learning also plays a very important role in curiosity [35], [36].…”
Section: Forward Model Learningmentioning
confidence: 99%
“…SAC is also widely used [11], [56]- [58]. [10], [59] first learns a latent representation of the world and learns through self-play.…”
Section: B Learning-based Planningmentioning
confidence: 99%