2018
DOI: 10.48550/arxiv.1809.07124
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Pommerman: A Multi-Agent Playground

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
46
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 27 publications
(47 citation statements)
references
References 0 publications
1
46
0
Order By: Relevance
“…While multi-agent reinforcement learning (MARL) is a well-established branch of Deep RL, most learning algorithms and environments proposed have targeted a relatively small number of agents 17,41 . It is common to see environments with less dozens of agents 1,30,52,67 , with 2-agent and 4-agent environments being particular popular for the study of competitive, self-play settings 2,23,36 . Collective intelligence observed in nature, however, rely on a much larger number of individuals than typically studied in MARL, involving population sizes from thousands to million.…”
Section: Multi-agent Learningmentioning
confidence: 99%
“…While multi-agent reinforcement learning (MARL) is a well-established branch of Deep RL, most learning algorithms and environments proposed have targeted a relatively small number of agents 17,41 . It is common to see environments with less dozens of agents 1,30,52,67 , with 2-agent and 4-agent environments being particular popular for the study of competitive, self-play settings 2,23,36 . Collective intelligence observed in nature, however, rely on a much larger number of individuals than typically studied in MARL, involving population sizes from thousands to million.…”
Section: Multi-agent Learningmentioning
confidence: 99%
“…Pommerman [17] is a variant of the famous game Bomberman and is used as a benchmark for multi-agent learning. Typically, there are 4 agents that each can move and place bomb on an 11 × 11 board.…”
Section: Pommermanmentioning
confidence: 99%
“…Then we describe CSP-MARL and explain the design of our code implementation in Section 3. Finally, we discuss several experiments over StarCraft 2, ViZDoom [16] and Pommerman [17] in Section 4 to show the efficiency and effectiveness of TLeague.…”
Section: Introductionmentioning
confidence: 99%
“…For example, Ivanovic et al (2018) propose to create a backwards curriculum for continuous control tasks through learning a dynamics model. Resnick et al (2018b) andSalimans &Chen (2018) propose to train policies on Pommerman (Resnick et al, 2018a) and the Atari game 'Montezumas Revenge' by starting each episode from a different point along a demonstration. Recently, Goyal et al (2018) and propose a learned backtracking model to generate traces that lead to high value states in order to obtain higher sample efficiency.…”
Section: Related Workmentioning
confidence: 99%