2021
DOI: 10.48550/arxiv.2106.03748
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards robust and domain agnostic reinforcement learning competitions

Abstract: Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field. Despite this, a majority of challenges suffer from the same fundamental problems: participant solutions to the posed challenge are usually domain-specific, biased to maximally exploit compute resources, and not guaranteed to be reproducible. In this paper, we present a new framework of competition design that promotes the development of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 28 publications
(32 reference statements)
0
3
0
Order By: Relevance
“…We evaluate our approach in the first-person Minecraft environments from the MineRL 2020 Competition [4]. The provided imitation learning database consists of data recorded from human players, and we use only images and the recorded sparse reward signals from the database to train our model.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We evaluate our approach in the first-person Minecraft environments from the MineRL 2020 Competition [4]. The provided imitation learning database consists of data recorded from human players, and we use only images and the recorded sparse reward signals from the database to train our model.…”
Section: Methodsmentioning
confidence: 99%
“…Such technique can contribute to explainable AI [3], symbolic and causal reasoning in the space of detected objects, robotics, or as auxiliary information [7] for an RL setup. Learning the optimal policy in sparse reward environments [4] is an important challenge in Deep Reinforcement Learning (DRL) [1,15,5]. A masking network for rewarding objects can support an actor-critic RL setup in a way that is intuitive and explainable to a human.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation