2019
DOI: 10.48550/arxiv.1908.03568
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Behaviour Suite for Reinforcement Learning

Abstract: This paper introduces the Behaviour Suite for Reinforcement Learning, or bsuite for short. bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives. First, to collect clear, informative and scalable problems that capture key issues in the design of general and efficient learning algorithms. Second, to study agent behaviour through their performance on these shared benchmarks. To complement this effort, we open source g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
34
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 26 publications
(35 citation statements)
references
References 26 publications
0
34
0
Order By: Relevance
“…There are a substantial amount of meta analysis works on online RL algorithms. While some focus on inadequacies in the experimental protocols [Henderson et al, 2017, Osband et al, 2019, others study the roles of subtle implementation details in algorithms [Tucker et al, 2018, Engstrom et al, 2020, Andrychowicz et al, 2021, Furuta et al, 2021. For example, Tucker et al [2018], Engstrom et al [2020] identified that superior performances of certain algorithms were more dependent on, or even accidentally due to, minor implementation rather than algorithmic differences.…”
Section: Meta Analyses Of Rl Algorithmsmentioning
confidence: 99%
“…There are a substantial amount of meta analysis works on online RL algorithms. While some focus on inadequacies in the experimental protocols [Henderson et al, 2017, Osband et al, 2019, others study the roles of subtle implementation details in algorithms [Tucker et al, 2018, Engstrom et al, 2020, Andrychowicz et al, 2021, Furuta et al, 2021. For example, Tucker et al [2018], Engstrom et al [2020] identified that superior performances of certain algorithms were more dependent on, or even accidentally due to, minor implementation rather than algorithmic differences.…”
Section: Meta Analyses Of Rl Algorithmsmentioning
confidence: 99%
“…We evaluate our approach on the open-source RL Unplugged Atari dataset , where we show that R-BVE outperforms other offline RL methods. We show that R-BVE performs better on two more datasets: bsuite (Osband et al, 2019) and partially observable DeepMind Lab environments (Beattie et al, 2016) 1 . We provide careful ablations and analyses that provide insights into our proposed method and existing offline RL algorithms.…”
Section: Introductionmentioning
confidence: 80%
“…bsuite (Osband et al, 2019) is a proposed benchmark designed to highlight key aspects of an agent's scalability such as exploration, memory or credit assignment. We generated low-coverage offline RL datasets for catch, mountain_car and cartpole by recording the experiences of an online agent during training, as described by (Agarwal et al, 2019a), and then subsampling it (see Appendix D.1 for details.)…”
Section: Bsuite Experimentsmentioning
confidence: 99%
“…From OpenAI Gym (Brockman et al, 2016), LunarLander is a sparse reward control environment, and MountainCar a sparse reward exploration environment. From BSuite 6 (Osband et al, 2020), Cartpole-Noise is a dense reward control environment. The BootstrapDQN baseline follows Osband et al (2018) as explained in section 2.3.…”
Section: Iv-dqnmentioning
confidence: 99%