Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence 2018
DOI: 10.24963/ijcai.2018/787
|View full text |Cite
|
Sign up to set email alerts
|

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents (Extended Abstract)

Abstract: The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community. In this paper we take a big picture look at how the ALE is being used by the research community. We focus on how diverse the evaluation methodologies in the ALE have become and we highlight some key concerns when ev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 6 publications
0
4
0
Order By: Relevance
“…A large body of works has been built on these algorithms to address different challenges in reinforcement learning including policy learning (Haarnoja et al 2017), hierarchical learning (Klissarov et al 2017), transfer learning (Wulfmeier, Posner & Abbeel 2017), and emergence of complex behavior (Heess et al 2017). Deep learning software such as Theano and Tensorflow as well as the availability of source code of learning algorithms (e.g., Duan et al 2016) and benchmark simulated environments (e.g., Brockman et al 2016, Machado et al 2018, Tassa et al 2018 contributed to this advancement.…”
Section: Introductionmentioning
confidence: 99%
“…A large body of works has been built on these algorithms to address different challenges in reinforcement learning including policy learning (Haarnoja et al 2017), hierarchical learning (Klissarov et al 2017), transfer learning (Wulfmeier, Posner & Abbeel 2017), and emergence of complex behavior (Heess et al 2017). Deep learning software such as Theano and Tensorflow as well as the availability of source code of learning algorithms (e.g., Duan et al 2016) and benchmark simulated environments (e.g., Brockman et al 2016, Machado et al 2018, Tassa et al 2018 contributed to this advancement.…”
Section: Introductionmentioning
confidence: 99%
“…To improve the Q-value function we parameterize it by a neural network and update it as the agent collects new experience. Specifically, we evaluate the Q-values on a voxelated grid [35], which allows us to update the Q-value towards the highest final reward observed by the agent for a specific state-action pair (figure 1), which for a deterministic problem puts a lower bound on the optimal Q-value [44]. Additionally, this discretization allows for easy inference of the optimal atom placement by simply finding the voxel which maximizes the Q-value.…”
Section: Theorymentioning
confidence: 99%
“…Specifically, we evaluate the Qvalues on a voxelated grid, which allows us to update the Q-value towards the highest final reward observed by the agent for a specific state-action pair (Fig. 1), which for a deterministic problem puts a lower bound on the optimal Q-value 41 . Additionally, this discretization allows for easy inference of the optimal atom placement by simply finding the voxel which maximizes the Q-value.…”
Section: Theorymentioning
confidence: 99%