2018
DOI: 10.48550/arxiv.1812.02868
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Measuring and Characterizing Generalization in Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 0 publications
2
10
0
Order By: Relevance
“…This means that any state we visit is actually reachable, at least in the training environment. The closest method to ours appears in the work of (Witty et al, 2018), in which the authors try to characterize generalization by exploring different starting configurations, either by directly changing the start state or by using states visited by another agent. In contrast to our work, their insights are applied to characterizing generalization, and not as explanations for humans to understand agent behavior.…”
Section: Counterfactual Style Explanationsmentioning
confidence: 99%
“…This means that any state we visit is actually reachable, at least in the training environment. The closest method to ours appears in the work of (Witty et al, 2018), in which the authors try to characterize generalization by exploring different starting configurations, either by directly changing the start state or by using states visited by another agent. In contrast to our work, their insights are applied to characterizing generalization, and not as explanations for humans to understand agent behavior.…”
Section: Counterfactual Style Explanationsmentioning
confidence: 99%
“…This behavior was magnified the more enemies were removed. This implied the agent had not learned a robust strategy for the game, but was simply using the enemies as a "signal" as to how to act; a more detailed analysis is available in Witty et al (2018).…”
Section: Approachmentioning
confidence: 99%
“…In control networks, where the output could govern the actions of highly impactful systems (e.g., autonomous vehicles), the lack of explainability is worrisome. Even more problematic, deep RL based systems have been shown to be highly dependent on the exact structure of the input, producing wildly different output with only small perturbations to the input Witty et al (2018). Users employing these system can be left with agents producing erratic behavior with little indication as to why.…”
Section: Introductionmentioning
confidence: 99%
“…Model-free RL, like model-based RL, has also suffered from both the "train=test" paradigm and a lack of standardization around how to measure generalization. In response, recent papers have discussed what generalization in RL means and how to measure it [7,8,36,49,71], and others have proposed new environments such as Procgen [9] and Meta-World [74] as benchmarks focusing on measuring generalization. While popular in the model-free community [e.g.…”
Section: Introductionmentioning
confidence: 99%