2020
DOI: 10.1007/978-3-030-63710-1_12
|View full text |Cite
|
Sign up to set email alerts
|

Understanding the Behavior of Reinforcement Learning Agents

Abstract: and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ? Take down policyIf you believe that this document breaches c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(15 citation statements)
references
References 13 publications
(20 reference statements)
0
15
0
Order By: Relevance
“…The BNET prototype was not systematically tuned for optimal parameter settings due to the high computational effort. The used parameter setup is based on preliminary tests and CGP-ANN or SMB-NE related publications [28,29]. The baseline algorithms parameter were also improved (from the stable baseline default settings) based on preliminary results for each problem at hand, e.g., the reward discount parameter gamma and the learning rate.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The BNET prototype was not systematically tuned for optimal parameter settings due to the high computational effort. The used parameter setup is based on preliminary tests and CGP-ANN or SMB-NE related publications [28,29]. The baseline algorithms parameter were also improved (from the stable baseline default settings) based on preliminary results for each problem at hand, e.g., the reward discount parameter gamma and the learning rate.…”
Section: Methodsmentioning
confidence: 99%
“…Here, one challenge is the appropriate definition of the state set S, as it has a considerable influence on the distance. We rely on a selection approach presented in [29], where both policies' stored states are combined for each pairwise distance calculation. If a new target network's fitness without stored states needs to be predicted, the reference policies' stored states are applied as reference input.…”
Section: Optimization By Behavior Surrogatesmentioning
confidence: 99%
“…This notion of behavior, with slight modifications, has appeared in several papers in the Reinforcement Learning literature [23][24][25][26]. At least one existing work uses this notion of behavior in Novelty Search [23].…”
Section: Primitive Behaviormentioning
confidence: 99%
“…Another [24] uses it for optimization with an algorithm other than Novelty Search. [23,25,26] weight the constituent distances (i.e. w s is not constant), and [25] uses primitive behavior to study the relationship between behavior and reward.…”
Section: Primitive Behaviormentioning
confidence: 99%
See 1 more Smart Citation