2021
DOI: 10.48550/arxiv.2102.04999
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Adaptive Pairwise Weights for Temporal Credit Assignment

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 0 publications
0
1
0
Order By: Relevance
“…The simplest example is studying how the choice of discount factor γ affects the policy learning (Petrik & Scherrer, 2008;Jiang et al, 2015;Fedus et al, 2019). Several previous work consider to extend the λ-return mechanism (Sutton, 1988) to a more generalized credit assignment framework, such as adaptive λ (Xu et al, 2018) and pairwise weights (Zheng et al, 2021). RUDDER (Arjona-Medina et al, 2019) proposes a return-equivalent formulation for the credit assignment problem and establish theoretical analyses (Holzleitner et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…The simplest example is studying how the choice of discount factor γ affects the policy learning (Petrik & Scherrer, 2008;Jiang et al, 2015;Fedus et al, 2019). Several previous work consider to extend the λ-return mechanism (Sutton, 1988) to a more generalized credit assignment framework, such as adaptive λ (Xu et al, 2018) and pairwise weights (Zheng et al, 2021). RUDDER (Arjona-Medina et al, 2019) proposes a return-equivalent formulation for the credit assignment problem and establish theoretical analyses (Holzleitner et al, 2021).…”
Section: Related Workmentioning
confidence: 99%