2020
DOI: 10.48550/arxiv.2001.02652
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sample-based Distributional Policy Gradient

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(10 citation statements)
references
References 11 publications
0
10
0
Order By: Relevance
“…Since the successful implementation of RL problems from a distributional perspective on Atari 2600 games (Bellemare et al, 2017a), there is a number of follow-ups trying to boost existing deep RL algorithms by directly characterizing the distribution of the random return instead of the expectation (Dabney et al, 2018a,b;Barth-Maron et al, 2018;Singh et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Since the successful implementation of RL problems from a distributional perspective on Atari 2600 games (Bellemare et al, 2017a), there is a number of follow-ups trying to boost existing deep RL algorithms by directly characterizing the distribution of the random return instead of the expectation (Dabney et al, 2018a,b;Barth-Maron et al, 2018;Singh et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…On the policy-gradient-based side, D4PG (Barth-Maron et al, 2018) incorporates the distributional perspective into DDPG (Lillicrap et al, 2015), with the return distribution modeled similarly as in C51 (Bellemare et al, 2017a). On top of that, SDPG (Singh et al, 2020) is proposed to model the quantile function with a generator to overcome the limitation of using variable probabilities at fixed locations, and the same as D4PG, it models the policy as a deterministic transformation of the state representation.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations