2021 IEEE International Symposium on Information Theory (ISIT) 2021
DOI: 10.1109/isit45174.2021.9518023
|View full text |Cite
|
Sign up to set email alerts
|

Actor-only Deterministic Policy Gradient via Zeroth-order Gradient Oracles in Action Space

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 16 publications
0
2
0
Order By: Relevance
“…[2,7,22]), or such a computation might be prohibitively expensive or noisy (e.g. see [1,27,32]). Thus, several zeroth-order schemes have been developed for the solution of stochastic optimization problems similar to (P), which require only function evaluations of F (•, ξ), in the absence of gradient information.…”
Section: Introductionmentioning
confidence: 99%
“…[2,7,22]), or such a computation might be prohibitively expensive or noisy (e.g. see [1,27,32]). Thus, several zeroth-order schemes have been developed for the solution of stochastic optimization problems similar to (P), which require only function evaluations of F (•, ξ), in the absence of gradient information.…”
Section: Introductionmentioning
confidence: 99%
“…Known gradient-based approaches such as SGD train and generalize effectively in reasonable time [1]. In contrast, emerging applications such as convex bandits [2][3][4], black-box learning [5], federated learning [6], reinforcement learning [7,8], learning linear quadratic regulators [9,10], and hyper-parameter tuning [11] stand in need of gradient-free learning algorithms [11][12][13][14] due to an unknown loss/model or impossible gradient evaluation.…”
Section: Introductionmentioning
confidence: 99%