2022
DOI: 10.48550/arxiv.2202.00817
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Do Differentiable Simulators Give Better Policy Gradients?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…Remark 5.5. Model-based RP PGMs with non-smooth models and policies can suffer from large variance and highly non-smooth loss landscapes, which can lead to slow convergence or failure during training even in simple toy examples (Parmas et al, 2018;Metz et al, 2021;Suh et al, 2022a). Proposition 5.4 suggests that one can add smoothness regularization to avoid exploding gradient variance.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Remark 5.5. Model-based RP PGMs with non-smooth models and policies can suffer from large variance and highly non-smooth loss landscapes, which can lead to slow convergence or failure during training even in simple toy examples (Parmas et al, 2018;Metz et al, 2021;Suh et al, 2022a). Proposition 5.4 suggests that one can add smoothness regularization to avoid exploding gradient variance.…”
Section: Resultsmentioning
confidence: 99%
“…Two main categories have emerged in the realm of stochastic gradient estimation: (1) Likelihood Ratio (LR) estimators, which perform zeroth-order estimation through the sampling of function evaluations (Williams, 1992;Konda and Tsitsiklis, 1999;Kakade, 2001), and (2) ReParameterization (RP) gradient estimators, which harness the differentiability of the function approximation (Figurnov et al, 2018;Ruiz et al, 2016;Clavera et al, 2020;Suh et al, 2022a). Despite the wide adoption of both LR and RP PGMs in practice, the majority of the literature on the theoretical properties of PGMs focuses on LR PGMs.…”
Section: Introductionmentioning
confidence: 99%
“…Our main goal is to study scenarios with deformables, because prior works already explored a number of tasks limited to rigid-body motion (see previous section). That said, very few works considered contact-rich tasks, and one prior work highlighted the potential fundamental limitations of gradients for rigid contacts [6]. Hence, there is a need to further study the fundamentals of how loss landscapes and gradient fields are affected by rigid contacts.…”
Section: Bo-leap: a Methods For Global Search On Rugged Landscapesmentioning
confidence: 99%
“…Surprisingly, efforts to investigate the differentiability of these simulators are far and few. One prior work [6] has highlighted a few fundamental limitations of differentiable simulation in the presence of rigid contacts, in low-dimensional systems. In this work, we first investigate the quality of gradients by visualizing loss landscapes through differentiable simulators for several robotic manipulation tasks of interest.…”
Section: Introductionmentioning
confidence: 99%
“…Such an approach is mostly useful for perception-based control systems with a complex mapping h. It is also possible to apply autodifferentiation to obtain the gradient estimate. In general, it is unclear whether the finite-difference technique or the autodifferentiation method gives a better gradient in the context of control, and this topic is still being studied actively [27]. We will investigate this issue in the future.…”
Section: Model-free Roa Analysismentioning
confidence: 99%