2019
DOI: 10.1007/978-3-030-28619-4_34
|View full text |Cite
|
Sign up to set email alerts
|

AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems

Abstract: Model-free policy learning has enabled good performance on complex tasks that were previously intractable with traditional control techniques. However, this comes at the cost of requiring a perfectly accurate model for training. This is infeasible due to the very high sample complexity of model-free methods preventing training on the target system. This renders such methods unsuitable for physical systems. Model mismatch due to dynamics parameter differences and unmodeled dynamics error may cause suboptimal or… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 19 publications
(17 citation statements)
references
References 33 publications
0
17
0
Order By: Relevance
“…We use a five dimensional car that is a standard test environment in motion planning [35] and RL [5]. The car Fig.…”
Section: A Car Modelmentioning
confidence: 99%
“…We use a five dimensional car that is a standard test environment in motion planning [35] and RL [5]. The car Fig.…”
Section: A Car Modelmentioning
confidence: 99%
“…The multi-task transfer framework, discussed in the next subsection, requires a desired trajectory and the correct input that makes the system track the desired trajectory. To construct this pair of desired trajectory and corresponding correct input, we use an optimization-based ILC [21] to modify the input and improve the tracking performance of the system, which now behaves close to (8), in a small number of iterations 1, . .…”
Section: A Multi-robot Transfermentioning
confidence: 99%
“…where A L1 , B L1 and C L1 are the discrete-time matrices that describe (8). To capture the dynamics of a complete trial by a static mapping, we compute the lifted representation of (9) using Equations (13)(14) in [22] to obtain:…”
Section: A Multi-robot Transfermentioning
confidence: 99%
See 1 more Smart Citation
“…A method developed on a similar idea to the one in this paper is in [16], in which a MPC controller is designed to stabilize the target system around the nominal trajectory generated by consecutively applying the policy in the source system. Theorems on tube-based MPC ensure that the states are bounded under certain modeling error.…”
Section: Related Workmentioning
confidence: 99%