Transfer RL across Observation Feature Spaces via Model-Based Regularization

Sun, Yanchao; Ruijie, Zheng,; Wang, Xiyao; Cohen, Andrew S.; Huang, Furong

doi:10.48550/arxiv.2201.00248

Cited by 2 publications

(4 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In transfer RL, most previous work [22,13] focuses on transferring under certain prior knowledge between the two observation spaces. Very recent work from Sun et al [21] is closest to the settings considered in this paper with a different solution than ours. Sun et al [21] tackled drastic changes in observation spaces and proposed an algorithm transferring the policy from source domain via learning a sufficient representation.…”

Section: Related Workmentioning

confidence: 86%

“…Very recent work from Sun et al [21] is closest to the settings considered in this paper with a different solution than ours. Sun et al [21] tackled drastic changes in observation spaces and proposed an algorithm transferring the policy from source domain via learning a sufficient representation. It provide an asymptotic guarantee of the policy learning given some representation condition, and empirical validation of the algorithm.…”

Section: Related Workmentioning

confidence: 86%

“…It provide an asymptotic guarantee of the policy learning given some representation condition, and empirical validation of the algorithm. However, Sun et al [21] does not gives finite-sample guarantees of the policy learning and any error bounds of learning the representation out of the deterministic transition case. Van Driessel and Francois-Lavet [24] proposed a deep RL algorithm for transfer learning with very different visual observation spaces by learning the abstract states, while this paper focuses more on a provable sample efficiency guarantee.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Provably Sample-Efficient RL with Side Information about Latent Dynamics

Liu¹,

Misra²,

Dudík³

et al. 2022

Preprint

View full text Add to dashboard Cite

We study reinforcement learning (RL) in settings where observations are high-dimensional, but where an RL agent has access to abstract knowledge about the structure of the state space, as is the case, for example, when a robot is tasked to go to a specific room in a building using observations from its own camera, while having access to the floor plan. We formalize this setting as transfer reinforcement learning from an abstract simulator, which we assume is deterministic (such as a simple model of moving around the floor plan), but which is only required to capture the target domain's latentstate dynamics approximately up to unknown (bounded) perturbations (to account for environment stochasticity). Crucially, we assume no prior knowledge about the structure of observations in the target domain except that they can be used to identify the latent states (but the decoding map is unknown). Under these assumptions, we present an algorithm, called TASID, that learns a robust policy in the target domain, with sample complexity that is polynomial in the horizon, and independent of the number of states, which is not possible without access to some prior knowledge. In synthetic experiments, we verify various properties of our algorithm and show that it empirically outperforms transfer RL algorithms that require access to "full simulators" (i.e., those that also simulate observations).

show abstract

Section: Related Workmentioning

confidence: 86%

Section: Related Workmentioning

confidence: 86%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Provably Sample-Efficient RL with Side Information about Latent Dynamics

Liu¹,

Misra²,

Dudík³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Since collective behavior data of simulated boids, simulated robots and real robots, have the same set of features, but with different observation space ranges and distributions (Abpeikar et al, 2022b ), the proposed method in this paper focuses on feature-based (observation space) transfer learning. Some feature-based transfer learning methods applied to RL are based on distribution similarity (Zhong et al, 2018 ), model-based regularization (Sun et al, 2022 ), and feature-space re-mapping (Feuz and Cook, 2015 ). The transfer learning on observation space used in this paper is based on using the Kullback-Leibler Divergence (KLD) method described by Zhong et al ( 2018 ).…”

Section: Background and Related Workmentioning

confidence: 99%

Iterative transfer learning for automatic collective motion tuning on multiple robot platforms

2023

View full text Add to dashboard Cite

This paper proposes an iterative transfer learning approach to achieve swarming collective motion in groups of mobile robots. By applying transfer learning, a deep learner capable of recognizing swarming collective motion can use its knowledge to tune stable collective motion behaviors across multiple robot platforms. The transfer learner requires only a small set of initial training data from each robot platform, and this data can be collected from random movements. The transfer learner then progressively updates its own knowledge base with an iterative approach. This transfer learning eliminates the cost of extensive training data collection and the risk of trial-and-error learning on robot hardware. We test this approach on two robot platforms: simulated Pioneer 3DX robots and real Sphero BOLT robots. The transfer learning approach enables both platforms to automatically tune stable collective behaviors. Using the knowledge-base library the tuning procedure is fast and accurate. We demonstrate that these tuned behaviors can be used for typical multi-robot tasks such as coverage, even though they are not specifically designed for coverage tasks.

show abstract

Transfer RL across Observation Feature Spaces via Model-Based Regularization

Cited by 2 publications

References 21 publications

Provably Sample-Efficient RL with Side Information about Latent Dynamics

Provably Sample-Efficient RL with Side Information about Latent Dynamics

Iterative transfer learning for automatic collective motion tuning on multiple robot platforms

Contact Info

Product

Resources

About