Abstract:Simulated humanoids are an appealing research domain due to their physical capabilities. Nonetheless, they are also challenging to control, as a policy must drive an unstable, discontinuous, and high-dimensional physical system. One widely studied approach is to utilize motion capture (MoCap) data to teach the humanoid agent low-level skills (e.g., standing, walking, and running) that can then be re-used to synthesize high-level behaviors. However, even with MoCap data, controlling simulated humanoids remains … Show more
“…Distillation has been applied in previous efforts to scale up reinforcement learning to train multi-task control policies [Merel et al 2019[Merel et al , 2020. Wagener et al [2023] used a single-stage distillation approach to train a general motion tracking controller capable of imitating approximately 3.5 hours of motion data. Our multi-stage progress distillation approach enables our system to train versatile controllers on 8.5 hours of text-labeled motion clips, leading to a unified end-to-end controller that can be directed to perform a large variety skills with simple text commands.…”
Section: Language-directed Controllersmentioning
confidence: 99%
“…Our approach is inspired by the first stage of MoCapAct [Wagener et al 2023], leveraging DeepMimic as our tracking method [Peng et al 2018]. We train a DeepMimic expert policy 𝜋 𝑒 𝑖 (a 𝑡 |o 𝑡 , 𝜙) on every motion capture sequence in our dataset, conditioned on the current state of character o 𝑡 as well as a phase variable 𝜙 ∈ [0, 1] that synchronizes the policy to the reference motion.…”
Section: Training Per-motion Expert Tracking Policiesmentioning
Figure 1: A physically simulated character performing motions specified by language commands. Our framework is able to train a versatile language-directed controller on a large dataset containing thousands of motions.
“…Distillation has been applied in previous efforts to scale up reinforcement learning to train multi-task control policies [Merel et al 2019[Merel et al , 2020. Wagener et al [2023] used a single-stage distillation approach to train a general motion tracking controller capable of imitating approximately 3.5 hours of motion data. Our multi-stage progress distillation approach enables our system to train versatile controllers on 8.5 hours of text-labeled motion clips, leading to a unified end-to-end controller that can be directed to perform a large variety skills with simple text commands.…”
Section: Language-directed Controllersmentioning
confidence: 99%
“…Our approach is inspired by the first stage of MoCapAct [Wagener et al 2023], leveraging DeepMimic as our tracking method [Peng et al 2018]. We train a DeepMimic expert policy 𝜋 𝑒 𝑖 (a 𝑡 |o 𝑡 , 𝜙) on every motion capture sequence in our dataset, conditioned on the current state of character o 𝑡 as well as a phase variable 𝜙 ∈ [0, 1] that synchronizes the policy to the reference motion.…”
Section: Training Per-motion Expert Tracking Policiesmentioning
Figure 1: A physically simulated character performing motions specified by language commands. Our framework is able to train a versatile language-directed controller on a large dataset containing thousands of motions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.