Learning to Run Challenge Solutions: Adapting Reinforcement Learning Methods for Neuromusculoskeletal Environments

Kidziński, Łukasz; Mohanty, Sharada P.; Ong, Carmichael; Huang, Zhewei; Zhou, Shuchang; Pechenko, Anton; Adam, Stelmaszczyk,; Jarosik, Piotr; Pavlov, Mikhail; Kolesnikov, Sergey; Plis, Sergey M.; Chen, Zhibo; Zhang, Zhizheng; Chen, Jiale; Shi, Jun; Zheng, Zedong; Yuan, Chun; Lin, Zhihui; Michalewski, Henryk; Miłoś, Piotr; Osiński, Błażej; Melnik, Andrew; Schilling, Malte; Ritter, Helge; Carroll, Sean F.; Hicks, Jennifer L.; Levine, Sergey; Salathé, Marcel; Delp, Scott L.

doi:10.1007/978-3-319-94042-7_7

Cited by 61 publications

(52 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Various RL techniques have been effectively used since the first competition [124,125], including frame skipping, discretization of the action space, and reward shaping. These are practical techniques that constrain the problem in certain ways to encourage an agent to search successful regions faster in the initial stages of training.…”

Section: Top Solutions and Resultsmentioning

confidence: 99%

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Song

Kidziński

Peng

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Modeling human motor control and predicting how humans will move in novel environments is a grand scientific challenge. Despite advances in neuroscience techniques, it is still difficult to measure and interpret the activity of the millions of neurons involved in motor control. Thus, researchers in the fields of biomechanics and motor control have proposed and evaluated motor control models via neuromechanical simulations, which produce physically correct motions of a musculoskeletal model. Typically, researchers have developed control models that encode physiologically plausible motor control hypotheses and compared the resulting simulation behaviors to measurable human motion data. While such plausible control models were able to simulate and explain many basic locomotion behaviors (e.g. walking, running, and climbing stairs), modeling higher layer controls (e.g. processing environment cues, planning long-term motion strategies, and coordinating basic motor skills to navigate in dynamic and complex environments) remains a challenge. Recent advances in deep reinforcement learning lay a foundation for modeling these complex control processes and controlling a diverse repertoire of human movement; however, reinforcement learning has been rarely applied in neuromechanical simulation to model human control. In this paper, we review the current state of neuromechanical simulations, along with the fundamentals of reinforcement learning, as it applies to human locomotion. We also present a scientific competition and accompanying software platform, which we have organized to accelerate the use of reinforcement learning in neuromechanical simulations. This “Learn to Move” competition, which we have run annually since 2017 at the NeurIPS conference, has attracted over 1300 teams from around the world. Top teams adapted state-of-art deep reinforcement learning techniques to produce complex motions, such as quick turning and walk-to-stand transitions, that have not been demonstrated before in neuromechanical simulations without utilizing reference motion data. We close with a discussion of future opportunities at the intersection of human movement simulation and reinforcement learning and our plans to extend the Learn to Move competition to further facilitate interdisciplinary collaboration in modeling human motor control for biomechanics and rehabilitation research.

show abstract

Section: Top Solutions and Resultsmentioning

confidence: 99%

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Song

Kidziński

Peng

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…In the early stage of training, participants reduced accuracy to speed up simulations to train their models more quickly. Later, they fine-tuned the model by switching the accuracy to the same one used for the competition [22].…”

Section: Solutionsmentioning

confidence: 99%

Artificial Intelligence for Prosthetics: Challenge Solutions

Kidziński

Ong

Mohanty

et al. 2019

The Springer Series on Challenges in Machine Learning

Self Cite

View full text Add to dashboard Cite

In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning.

show abstract

“…D EEP neural networks have pushed further the envelope of reinforcement learning in a wide variety of domains, such as Atari games [1], continuous systems control [2], musculoskeletal models control for medical applications [3], etc. Deep reinforcement learning (Deep-RL) methods perform trail-and-error training through frequent interactions with the environments.…”

Section: Introductionmentioning

confidence: 99%

“…We evaluate our proposed method on a realistic physiologically-based model control task, namely Learning to Run [3]. Experimental results show that AE-DDPG outperforms not only the vanilla DDPG but also other popular RL methods in training efficiency and the resulting final policies.…”

Section: Introductionmentioning

confidence: 99%

Asynchronous Episodic Deep Deterministic Policy Gradient: Toward Continuous Control in Computationally Complex Environments

Zhang

Chen

et al. 2021

IEEE Trans. Cybern.

Self Cite

View full text Add to dashboard Cite

Deep Deterministic Policy Gradient (DDPG) has been proved to be a successful reinforcement learning (RL) algorithm for continuous control tasks. However, DDPG still suffers from data insufficiency and training inefficiency, especially in computationally complex environments. In this paper, we propose Asynchronous Episodic DDPG (AE-DDPG), as an expansion of DDPG, which can achieve more effective learning with less training time required. First, we design a modified scheme for data collection in an asynchronous fashion. Generally, for asynchronous RL algorithms, sample efficiency or/and training stability diminish as the degree of parallelism increases. We consider this problem from the perspectives of both data generation and data utilization. In detail, we re-design experience replay by introducing the idea of episodic control so that the agent can latch on good trajectories rapidly. In addition, we also inject a new type of noise in action space to enrich the exploration behaviors. Experiments demonstrate that our AE-DDPG achieves higher rewards and requires less time consuming than most popular RL algorithms in Learning to Run task which has a computationally complex environment. Not limited to the control tasks in computationally complex environments, AE-DDPG also achieves higher rewards and 2-to 4-fold improvement in sample efficiency on average compared to other variants of DDPG in MuJoCo environments. Furthermore, we verify the effectiveness of each proposed technique component through abundant ablation study.

show abstract

Learning to Run Challenge Solutions: Adapting Reinforcement Learning Methods for Neuromusculoskeletal Environments

Cited by 61 publications

References 24 publications

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Artificial Intelligence for Prosthetics: Challenge Solutions

Asynchronous Episodic Deep Deterministic Policy Gradient: Toward Continuous Control in Computationally Complex Environments

Contact Info

Product

Resources

About