A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG

Li, Yanbiao; Chen, Zhao; Wu, Chentao; Mao, Haoyu; Sun, Peng

doi:10.3390/biomimetics8050382

Cited by 9 publications

(4 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Experimental results using the BHR7P bipedal robot validate the effectiveness of these proposed methods. The paper by Yanbiao [4] presents a hierarchical reinforcement learning framework based on the Deep Deterministic Policy Gradient (DDPG) algorithm. This framework involves a high-level planner that generates ideal motion parameters, a lowlevel controller using model predictive control (MPC), and a trajectory generator.…”

Section: Discussion Of the Papersmentioning

confidence: 99%

Special Issue: Design and Control of a Bio-Inspired Robot

Zhao,

2024

Biomimetics

View full text Add to dashboard Cite

show abstract

Section: Discussion Of the Papersmentioning

confidence: 99%

Special Issue: Design and Control of a Bio-Inspired Robot

Zhao,

2024

Biomimetics

View full text Add to dashboard Cite

show abstract

“…It can be adept at dealing with the high-dimensional action spaces commonly encountered in real-world applications like robotics, autonomous vehicles, and complex control systems. The soft target updates employed by DDPG also contribute to more robust and reliable learning, including controlling a car's throttle [25] or a robot's joint angles [26]. Furthermore, the improved version of DDPG has been proposed to enhance learning efficiency.…”

Section: Introductionmentioning

confidence: 99%

An improved DDPG algorithm based on evolution-guided transfer in reinforcement learning

Bai,

Wang

2024

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

Deep Reinforcement Learning (DRL) algorithms help agents take actions automatically in sophisticated control tasks. However, it is challenged by sparse reward and long training time for exploration in the application of Deep Neural Network (DNN). Evolutionary Algorithms (EAs), a set of black box optimization techniques, are well applied to single agent real-world problems, not troubled by temporal credit assignment. However, both suffer from large sets of sampled data. To facilitate the research on DRL for a pursuit-evasion game, this paper contributes an innovative policy optimization algorithm, which is named as Evolutionary Algorithm Transfer - Deep Deterministic Policy Gradient (EAT-DDPG). The proposed EAT-DDPG takes parameters transfer into consideration, initializing the DNN of DDPG with the parameters driven by EA. Meanwhile, a diverse set of experiences produced by EA are stored into the replay buffer of DDPG before the EA process is ceased. EAT-DDPG is an improved version of DDPG, aiming at maximizing the reward value of the agent trained by DDPG as much as possible within finite episodes. The experimental environment includes a pursuit-evasion scenario where the evader moves with the fixed policy, and the results show that the agent can explore policy more efficiently with the proposed EAT-DDPG during the learning process.

show abstract

“…To enhance the adaptability of these learning strategies while reducing their computational demands, current research trends towards integrating reinforcement learning with conventional control methods [ 17 ]. For example, Li and colleagues developed a hierarchical framework for quadruped robots’ gait planning that combines the DDPG algorithm with Model Predictive Control (MPC), achieving optimal action control [ 18 ]. Moreover, since the introduction of the soft actor-critic (SAC) algorithm, reinforcement learning-based gait control methods have made significant strides in the field of robotics [ 19 ].…”

Section: Introductionmentioning

confidence: 99%

Biped Robots Control in Gusty Environments with Adaptive Exploration Based DDPG

Zhang,

Sun,

Sun

et al. 2024

Biomimetics

View full text Add to dashboard Cite

As technology rapidly evolves, the application of bipedal robots in various environments has widely expanded. These robots, compared to their wheeled counterparts, exhibit a greater degree of freedom and a higher complexity in control, making the challenge of maintaining balance and stability under changing wind speeds particularly intricate. Overcoming this challenge is critical as it enables bipedal robots to sustain more stable gaits during outdoor tasks, thereby increasing safety and enhancing operational efficiency in outdoor settings. To transcend the constraints of existing methodologies, this research introduces an adaptive bio-inspired exploration framework for bipedal robots facing wind disturbances, which is based on the Deep Deterministic Policy Gradient (DDPG) approach. This framework allows the robots to perceive their bodily states through wind force inputs and adaptively modify their exploration coefficients. Additionally, to address the convergence challenges posed by sparse rewards, this study incorporates Hindsight Experience Replay (HER) and a reward-reshaping strategy to provide safer and more effective training guidance for the agents. Simulation outcomes reveal that robots utilizing this advanced method can more swiftly explore behaviors that contribute to stability in complex conditions, and demonstrate improvements in training speed and walking distance over traditional DDPG algorithms.

show abstract

A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG

Cited by 9 publications

References 21 publications

Special Issue: Design and Control of a Bio-Inspired Robot

Special Issue: Design and Control of a Bio-Inspired Robot

An improved DDPG algorithm based on evolution-guided transfer in reinforcement learning

Biped Robots Control in Gusty Environments with Adaptive Exploration Based DDPG

Contact Info

Product

Resources

About