This paper proposes a new framework for the computational design of robots that are robust to disturbances. The framework combines trajectory optimization (TO) and feedback control design to produce robots with improved performance under perturbations by co-optimizing a nominal trajectory alongside a feedback policy and the system morphology. Stochastic-programming (SP) methods are used to address these perturbations via uncertainty models in the problem specification, resulting in motions that are easier to stabilize via feedback. Two robotic systems serve to demonstrate the potential of the method: a planar manipulator and a jumping monopod robot. The co-optimized robots achieve higher performance compared to state-of-the-art solutions where the feedback controller is designed separately from the physical system. Specifically, the co-designed controllers show higher tracking accuracy and improved energy efficiency (e.g., 91% decrease in tracking error and approximately 5% decrease in energy consumption for a manipulator) compared to LQR applied to a design optimized for nominal conditions.
This project has received funding from the Italian Ministry for Education, University, and Research (MIUR) through the "Departments of Excellence" programme.
This paper presents a novel algorithm for the continuous control of dynamical systems that combines Trajectory Optimization (TO) and Reinforcement Learning (RL) in a single framework. The motivations behind this algorithm are the two main limitations of TO and RL when applied to continuous nonlinear systems to minimize a non-convex cost function. Specifically, TO can get stuck in poor local minima when the search is not initialized close to a "good" minimum. On the other hand, when dealing with continuous state and control spaces, the RL training process may be excessively long and strongly dependent on the exploration strategy. Thus, our algorithm learns a "good" control policy via TO-guided RL policy search that, when used as initial guess provider for TO, makes the trajectory optimization process less prone to converge to poor local optima. Our method is validated on several reaching problems featuring non-convex obstacle avoidance with different dynamical systems, including a car model with 6D state, and a 3-joint planar manipulator. Our results show the great capabilities of CACTO in escaping local minima, while being more computationally efficient than the Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO) RL algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.