Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes

Israilov, Sardor; Li, Fu; Sánchez-Rodríguez, Jesús; Fusco, Franco; Allibert, Guillaume; Raufaste, Christophe; Argentina, Médéric

doi:10.1371/journal.pone.0280071

Cited by 14 publications

(5 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To encourage exploration and prevent the agent from settling into suboptimal policies, DQN integrates an ε-greedy exploration strategy. This means that the agent chooses a random action with probability ε, and chooses the action with the highest Q-value with probability 1 − ε. Israilov et al [30] applied the DQN algorithm to control the inverted pendulum on a cart both in an experimental setup and in simulation, for swing-up and stabilization of the pendulum in its unstable upward equilibrium without a dependency on initial conditions.…”

Section: Deep Q-network (Dqn)mentioning

confidence: 99%

Reliability Evaluation of Reinforcement Learning Methods for Mechanical Systems with Increasing Complexity

Manzl

Rogov

Gerstmayr

et al. 2023

Preprint

View full text Add to dashboard Cite

Reinforcement Learning (RL) is one of the emerging fields of Artificial Intelligence (AI) intended for designing agents that take actions in the physical environment. RL has many vital applications, including robotics and autonomous vehicles. The key characteristic of RL is its ability to learn from experience without requiring direct programming or supervision. To learn, an agent interacts with an environment by acting and observing the resulting states and rewards. In most practical applications, an environment is implemented as a virtual system due to cost, time, and safety concerns. Simultaneously, Multibody System Dynamics (MSD) is a framework for efficiently and systematically developing virtual systems of arbitrary complexity. MSD is commonly used to create virtual models of robots, vehicles, machinery, and humans. The features of RL and MSD make them perfect companions in building sophisticated, automated, and autonomous mechatronic systems. The research demonstrates the use of RL in controlling multibody systems. While AI methods are used to solve some of the most challenging tasks in engineering, their proper understanding and implementation are demanding. Therefore, we introduce and detail three commonly used RL algorithms to control the inverted N-pendulum on the cart. Single-, double-, and triple-pendulum configurations are investigated, showing the capability of RL methods to handle increasingly complex dynamical systems. We show 2D state space zones where the agent succeeds or fails the stabilization. Despite passing randomized tests during training, "blind spots" may occur where the agent's policy fails. Results confirm that RL is a versatile, although complex, control engineering approach.

show abstract

Section: Deep Q-network (Dqn)mentioning

confidence: 99%

Reliability Evaluation of Reinforcement Learning Methods for Mechanical Systems with Increasing Complexity

Manzl

Rogov

Gerstmayr

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Due to the complexity of controlling a system with unstable equilibrium, an inverted pendulum (IP) is one of the most commonly used benchmark problems to test different control algorithms. A lot of studies have been conducted in the context of controlling an IP system [6], especially using RL agents [7, 8]. Most of this work focuses on algorithm development and implementation in simulation.…”

Section: Introductionmentioning

confidence: 99%

The effect of model uncertainties in the reinforcement learning based regulation problem: An experimental case study with inverted pendulum

Pal,

Oveisi,

Nestorović

2023

Proc Appl Math and Mech

View full text Add to dashboard Cite

This work aims to improve the classical white‐box model of an inverted pendulum in order to reach a more accurate representation of an actual pendulum on a cart system. The purpose of the model is to train different controllers based on machine learning algorithms. In the context of this paper, the inverted pendulum system is driven by a belt drive that is controlled by a stepper motor. Due to the nature of the controller, the input to the stepper motor is in the form of a non‐smooth bang‐bang‐like signal that moves the cart to the left, right, or terminates its movement. One of the main challenges, in this case, is to find a proper function to model the stepper motor as its dynamics cannot be captured with a constant gain. It has been shown that the transient behavior of the stepper motor when changing direction or stopping is not negligible in the closed‐loop control performance. Accordingly, a grey‐box scheme, which accounts for the uncertainties that are not included in the vanilla white‐box model, is utilized to achieve a lower model mismatch compared to the actual pendulum. Initially, the equation of motion was derived using the Euler‐Lagrange equation with force on the cart as the control input. But in the real‐time experiment, the interface is realized by the stepper motor's frequency modulator, hence a transfer function representing the relationship between the frequency and the force applied on the cart (in the model) is calculated as a black‐box model. To improve the accuracy of the transfer function, an experimental data‐driven design of this function is performed based on modern schemes in system identification. For this purpose, the applied frequency to the stepper motor and the states from the actual system are recorded. Then, the applied force on the cart is calculated using the equation of motion and the recorded states. It is also shown that the frequency‐force transfer function uncertainty due to exogenous disturbances is non‐negligible and with the aim of having a more accurate model, an artificial neural network is introduced. Finally, the effectiveness of this grey‐box model is shown by training and implementing a deep Q‐network based controller to swing up and balance the inverted pendulum.

show abstract

“…Bates [28] harnessed GPUs to quickly train a simulation of an inverted pendulum to balance itself. Israilov et al [29] used two model-free RL algorithms to control targets and proposed a general framework to reproduce successful experiments and simulations based on the inverted pendulum. In addition, there are still many studies of this kind, for example [30][31][32].…”

Section: Introductionmentioning

confidence: 99%

Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning

Ma,

Xu,

Huang

et al. 2023

Applied Sciences

View full text Add to dashboard Cite

This paper is primarily focused on the robust control of an inverted pendulum system based on policy iteration in reinforcement learning. First, a mathematical model of the single inverted pendulum system is established through a force analysis of the pendulum and trolley. Second, based on the theory of robust optimal control, the robust control of the uncertain linear inverted pendulum system is transformed into an optimal control problem with an appropriate performance index. Moreover, for the uncertain linear and nonlinear systems, two reinforcement-learning control algorithms are proposed using the policy iteration method. Finally, two numerical examples are provided to validate the reinforcement learning algorithms for the robust control of the inverted pendulum systems.

show abstract

Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes

Cited by 14 publications

References 19 publications

Reliability Evaluation of Reinforcement Learning Methods for Mechanical Systems with Increasing Complexity

Reliability Evaluation of Reinforcement Learning Methods for Mechanical Systems with Increasing Complexity

The effect of model uncertainties in the reinforcement learning based regulation problem: An experimental case study with inverted pendulum

Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning

Contact Info

Product

Resources

About