2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) 2017
DOI: 10.1109/icacci.2017.8125811
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of reinforcement learning algorithms applied to the cart-pole problem

Abstract: Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment and provide optimal control strategy. In this paper, RL is explored in the context of control of the benchmark cartpole dynamical system with no prior knowledge of the dynamics. RL algorithms such as temporal-difference, policy gradient actor-critic, and value function appr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 29 publications
(16 citation statements)
references
References 11 publications
0
16
0
Order By: Relevance
“…The pole is free to spin on a pivot on the vertical axis of the cart and the track. The controller applies a force F to the right or left of the cart; this allows the cart to move and keep the pole balanced [58]. The force is bounded by the interval (−F max , F max ), where F max is a system parameter.…”
Section: A Cart-pole Balancingmentioning
confidence: 99%
See 1 more Smart Citation
“…The pole is free to spin on a pivot on the vertical axis of the cart and the track. The controller applies a force F to the right or left of the cart; this allows the cart to move and keep the pole balanced [58]. The force is bounded by the interval (−F max , F max ), where F max is a system parameter.…”
Section: A Cart-pole Balancingmentioning
confidence: 99%
“…The applied force F produces a linear movement on the cart and an angular movement on the pole. Figure adapted from Nagendra et al[58].…”
mentioning
confidence: 99%
“…The car moves back and forth along a no friction track with the task of balancing a pole attached by an un-actuated joint as shown in Figure 3. The goal is to learn how to swing and balance the pole just by moving the car around the track (Barto et al, 1983;Nagendra et al, 2017). The observation variable, as in inverted pendulum swing-up environment, considers derived information of the angle, below has more details of training and variables characteristics in Table 4.…”
Section: Cart Pole (Cp)mentioning
confidence: 99%
“…The most recently visited state Mathematically, this function is dened, for each state s ∈ S, by E 0 (s) = 0 and : (14) γ and λ are parameters that make the eligibility trace decrease over the time steps. When a state is visited, 1(S t = s) = 1 and the function increases.…”
Section: Engg7282mentioning
confidence: 99%
“…Robotics problems are usually represented by continuous state spaces with discrete or continuous actions. The following papers explain how to deal with such complex representations using algorithms like: Q-Learning, Actor Critic Policy Gradient and SARSA for problems like Cart-Pole [14], ball collecting tasks [15] [17]. [18] [19], Proximal Policy Optimization [18] or Monotonic Policy Optimization [4] used to control the motion of 3D robots.…”
Section: Roboticsmentioning
confidence: 99%