2017
DOI: 10.1016/j.ifacol.2017.08.805
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Control via Reinforcement Learning with Symbolic Policy Approximation

Abstract: Model-based reinforcement learning (RL) algorithms can be used to derive optimal control laws for nonlinear dynamic systems. With continuous-valued state and input variables, RL algorithms have to rely on function approximators to represent the value function and policy mappings. This paper addresses the problem of finding a smooth policy based on the value function represented by means of a basis-function approximator. We first show that policies derived directly from the value function or represented explici… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 20 publications
0
7
0
Order By: Relevance
“…Based on the 5-SAP validation results, 21 best models (10x SNGP, 10x MGGP, and 1x RFWR) were chosen to be used for RL control of the magnetic manipulator. Based on these models, we calculated the approximations of the optimal V-functions using the fuzzy value iteration [10,21]. Furthermore, we used equation (19) to calculate the corresponding policies.…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…Based on the 5-SAP validation results, 21 best models (10x SNGP, 10x MGGP, and 1x RFWR) were chosen to be used for RL control of the magnetic manipulator. Based on these models, we calculated the approximations of the optimal V-functions using the fuzzy value iteration [10,21]. Furthermore, we used equation (19) to calculate the corresponding policies.…”
Section: Resultsmentioning
confidence: 99%
“…Genetic programming was already applied to nonlinear systems like an inverted pendulum or a collaborative robot [2,10,21]. To further investigate the approximation capabilities of these methods, we use a different system-a magnetic manipulator (Magman).…”
Section: Magnetic Manipulatormentioning
confidence: 99%
See 3 more Smart Citations