2018
DOI: 10.1016/j.ifacol.2018.11.115
|View full text |Cite
|
Sign up to set email alerts
|

Deep reinforcement learning based finite-horizon optimal tracking control for nonlinear system

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
6
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 13 publications
0
6
0
Order By: Relevance
“…Besides, there are many more numerical methods to solve PDEs as in [49]- [52] with high processing requirements. Therefore, new methods based on machine learning (ML), e.g., (deep) reinforcement learning and neural-network-based online methods, are important to obtain the solution for PDEs with more accuracy or speed [10], [53]- [55]. The (deep) reinforcement learning methods are developed to learn the solution of the HJB equation and control rule in [53]- [55].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Besides, there are many more numerical methods to solve PDEs as in [49]- [52] with high processing requirements. Therefore, new methods based on machine learning (ML), e.g., (deep) reinforcement learning and neural-network-based online methods, are important to obtain the solution for PDEs with more accuracy or speed [10], [53]- [55]. The (deep) reinforcement learning methods are developed to learn the solution of the HJB equation and control rule in [53]- [55].…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, new methods based on machine learning (ML), e.g., (deep) reinforcement learning and neural-network-based online methods, are important to obtain the solution for PDEs with more accuracy or speed [10], [53]- [55]. The (deep) reinforcement learning methods are developed to learn the solution of the HJB equation and control rule in [53]- [55]. In [10] a neural-network-based online solution of the HJB equation is used to explore the infinite horizon optimal robust guaranteed cost control of uncertain nonlinear systems.…”
Section: Introductionmentioning
confidence: 99%
“…Human-behavior learning [1] has become a popular research topic in different communities, specially in video games applications [2]. These kind of applications extract special features of the human performance while playing a video game using a deep reinforcement learning architecture known as Deep Q Network (DQN) [3]. The main components of this algorithm are: a deep neural network (designed with convolutional and fully connected layers [4]), a reinforcement learning update rule (Q-learning rule [5]- [7]), a εgreedy strategy for exploration-exploitation [8], [9] of the state-action space and an experience replay memory [10].…”
Section: Introductionmentioning
confidence: 99%
“…The main difference between these two approaches lies in the cognitions where different models and functions are used to extract experiences [12] and any previous knowledge [13], [14] that facilitates obtaining the solution of the desired task in an optimal way. Some examples of cognitive models are knowledge of the system and environment dynamics [9], [15] or any intelligent model/expert system [3], [16] like neural networks [17]- [19], function approximators [20]- [24], fuzzy systems [25], deep models [26], among others. The emotions defines a complex set and its topic for future work.…”
Section: Introductionmentioning
confidence: 99%
“…When system dynamics is unknown, the works 23‐26 establish neural networks as system identifiers to build the system dynamics through the collected data. Based on the system identifiers, the literature 23 uses ADP to solve the HJB equation where neural networks with time‐invariant weights are adopted to approximate the value function.…”
Section: Introductionmentioning
confidence: 99%