1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat
DOI: 10.1109/ijcnn.1998.685910
|View full text |Cite
|
Sign up to set email alerts
|

Application considerations for the DHP methodology

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(24 citation statements)
references
References 4 publications
0
24
0
Order By: Relevance
“…10-12, dual heuristic programming (DHP) has been shown to learn quickly and to alleviate persistence of excitation problems by computing the correlation between the cost and the individual state elements. 4,10,13 In this paper, a DHP architecture is trained online to control the nonlinear simulation of a business jet aircraft over its full operating envelope, improving performance during unexpected conditions such as unmodeled dynamics, parameter variations, and control failures.…”
Section: Introductionmentioning
confidence: 99%
“…10-12, dual heuristic programming (DHP) has been shown to learn quickly and to alleviate persistence of excitation problems by computing the correlation between the cost and the individual state elements. 4,10,13 In this paper, a DHP architecture is trained online to control the nonlinear simulation of a business jet aircraft over its full operating envelope, improving performance during unexpected conditions such as unmodeled dynamics, parameter variations, and control failures.…”
Section: Introductionmentioning
confidence: 99%
“…An algorithm for updating the DHP functionals can be obtained from the policy-improvement routine and the value-determination operation, as explained in Section 3.3.3. In applications [23,24], the DHP algorithm has been shown to find the optimal solution more rapidly (with less iteration cycles) than HDP. However, due to the use of derivative information, the relationships for updating the control and value-derivative functionals are more involved.…”
Section: Action-dependent (Ad) Designsmentioning
confidence: 99%
“…Under the assumption that the critic is accurately estimating the gradient of the cost to go of the policy specified by the controller's parameter values, the critic function can be used to adjust these parameters so as to arrive at a local optimum in the parameterized policy space. This process has been successfully operationalized using artificial neural networks for both the control and critic functions [2], [12], [19], [20], [23], [24].…”
Section: A Approximate Dynamic Programmingmentioning
confidence: 99%
“…For the DHP method, where the critic estimates the derivatives of J(t) with respect to the system states (stock levels, transport available, etc.) R, the critic's output is defined as (10) We differentiate both sides of Bellman's Recursion (11) to get the identity used for critic training (12) To evaluate the right hand side of this equation we need a model of the system dynamics that includes all the terms from the Jacobian matrix of the coupled plant-controller system, e.g., all the and . In terms of our specific problem, we would be estimating such things as , and .…”
Section: A Approximate Dynamic Programmingmentioning
confidence: 99%
See 1 more Smart Citation