1973
DOI: 10.1109/tac.1973.1100238
|View full text |Cite
|
Sign up to set email alerts
|

Wide-sense adaptive dual control for nonlinear stochastic systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

1977
1977
2016
2016

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 163 publications
(33 citation statements)
references
References 17 publications
0
33
0
Order By: Relevance
“…Open-loop feedback control and the present algorithm also share the unfortunate characteristic that the agent does not take account of the possibility of future use of the knowledge acquired. This possibility, which is at the heart of the tradeoff between exploration and exploitation, is incorporated into a more sophisticated and complicated method called wide-sense dual adaptive control (Tse, Bar-Shalom & Meier, 1973;) which also has a counterpart in the sort of problems we are considering here. We are currently testing an algorithm in which the agent knows that it is will perform repeated trials in the same maze, and knows that if it finds a transition to be open on one trial then it has the expectation that it will be able to use the transition in succeeding trials, which gives the agent an incentive to learn.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Open-loop feedback control and the present algorithm also share the unfortunate characteristic that the agent does not take account of the possibility of future use of the knowledge acquired. This possibility, which is at the heart of the tradeoff between exploration and exploitation, is incorporated into a more sophisticated and complicated method called wide-sense dual adaptive control (Tse, Bar-Shalom & Meier, 1973;) which also has a counterpart in the sort of problems we are considering here. We are currently testing an algorithm in which the agent knows that it is will perform repeated trials in the same maze, and knows that if it finds a transition to be open on one trial then it has the expectation that it will be able to use the transition in succeeding trials, which gives the agent an incentive to learn.…”
Section: Discussionmentioning
confidence: 99%
“…For example, consider the quadratic regulator problem of Tse, Bar-Shalom and Meier (1973) which has imperfect state information and a nonlinear transition function. Their system used a second order model that maintained only the mean and variance, which were updated using an extended Kalman filter in the light of information from the world.…”
Section: Discussionmentioning
confidence: 99%
“…The control problem is then reduced to finding the control sequence ( ), , ( ) u k u k N d  that minimize the performance index (7) subject to equation (5). Theoretically, the optimal solution of control could be obtained by solving a dynamic programming.…”
Section: Formulationmentioning
confidence: 99%
“…The first ones try to solve the Bellman equation by using some approximated solutions of dynamic programming [5] [6]. In the second class, the optimization problem is reformulated by modifying cost functions [7] [8] [9].…”
Section: Introductionmentioning
confidence: 99%
“…15,16) The use of standard certainty equivalence control techniques and various other approximation techniques is limited due to the inherent complexity of the problem, including the trade-off between efficient control and reliable estimation. [17][18][19][20][21] In this paper, a dual control algorithm considering such a trade-off is derived. To improve estimation accuracy, it incorporates the cost incurred by the system uncertainty into the performance index.…”
Section: Stochastic Feedback Dual Control Algorithmmentioning
confidence: 99%