2020
DOI: 10.1002/acs.3097
|View full text |Cite
|
Sign up to set email alerts
|

Extremum seeking for optimal control problems with unknown time‐varying systems and unknown objective functions

Abstract: Summary We consider the problem of optimal feedback control of an unknown, noisy, time‐varying, dynamic system that is initialized repeatedly. Examples include a robotic manipulator which must perform the same motion, such as assisting a human, repeatedly and accelerating cavities in particle accelerators which are turned on for a fraction of a second with given initial conditions and vary slowly due to temperature fluctuations. We present an approach that applies to systems of practical interest. The method p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

5
3

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 53 publications
(96 reference statements)
0
11
0
Order By: Relevance
“…In our case we prefer to use a recently developed robust extremum seeking (ES) feedback control algorithm which was originally designed for the stabilization of unknown open-loop unstable time-varying nonlinear systems and then extended for the optimization of many parameter time-varying unknown systems with noise-corrupted measurement functions [67,68]. This method can quickly tune many coupled parameters and has been applied for time-varying problems such as adaptively learning optimal feedback control policies for unknown time-varying systems directly from measurement data [69].…”
Section: Overview Of Main Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In our case we prefer to use a recently developed robust extremum seeking (ES) feedback control algorithm which was originally designed for the stabilization of unknown open-loop unstable time-varying nonlinear systems and then extended for the optimization of many parameter time-varying unknown systems with noise-corrupted measurement functions [67,68]. This method can quickly tune many coupled parameters and has been applied for time-varying problems such as adaptively learning optimal feedback control policies for unknown time-varying systems directly from measurement data [69].…”
Section: Overview Of Main Resultsmentioning
confidence: 99%
“…The adaptive feedback dynamics (13) are chosen based on the results in [67][68][69]. The hyperparameters in (13) can intuitively be understood as α representing a dithering amplitude which controls the size of the dynamic perturbations, k a feedback gain, and the product kα can be thought of as an overall learning rate.…”
Section: Adaptive Latent Space Tuningmentioning
confidence: 99%
“…Reinforcement learning has recently grown in popularity with the use of ML methods being utilized to learn models for Dynamic Programming problems in order to satisfy the Bellman optimality condition [14]. In [15], a reinforcement learning approach is utilized in which optimal feedback control laws (or "agent policies") are learned online directly from system data for unknown and timevarying systems. This approach was studied for the optimal control of radio frequency accelerating cavities with characteristics that drift with time, such as cable length changes, resonance frequencies, and analog component fluctuations, due to temperature variations.…”
Section: Ml-enhanced Controlmentioning
confidence: 99%
“…where f and g are both nonlinear, time-varying, and analytically unknown. The method has now been generalized further with analytical proofs of stability for non-differentiable systems as well as systems not affine in control [41,42], of the forṁ x = f(x, u, t), and has been utilized in various particle accelerator applications [26,[29][30][31][32]43].…”
Section: Unknown Time-varying Systems and Adaptive Feedback Controlmentioning
confidence: 99%
“…Recently, methods such as reinforcement learning have been developed for this optimal control problem for unknown systems with unknown dynamics f and measurable, but analytically unknown cost function C for which it is impossible to calculate the optimal control policy with the standard Dynamic Programming approach [4]. One example is to use adaptive methods which can be applied for online RL in which optimal feedback control policies are learned directly from data to learn optimal feedback control policies which are parametrized by a chosen set of basis functions whose coefficients must be adaptively tuned online [43]. Other RL approaches learn either the unknown dynamics f , or the cost function C, or both directly from measured data and represent them as trained neural networks, as described above.…”
Section: Machine Learning For Time-varying Systemsmentioning
confidence: 99%