Necessary optimality conditions for average cost minimization problems

Bettiol, Piernicola; Khalil, Nathalie T.

doi:10.3934/dcdsb.2019086

Cited by 5 publications

(10 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Then, in Sect. 4, we also derive a Pontryagin's maximum principle for problem (2), refining some results in [2]. In Sect.…”

Section: Introductionmentioning

confidence: 56%

“…In Sect. 5, we state and prove the main results of the paper, providing positive answers to question (1) and question (2). In Sect.…”

Section: Introductionmentioning

confidence: 72%

“…We recall the following result due to Bettiol and Khalil, which is a special case of Theorem 3.3 in [2]:…”

Section: Optimality Conditionsmentioning

confidence: 99%

“…However, scrutiny to the proof given there reveals that the result still holds true under the relaxed condition (TH). Furthermore, Theorem 3.3 in [2] is derived for a Mayer optimal control problem, i.e., with only a final cost, but an analogous theorem for Bolza optimal control problems can be easily obtained by using a standard state augmentation argument.…”

Section: Remark 43mentioning

confidence: 99%

“…A general, rigorous framework capturing PILCO as well as other Bayesian model-based RL approaches (see, e.g., [3,4,[10][11][12]29]) has been developed in [18,19]. In particular, it is important to mention that the framework developed in [18] is closely related to the averaging control framework and Riemann-Stieltjes optimal control [2,16,21,24,30].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Convergence results for an averaged LQR problem with applications to reinforcement learning

Andrea

Palladino

Falcone

2021

Math. Control Signals Syst.

View full text Add to dashboard Cite

In this paper, we will deal with a linear quadratic optimal control problem with unknown dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution $$\pi $$ π on the space of matrices. Furthermore, we will assume that such a probability measure is opportunely updated to take into account the increased experience that the agent obtains while exploring the environment, approximating with increasing accuracy the underlying dynamics. Under these assumptions, we will show that the optimal control obtained by solving the “average” linear quadratic optimal control problem with respect to a certain $$\pi $$ π converges to the optimal control driven related to the linear quadratic optimal control problem governed by the actual, underlying dynamics. This approach is closely related to model-based reinforcement learning algorithms where prior and posterior probability distributions describing the knowledge on the uncertain system are recursively updated. In the last section, we will show a numerical test that confirms the theoretical results.

show abstract

“…Then, in Sect. 4, we also derive a Pontryagin's maximum principle for problem (2), refining some results in [2]. In Sect.…”

Section: Introductionmentioning

confidence: 56%

“…In Sect. 5, we state and prove the main results of the paper, providing positive answers to question (1) and question (2). In Sect.…”

Section: Introductionmentioning

confidence: 72%

“…We recall the following result due to Bettiol and Khalil, which is a special case of Theorem 3.3 in [2]:…”

Section: Optimality Conditionsmentioning

confidence: 99%

Section: Remark 43mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Convergence results for an averaged LQR problem with applications to reinforcement learning

Andrea

Palladino

Falcone

2021

Math. Control Signals Syst.

View full text Add to dashboard Cite

show abstract

A model for system uncertainty in reinforcement learning

Murray

Palladino

2018

Systems & Control Letters

View full text Add to dashboard Cite

This work provides a rigorous framework for studying continuous time control problems in uncertain environments. The framework considered models uncertainty in state dynamics as a measure on the space of functions. This measure is considered to change over time as agents learn their environment. This model can be seem as a variant of either Bayesian reinforcement learning or adaptive control. We study necessary conditions for locally optimal trajectories within this model, in particular deriving an appropriate dynamic programming principle and Hamilton-Jacobi equations. This model provides one possible framework for studying the tradeoff between exploration and exploitation in reinforcement learning.

show abstract

Optimal control of ensembles of dynamical systems

Scagliotti¹

2023

ESAIM: COCV

View full text Add to dashboard Cite

In this paper we consider the problem of the optimal control of an ensemble of affine-control systems. After proving the well-posedness of the minimization problem under examination, we establish a $\Gamma$-convergence result that allows us to substitute the original (and usually infinite) ensemble with a sequence of finite increasing-in-size sub-ensembles. The solutions of the optimal control problems involving these sub-ensembles provide approximations in the $L^2$-strong topology of the minimizers of the original problem. Using again a $\Gamma$-convergence argument, we manage to derive a Maximum Principle for ensemble optimal control problems with end-point cost. Moreover, in the case of finite sub-ensembles, we can address the minimization of the related cost through numerical schemes. In particular, we propose an algorithm that consists of a subspace projection of the gradient field induced on the space of admissible controls by the approximating cost functional. In addition, we consider an iterative method based on the Pontryagin Maximum Principle. Finally, we test the algorithms on an ensemble of linear systems in $\mathbb{R}^2$.

show abstract

Necessary optimality conditions for average cost minimization problems

Cited by 5 publications

References 15 publications

Convergence results for an averaged LQR problem with applications to reinforcement learning

Convergence results for an averaged LQR problem with applications to reinforcement learning

A model for system uncertainty in reinforcement learning

Optimal control of ensembles of dynamical systems

Contact Info

Product

Resources

About