2009 American Control Conference 2009
DOI: 10.1109/acc.2009.5160344
|View full text |Cite
|
Sign up to set email alerts
|

Approximate dynamic programming using Bellman residual elimination and Gaussian process regression

Abstract: The overarching goal of the thesis is to devise new strategies for multi-agent planning and control problems, especially in the case where the agents are subject to random failures, maintenance needs, or other health management concerns, or in cases where the system model is not perfectly known. We argue that dynamic programming techniques, in particular Markov Decision Processes (MDPs), are a natural framework for addressing these planning problems, and present an MDP problem formulation for a persistent surv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
9
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 62 publications
0
9
0
Order By: Relevance
“…The core of Theorem 3 is that the optimization objectives are different on the left-hand side and right-hand side of (28). Theorem 3 indicates that it will get a better generalization ability when optimizing each value function (to obtain a global value function), respectively, for each local subspace of the state space.…”
Section: Assumptionmentioning
confidence: 99%
See 1 more Smart Citation
“…The core of Theorem 3 is that the optimization objectives are different on the left-hand side and right-hand side of (28). Theorem 3 indicates that it will get a better generalization ability when optimizing each value function (to obtain a global value function), respectively, for each local subspace of the state space.…”
Section: Assumptionmentioning
confidence: 99%
“…replacing-kernel reinforcement learning (RKRL) is an online model selection method for GPTD using a sequential Monte-Carlo method [26]. An approach as Bellman residual elimination (BRE) rather than Bellman residual minimization [27] is introduced to KBRL, which emphasizes the fact that the Bellman error is explicitly forced to zero, and BRE(GP) is proposed based on Gaussian process regression [28]. A unifying view of the different approaches is proposed to kernelized value function approximation for RL and demonstrates that several model-free kernelized value function approximators can be viewed as special cases of a novel, model-based value function approximator [29].…”
mentioning
confidence: 99%
“…The contribution of this paper is a novel approach to estimate the ROA of nonlinear systems in a flexible and parallelizable way exploiting Gaussian process (GP) regression to learn the infinite horizon cost function, which can be used as Lyapunov function for stable systems [11]. The infinite horizon cost is learned efficiently with a Gaussian process by exploiting the Bellman equation [12]. Since the learned cost might violate the Lyapunov conditions around the origin due to regression errors, we derive a theorem allowing to extend known regions of attraction through a Lyapunov-like function.…”
Section: Introductionmentioning
confidence: 99%
“…This work has led to algorithms such as GPTD [8], an approach which uses temporal differences to learn a Gaussian process representation of the cost-to-go function, and GPDP [9], which is an approximate value iteration scheme based on a similar Gaussian process cost-to-go representation. Another recently-developed approach, known as Bellman Residual Elimination (BRE) [1], [2], uses kernelbased regression to solve a system of Bellman equations over a small set of sample states.…”
Section: Introductionmentioning
confidence: 99%