Proceedings of the 22nd International Conference on Machine Learning - ICML '05 2005
DOI: 10.1145/1102351.1102377
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement learning with Gaussian processes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
268
0

Year Published

2007
2007
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 260 publications
(268 citation statements)
references
References 5 publications
0
268
0
Order By: Relevance
“…Engel et al [2] approached the problem from the viewpoint of temporal difference learning (GPTD) and later extended this scheme to be able to deal with stochastic state transitions to improve action selection and to learning Q-values without an explicit transition model (GPSARSA) [3]. Their approach was successfully applied to the problem of learning complex manipulation policies [4].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Engel et al [2] approached the problem from the viewpoint of temporal difference learning (GPTD) and later extended this scheme to be able to deal with stochastic state transitions to improve action selection and to learning Q-values without an explicit transition model (GPSARSA) [3]. Their approach was successfully applied to the problem of learning complex manipulation policies [4].…”
Section: Related Workmentioning
confidence: 99%
“…This, however, can lead to discretization errors or, when finegrained grids are used, requires a huge amount of memory and a time-consuming exploration process. Therefore, function approximation techniques that directly operate on the continuous space such as neural networks [1], [15], kernel methods [9], or Gaussian processes [13], [3] have been proposed as powerful alternatives to the discrete approximations of the continuous Q-function. From a regression perspective, these techniques seek to model the dependency…”
Section: B Learning the Q-functionmentioning
confidence: 99%
“…The parameters of the value function are usually learned from data, as in the case of incremental TD and the Least-Squares TD (LSTD) methods [4,5]. Also, kernelized reinforcement learning methods have been paid a lot of attention by employing kernel techniques to standard RL methods [6] and Gaussian Processes for approximating the value function [7][8][9].…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, Gaussian kernels have 'centers', which alleviates the difficulty of basis subset choice, e.g., uniform allocation (Lagoudakis and Parr 2003) or sample-dependent allocation (Engel et al 2005). In this paper, we therefore define Gaussian kernels on graphs (which we call geodesic Gaussian kernel), and propose using them for value function approximation (see Fig.…”
Section: Introductionmentioning
confidence: 99%
“…Our definition of Gaussian kernels on graphs employs the shortest paths between states rather than the Euclidean distance, which can be computed efficiently using the Dijkstra algorithm (Dijkstra 1959;Fredman and Tarjan 1987). Moreover, an effective use of Gaussian kernels opens up the possibility to exploit the recent advances in using Gaussian processes for temporaldifference learning (Engel et al 2005).…”
Section: Introductionmentioning
confidence: 99%