Proceedings of the 24th International Conference on Machine Learning 2007
DOI: 10.1145/1273496.1273545
|View full text |Cite
|
Sign up to set email alerts
|

Constructing basis functions from directed graphs for value function approximation

Abstract: Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success of this technique is attributed to the smoothness of the basis functions with respect to the state space geometry. This paper explores the properties of bases created from directed graphs which are a more natural fit for expressing state connectivity. Digraphs capture the effect of non-reversible MDPs whose value functions may not be s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
20
0

Year Published

2007
2007
2013
2013

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 31 publications
(21 citation statements)
references
References 10 publications
1
20
0
Order By: Relevance
“…Let us consider the asymptotic form of the optimization problem solved by ℓ 1 -PBR, depicted in Equation (10). Using the Pythagorean theorem, it can be rewritten as follows:…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Let us consider the asymptotic form of the optimization problem solved by ℓ 1 -PBR, depicted in Equation (10). Using the Pythagorean theorem, it can be rewritten as follows:…”
Section: Discussionmentioning
confidence: 99%
“…This implies many different methods, such as feature construction [10,15] or Kernel-based approaches [7,22]. Another approach consists in defining beforehand a (very) large number of features and then choosing automatically those which are relevant for the problem at hand.…”
Section: Introductionmentioning
confidence: 99%
“…In contrast to these approaches that make use of the approximation of the Bellman error, including ours, the work by Mahadevan et al aims to find policy and reward function independent basis functions that captures the intrinsic domain structure that can be used to represent any value function [18][19][20]. Their approach originates from the idea of using manifolds to model the topology of the state space; a state space connectivity graph is built using the samples of state transitions, and then eigenvectors of the (directed) graph Laplacian with the smallest eigenvalues are used as basis functions.…”
Section: Methodsmentioning
confidence: 99%
“…These eigenvectors form Φ and can be used with any of the approximation methods described in Section 4. A comparison of the directed and undirected Laplacian for solving MDPs can be found in [57]. The directed Laplacian requires a strongly connected graph, however, graphs created from an agent's experience may not have this property.…”
Section: Definition 61 the Combinatorial And Normalized Graph Laplamentioning
confidence: 99%