2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence) 2008
DOI: 10.1109/fuzzy.2008.4630449
|View full text |Cite
|
Sign up to set email alerts
|

Fuzzy Q-Learning with an adaptive representation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 9 publications
0
8
0
Order By: Relevance
“…Such methods have been proposed, e.g., for Q-learning (Reynolds, 2000;Ratitch and Precup, 2004;Waldock and Carse, 2008), V-iteration (Munos and Moore, 2002), and Q-iteration (Munos, 1997;Uther and Veloso, 1998).…”
Section: Basis Function Refinementmentioning
confidence: 99%
See 1 more Smart Citation
“…Such methods have been proposed, e.g., for Q-learning (Reynolds, 2000;Ratitch and Precup, 2004;Waldock and Carse, 2008), V-iteration (Munos and Moore, 2002), and Q-iteration (Munos, 1997;Uther and Veloso, 1998).…”
Section: Basis Function Refinementmentioning
confidence: 99%
“…• when the value function is not (approximately) constant in that region (Munos and Moore, 2002;Waldock and Carse, 2008);…”
Section: Basis Function Refinementmentioning
confidence: 99%
“…Each input variable has 3 MFs with a total of 6 MFs. Each MF, as defined by (12), has 2 parameters to be tuned. These parameters are the standard deviation, (J, and the mean, m. The total number of input parameters to be tuned is 12 parameters.…”
Section: Fuzzy Logic Controllermentioning
confidence: 99%
“…In the Q-Iearning algorithm, the state and action spaces are discrete and their corresponding value function is stored in a what is known as a Q-table. To use Q-Iearning with continuous systems (continuous state and action spaces), one can discretize the state and action spaces [11] or use some types of function approximations such as fuzzy systems [12], neural networks [13], or use some types of optimization techniques such as GAs [14]. A one-step update rule for Q-Iearning is defined as (6) where a is the learning rate, (0 < a :::; 1) and f::, t is the temporal difference error (TD-error) defined as f::, t = rt+1 +-y m � Qt(st+l ,a ) -Qt(st , a t)…”
Section: A Reinforcement Learningmentioning
confidence: 99%
“…The state and action spaces are discrete and their corresponding value function is stored in what is known as a Q-table. To use Q-learning with continuous systems (continuous state and action spaces), one can discretize the state and action spaces [21,[31][32][33][34][35][36] or use some type of function approximation such as FISs [26][27][28], NNs [12,19,37], or use some type of optimization technique such as genetic algorithms [38,39].…”
Section: Reinforcement Learningmentioning
confidence: 99%