2011
DOI: 10.1016/j.neucom.2010.11.012
|View full text |Cite
|
Sign up to set email alerts
|

Continuous state/action reinforcement learning: A growing self-organizing map approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
7
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 16 publications
1
7
0
Order By: Relevance
“…However, these works assume that an effective policy for a particular target task is already accessible to the teacher, which is not the case in this work. SOM-based approaches have previously been used in RL for a number of applications such as improving learning speed (Tateyama, Kawata, & Oguchi, 2004) and representation in continuous state-action domains (Montazeri, Moradi, & Safabakhsh, 2011; Smith, 2002). In the context of scaling task knowledge for continual learning (Ring, 1994), Ring, Schaul, and Schmidhuber (2011) described a modular approach to assimilate the knowledge of complex tasks using a training process that closely resembles SOM.…”
Section: Related Workmentioning
confidence: 99%
“…However, these works assume that an effective policy for a particular target task is already accessible to the teacher, which is not the case in this work. SOM-based approaches have previously been used in RL for a number of applications such as improving learning speed (Tateyama, Kawata, & Oguchi, 2004) and representation in continuous state-action domains (Montazeri, Moradi, & Safabakhsh, 2011; Smith, 2002). In the context of scaling task knowledge for continual learning (Ring, 1994), Ring, Schaul, and Schmidhuber (2011) described a modular approach to assimilate the knowledge of complex tasks using a training process that closely resembles SOM.…”
Section: Related Workmentioning
confidence: 99%
“…In this task the agent has to solve a maze by reaching the goal state fixed at position (19,2). At each time step, the agent receives a reward of -1 until it reaches the goal state where it receives a reward of 0.…”
Section: Testbed Environmentsmentioning
confidence: 99%
“…Mountain car [12], robot navigation [13]- [15], puddle world [16], cart centering [17] and arm-control [18], [19] are examples of that nature. Basically most of these models are made by two Self Organizing Maps (SOM) networks; one is used to map the state space and the other is used for searching the action space [12], [18], [19].…”
Section: Introductionmentioning
confidence: 99%
“…Unfortunately, this method cannot be adapted to non-stationary data sets. Hesam et al [18] suggested an algorithm that applies GSOM neural networks to reinforcement learning to achieve an optimal state-space representation through two growing Self-Organizing Maps. The significant advantage of this algorithm is its ability to process online data in real-time using adaptive mechanisms.…”
Section: Introductionmentioning
confidence: 99%