Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94) 1994
DOI: 10.1109/icnn.1994.374202
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement learning using a recurrent neural network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2007
2007
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…A number of researchers use a RNN to predict Q values to solve POMDPs (Lin 1993;Bakker 2002;Bakker et al 2003;Schmidhuber 1991;Ho and Kamel 1994;Onat, Kita, and Nishikawa 1998;Ballini et al 2001;Gomez et al 2006). The number of input units is equal to the dimension of the sensory inputs from the environment.…”
Section: Rl With Rnnmentioning
confidence: 99%
See 1 more Smart Citation
“…A number of researchers use a RNN to predict Q values to solve POMDPs (Lin 1993;Bakker 2002;Bakker et al 2003;Schmidhuber 1991;Ho and Kamel 1994;Onat, Kita, and Nishikawa 1998;Ballini et al 2001;Gomez et al 2006). The number of input units is equal to the dimension of the sensory inputs from the environment.…”
Section: Rl With Rnnmentioning
confidence: 99%
“…The second category, RL with RNN, is to use an RNN as an approximate function. An RNN is to learn Q values or advantage values (Lin 1993;Bakker 2002;Bakker, Zhumatiy, Gruener, and Schmidhuber 2003;Schmidhuber 1991;Ho and Kamel 1994;Onat, Kita, and Nishikawa 1998;Ballini, Soares, and Gomide 2001;Gomez, Schmidhuber, and Miikkulainen 2006). Although these methods can find a good policy in POMDPs, they have a big disadvantage in that they require a long learning time.…”
Section: Introductionmentioning
confidence: 99%