2016
DOI: 10.1016/j.neucom.2015.04.125
|View full text |Cite
|
Sign up to set email alerts
|

A new prospective for Learning Automata: A machine learning approach

Abstract: In the field of Learning Automata (LA), how to design faster learning algorithms has always been a key issue. Among solutions reported in the literature, the stochastic estimator reward-inaction learning automaton (SE RI), which belongs to the Maximum Likelihood estimator based LAs, has been recognized as the fastest-optimal LA. In this paper, we first point out the limitations of the traditional Maximum Likelihood estimator (MLE) based LAs and then introduce Bayesian estimator based approach, which is demonst… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 11 publications
(11 citation statements)
references
References 21 publications
0
11
0
Order By: Relevance
“…Due to the superiority of the estimator algorithms, there are many novel estimators [3][4] [5] are proposed in recent years. In 2005, Hao Ge [3] proposed a deterministic estimator based LA (Discretized Generalized Confidence Pursuit Algorithm, DGCP A) of which the estimate of each action is the upper bound of a confidence interval and extended the algorithm to stochastic estimator schemes.…”
Section: Random Environmentmentioning
confidence: 99%
“…Due to the superiority of the estimator algorithms, there are many novel estimators [3][4] [5] are proposed in recent years. In 2005, Hao Ge [3] proposed a deterministic estimator based LA (Discretized Generalized Confidence Pursuit Algorithm, DGCP A) of which the estimate of each action is the upper bound of a confidence interval and extended the algorithm to stochastic estimator schemes.…”
Section: Random Environmentmentioning
confidence: 99%
“…As emphasized in section I, parameter tuning is intend to balance the trade-off between speed and accuracy. And the standard procedure of parameter tuning is pioneered in [20] and become a common practive in follow-up researches [4], [5], [7]- [9], [21].…”
Section: Appendix the Standard Parameter Tuning Procedures Of Learningmentioning
confidence: 99%
“…Repeat this process 20 times, averaging over these 20 resulting values and denote it as the best resolution parameter. The value of number of successive No Error experiments is set as N E = 750, as the same value in [5], [7]- [9]. For tuning the "best" γ, In our simulation settings, for the four two-action environments, the search range of γ is from 1 to 10; For the five ten-action environments except E 7 , the search range of γ is from 1 to 20, while for E 7 , the most difficult one, the range is a little wider, from 1 to 30.…”
Section: Appendix the Standard Parameter Tuning Procedures Of Learningmentioning
confidence: 99%
See 2 more Smart Citations