2009
DOI: 10.1007/978-3-642-11169-3_13
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Multi-Armed Bandits and Extreme Value-Based Rewards for Adaptive Operator Selection in Evolutionary Algorithms

Abstract: Abstract. The performance of many efficient algorithms critically depends on the tuning of their parameters, which on turn depends on the problem at hand. For example, the performance of Evolutionary Algorithms critically depends on the judicious setting of the operator rates. The Adaptive Operator Selection (AOS) heuristic that is proposed here rewards each operator based on the extreme value of the fitness improvement lately incurred by this operator, and uses a Multi-Armed Bandit (MAB) selection process bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
62
0

Year Published

2010
2010
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 43 publications
(62 citation statements)
references
References 24 publications
0
62
0
Order By: Relevance
“…On the one hand, the average reward of every operator tends to decrease as evolution goes on (diminishing returns). In the One-Max problem, for instance, the best mutation operator is the 5-bit mutation when the population is far away from the optimum; but the reward of the 5-bit mutation gracefully decreases as the population goes to more fit regions, and at some point the 3-bit mutation operator catches up (more details on this can be found in [15]). This suggests that when a good operator has been identified, there is no need for exploration as long as this operator remains sufficiently good.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…On the one hand, the average reward of every operator tends to decrease as evolution goes on (diminishing returns). In the One-Max problem, for instance, the best mutation operator is the 5-bit mutation when the population is far away from the optimum; but the reward of the 5-bit mutation gracefully decreases as the population goes to more fit regions, and at some point the 3-bit mutation operator catches up (more details on this can be found in [15]). This suggests that when a good operator has been identified, there is no need for exploration as long as this operator remains sufficiently good.…”
Section: Discussionmentioning
confidence: 99%
“…Fed by the Extreme value of fitness improvements, it was later assessed on some EA binary benchmark problems [14][15][16], and also on some SAT instances [31].…”
Section: Multi-armed Banditmentioning
confidence: 99%
“…MAENS is a memetic algorithm which makes use of a crossover operator, a local search combining three local move operators and a novel long move operator called MergeSplit, and a ranking selection procedure called stochastic ranking (SR) (Runarsson and Yao 2000). The major differences between MAENS and MAENS* are: (a) MAENS uses a single crossover operator, whereas MAENS* uses a set of crossover operators, (b) a dynamic MAB mechanism (dMAB) (Fialho et al 2009) is adopted as an AOS rule, (c) a novel CA mechanism assigns a reward to the operators which is proportional to the number of solutions generated by each operator that "survived" the ranking phase, named proportional reward, (d) the stochastic ranking is improved considering also the diversity of the solutions (dSR) using a (e) novel diversity measure for the CARP search space.…”
Section: Maens*mentioning
confidence: 99%
“…The dMAB (Fialho et al 2009) approach, adopted in this work, combines the UCB1 algorithm (Auer et al 2002) with the Page-Hinckley (PH) statistical test (Hinkley 1971) to detect changes in the environment. When the PH test is triggered, the MAB system is restarted and the information gathered in the previous generations is discarded.…”
Section: Maens*mentioning
confidence: 99%
See 1 more Smart Citation