Differential Evolution algorithm applied to non-stationary bandit problem

St-Pierre, David Lupien; Liu, Jialin

doi:10.1109/cec.2014.6900505

Cited by 7 publications

(5 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Mellor and Shapiro (2013) analyze an NS-MAB where the probabilities according to which the expected value of the arms change are a priori fixed and propose the CTS algorithm that combines Thompson Sampling with a change point detection mechanism. St-Pierre and Jialin (2014) present an evolutionary algorithm to deal with generic non-stationary environments which empirically outperforms classical solutions.…”

Section: Related Workmentioning

confidence: 99%

Sliding-Window Thompson Sampling for Non-Stationary Settings

Trovò

Paladino

Restelli

et al. 2020

jair

View full text Add to dashboard Cite

Multi-Armed Bandit (MAB) techniques have been successfully applied to many classes of sequential decision problems in the past decades. However, non-stationary settings -- very common in real-world applications -- received little attention so far, and theoretical guarantees on the regret are known only for some frequentist algorithms. In this paper, we propose an algorithm, namely Sliding-Window Thompson Sampling (SW-TS), for nonstationary stochastic MAB settings. Our algorithm is based on Thompson Sampling and exploits a sliding-window approach to tackle, in a unified fashion, two different forms of non-stationarity studied separately so far: abruptly changing and smoothly changing. In the former, the reward distributions are constant during sequences of rounds, and their change may be arbitrary and happen at unknown rounds, while, in the latter, the reward distributions smoothly evolve over rounds according to unknown dynamics. Under mild assumptions, we provide regret upper bounds on the dynamic pseudo-regret of SW-TS for the abruptly changing environment, for the smoothly changing one, and for the setting in which both the non-stationarity forms are present. Furthermore, we empirically show that SW-TS dramatically outperforms state-of-the-art algorithms even when the forms of non-stationarity are taken separately, as previously studied in the literature.

show abstract

Section: Related Workmentioning

confidence: 99%

Sliding-Window Thompson Sampling for Non-Stationary Settings

Trovò

Paladino

Restelli

et al. 2020

jair

View full text Add to dashboard Cite

show abstract

“…The results show that the UCB algorithm is efficient in algorithm selection problems. Also in [18], authors regarded the algorithm selection problem as a nonstationary bandit problem and applied UCB algorithm to be the decision policy.…”

Section: Multi-armed Bandit Problemmentioning

confidence: 99%

Algorithm portfolio for individual-based surrogate-assisted evolutionary algorithms

Hao

Liu

Yao

2019

Proceedings of the Genetic and Evolutionary Computation Conference

Self Cite

View full text Add to dashboard Cite

Surrogate-assisted evolutionary algorithms (SAEAs) are powerful optimisation tools for computationally expensive problems (CEPs). However, a randomly selected algorithm may fail in solving unknown problems due to no free lunch theorems, and it will cause more computational resource if we re-run the algorithm or try other algorithms to get a much solution, which is more serious in CEPs. In this paper, we consider an algorithm portfolio for SAEAs to reduce the risk of choosing an inappropriate algorithm for CEPs. We propose two portfolio frameworks for very expensive problems in which the maximal number of fitness evaluations is only 5 times of the problem's dimension. One framework named Par-IBSAEA runs all algorithm candidates in parallel and a more sophisticated framework named UCB-IBSAEA employs the Upper Confidence Bound (UCB) policy from reinforcement learning to help select the most appropriate algorithm at each iteration. An effective reward definition is proposed for the UCB policy. We consider three state-of-the-art individual-based SAEAs on different problems and compare them to the portfolios built from their instances on several benchmark problems given limited computation budgets. Our experimental studies demonstrate that our proposed portfolio frameworks significantly outperform any single algorithm on the set of benchmark problems.

show abstract

“…In spite of some adaptations to other contexts (time varying as in [26] or adversarial [21,7]), and maybe due to strong differences such as the very non-stationary nature of bandit problems involved in optimization portfolios, these methods did not, for the moment, really find their way to AS. Another approach consists in writing this bandit algorithm as a meta-optimization problem; [38] applies the differential evolution algorithm [39] to some non-stationary bandit problem, which outperforms the classical bandit algorithm on an AS task.…”

Section: Static Portfolios and Parameter Tuningmentioning

confidence: 99%

Algorithm portfolios for noisy optimization

Cauwet

Liu

Rozière

et al. 2015

Ann Math Artif Intell

Self Cite

View full text Add to dashboard Cite

Noisy optimization is the optimization of objective functions corrupted by noise. A portfolio of solvers is a set of solvers equipped with an algorithm selection tool for distributing the computational power among them. Portfolios are widely and successfully used in combinatorial optimization.In this work, we study portfolios of noisy optimization solvers. We obtain mathematically proved performance (in the sense that the portfolio performs nearly as well as the best of its solvers) by an ad hoc portfolio algorithm dedicated to noisy optimization. A somehow surprising result is that it is better to compare solvers with some lag, i.e., propose the current recommendation of best solver based on their performance earlier in the run. An additional finding is a principled method for distributing the computational power among solvers in the portfolio.

show abstract

Differential Evolution algorithm applied to non-stationary bandit problem

Cited by 7 publications

References 21 publications

Sliding-Window Thompson Sampling for Non-Stationary Settings

Sliding-Window Thompson Sampling for Non-Stationary Settings

Algorithm portfolio for individual-based surrogate-assisted evolutionary algorithms

Algorithm portfolios for noisy optimization

Contact Info

Product

Resources

About