We introduce a new benchmark problem called Deceptive Leading Blocks (DLB) to rigorously study the runtime of the Univariate Marginal Distribution Algorithm (UMDA) in the presence of epistasis and deception. We show that simple Evolutionary Algorithms (EAs) outperform the UMDA unless the selective pressure µ/λ is extremely high, where µ and λ are the parent and offspring population sizes, respectively. More precisely, we show that the UMDA with a parent population size of µ = Ω(log n) has an expected runtime of e Ω(µ) on the DLB problem assuming any selective pressure µ λ ≥ 14 1000 , as opposed to the expected runtime of O nλ log λ + n 3 for the non-elitist (µ, λ) EA with µ/λ ≤ 1/e. These results illustrate inherent limitations of univariate EDAs against deception and epistasis, which are common characteristics of real-world problems. In contrast, empirical evidence reveals the efficiency of the bi-variate MIMIC algorithm on the DLB problem. Our results suggest that one should consider EDAs with more complex probabilistic models when optimising problems with some degree of epistasis and deception. * Preliminary version of this work will appear in the Proceedings of 15th ACM/SIGEVO
Workshop on FoundationsEstimation of distribution algorithms (EDAs) [42,43,33] are a class of randomised search heuristics with many real-world applications (see [27] and references therein). Unlike traditional EAs, which define implicit models of promising solutions via genetic operations such as crossover and mutation, EDAs optimise objective functions by constructing and sampling explicit probabilistic models to generate offspring for the next iteration. The workflow of EDAs is an iterative process, where the initial model is a uniform distribution over the search space. The starting population consists of λ individuals sampled from the uniform distribution. A fitness function then scores each individual, and the algorithm selects the µ fittest individuals to update the model (where µ < λ). The procedure is repeated until some termination condition is fulfilled, which is usually a threshold on the number of iterations or on the quality of the fittest offspring [27, 19]. Many variants of EDAs have been proposed over the last decades. They differ in the way their models are represented, updated as well as sampled over iterations. In general, EDAs are categorised into two main classes: univariate and multivariate. Univariate EDAs take advantage of first-order statistics (i.e. the mean) to build a probability vector-based model and assume independence between decision variables. The probabilistic model is represented as an n-vector, where each component is called a marginal (also frequency) and n is the problem instance size. Typical univariate EDAs are compact Genetic Algorithm (cGA [25]), Univariate Marginal Distribution Algorithm (UMDA [42]) and Population-Based Incremental Learning (PBIL [3]). In contrast, multivariate EDAs apply higher-order statistics to model the correlations between decision variables of the addressed problems.Th...