Multi-level Parallelization for Hybrid ACO

Abdelkafi, Omar; Lepagnot, Julien; Idoumghar, Lhassane

doi:10.1007/978-3-319-12970-9_7

Cited by 4 publications

(3 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To adapt single solution based metaheuristics for GPU, iteration level is implemented to speedup the time-consuming generation and evaluation of the neighbors without affecting the behavior of the algorithm. Besides, algorithmic level is implemented for two reasons, which are to achieve a high occupancy of GPU, and to enhance the exploration ability of the algorithms by launching multiple instances [8,18,30]. We noticed that communication strategies are implemented between processes in order to guide the search towards promising search regions.…”

Section: Discussionmentioning

confidence: 99%

“…We noticed that communication strategies are implemented between processes in order to guide the search towards promising search regions. For instance, in [8], processes form a ring topology and exchange solutions based on a diversification strategy mentioned in [19]. In [18], a working set F is used to exchange solutions.…”

Section: Discussionmentioning

confidence: 99%

“…In the same context, QAP has been addressed in [8] by proposing another version of iterative parallel tabu search (ITTSD), where different instances of TS are run in parallel (algorithmic level). Afterwards, a diversification strategy proposed in [38] is applied.…”

Section: Single Solution Based Metaheuristicsmentioning

confidence: 99%

See 2 more Smart Citations

GPU parallelization strategies for metaheuristics: a survey

Essaid

Idoumghar

Lepagnot

et al. 2018

International Journal of Parallel, Emergent and Distributed Sys

Self Cite

View full text Add to dashboard Cite

Metaheuristics have been showing interesting results in solving hard optimization problems. However, they become limited in terms of effectiveness and runtime for high dimensional problems. Thanks to the independency of metaheuristics components, parallel computing appears as an attractive choice to reduce the execution time and to improve solution quality. By exploiting the increasing performance and programability of graphics processing units (GPUs) to this aim, GPU-based parallel metaheuristics have been implemented using different designs. Recent results in this area show that GPUs tend to be effective co-processors for leveraging complex optimization problems. In this survey, mechanisms involved in GPU programming for implementing parallel metaheuristics are presented and discussed through a study of relevant research papers.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

GPU parallelization strategies for metaheuristics: a survey

Essaid

Idoumghar

Lepagnot

et al. 2018

International Journal of Parallel, Emergent and Distributed Sys

Self Cite

View full text Add to dashboard Cite

show abstract

Re-engineering the ant colony optimization for CMP architectures

Cecilia

García

2019

J Supercomput

View full text Add to dashboard Cite

The Ant Colony Optimization (ACO) is inspired by the behavior of real ants and, as a bioinspired method; its underlying computation is massively parallel by definition. This paper shows re-engineering strategies to migrate the ACO algorithm applied to the Traveling Salesman Problem (TSP) to modern Intel-based multi-and-many-core architectures in a step-bystep methodology. The paper provides detailed guidelines on how to optimize the algorithm for the intra-node (thread and vector) parallelization, showing the performance scalability along with the number of cores on different Intel architectures, reporting up to 5.5x speed-up factor between the Intel Xeon Phi Knights Landing (KNL) and Intel Xeon v2. Moreover, parallel efficiency is provided for all targeted architectures, finding that core load imbalance, memory bandwidth limitations, and NUMA effects on data placement are some of the key factors limiting performance. Finally, a distributed implementation is also presented, reaching up to 2.96x speed-up factor when running the code on 3 nodes over the single-node counterpart version. In the latter case, the parallel efficiency is affected by the synchronization frequency, which also affects the quality of the solution found by the distributed implementation.

show abstract

ACOTSP-MF: A memory-friendly and highly scalable ACOTSP approach

Martínez

García²

2021

Engineering Applications of Artificial Intelligence

View full text Add to dashboard Cite

Multi-level Parallelization for Hybrid ACO

Cited by 4 publications

References 15 publications

GPU parallelization strategies for metaheuristics: a survey

GPU parallelization strategies for metaheuristics: a survey

Re-engineering the ant colony optimization for CMP architectures

ACOTSP-MF: A memory-friendly and highly scalable ACOTSP approach

Contact Info

Product

Resources

About