A GPU-based Branch-and-Bound algorithm using Integer–Vector–Matrix data structure

Gmys, Jan; Mezmaz, Mohand; Melab, Nouredine; Tuyttens, Daniel

doi:10.1016/j.parco.2016.01.008

Cited by 19 publications

(18 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The PFSP has been frequently used as a test-case for parallel B&B algorithms, as the huge amount of generated nodes and the highly irregular structure of the search tree raise multiple challenges in terms of design and implementation on increasingly complex parallel architectures, e. g. grid computing (Mezmaz et al, 2007;Drozdowski et al, 2011;Bendjoudi et al, 2012), multicore CPUs (Mezmaz et al, 2014a;Gmys et al, 2016a), GPUs and many-core devices (Chakroun et al, 2013;Gmys et al, 2016b;Melab et al, 2018), clusters of GPUs (Vu and Derbel, 2016) or FPGAs (Daouri et al, 2015).…”

Section: Parallelismmentioning

confidence: 99%

A computationally efficient Branch-and-Bound algorithm for the permutation flow-shop scheduling problem

Gmys

Mezmaz

Melab

et al. 2020

European Journal of Operational Research

View full text Add to dashboard Cite

In this work we propose an efficient branch-and-bound (B&B) algorithm for the permutation flowshop problem (PFSP) with makespan objective. We present a new node decomposition scheme that combines dynamic branching and lower bound refinement strategies in a computationally efficient way. To alleviate the computational burden of the two-machine bound used in the refinement stage, we propose an online learning-inspired mechanism to predict promising couples of bottleneck machines. The algorithm offers multiple choices for branching and bounding operators and can explore the search tree either sequentially or in parallel on multi-core CPUs. In order to empirically determine the most efficient combination of these components, a series of computational experiments with 600 benchmark instances is performed. A main insight is that the problem size, as well as interactions between branching and bounding operators substantially modify the trade-off between the computational requirements of a lower bound and the achieved tree size reduction. Moreover, we demonstrate that parallel tree search is a key ingredient for the resolution of large problem instances, as strong super-linear speedups can be observed. An overall evaluation using two well-known benchmarks indicates that the proposed approach is superior to previously published B&B algorithms. For the first benchmark we report the exact resolution -within less than 20 minutes -of two instances defined by 500 jobs and 20 machines that remained open for more than 25 years, and for the second a total of 89 improved best-known upper bounds, including proofs of optimality for 74 of them. . In contrast, exact methods allow to find optimal solution(s) with a proof of optimality, but their execution time is unpredictable and exponential in the worst-case.Branch-and-Bound (B&B) is the most frequently used exact method to solve combinatorial optimization problems like the PFSP. The algorithm recursively decomposes the initial problem by dynamically constructing and exploring a search-tree, whose root node represents the initial problem, leaf nodes are possible solutions and internal nodes are subproblems of the initial problem. This is done using four operators: branching, bounding, selection and pruning. The branching operator divides the initial problem into smaller disjoint subproblems and a bounding function computes lower bounds on the optimal cost of a subproblem. The pruning operator eliminates subproblems whose lower bound exceeds the cost of the best solution found so far (upper bound on the optimal makespan). The tree-traversal is guided by the selection operator which returns the next subproblem to be processed according to a search strategy (e.g. depth-first search).In this paper the focus is put on three performance-critical components of the algorithm: the lower bound (LB), the branching rule and the use of parallel tree exploration. Although they can be separated on a conceptual level, the main objective of this article is to reveal interactions between these compone...

show abstract

Section: Parallelismmentioning

confidence: 99%

A computationally efficient Branch-and-Bound algorithm for the permutation flow-shop scheduling problem

Gmys

Mezmaz

Melab

et al. 2020

European Journal of Operational Research

View full text Add to dashboard Cite

show abstract

“…The first version corresponds to the one presented in our earlier work [10,11] and uses a combined PTE + PEB model. We present two versions of the B&B algorithm, both using the same load balancing mechanism.…”

Section: Graphics Processing Unit-accelerated Branch-and-boundmentioning

confidence: 99%

“…In order to enable a B&B-process to explore any arbitrary interval OEB; E an initialization procedure, as described in [10] is necessary. However, it is possible to do the interval splitting between § In the rest of this paper the term IVM designates the data structure as well as, by extension, the exploration process associated with a part of the B&B-tree.…”

Section: Algorithm 1 Select-and-branchmentioning

confidence: 99%

“…Our algorithm is based on the Integer-Vector-Matrix (IVM) data structure [9] which allows to implement all four B&B operators on the GPU. For the IVM-based GPU-B&B algorithm presented in [10] our previous work [11] proposes four work stealing (WS) [12] strategies for balancing the irregular workload inside a single GPU. The major contributions of this paper are the following:…”

Section: Introductionmentioning

confidence: 99%

“…Revisiting the algorithm presented in [10] for fine-grained permutation problems, that is, permutation problems where the cost of evaluating the tree nodes is low compared with the tree search itself. The proposed variant of the algorithm uses the same WS mechanism for load balancing as the original algorithm which uses a second level of parallelism to accelerate the bounding operator.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

IVM‐based parallel branch‐and‐bound using hierarchical work stealing on multi‐GPU systems

Gmys

Mezmaz

Melab

et al. 2016

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

International audienceTree-based exploratory methods, like Branch-and-Bound (B&B) algorithms, are highly irregular applications which makes their design and implementation on graphics processing unit (GPU) challenging. In this paper, we present a multi-GPU B&B algorithm for solving large permutation-based combinatorial optimization problems. To tackle the problem of the irregular workload, we propose a hierarchical work stealing (WS) strategy that balances the workload inside the GPU and between different GPUs and CPU cores. Our B&B is based on an Integer-Vector-Matrix data structure instead of a pool of permutations, and work units exchanged are intervals of factoradics instead of sets of nodes. Two variants of the algorithm, using the same hierarchical WS strategy, are proposed: one for combinatorial optimization problems where the evaluation of nodes is costly and one for fine-grained problems. The latter variant uses a new hypercube-based WS strategy and a trigger mechanism to balance the work load inside the GPU. The proposed approach has been extensively experimented using the flowshop scheduling, the n-queens and the asymmetric travelling salesman problems as test-cases. The reported results show that the proposed hierarchical WS mechanism is capable of handling fine and coarse-grained types of workloads efficiently, reaching near-linear speed-up on up to four GPUs for a set of ten flowshop instances and large instances of fine-grained problem

show abstract

Parallel K-Prototypes Clustering with High Efficiency and Accuracy

Jridi

HajKacem

Essoussi

2020

Big Data Analytics and Knowledge Discovery

View full text Add to dashboard Cite

A GPU-based Branch-and-Bound algorithm using Integer–Vector–Matrix data structure

Cited by 19 publications

References 19 publications

A computationally efficient Branch-and-Bound algorithm for the permutation flow-shop scheduling problem

A computationally efficient Branch-and-Bound algorithm for the permutation flow-shop scheduling problem

IVM‐based parallel branch‐and‐bound using hierarchical work stealing on multi‐GPU systems

Parallel K-Prototypes Clustering with High Efficiency and Accuracy

Contact Info

Product

Resources

About