Proceedings of Programming Models and Applications on Multicores and Manycores 2014
DOI: 10.1145/2578948.2560688
|View full text |Cite
|
Sign up to set email alerts
|

A Novel CPU-GPU Cooperative Implementation of A Parallel Two-List Algorithm for the Subset-Sum Problem

Abstract: The subset-sum problem is a well-known non-deterministic polynomial-time complete (NP-complete) decision problem. This paper proposes a novel and efficient implementation of a parallel two-list algorithm for solving the problem on a graphics processing unit (GPU) using Compute Unified Device Architecture (CUDA). The algorithm is composed of a generation stage, a pruning stage, and a search stage. It is not easy to effectively implement the three stages of the algorithm on a GPU. Ways to achieve better performa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(17 citation statements)
references
References 22 publications
0
17
0
Order By: Relevance
“…There are many algorithms developed to solve classic subset sum problem, some of them are: branch-and-bound, parallel two-list algorithm, genetic algorithm, etc. [4].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…There are many algorithms developed to solve classic subset sum problem, some of them are: branch-and-bound, parallel two-list algorithm, genetic algorithm, etc. [4].…”
Section: Introductionmentioning
confidence: 99%
“…Their CUDA implementation will not show good speedup if the table does not fit within the device memory. Another approach [4] exploits the CPU-GPU cooperation in order to achieve a speedup factor of $9.2$ over the best sequential implementation. Our implementation is solely on GPU and does not use any table for dynamic programming.…”
Section: Introductionmentioning
confidence: 99%
“…The basic unit of execution in CUDA is the so called kernel. When a CUDA program invokes a kernel on the host side, these thread blocks within a grid are enumerated and distributed to multiprocessors with available execution capacity; all threads within a grid can be executed in parallel …”
Section: The Proposed Gpu Implementation and Optimizationmentioning
confidence: 99%
“…GPGPU (General-Purpose computing on Graphics Processing Unit) is a typical instance. Example implementations: two-list algorithm for the subset-sum problem [9] or protein structure similarity search engine [10] illustrate the approach-parallel algorithms execution on GPU requires adjusting to the specific architecture.…”
Section: A Parallel Algorithms Testingmentioning
confidence: 99%