2010 Ninth International Symposium on Parallel and Distributed Computing 2010
DOI: 10.1109/ispdc.2010.22
|View full text |Cite
|
Sign up to set email alerts
|

NQueens on CUDA: Optimization Issues

Abstract: Todays commercial off-the-shelf computer systems are multicore computing systems as a combination of CPU, graphic processor (GPU) and custom devices. In comparison with CPU cores, graphic cards are capable to execute hundreds up to thousands compute units in parallel. To benefit from these GPU computing resources, applications have to be parallelized and adapted to the target architecture. In this paper we show our experience in applying the NQueens puzzle solution on GPUs using Nvidia's CUDA (Compute Unified … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
38
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 18 publications
(38 citation statements)
references
References 1 publication
0
38
0
Order By: Relevance
“…Image source: [18] 22, 317, 699, 616, 364, 044 valid solutions were determined for N = 26. While the world record is held by the aforementioned FPGA-based approach, graphics hardware based implementations have also been well researched [8], [5,23]. However, no publication is known to the authors that tries to solve the unbalanced workload distribution of N-Queens using Dynamic Parallelism.…”
Section: Parallel Implementationsmentioning
confidence: 99%
See 4 more Smart Citations
“…Image source: [18] 22, 317, 699, 616, 364, 044 valid solutions were determined for N = 26. While the world record is held by the aforementioned FPGA-based approach, graphics hardware based implementations have also been well researched [8], [5,23]. However, no publication is known to the authors that tries to solve the unbalanced workload distribution of N-Queens using Dynamic Parallelism.…”
Section: Parallel Implementationsmentioning
confidence: 99%
“…However, no publication is known to the authors that tries to solve the unbalanced workload distribution of N-Queens using Dynamic Parallelism. The fastest implementation for GPU compute devices known to the authors has been published by Feinbube et al [8], which is based on Somers [21] serial implementation. In this approach, the main CPU creates initial board configurations for each thread and then hands over the tasks to the GPU.…”
Section: Parallel Implementationsmentioning
confidence: 99%
See 3 more Smart Citations