Frank Feinbube scite author profile

Frank Feinbube

4Publications

75Citation Statements Received

22Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Potsdam, Hasso Plattner Institute

Publications

Order By: Most citations

NQueens on CUDA: Optimization Issues

Feinbube

Rabe

Löwis

et al. 2010

View full text Add to dashboard Cite

Todays commercial off-the-shelf computer systems are multicore computing systems as a combination of CPU, graphic processor (GPU) and custom devices. In comparison with CPU cores, graphic cards are capable to execute hundreds up to thousands compute units in parallel. To benefit from these GPU computing resources, applications have to be parallelized and adapted to the target architecture. In this paper we show our experience in applying the NQueens puzzle solution on GPUs using Nvidia's CUDA (Compute Unified Device Architecture) technology. Using the example of memory usage and memory access, we demonstrate that optimizations of CUDA programs may have contrary results on different CUDA architectures. Evaluation results will point out, that it is not sufficient to use new programming languages or compilers to achieve best results with emerging graphic card computing.

show abstract

A Performance Evaluation of Dynamic Parallelism for Fine-Grained, Irregular Workloads

Plauth

Feinbube

Schlegel

et al. 2016

IJNC

View full text Add to dashboard Cite

GPU compute devices have become very popular for general purpose computations. However, the SIMD-like hardware of graphics processors is currently not well suited for irregular workloads, like searching unbalanced trees. In order to mitigate this drawback, NVIDIA introduced an extension to GPU programming models called Dynamic Parallelism. This extension enables GPU programs to spawn new units of work directly on the GPU, allowing the refinement of subsequent work items based on intermediate results without any involvement of the main CPU.This work investigates methods for employing Dynamic Parallelism with the goal of improved workload distribution for tree search algorithms on modern GPU hardware. For the evaluation of the proposed approaches, a case study is conducted on the N-Queens problem. Extensive benchmarks indicate that the benefits of improved resource utilization fail to outweigh high management overhead and runtime limitations due to the very fine level of granularity of the investigated problem. However, novel memory management concepts for passing parameters to child grids are presented. These general concepts are applicable to other, more coarse-grained problems that benefit from the use of Dynamic Parallelism.

show abstract

Joint Forces: From Multithreaded Programming to GPU Computing

2011

View full text Add to dashboard Cite

Using Dynamic Parallelism for Fine-Grained, Irregular Workloads: A Case Study of the N-Queens Problem

Plauth

Feinbube

Schlegel

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Frank Feinbube

NQueens on CUDA: Optimization Issues

A Performance Evaluation of Dynamic Parallelism for Fine-Grained, Irregular Workloads

Joint Forces: From Multithreaded Programming to GPU Computing

Using Dynamic Parallelism for Fine-Grained, Irregular Workloads: A Case Study of the N-Queens Problem

Contact Info

Product

Resources

About