2011 IEEE 17th International Conference on Parallel and Distributed Systems 2011
DOI: 10.1109/icpads.2011.92
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing Dynamic Programming on Graphics Processing Units via Adaptive Thread-Level Parallelism

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 16 publications
0
6
0
Order By: Relevance
“…(2) and selecting the appropriate computing power for each phase lead to better mapping for the entire NPDP computation and maximum utilization of SMs in all the phases. In comparisons of these results with earlier state-of-art by Wu et al [19], the maximum achieved speedup was 13.40×, whereas speedup using GMM approach is more than 30×.…”
Section: Resultsmentioning
confidence: 62%
See 2 more Smart Citations
“…(2) and selecting the appropriate computing power for each phase lead to better mapping for the entire NPDP computation and maximum utilization of SMs in all the phases. In comparisons of these results with earlier state-of-art by Wu et al [19], the maximum achieved speedup was 13.40×, whereas speedup using GMM approach is more than 30×.…”
Section: Resultsmentioning
confidence: 62%
“…On the same line, inherent non-uniformity in the NPDP algorithms is targeted using the thread block analogy i.e. single block is employed for each subproblem and the number of threads in a block is the same as the number of comparisons required for computing the subproblem presented by Wu et al [19]. In addition, two stage adaptive thread model for the efficient mapping is illustrated by employing different number of threads for different phases.…”
Section: MCMmentioning
confidence: 99%
See 1 more Smart Citation
“…There are several published works on the implementation of the dynamic programming [5], [6], [20], [21], [22]. Their implementations have been optimized mainly by the developer's experience.…”
Section: Introductionmentioning
confidence: 99%
“…We additionally retrieve the backtrace on the GPU and transfer it to the CPU. The Nussinov and matrix multiplication problems have also been studied as pure GPU implementations [3,38].…”
Section: Metaprogramming and Compiler Technologymentioning
confidence: 99%