2012 19th International Conference on High Performance Computing 2012
DOI: 10.1109/hipc.2012.6507490
|View full text |Cite
|
Sign up to set email alerts
|

Design and implementation of a parallel priority queue on many-core architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(18 citation statements)
references
References 20 publications
0
18
0
Order By: Relevance
“…GPU parallel priority queues [24] improve over the serial heap update by allowing multiple concurrent updates, but they require a potential number of small sorts for each insert and data-dependent memory movement. Moreover, it uses multiple synchronization barriers through kernel launches in different streams, plus the additional latency of successive kernel launches and coordination with the CPU host.…”
Section: K-selection On Cpu Versus Gpumentioning
confidence: 99%
“…GPU parallel priority queues [24] improve over the serial heap update by allowing multiple concurrent updates, but they require a potential number of small sorts for each insert and data-dependent memory movement. Moreover, it uses multiple synchronization barriers through kernel launches in different streams, plus the additional latency of successive kernel launches and coordination with the CPU host.…”
Section: K-selection On Cpu Versus Gpumentioning
confidence: 99%
“…In the heap version, however, the sorting of elements by their values asserts that propagations carried by a thread are more likely to be final and, as a consequence, increases parallel efficiency. In an earlier effort, we implemented the IWPP using a state-of-the-art priority queue proposed by He et al [15], but it was not able to improve the regular queue based GPU implementation because of the data management costs.…”
Section: Resultsmentioning
confidence: 99%
“…An extensive body of work has embarked on the redesign of data structures for construction and general computation on the GPU [88]. Within the context of searching, these acceleration structures include sorted arrays [3], [4], [8], [51], [66], [67], [98] and linked lists [116], hash tables (see section III), spatial-partitioning trees (e.g., k-d trees [57], [115], [120], octrees [57], [119], bounding volume hierarchies (BVH) [57], [64], R-trees [71], and binary indexing trees [59], [99]), spatial-partitioning grids (e.g., uniform [36], [53], [62] and two-level [52]), skiplists [81], and queues (e.g., binary heap priority [43] and FIFO [17], [101]). Due to significant architectural differences between the CPU and GPU, search structures cannot simply be "ported" from the CPU to the GPU and maintain optimal performance.…”
Section: Gpu Searchingmentioning
confidence: 99%