2015 IEEE 22nd International Conference on High Performance Computing (HiPC) 2015
DOI: 10.1109/hipc.2015.40
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Batched Predecessor Search in Shared Memory on GPUs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…Over the past decade, a lot of research has focused on designing efficient algorithms to solve a range of classical problems on GPUs [9,12,17,27,36,37,39]. These works have introduced several optimization techniques, such as coalesced memory accesses [10,11,35], branch divergence elimination [19,23], and bank conflict avoidance [1,6,9,19]. Several empirical models for specific GPUs have been proposed that use micro-benchmarking [5,41], and several fast GPU algorithms have been produced [10,17,39] via the use of benchmarks [40] and application of hardware-specific optimization techniques to existing algorithms.…”
Section: Related Workmentioning
confidence: 99%
“…Over the past decade, a lot of research has focused on designing efficient algorithms to solve a range of classical problems on GPUs [9,12,17,27,36,37,39]. These works have introduced several optimization techniques, such as coalesced memory accesses [10,11,35], branch divergence elimination [19,23], and bank conflict avoidance [1,6,9,19]. Several empirical models for specific GPUs have been proposed that use micro-benchmarking [5,41], and several fast GPU algorithms have been produced [10,17,39] via the use of benchmarks [40] and application of hardware-specific optimization techniques to existing algorithms.…”
Section: Related Workmentioning
confidence: 99%
“…Over the past decade, many works have focused on designing efficient algorithms to solve a range of classical problems on the GPU [8,26,37,19,36,39,12]. These works have introduced several optimization techniques, such as coalesced memory accesses [11,34,9], branch reduction [23,20], and bank conflict avoidance [20,6]. Several empirical models for specific GPUs have been proposed that use micro-benchmarking [41,17,5], and several fast GPU algorithms have been produced [9,19,39] via the use of empirical benchmarks [40] and the application of hardwarespecific optimization techniques to existing algorithms.…”
Section: Related Workmentioning
confidence: 99%