(Ieee Ispass) Ieee International Symposium on Performance Analysis of Systems and Software 2011
DOI: 10.1109/ispass.2011.5762730
|View full text |Cite
|
Sign up to set email alerts
|

Where is the data? Why you cannot debate CPU vs. GPU performance without the answer

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
145
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 209 publications
(146 citation statements)
references
References 18 publications
1
145
0
Order By: Relevance
“…He and others observed that joins are 2-7 times faster on the GPU, whereas selections are 2-4 times slower, due to the required data transfers [30]. The same conclusion was made by Gregg and others, who showed that a GPU algorithm is not neccesarily faster than its CPU counterpart, due to the expensive data transfers [27]. One major point for achieving good performance in a GDBMS is therefore to avoid data transfers where possible.…”
Section: Non-functional Propertiesmentioning
confidence: 69%
“…He and others observed that joins are 2-7 times faster on the GPU, whereas selections are 2-4 times slower, due to the required data transfers [30]. The same conclusion was made by Gregg and others, who showed that a GPU algorithm is not neccesarily faster than its CPU counterpart, due to the expensive data transfers [27]. One major point for achieving good performance in a GDBMS is therefore to avoid data transfers where possible.…”
Section: Non-functional Propertiesmentioning
confidence: 69%
“…One obvious benefit is avoiding data transfer through connection interfaces (e.g., PCIe link), which is one of the most well known bottlenecks of co-processor computing [34]. Additionally, GPU cores can access more memory by paging memory to and from disk.…”
Section: Heterogeneous Processorsmentioning
confidence: 99%
“…Two work-items process one row of the block. The work-item with the even ID reads In[0] to In [4] to produce an eightpixel row from Out[0] to Out [7], and the work-item with the odd ID reads In [4] to In [7] to produce the successive eight-pixel row Out [8] to Out [15].…”
Section: Upsamplingmentioning
confidence: 99%