Proceedings of the 26th ACM International Conference on Supercomputing 2012
DOI: 10.1145/2304576.2304620
|View full text |Cite
|
Sign up to set email alerts
|

An efficient work-distribution strategy for gridding radio-telescope data on GPUs

Abstract: This paper presents a novel work-distribution strategy for GPUs, that efficiently convolves radio-telescope data onto a grid, one of the most time-consuming processing steps to create a sky image. Unlike existing work-distribution strategies, this strategy keeps the number of device-memory accesses low, without incurring the overhead from sorting or searching within telescope data. Performance measurements show that the strategy is an order of magnitude faster than existing accelerator-based gridders. We compa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
24
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(25 citation statements)
references
References 5 publications
(5 reference statements)
1
24
0
Order By: Relevance
“…As a workaround, we added the statement asm("") as a compiler-level memory barrier. Romein (2012) found that the majority of memory traffic was due to cache misses in reading the convolution GCF values. To avoid this problem, we have used a separable approximation to the GCF (Merry, 2016).…”
Section: Implementation Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…As a workaround, we added the statement asm("") as a compiler-level memory barrier. Romein (2012) found that the majority of memory traffic was due to cache misses in reading the convolution GCF values. To avoid this problem, we have used a separable approximation to the GCF (Merry, 2016).…”
Section: Implementation Detailsmentioning
confidence: 99%
“…However, the irregular data access patterns make this a non-trivial task. One of the first really practical algorithms for GPU-accelerated gridding is due to Romein (2012). Despite being state of the art, it typically spends only about 25% of a GPU's compute power on the actual convolution operations.…”
Section: Introductionmentioning
confidence: 99%
“…We can investigate computational efficiency of the most costly algorithmic components, an estimate based on our current undestanding of the required processing, on current day bestof-breed hardware. This shows very poor efficiency of at most 20% of R peak [1], [10].…”
Section: A Defining the Required Sdp Capacitymentioning
confidence: 99%
“…While this may be sufficient for some applications, in the general case other elements in the data path pose unsolved problems of scale owing to dependence on at least N 2 . The computation challenges associated with gridding irregularly spaced visibilities in preparation for FFT imaging Romein, (2012), and subtraction of sky models from correlator output in the visibility domain Mitchell, et al (2008), for example, will also need to be addressed.…”
Section: Scalabilitymentioning
confidence: 99%