2022
DOI: 10.1016/j.parco.2021.102856
|View full text |Cite
|
Sign up to set email alerts
|

OpenMP application experiences: Porting to accelerated nodes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…Compared to the traditional Poisson algorithm which mostly uses principal component analysis for normal estimation, this paper proposes an improved method. Firstly, an octree is used instead of a KDtree to search the nearest neighborhood; then the normal of the point cloud is estimated by moving least squares and accelerated by OpenMP [14], and then the normal direction is adjusted consistently by a least-cost spanning tree. The traditional Poisson reconstruction algorithm is prone to generate pseudo-surfaces.…”
Section: Whole Methodsmentioning
confidence: 99%
“…Compared to the traditional Poisson algorithm which mostly uses principal component analysis for normal estimation, this paper proposes an improved method. Firstly, an octree is used instead of a KDtree to search the nearest neighborhood; then the normal of the point cloud is estimated by moving least squares and accelerated by OpenMP [14], and then the normal direction is adjusted consistently by a least-cost spanning tree. The traditional Poisson reconstruction algorithm is prone to generate pseudo-surfaces.…”
Section: Whole Methodsmentioning
confidence: 99%
“…The training and outreach activity is a cross-cutting effort which is supported by resources from SOLLVE and ECP Broader Engagement, with contributions by external collaborators, notably Lawrence Berkeley National Laboratory. A number of articles have also been published as part of the SOLLVE effort [87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,100,105,71].…”
Section: Validation and Verification (Vandv)mentioning
confidence: 99%
“…Ravikumar et al [34] performed spectral simulation of turbulent flows using their own asynchronous batched GPU-FFT. In addition, Bak et al [9] performed weak scaling of their synchronous non-batched GPU-FFT on Summit. They observed that a FFT of 3072 3 grid using 96 V100 GPUs (16 nodes) of Summit took 550 milliseconds [1].…”
Section: Comparison Of Gpu-fft With Other Fftsmentioning
confidence: 99%
“…They observed a GPU to CPU speedup of 4.7 for 12288 3 grid and a speedup of 2.9 for 18432 3 grid. Recently, Bak et al [9] measured the performance of a synchronous non-batched version of this GPU-FFT on 1024 nodes of Summit and obtained a maximum GPU to CPU speedup of 2.57 for 12228 3 grid.…”
Section: Introductionmentioning
confidence: 99%