2021
DOI: 10.48550/arxiv.2106.14332
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation

Abstract: This paper presents a methodology for using LLVM-based tools to tune the DCA++ (dynamical cluster approximation) application that targets the new ARM A64FX processor. The goal is to describe the changes required for the new architecture and generate efficient single instruction/multiple data (SIMD) instructions that target the new Scalable Vector Extension instruction set. During manual tuning, the authors used the LLVM tools to improve code parallelization by using OpenMP SIMD, refactored the code and applied… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 10 publications
(10 reference statements)
0
1
0
Order By: Relevance
“…Various publications and reports of A64FX evaluations have been released recently. For example, [14] tested LLVM and its SVE code generation capabilities for DCA++, and [15] compared a limited set of applications with LLVM, GNU, ARM, and Cray compilers and focused on SVE and multi-node scaling. Similarly, [16] investigated OpenMPscaling of 3 proxy apps on A64FX while comparing 5 compilers, and [17] measured nearly a dozen proxy apps (different from ours) on ARM and x86 for multiple compilers, but lacked LLVM.…”
Section: Related Workmentioning
confidence: 99%
“…Various publications and reports of A64FX evaluations have been released recently. For example, [14] tested LLVM and its SVE code generation capabilities for DCA++, and [15] compared a limited set of applications with LLVM, GNU, ARM, and Cray compilers and focused on SVE and multi-node scaling. Similarly, [16] investigated OpenMPscaling of 3 proxy apps on A64FX while comparing 5 compilers, and [17] measured nearly a dozen proxy apps (different from ours) on ARM and x86 for multiple compilers, but lacked LLVM.…”
Section: Related Workmentioning
confidence: 99%