2022 IEEE International Conference on Cluster Computing (CLUSTER) 2022
DOI: 10.1109/cluster51413.2022.00070
|View full text |Cite
|
Sign up to set email alerts
|

On Using Linux Kernel Huge Pages with FLASH, an Astrophysical Simulation Code

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 0 publications
1
2
0
Order By: Relevance
“…Therefore, these counters represent the behavior in that specific module, rather than the software as a whole, while the timers show the full runtime. As expected and seen in our last study [6], in both cases the hardware cycles, MMB, and overall runtime are about the same when using hp, thp, or no hp. However, using hp drastically decreases the DTLB miss rate, while using thp does not have as much of an effect.…”
Section: Resultssupporting
confidence: 88%
See 1 more Smart Citation
“…Therefore, these counters represent the behavior in that specific module, rather than the software as a whole, while the timers show the full runtime. As expected and seen in our last study [6], in both cases the hardware cycles, MMB, and overall runtime are about the same when using hp, thp, or no hp. However, using hp drastically decreases the DTLB miss rate, while using thp does not have as much of an effect.…”
Section: Resultssupporting
confidence: 88%
“…This work extends our initial study of using hugepages with just the Fujitsu compiler, which demonstrated that hugepages did not provide a significant speedup [6]. Our speculation was that TLB misses might not make much of a difference because the A64FX has hardware to ameliorate the cost of TLB misses by avoiding OS calls, or because the FLASH data access patterns do not trigger a performance penalty.…”
Section: Previous Work With Hugepagesmentioning
confidence: 56%
“…But because we also found that memory access is only 20-40% of the runtime (not illustrated in the plots), we conclude that the increased bandwidth can't completely account for the speedup. Complete results of our exploration of hp may be found in [27,28].…”
Section: Huge Pages and Compiler Comparisonmentioning
confidence: 99%