2013
DOI: 10.1016/j.jcp.2012.11.031
|View full text |Cite
|
Sign up to set email alerts
|

Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
33
0
1

Year Published

2015
2015
2023
2023

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 58 publications
(34 citation statements)
references
References 38 publications
0
33
0
1
Order By: Relevance
“…They obtain 97 GFLOPS when running HPL on 96 nodes. Göddecke et al [30] use this cluster on a wide variety of scientific applications and find that the energy use compares favorably with an x86 cluster.…”
Section: Arm Cluster Buildingmentioning
confidence: 99%
“…They obtain 97 GFLOPS when running HPL on 96 nodes. Göddecke et al [30] use this cluster on a wide variety of scientific applications and find that the energy use compares favorably with an x86 cluster.…”
Section: Arm Cluster Buildingmentioning
confidence: 99%
“…Hence, as opposed to x86 and other commodity designs (with a focus on chipset compatibility and performance), the resulting energy efficiency advantage can be made accessible to the HPC community. In our earlier work [10] we demonstrated reductions in the energy-to-solution of simulations by using ARM-based processors. Those findings were obtained on a cluster prototype built with NVIDIA Tegra 2 and continued later with Tegra 3 micro-architecture [21].…”
Section: Introductionmentioning
confidence: 98%
“…The nodes interconnection and its limitations were studied in [9]. One of the first attempts of porting CFD applications to mobile architectures was presented in [10], where scalability tests were presented using up to 96 nodes, while in [11] it is presented a study of the energy efficiency of a CFD code on embedded platforms. In the current version the Mont-Blanc prototype is composed of 930 nodes with the SoC Samsung Exynos 5 that combines an ARM Cortex-A15 CPU and an OpenCL capable Mali T604 GPU.…”
Section: Introductionmentioning
confidence: 99%
“…This constitutes the baseline of the present paper, which is focused on optimizing the basic kernels composing the time integration to efficiently run on the Mont-Blanc prototype. In contrast to previous works [10,13], where applications were only ported to run on one of the two devices composing the Samsung Exynos 5 chip, here we present a multilevel heterogeneous approach that allows the concurrent execution in both the CPU and GPU devices.…”
Section: Introductionmentioning
confidence: 99%