2014 IEEE 28th International Parallel and Distributed Processing Symposium 2014
DOI: 10.1109/ipdps.2014.54
|View full text |Cite
|
Sign up to set email alerts
|

Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks

Abstract: We conducted a microbenchmarking study of the time, energy, and power of computation and memory access on several existing platforms. These platforms represent candidate compute-node building blocks of future high-performance computing systems. Our analysis uses the "energy roofline" model, developed in prior work, which we extend in two ways. First, we improve the model's accuracy by accounting for power caps, basic memory hierarchy access costs, and measurement of random memory access patterns. Secondly, we … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
48
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 60 publications
(48 citation statements)
references
References 22 publications
0
48
0
Order By: Relevance
“…5, right) and than other affinity modes until 180 threads. Since CG is the most memory-accessing application among the ones considered here as was shown in [23], this situation may be explained by the findings in [13] that Intel Xeon Phi uses less energy for memory accesses. Under the compact affinity mode, the neighboring threads, which are more likely to access memory simultaneously, are located in the same core.…”
Section: ) Energymentioning
confidence: 87%
See 1 more Smart Citation
“…5, right) and than other affinity modes until 180 threads. Since CG is the most memory-accessing application among the ones considered here as was shown in [23], this situation may be explained by the findings in [13] that Intel Xeon Phi uses less energy for memory accesses. Under the compact affinity mode, the neighboring threads, which are more likely to access memory simultaneously, are located in the same core.…”
Section: ) Energymentioning
confidence: 87%
“…An instruction-level energy model has been used by Shao and Brooks [12] with the Linpack benchmark suite to observe increases in energy efficiency as high as 10%. Choi et al [13] conducted a microbenchmarking study and found that the Intel Xeon Phi offers energy benefits to highly irregular data processing workloads. The Xeon Phi requires one magnitude less energy per access during random memory access operations.…”
Section: A Related Workmentioning
confidence: 99%
“…In [17], the impact of data movement on the total energy consumption is characterized for the NAS parallel benchmarks and several scientific applications. In [6], the authors extend their energy roofline model to cap-ture arithmetic and basic cache memory energy access costs as well as more elaborate random access patterns.…”
Section: Related Workmentioning
confidence: 99%
“…There are also some approaches that model the energy consumption of individual algorithms by considering the operations performed [59], however these approaches are difficult to transfer to other algorithms and they require a significant effort for the analysis at the algorithmic level. Another attempt in finding a relation between properties of the algorithms and the resulting energy consumption and execution time is described in [25], but the results are only presented at the level of micro-benchmarks. So far, there is no broad investigation that determines which algorithmic properties have which effect on the energy consumption for a specific architecture.…”
Section: Algorithmic Techniques Towards Energy Awarenessmentioning
confidence: 99%