Auto-tuning full applications: A case study

Understanding the behaviour of High Performance Computing (HPC) systems is a challenging task due to the large number of processes they involve as well as the complex interactions among these processes. In this paper, we present a novel approach that aims to simplify the analysis of large execution traces generated from HPC applications. We achieve this through a technique that allows semi-automatic extraction of execution phases from large traces. These phases, which characterize the main computations of the traced scenario, can be used by software engineers to browse the content of a trace at different levels of abstraction. Our approach is based on the application of information theory principles to the analysis of sequences of communication patterns found in HPC traces. The results of the proposed approach when applied to traces of a large HPC industrial system demonstrate its effectiveness in identifying the main program phases and their corresponding sub-phases.

show abstract

“…For example, Process 7 will send to and receive from processes 2, 3,4,6,8,10,11,12,18,19,20,21,22,23,24,26,27, and 28.…”

Section: B Pattern Detectionmentioning

confidence: 99%

“…At a high-level, SMG2000 performs three distinct phases to solve the problem as reported in [24]. These phases are Initialization, Setup and Solve.…”

Section: Case Studymentioning

confidence: 99%

Identifying computational phases from inter-process communication traces of HPC applications

Alawneh

Hamou-Lhadj

2012

2012 20th IEEE International Conference on Program Comprehension (ICPC)

View full text Add to dashboard Cite

show abstract

“…Compiler-based approaches are mostly independent from specific applications. In [18], whole applications are tuned by the compiler by extracting the most expensive loops from the source code and applying transformations such as tiling, unrolling, or permutation. The approach presented in [19] uses compiled binaries of applications to derive models of their execution times.…”

Section: Related Workmentioning

confidence: 99%

Automatic Tuning of the Fast Multipole Method Based on Integrated Performance Prediction

Dachsel

Hofmann

Lang

et al. 2012

2012 IEEE 14th International Conference on High Performance Computing and Communication &Amp; 2012 IEEE 9th International Confe

View full text Add to dashboard Cite

The Fast Multipole Method (FMM) is an efficient, widely used method for the solution of N-body problems. One of the main data structures is a hierarchical tree data structure describing the separation into near-field and farfield particle interactions. This article presents a method for automatic tuning of the FMM by selecting the optimal FMM tree depth based on an integrated performance prediction of the FMM computations. The prediction method exploits benchmarking of significant parts of the FMM implementation to adapt the tuning to the specific hardware system being used. Furthermore, a separate analysis phase at runtime is used to predict the computational load caused by the specific particle system to be computed. The tuning method was integrated into an FMM implementation. Performance results show that a reliable determination of the tree depth is achieved, thus leading to minimal execution times of the FMM algorithm.

show abstract

“…In contrast to common compiler optimizations, autotuning approaches often take into account details about the specific application being optimized and the environment where it will execute. Such approaches have been applied to specific types of software (e.g., computer algebra libraries [51] and high performance computing [39,49] as well as for general purpose languages and platforms (e.g., [45]). …”

Section: Related Workmentioning

confidence: 99%

SEEDS: a software engineer's energy-optimization decision support framework

Manotas

Pollock

Clause

2014

Proceedings of the 36th International Conference on Software Engineering

144

View full text Add to dashboard Cite

Reducing the energy usage of software is becoming more important in many environments, in particular, batterypowered mobile devices, embedded systems and data centers. Recent empirical studies indicate that software engineers can support the goal of reducing energy usage by making design and implementation decisions in ways that take into consideration how such decisions impact the energy usage of an application. However, the large number of possible choices and the lack of feedback and information available to software engineers necessitates some form of automated decision-making support. This paper describes the first known automated support for systematically optimizing the energy usage of applications by making code-level changes. It is e↵ective at reducing energy usage while freeing developers from needing to deal with the low-level, tedious tasks of applying changes and monitoring the resulting impacts to the energy usage of their application. We present a general framework, SEEDS, as well as an instantiation of the framework that automatically optimizes Java applications by selecting the most energye cient library implementations for Java's Collections API. Our empirical evaluation of the framework and instantiation show that it is possible to improve the energy usage of an application in a fully automated manner for a reasonable cost.

show abstract

Auto-tuning full applications: A case study

Cited by 25 publications

References 25 publications

Identifying computational phases from inter-process communication traces of HPC applications

Identifying computational phases from inter-process communication traces of HPC applications

Automatic Tuning of the Fast Multipole Method Based on Integrated Performance Prediction

SEEDS: a software engineer's energy-optimization decision support framework

Contact Info

Product

Resources

About