Energy-efficiency and power consumption of applications has become a major topic in High Performance Computing in the last years. This poster presents the work of the last two years in the eeClust project 1 . The aim of the project is to reduce the energy consumption of applications on commodity HPC clusters with as little performance impact as possible by an integrated approach of application analysis, efficient management of hardware power-states and monitoring of the clusters power consumption. We outline the overall project plan and present the generation and analysis of traces on application side as well as the hardware management and monitoring on system side in detail. We further introduce eeMark, a benchmark for computational performance and energy efficiency which is especially tailored for HPC systems.
Project PlanThe approach of the eeClust project ([3]) is to switch hardware components to a lower power-state in phases where the component is not fully utilized. To identify these phases we extend well-known performance analysis tools which many application developers are familiar with. This keeps the 1 .learning curve relatively flat. We have developed an Application Programming Interface (API) to instrument the application, which communicates the future hardware requirements of the application to a daemon process (called eeDaemon), which then manages the hardware power-states accordingly.To test and validate our approach we procured a powermanageable cluster with 10 nodes (5 nodes Intel Nehalem and 5 nodes AMD Opteron), attached to 3 ZES LMG450 power-meters 2 which allow very accurate measurements at a high frequency.
Tracing & AnalysisWe extended two well-known and wildly deployed performance analysis toolsets for trace file analysis. The first is VampirTrace 3 and Vampir 4 from the Center for Information Services and High Performance Computing of TU Dresden, which allow a manual analysis by timeline visualization of program behaviour, e.g. function calls or messages sent, and hardware counter values, together with statistical details of the program execution. The other toolset is Scalasca 5 , developed at Juelich Supercomputing Centre and the German Research School for Simulation Science in Aachen, which performs an automatic trace file analysis to detect patterns that indicate performance problems, especially wait-states, i.e. situations where one process has to wait for one (ore more) another process(es) because of workload imbalances.We developed a VampirTrace Plugin ([5]) to display the power consumption of the cluster nodes as a counter timeline in Vampir. This enables us to correlate the power consumption to program activity and hardware counter values. 2