In this paper we present the ESPRESO FEM library, which includes a FEM toolbox with interfaces to professional and open-source simulation tools, and a massively parallel Hybrid Total FETI (HTFETI) solver which can fully utilize the OLCF Titan supercomputer, and achieves super-linear scaling. This paper presents several new techniques for FETI solvers designed for efficient utilization of supercomputers with a focus on: (i) performance-we present a fivefold reduction of solver runtime for the Laplace equation by redesigning the FETI solver, and offloading the key workload to the accelerator. We compare Intel Xeon Phi 7120p and Tesla K80 and P100 accelerators to Intel Xeon E5-2680v3 and Xeon Phi 7210 CPUs; and (ii) memory efficiency-we present two techniques which increase the efficiency of the HTFETI solver 1.8 times, and pushes the limits of the largest possible problem ESPRESO can solve from 124 to 223 billion unknowns for problems with unstructured meshes. Finally we show that by dynamicly tuning hardware parameters we can reduce energy consumption by up to 33 %.
Part 5: Industrial Management and Other ApplicationsInternational audienceIn the future, the silicon technology will continue to reduce following the Moore’s law. Device variability is going to increase due to a loss in controllability during silicon chip fabrication. Then, the mean time between failures is also going to decrease. The current methodologies based on error detection and thread re-execution (roll back) can not be enough, when the number of errors increases and arrives to a specific threshold. This dynamic scenario can be very negative if we are executing programs in HPC systems where a correct, accurate and time constrained solution is expected. The objective of this paper is to describe and analyse the needs and constraints of different applications studied in disaster management processes. These applications fall mainly in the domains of the High Performance Computing (HPC). Even if this domain can have differences in terms of computation needs, system form factor and power consumption, it nevertheless shares some commonalities
An increasing number of High-Performance Applications demand some form of time predictability, in particular in scenarios where correctness depends on both performance and timing requirements, and the failure to meet either of them is critical. Consequently, a more predictable HPC system is required, particularly for an emerging class of adaptive real-time HPC applications. Here we present our runtime approach which produces the results in the predictable time with the minimized allocation of hardware resources. The paper describes the advantages in terms of execution time reliability and the trade-offs regarding power/energy consumption and temperature of the system compared with the current GNU/Linux governors
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.