This paper describes a massively parallel code for a state-of-the art thermal Lattice Boltzmann method. Our code has been carefully optimized for performance on one GPU and to have a good scaling behavior extending to a large number of GPUs. Versions of this code have been already used for large-scale studies of convective turbulence.GPUs are becoming increasingly popular in HPC applications, as they are able to deliver higher performance than traditional processors. Writing efficient programs for large clusters is not an easy task as codes must adapt to increasingly parallel architectures, and the overheads of node-to-node communications must be properly handled.We describe the structure of our code, discussing several key design choices that were guided by theoretical models of performance and experimental benchmarks. We present an extensive set of performance measurements and identify the corresponding main bottlenecks; finally we compare the results of our GPU code with those measured on other currently available high performance processors. Our results are a production-grade code able to deliver a sustained performance of several tens of Tflops as well as a design and optimization methodology that can be used for the development of other high performance applications for computational physics.
Despite a long record of intense effort, the basic mechanisms by which dissipation emerges from the microscopic dynamics of a relativistic fluid still elude complete understanding. In particular, several details must still be finalized in the pathway from kinetic theory to hydrodynamics mainly in the derivation of the values of the transport coefficients. In this paper, we approach the problem by matching data from lattice-kinetic simulations with analytical predictions. Our numerical results provide neat evidence in favor of the Chapman-Enskog [The Mathematical Theory of Non-Uniform Gases, 3rd ed. (Cambridge University Press, Cambridge, U.K., 1970)] procedure as suggested by recent theoretical analyses along with qualitative hints at the basic reasons why the Chapman-Enskog expansion might be better suited than Grad's method [Commun. Pure Appl. Math. 2, 331 (1949)0010-364010.1002/cpa.3160020403] to capture the emergence of dissipative effects in relativistic fluids.
SUMMARYEnergy efficiency is becoming increasingly important for computing systems, in particular for large scale HPC facilities. In this work we evaluate, from an user perspective, the use of Dynamic Voltage and Frequency Scaling (DVFS) techniques, assisted by the power and energy monitoring capabilities of modern processors in order to tune applications for energy efficiency. We run selected kernels and a full HPC application on two high-end processors widely used in the HPC context, namely an NVIDIA K80 GPU and an Intel Haswell CPU. We evaluate the available trade-offs between energy-to-solution and time-to-solution, attempting a function-by-function frequency tuning. We finally estimate the benefits obtainable running the full code on a HPC multi-GPU node, with respect to default clock frequency governors. We instrument our code to accurately monitor power consumption and execution time without the need of any additional hardware, and we enable it to change CPUs and GPUs clock frequencies while running. We analyze our results on the different architectures using a simple energy-performance model, and derive a number of energy saving strategies which can be easily adopted on recent high-end HPC systems for generic applications.
We present a systematic account of recent developments of the relativistic Lattice Boltzmann method (RLBM) for dissipative hydrodynamics. We describe in full detail a unified, compact and dimension-independent procedure to design relativistic LB schemes capable of bridging the gap between the ultra-relativistic regime, k B T mc 2 , and the non-relativistic one, k B T mc 2 . We further develop a systematic derivation of the transport coefficients as a function of the kinetic relaxation time in d = 1, 2, 3 spatial dimensions. The latter step allows to establish a quantitative bridge between the parameters of the kinetic model and the macroscopic transport coefficients. This leads to accurate calibrations of simulation parameters and is also relevant at the theoretical level, as it provides neat numerical evidence of the correctness of the Chapman-Enskog procedure. We present an extended set of validation tests, in which simulation results based on the RLBMs are compared with existing analytic or semi-analytic results in the mildly-relativistic (k B T ∼ mc 2 ) regime for the case of shock propagations in quark-gluon plasmas and laminar electronic flows in ultra-clean graphene samples. It is hoped and expected that the material collected in this paper may allow the interested readers to reproduce the present results and generate new applications of the RLBM scheme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.