Abstract. With semiconductor technology gradually approaching its physical and thermal limits, recent supercomputers have adopted major
architectural changes to continue increasing the performance through more
power-efficient heterogeneous many-core systems. Examples include Sunway
TaihuLight that has four management processing elements (MPEs) and 256
computing processing elements (CPEs) inside one processor and Summit that has
two central processing units (CPUs) and six graphics processing units (GPUs)
inside one node. Meanwhile, current high-resolution Earth system models that
desperately require more computing power generally consist of millions of
lines of legacy code developed for traditional homogeneous multicore
processors and cannot automatically benefit from the advancement of
supercomputer hardware. As a result, refactoring and optimizing the legacy
models for new architectures become key challenges along the road of taking
advantage of greener and faster supercomputers, providing better support for
the global climate research community and contributing to the long-lasting
societal task of addressing long-term climate change. This article reports
the efforts of a large group in the International Laboratory for
High-Resolution Earth System Prediction (iHESP) that was established by the
cooperation of Qingdao Pilot National Laboratory for Marine Science and
Technology (QNLM), Texas A&M University (TAMU), and the National Center for
Atmospheric Research (NCAR), with the goal of enabling highly efficient
simulations of the high-resolution (25 km atmosphere and 10 km ocean)
Community Earth System Model (CESM-HR) on Sunway TaihuLight. The refactoring
and optimizing efforts have improved the simulation speed of CESM-HR from 1 SYPD (simulation years per day) to 3.4 SYPD (with output disabled) and
supported several hundred years of pre-industrial control simulations. With
further strategies on deeper refactoring and optimizing for remaining
computing hotspots, as well as redesigning architecture-oriented
algorithms, we expect an equivalent or even better efficiency to be gained on the
new platform than traditional homogeneous CPU platforms. The refactoring and
optimizing processes detailed in this paper on the Sunway system should have
implications for similar efforts on other heterogeneous many-core systems
such as GPU-based high-performance computing (HPC) systems.
Abstract. With the semi-conductor technology gradually approaching its physical and heat limits, recent supercomputers have adopted major architectural changes to continue increasing the performance through more power-efficient heterogeneous many-core systems. Examples include Sunway TaihuLight that has four Management Processing Element (MPE) and 256 Computing Processing Element (CPE) inside one processor and Summit that has two central processing units (CPUs) and 6 graphics processing units (GPUs) inside one node. Meanwhile, current high-resolution Earth system models that desperately require more computing power, generally consist of millions of lines of legacy codes developed for traditional homogeneous multi-core processors and cannot automatically benefit from the advancement of supercomputer hardware. As a result, refactoring and optimizing the legacy models for new architectures become a key challenge along the road of taking advantage of greener and faster supercomputers, providing better support for the global climate research community and contributing to the long-lasting society task of addressing long-term climate change. This article reports the efforts of a large group in the International Laboratory for High-Resolution Earth System Prediction (iHESP) established by the cooperation of Qingdao Pilot National Laboratory for Marine Science and Technology (QNLM), Texas A & M University and the National Center for Atmospheric Research (NCAR), with the goal of enabling highly efficient simulations of the high-resolution (25-km atmosphere and 10-km ocean) Community Earth System Model (CESM-HR) on Sunway TaihuLight. The refactoring and optimizing efforts have improved the simulation speed of CESM-HR from 1 SYPD (simulation years per day) to 3.4 SYPD (with output disabled), and supported several hundred years of pre-industrial control simulations. With further strategies on deeper refactoring and optimizing for a few remaining computing hot spots, we expect an equivalent or even better efficiency than homogeneous CPU platforms. The refactoring and optimizing processes detailed in this paper on the Sunway system should have implications to similar efforts on other heterogeneous many-core systems such as GPU-based high-performance computing (HPC) systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.