Parallelization and Performance of the NIM Weather Model on CPU, GPU, and MIC Processors

Govett, M.; Rosinski, J.; Middlecoff, Jacques; Henderson, Tom; Liu, Jin; MacDonald, Alexander E.; Wang, Ning; Madden, Paul; Schramm, Julie; Duarte, António J.S.T.

doi:10.1175/bams-d-15-00278.1

Cited by 27 publications

(18 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Overall the COSMO model with the rewritten GridTools dynamical core and with the other components ported with OpenACC directives runs about 3-4 times faster on GPUs than the original code on CPUs when comparing hardware of the same generation Leutwyler et al 2016). Similar speedups have been reported by other studies (e.g., Govett et al 2017).…”

Section: Use Of Openaccsupporting

confidence: 91%

Kilometer-Scale Climate Models: Prospects and Challenges

Schär

Fuhrer

Arteaga

et al. 2020

Bulletin of the American Meteorological Society

169

125

View full text Add to dashboard Cite

Currently major efforts are underway toward refining the horizontal resolution (or grid spacing) of climate models to about 1 km, using both global and regional climate models (GCMs and RCMs). Several groups have succeeded in conducting kilometer-scale multiweek GCM simulations and decadelong continental-scale RCM simulations. There is the well-founded hope that this increase in resolution represents a quantum jump in climate modeling, as it enables replacing the parameterization of moist convection by an explicit treatment. It is expected that this will improve the simulation of the water cycle and extreme events and reduce uncertainties in climate change projections. While kilometer-scale resolution is commonly employed in limitedarea numerical weather prediction, enabling it on global scales for extended climate simulations requires a concerted effort. In this paper, we exploit an RCM that runs entirely on graphics processing units (GPUs) and show examples that highlight the prospects of this approach. A particular challenge addressed in this paper relates to the growth in output volumes. It is argued that the data avalanche of high-resolution simulations will make it impractical or impossible to store the data. Rather, repeating the simulation and conducting online analysis will become more efficient. A prototype of this methodology is presented. It makes use of a bit-reproducible model version that ensures reproducible simulations across hardware architectures, in conjunction with a data virtualization layer as a common interface for output analyses. An assessment of the potential of these novel approaches will be provided.

show abstract

Section: Use Of Openaccsupporting

confidence: 91%

Kilometer-Scale Climate Models: Prospects and Challenges

Schär

Fuhrer

Arteaga

et al. 2020

Bulletin of the American Meteorological Society

169

125

View full text Add to dashboard Cite

show abstract

“…Tiling of data. For certain operations such as stencil computation (Gan et al, 2017) that have complicated data accessing patterns, the computing and accessing can be done in a tiling pattern (Bandishti et al, 2012) so that the computation of different lines, planes, or cubes can be pipelined and overlapped.…”

Section: Other Tuning Techniques At a Glimpsementioning

confidence: 99%

“…For the porting of a model at such a level, the three challenges mentioned above (heavy burden of legacy code, hundreds of hotspots distributed through the code, and the mismatch between the existing code and the emerging hardware) have apparently combined to produce more challenges. Facing the problem of tens of thousands of lines of code, the researchers and developers have to either perform an extensive rewriting of the code (Xu et al, 2014) or invest years of effort into redesign methodologies and tools (Gysi et al, 2015).…”

Section: Introductionmentioning

confidence: 99%

Optimizing high-resolution Community Earth System Model on a heterogeneous many-core supercomputing platform

Zhang

et al. 2020

Geosci. Model Dev.

View full text Add to dashboard Cite

Abstract. With semiconductor technology gradually approaching its physical and thermal limits, recent supercomputers have adopted major architectural changes to continue increasing the performance through more power-efficient heterogeneous many-core systems. Examples include Sunway TaihuLight that has four management processing elements (MPEs) and 256 computing processing elements (CPEs) inside one processor and Summit that has two central processing units (CPUs) and six graphics processing units (GPUs) inside one node. Meanwhile, current high-resolution Earth system models that desperately require more computing power generally consist of millions of lines of legacy code developed for traditional homogeneous multicore processors and cannot automatically benefit from the advancement of supercomputer hardware. As a result, refactoring and optimizing the legacy models for new architectures become key challenges along the road of taking advantage of greener and faster supercomputers, providing better support for the global climate research community and contributing to the long-lasting societal task of addressing long-term climate change. This article reports the efforts of a large group in the International Laboratory for High-Resolution Earth System Prediction (iHESP) that was established by the cooperation of Qingdao Pilot National Laboratory for Marine Science and Technology (QNLM), Texas A&M University (TAMU), and the National Center for Atmospheric Research (NCAR), with the goal of enabling highly efficient simulations of the high-resolution (25 km atmosphere and 10 km ocean) Community Earth System Model (CESM-HR) on Sunway TaihuLight. The refactoring and optimizing efforts have improved the simulation speed of CESM-HR from 1 SYPD (simulation years per day) to 3.4 SYPD (with output disabled) and supported several hundred years of pre-industrial control simulations. With further strategies on deeper refactoring and optimizing for remaining computing hotspots, as well as redesigning architecture-oriented algorithms, we expect an equivalent or even better efficiency to be gained on the new platform than traditional homogeneous CPU platforms. The refactoring and optimizing processes detailed in this paper on the Sunway system should have implications for similar efforts on other heterogeneous many-core systems such as GPU-based high-performance computing (HPC) systems.

show abstract

“…For example, the notion of DSLs as a solution has a tried and tested heritage -examples include the Kokkos array library (Edwards et al, 2012), which like GridTools uses C++ templates to provide an interface to distributed data which can support multiple hardware back ends, and from computational chemistry, sophisticated codes (Valiev et al, 2010) built on top of a toolkit (Nieplocha et al, 2006), which facilitates shared memory programming. Arguably DSLs are starting to be more prevalent because of the advent of better tooling for their development and because the code they generate can be better optimised by autotuning (Gropp and Snir, 2013). However, in our case, we still believe that human ex- Figure 8.…”

Section: Related Workmentioning

confidence: 94%

“…for memory optimisations). With the advent of exascale systems, entirely new programming models are likely to be necessary (Gropp and Snir, 2013), potentially deploying new tools, or even the same tools (MPI, OpenMP), to deliver entirely new algorithmic constructs such as thread pools and task-based parallelism (e.g. Perez et al, 2008).…”

Section: Where Is the Concurrency?mentioning

confidence: 99%

Crossing the chasm: how to develop weather and climate models for next generation computers?

et al. 2018

View full text Add to dashboard Cite

Abstract. Weather and climate models are complex pieces of software which include many individual components, each of which is evolving under pressure to exploit advances in computing to enhance some combination of a range of possible improvements (higher spatio-temporal resolution, increased fidelity in terms of resolved processes, more quantification of uncertainty, etc.). However, after many years of a relatively stable computing environment with little choice in processing architecture or programming paradigm (basically X86 processors using MPI for parallelism), the existing menu of processor choices includes significant diversity, and more is on the horizon. This computational diversity, coupled with ever increasing software complexity, leads to the very real possibility that weather and climate modelling will arrive at a chasm which will separate scientific aspiration from our ability to develop and/or rapidly adapt codes to the available hardware. In this paper we review the hardware and software trends which are leading us towards this chasm, before describing current progress in addressing some of the tools which we may be able to use to bridge the chasm. This brief introduction to current tools and plans is followed by a discussion outlining the scientific requirements for quality model codes which have satisfactory performance and portability, while simultaneously supporting productive scientific evolution. We assert that the existing method of incremental model improvements employing small steps which adjust to the changing hardware environment is likely to be inadequate for crossing the chasm between aspiration and hardware at a satisfactory pace, in part because institutions cannot have all the relevant expertise in house. Instead, we outline a methodology based on large community efforts in engineering and standardisation, which will depend on identifying a taxonomy of key activities – perhaps based on existing efforts to develop domain-specific languages, identify common patterns in weather and climate codes, and develop community approaches to commonly needed tools and libraries – and then collaboratively building up those key components. Such a collaborative approach will depend on institutions, projects, and individuals adopting new interdependencies and ways of working.

show abstract

Parallelization and Performance of the NIM Weather Model on CPU, GPU, and MIC Processors

Cited by 27 publications

References 20 publications

Kilometer-Scale Climate Models: Prospects and Challenges

Kilometer-Scale Climate Models: Prospects and Challenges

Optimizing high-resolution Community Earth System Model on a heterogeneous many-core supercomputing platform

Crossing the chasm: how to develop weather and climate models for next generation computers?

Contact Info

Product

Resources

About