As the push towards exascale hardware has increased the diversity of system architectures, performance portability has become a critical aspect for scientific software. We describe the Kokkos Performance Portable Programming Model that allows developers to write single source applications for diverse high-performance computing architectures. Kokkos provides key abstractions for both the compute and memory hierarchy of modern hardware. We describe the novel abstractions that have been added to Kokkos version 3 such as hierarchical parallelism, containers, task graphs, and arbitrary-sized atomic operations to prepare for exascale era architectures. We demonstrate the performance of these new features with reproducible benchmarks on CPUs and GPUs.
The Parallel Unstructured Mesh Infrastructure (PUMI) is designed to support the representation of, and operations on, unstructured meshes as needed for the execution of mesh-based simulations on massively parallel computers. In PUMI, the mesh representation is
complete
in the sense of being able to provide any adjacency of mesh entities of multiple topologies in O(1) time, and
fully distributed
to support relationships of mesh entities across multiple memory spaces in a manner consistent with supporting massively parallel simulation workflows. PUMI's mesh maintains links to the high-level model definition in terms of a model topology as produced by CAD systems, and is specifically designed to efficiently support evolving meshes as required for mesh generation and adaptation. To support the needs of parallel unstructured mesh simulations, PUMI also supports a specific set of services such as the migration of mesh entities between parts while maintaining the mesh adjacencies, maintaining read-only mesh entity copies from neighboring parts (ghosting), repartitioning parts as the mesh evolves, and dynamic mesh load balancing.
Here we present the overall design, software structures, example programs, and performance results. The effectiveness of PUMI is demonstrated by its applications to massively parallel adaptive simulation workflows.
Unstructured grid adaptation is a tool to control Computational Fluid Dynamics (CFD) discretization error. However, adaptive grid techniques have made limited impact on production analysis workflows where the control of discretization error is critical to obtaining reliable simulation results. Issues that prevent the use of adaptive grid methods are identified by applying unstructured grid adaptation methods to a series of benchmark cases. Once identified, these challenges to existing adaptive workflows can be addressed. Unstructured grid adaptation is evaluated for test cases described on the Turbulence Modeling Resource (TMR) web site, which documents uniform grid refinement of multiple schemes. The cases are turbulent flow over a Hemisphere Cylinder and an ONERA M6 Wing. Adaptive grid force and moment trajectories are shown for three integrated grid adaptation processes with Mach interpolation control and output error based metrics. The integrated grid adaptation process with a finite element (FE) discretization produced results consistent with uniform grid refinement of fixed grids. The integrated grid adaptation processes with finite volume schemes were slower to converge to the reference solution than the FE method. Metric conformity is documented on grid/metric snapshots for five grid adaptation mechanics implementations. These tools produce anisotropic boundary conforming grids requested by the adaptation process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.