Pumi

Ibanez, Daniel Alejandro; Seol, E. Seegyoung; Smith, Cameron; Shephard, Mark S.

doi:10.1145/2814935

Cited by 53 publications

(17 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several libraries are available that provide efficient parallel adaptive mesh refinement, for example libMesh [43] or PUMI [44]. However, these libraries are not generic, they are only defined in a fixed dimension, and do not allow us to consider any type of meshes.…”

Section: A Parallel Algorithm Of Mesh Processing Based On 3-dmapsmentioning

confidence: 99%

Distributed Combinatorial Maps for Parallel Mesh Processing

et al. 2018

View full text Add to dashboard Cite

Abstract:We propose a new strategy for the parallelization of mesh processing algorithms. Our main contribution is the definition of distributed combinatorial maps (called n-dmaps), which allow us to represent the topology of big meshes by splitting them into independent parts. Our mathematical definition ensures the global consistency of the meshes at their interfaces. Thus, an n-dmap can be used to represent a mesh, to traverse it, or to modify it by using different mesh processing algorithms. Moreover, an nD mesh with a huge number of elements can be considered, which is not possible with a sequential approach and a regular data structure. We illustrate the interest of our solution by presenting a parallel adaptive subdivision method of a 3D hexahedral mesh, implemented in a distributed version. We report space and time performance results that show the interest of our approach for parallel processing of huge meshes.

show abstract

Section: A Parallel Algorithm Of Mesh Processing Based On 3-dmapsmentioning

confidence: 99%

Distributed Combinatorial Maps for Parallel Mesh Processing

et al. 2018

View full text Add to dashboard Cite

show abstract

“…14 In each application, an existing analysis package is integrated with PUMI using a specific combination of bulk and atomic information passing methods. 14 In each application, an existing analysis package is integrated with PUMI using a specific combination of bulk and atomic information passing methods.…”

Section: Contributions and Related Workmentioning

confidence: 99%

“…38 Similarly, classification supports the transformation of the input field tensors onto the mesh to define the boundary conditions, material parameters, and initial conditions. 14,45 PUMI's unstructured mesh components include the following. 14,45 PUMI's unstructured mesh components include the following.…”

Section: Component Interfacesmentioning

confidence: 99%

“…SPR then computes mesh-entity level error estimates based on an improved Cauchy stress field (l.11). The adaptive cycle concludes with the transformation of PUMI unstructured mesh information (l. [14][15] and APF field information (l.16) into Albany analysis data structures. As the mesh modifications (split, collapse, etc) are applied, the FieldShape transfer operators 69,70 are called to determine the value of state variables at repositioned or newly created integration points.…”

Section: Adaptive Solderball Simulationmentioning

confidence: 99%

“…The three applications discussed all use the Parallel Unstructured Mesh Infrastructure (PUMI) to provide mesh adaptation, load balancing, and field services. 14 In each application, an existing analysis package is integrated with PUMI using a specific combination of bulk and atomic information passing methods. The method selected and the point in the workflow for transferring the information is a key focus of our work.…”

Section: Contributions and Related Workmentioning

confidence: 99%

See 2 more Smart Citations

In‐memory integration of existing software components for parallel adaptive unstructured mesh workflows

Smith

Granzow

Diamond

et al. 2018

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

Reliable mesh-based simulations are needed to solve complex engineering problems. Mesh adaptivity can increase reliability by reducing discretization errors but requires multiple software components to exchange information. Often, components exchange information by reading and writing a common file format. This file-based approach becomes a problem on massively parallel computers where filesystem bandwidth is a critical performance bottleneck. Our approach using data streams and component interfaces avoids the filesystem bottleneck. In this paper, we present these techniques and their use for coupling mesh adaptivity to the PHASTA computational fluid dynamics solver, the Albany multi-physics framework, and the Omega3P linear accelerator frequency analysis applications. Performance results are reported on up to 16,384 cores of an Intel Knights Landing-based system. KEYWORDSin-memory, mesh adaptation, parallel, unstructured mesh, workflow INTRODUCTIONSimulations on massively parallel systems are most effective when data movement is minimized. Data movement costs increase with the depth of the memory hierarchy, ie, a design trade-off for increased capacity. For example, the lowest level on-node storage in the IBM Blue Gene/Q A2 processor 1 is the per core 16 KiB L1 cache (excluding registers) and has a peak bandwidth of 819 GiB/s. The highest level on-node storage, 16 GiB of DDR3 main memory, provides a million times more capacity but at a greatly reduced bandwidth of 43 GiB/s, 1/19th of L1 cache. 2 One level further up the hierarchy is the parallel filesystem.* At this level, the bandwidth and capacity relationship are again less favorable and further compromised by the fact that the filesystem is a shared resource. Table 1 lists the per node peak main memory and filesystem bandwidth across five generations of Argonne National Laboratory leadership class systems, ie, Blue Gene/L, 5,6 Intrepid Blue Gene/P, 7,8 Mira Blue Gene/Q, 1,9 Theta, 10,11 and 2018's Aurora. 12Based on these peak values, the bandwidth gap between main memory and the filesystem is at least three orders of magnitude. Software must leverage the cache and main memory bandwidth performance advantage during as many workflow operations as possible to maximize performance.This paper presents a set of in-memory component coupling techniques that avoid filesystem use. We demonstrate these techniques for three different unstructured mesh-based adaptive analysis workflows. These demonstrations highlight the need for in-memory coupling techniques that are compatible with the design and execution of the analysis software involved. Key to this compatibility is supporting two interaction modes, ie, bulk and atomic information transfers.Section 3 provides a definition of the information transfer modes and reviews methods to couple workflow components using them. The core interfaces supporting adaptive unstructured mesh workflows are described in Section 3.1, and examples are given for their use in bulk and atomic information transfers. Section 3.2 details the d...

show abstract

One machine, one minute, three billion tetrahedra

Marot

Pellerin

Remacle

2018

Numerical Meth Engineering

View full text Add to dashboard Cite

This paper presents a new scalable parallelization scheme to generate the 3D Delaunay triangulation of a given set of points. Our first contribution is an efficient serial implementation of the incremental Delaunay insertion algorithm. KEYWORDS3D Delaunay triangulation, parallel delaunay, radix sort, SFC partitioning, tetrahedral mesh generation Int J Numer Methods Eng. 2019;117:967-990. wileyonlinelibrary.com/journal/nme

show abstract

Pumi

Cited by 53 publications

References 41 publications

Distributed Combinatorial Maps for Parallel Mesh Processing

Distributed Combinatorial Maps for Parallel Mesh Processing

In‐memory integration of existing software components for parallel adaptive unstructured mesh workflows

One machine, one minute, three billion tetrahedra

Contact Info

Product

Resources

About