Rui Chen scite author profile

Algorithmic and architecture-oriented optimizations are essential for achieving performance worthy of anticipated energy-austere exascale systems. In this paper, we present an extreme scale FMM-accelerated boundary integral equation solver for wave scattering, which uses FMM as a matrix-vector multiplication inside the GMRES iterative method. Our FMM Helmholtz kernels are capable of treating nontrivial singular and near-field integration points. We implement highly optimized kernels for both shared and distributed memory, targeting emerging Intel extreme performance HPC architectures. We extract the potential thread-and data-level parallelism of the key Helmholtz kernels of FMM. Our application code is well optimized to exploit the AVX-512 SIMD units of Intel Skylake and Knights Landing architectures. We provide different performance models for tuning the task-based tree traversal implementation of FMM, and develop optimal architecturespecific and algorithm aware partitioning, load balancing, and communication reducing mechanisms to scale up to 6,144 compute nodes of a Cray XC40 with 196,608 hardware cores. With shared memory optimizations, we achieve roughly 77% of peak single precision floating point performance of a 56-core Skylake processor, and on average 60% of peak single precision floating point performance of a 72-core KNL. These numbers represent nearly 5.4x and 10x speedup on Skylake and KNL, respectively, compared to the the baseline scalar code. With distributed memory optimizations, on the other hand, we report near-optimal efficiency in the weak scalability study with respect to both the O(log P ) communication complexity as well as the theoretical scaling complexity of FMM. In addition, we exhibit up to 85% efficiency in strong scaling. We compute in excess of 2 billion DoF on the full-scale of the Cray XC40 supercomputer. The numerical results match the analytical solution with convergence at 1.0e-4 relative 2-norm residual accuracy. To the best of our knowledge, this work presents the fastest and the most scalable FMM-accelerated linear solver for oscillatory kernels.

show abstract

An explicit marching-on-in-time scheme for solving the time domain Kirchhoff integral equation

Chen

Sayed

Al-Harthi

et al. 2019

View full text Add to dashboard Cite

A fully explicit marching-on-in-time (MOT) scheme for solving the time domain Kirchhoff (surface) integral equation to analyze transient acoustic scattering from rigid objects is presented. A higher-order Nyström method and a PE(CE)m-type ordinary differential equation integrator are used for spatial discretization and time marching, respectively. The resulting MOT scheme uses the same time step size as its implicit counterpart (which also uses Nyström method in space) without sacrificing from the accuracy and stability of the solution. Numerical results demonstrate the accuracy, efficiency, and applicability of the proposed explicit MOT solver.

show abstract

An Explicit Time Marching Scheme for Efficient Solution of the Magnetic Field Integral Equation at Low Frequencies

Chen

Sayed

Ülkü

et al. 2021

IEEE Trans. Antennas Propagat.

View full text Add to dashboard Cite

An explicit marching-on-in-time (MOT) scheme to efficiently solve the time domain magnetic field integral equation (TD-MFIE) with a large time step size (under a low-frequency excitation) is developed. The proposed scheme spatially expands the current using high-order nodal functions defined on curvilinear triangles discretizing the scatterer surface. Applying Nyström discretization, which uses this expansion, to the TD-MFIE, which is written as an ordinary differential equation (ODE) by separating self-term contribution, yields a system of ODEs in unknown time-dependent expansion coefficients. A predictor-corrector method is used to integrate this system for samples of these coefficients. Since the Gram matrix arising from the Nyström discretization is blockdiagonal, the resulting MOT scheme replaces the matrix "inversion" required at each time step by a product of the inverse block-diagonal Gram matrix and the right-hand side vector. It is shown that, upon convergence of the corrector updates, this explicit MOT scheme produces the same solution as its implicit counterpart, and is faster for large time step sizes. Index Terms-Marching-on-in-time (MOT), magnetic field integral equation (MFIE), Nyström method, predictor-corrector scheme.

show abstract

Microgrinding force predictive modelling based on microscale single grain interaction analysis

Park

Liang

Chen

2007

IJMTM

View full text Add to dashboard Cite

In this paper, a new single grit model between the workpiece and the single grit considering both cutting and ploughing effects is proposed to predict the material deformation and microgrinding forces. The proposed model predictions are compared to the experiment data of the Single Crystal Diamond (SCD) cutting for validation. Extension of the single grit model by stochastic distribution analysis to predict the entire microgrinding forces is also presented.

show abstract

Solving Acoustic Boundary Integral Equations Using High Performance Tile Low-Rank LU Factorization

Al-Harthi

Alomairy

Akbudak

et al. 2020

View full text Add to dashboard Cite

We design and develop a new high performance implementation of a fast direct LU-based solver using low-rank approximations on massively parallel systems. The LU factorization is the most timeconsuming step in solving systems of linear equations in the context of analyzing acoustic scattering from large 3D objects. The matrix equation is obtained by discretizing the boundary integral of the exterior Helmholtz problem using a higher-order Nyström scheme. The main idea is to exploit the inherent data sparsity of the matrix operator by performing local tilecentric approximations while still capturing the most significant information. In particular, the proposed LU-based solver leverages the Tile Low-Rank (TLR) data compression format as implemented in the Hierarchical Computations on Manycore Architectures (HiCMA) library to decrease the complexity of "classical" dense direct solvers from cubic to quadratic order. We taskify the underlying boundary integral kernels to expose fine-grained computations. We then employ the dynamic runtime system StarPU to orchestrate the scheduling of computational tasks on shared and distributed-memory systems. The resulting asynchronous execution permits to compensate for the load imbalance due to the heterogeneous ranks, while mitigating the overhead of data motion. We assess the robustness of our TLR LU-based solver and study the qualitative impact when using different numerical accuracies. The new TLR LU factorization outperforms the state-of-the-art dense factorizations by up to an order of magnitude on various parallel systems, for analysis of scattering from large-scale 3D synthetic and real geometries.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rui Chen

Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave Scattering

An explicit marching-on-in-time scheme for solving the time domain Kirchhoff integral equation

An Explicit Time Marching Scheme for Efficient Solution of the Magnetic Field Integral Equation at Low Frequencies

Microgrinding force predictive modelling based on microscale single grain interaction analysis

Solving Acoustic Boundary Integral Equations Using High Performance Tile Low-Rank LU Factorization

Contact Info

Product

Resources

About