Algorithmic and architecture-oriented optimizations are essential for achieving performance worthy of anticipated energy-austere exascale systems. In this paper, we present an extreme scale FMM-accelerated boundary integral equation solver for wave scattering, which uses FMM as a matrix-vector multiplication inside the GMRES iterative method. Our FMM Helmholtz kernels are capable of treating nontrivial singular and near-field integration points. We implement highly optimized kernels for both shared and distributed memory, targeting emerging Intel extreme performance HPC architectures. We extract the potential thread-and data-level parallelism of the key Helmholtz kernels of FMM. Our application code is well optimized to exploit the AVX-512 SIMD units of Intel Skylake and Knights Landing architectures. We provide different performance models for tuning the task-based tree traversal implementation of FMM, and develop optimal architecturespecific and algorithm aware partitioning, load balancing, and communication reducing mechanisms to scale up to 6,144 compute nodes of a Cray XC40 with 196,608 hardware cores. With shared memory optimizations, we achieve roughly 77% of peak single precision floating point performance of a 56-core Skylake processor, and on average 60% of peak single precision floating point performance of a 72-core KNL. These numbers represent nearly 5.4x and 10x speedup on Skylake and KNL, respectively, compared to the the baseline scalar code. With distributed memory optimizations, on the other hand, we report near-optimal efficiency in the weak scalability study with respect to both the O(log P ) communication complexity as well as the theoretical scaling complexity of FMM. In addition, we exhibit up to 85% efficiency in strong scaling. We compute in excess of 2 billion DoF on the full-scale of the Cray XC40 supercomputer. The numerical results match the analytical solution with convergence at 1.0e-4 relative 2-norm residual accuracy. To the best of our knowledge, this work presents the fastest and the most scalable FMM-accelerated linear solver for oscillatory kernels.
A fully explicit marching-on-in-time (MOT) scheme for solving the time domain Kirchhoff (surface) integral equation to analyze transient acoustic scattering from rigid objects is presented. A higher-order Nyström method and a PE(CE)m-type ordinary differential equation integrator are used for spatial discretization and time marching, respectively. The resulting MOT scheme uses the same time step size as its implicit counterpart (which also uses Nyström method in space) without sacrificing from the accuracy and stability of the solution. Numerical results demonstrate the accuracy, efficiency, and applicability of the proposed explicit MOT solver.
We design and develop a new high performance implementation of a fast direct LU-based solver using low-rank approximations on massively parallel systems. The LU factorization is the most timeconsuming step in solving systems of linear equations in the context of analyzing acoustic scattering from large 3D objects. The matrix equation is obtained by discretizing the boundary integral of the exterior Helmholtz problem using a higher-order Nyström scheme. The main idea is to exploit the inherent data sparsity of the matrix operator by performing local tilecentric approximations while still capturing the most significant information. In particular, the proposed LU-based solver leverages the Tile Low-Rank (TLR) data compression format as implemented in the Hierarchical Computations on Manycore Architectures (HiCMA) library to decrease the complexity of "classical" dense direct solvers from cubic to quadratic order. We taskify the underlying boundary integral kernels to expose fine-grained computations. We then employ the dynamic runtime system StarPU to orchestrate the scheduling of computational tasks on shared and distributed-memory systems. The resulting asynchronous execution permits to compensate for the load imbalance due to the heterogeneous ranks, while mitigating the overhead of data motion. We assess the robustness of our TLR LU-based solver and study the qualitative impact when using different numerical accuracies. The new TLR LU factorization outperforms the state-of-the-art dense factorizations by up to an order of magnitude on various parallel systems, for analysis of scattering from large-scale 3D synthetic and real geometries.
This exploratory paper investigated female gifted secondary school students’ needs for English Learning in Saudi Arabia. It has addressed a gap in the literature regarding English Learning needs analysis of gifted secondary school students in Saudi Arabia. It tries to answer the question: What are the needs of those students? The professional needs for English learning were collected through a need analysis questionnaire addressed to gifted secondary school students in Riyadh, Jeddah, and Al-Baha in Saudi Arabia. The data analysis techniques for the descriptive statistics were frequency count, percentage, means, and standard deviations. The results revealed that gifted students need to learn English primarily to access the vast body of international scientific knowledge and research, to get international certificates in English, and to deal with the media, technology, and the Internet. In addition, this study specified students’ perceptions about the characteristics that should be included in English class, the difficulties they faced while studying language, and their suggestions for better English learning strategies. Furthermore, gifted students preferred to learn English from activities that resemble daily life situations and to learn at their own pace. The research results indicated that gifted students face difficulties in class, such as making mistakes and feeling bored because they already know the information. They have demonstrated a need to insert virtual learning environments into the curriculum. The study recommended that the Ministry of Education enable gifted students to study a tailored gifted curriculum in English and activate enrichment in and out of activities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.