Finite element method has been successfully implemented on the graphics processing units to achieve a significant reduction in simulation time. In this paper, new strategies for the finite element matrix generation including numerical integration and assembly are proposed by using a warp per element for a given mesh. These strategies are developed using the well-known coloring method. The proposed strategies use a specialized algorithm to realize fine-grain parallelism and efficient use of on-chip memory resources. The warp shuffle feature of Compute Unified Device Architecture (CUDA) is used to accelerate numerical integration. The evaluation of elemental stiffness matrix is further optimized by adopting a partial parallel implementation of numerical integration. Performance evaluations of the proposed strategies are done for three-dimensional elasticity problem using the 8-noded hexahedral elements with three degrees of freedom per node. We obtain a speedup of up to 8.2× over the coloring based assembly by element strategy (using a single thread per element) on NVIDIA Tesla K40 GPU. Also, the proposed strategies achieve better arithmetic throughput and bandwidth. Highlights CUDA Warp based strategies for FE matrix generation and assembly. Performed using coloring method and on linear hexahedral element meshing in 3D. Obtained speedup of 5.17×− 8.2× over single thread per element strategy on GPU. Strategies showed better arithmetic throughput and bandwidth through code profiling.
PurposeStructural topology optimization is computationally expensive due to the involvement of high-resolution mesh and repetitive use of finite element analysis (FEA) for computing the structural response. Since FEA consumes most of the computational time in each optimization iteration, a novel GPU-based parallel strategy for FEA is presented and applied to the large-scale structural topology optimization of 3D continuum structures.Design/methodology/approachA matrix-free solver based on preconditioned conjugate gradient (PCG) method is proposed to minimize the computational time associated with solution of linear system of equations in FEA. The proposed solver uses an innovative strategy to utilize only symmetric half of elemental stiffness matrices for implementation of the element-by-element matrix-free solver on GPU.FindingsUsing solid isotropic material with penalization (SIMP) method, the proposed matrix-free solver is tested over three 3D structural optimization problems that are discretized using all hexahedral structured and unstructured meshes. Results show that the proposed strategy demonstrates 3.1× –3.3× speedup for the FEA solver stage and overall speedup of 2.9× –3.3× over the standard element-by-element strategy on the GPU. Moreover, the proposed strategy requires almost 1.8× less GPU memory than the standard element-by-element strategy.Originality/valueThe proposed GPU-based matrix-free element-by-element solver takes a more general approach to the symmetry concept than previous works. It stores only symmetric half of the elemental matrices in memory and performs matrix-free sparse matrix-vector multiplication (SpMV) without any inter-thread communication. A customized data storage format is also proposed to store and access only symmetric half of elemental stiffness matrices for coalesced read and write operations on GPU over the unstructured mesh.
A multiple transmit antenna, single receive antenna (per receiver) downlink channel with limited channel feedback is considered. Given a constraint on the total system-wide channel feedback, the following question is considered: is it preferable to get low-rate feedback from a large number of receivers or to receive high-rate/high-quality feedback from a smaller number of (randomly selected) receivers. Acquiring feedback from many users allows multi-user diversity to be exploited, while highrate feedback allows for very precise selection of beamforming directions. It is shown that systems in which a limited number of users feedback high-rate channel information significantly outperform low-rate/many user systems. The marginal benefit of channel feedback is very significant up to the point where the CSI is essentially perfect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.