CUDA, OpenCL, and OpenMP are popular programming models for the multi-core architectures of CPUs and many-core architectures of GPUs or Xeon Phis. At the same time, computational scientists face the question of which programming model to use to obtain their scientific results. We present the linear algebra library ViennaCL, which is built on top of all three programming models, thus enabling computational scientists to interface to a single library, yet obtain high performance for all three hardware types. Since the respective compute backend can be selected at runtime, one can seamlessly switch between different hardware types without the need for error-prone and time-consuming recompilation steps.We present new benchmark results for sparse linear algebra operations in ViennaCL, complementing results for the dense linear algebra operations in ViennaCL reported in earlier work. Comparisons with vendor-libraries show that ViennaCL provides better overall performance for sparse matrix-vector and sparse matrix-matrix products. Additional benchmark results for pipelined iterative solvers with kernel fusion and preconditioners identify the respective sweet spots for CPUs, Xeon Phis, and GPUs.
We propose two different approaches to describe carrier transport in n-laterally diffused MOS (nLDMOS) transistor and use the calculated carrier energy distribution as an input for our physical hot-carrier degradation (HCD) model. The first version relies on the solution of the Boltzmann transport equation using the spherical harmonics expansion method, while the second uses the simpler drift-diffusion (DD) scheme. We compare these two versions of our model and show that both approaches can capture HCD. We, therefore, conclude that in the case of nLDMOS devices, the DD-based variant of the model provides good accuracy and at the same time is computationally less expensive. This makes the DD-based version attractive for predictive HCD simulations of LDMOS transistors.Index Terms-Drift-diffusion (DD) scheme, hot-carrier degradation (HCD), n-laterally diffused MOS (nLDMOS), spherical harmonics expansion (SHE).
The performance portability of OpenCL kernel implementations for common memory bandwidth limited linear algebra operations across different hardware generations of the same vendor as well as across vendors is studied. Certain combinations of kernel implementations and work sizes are found to exhibit good performance across compute kernels, hardware generations, and, to a lesser degree, vendors. As a consequence, it is demonstrated that the optimization of a single kernel is often sufficient to obtain good performance for a large class of more complicated operations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.