2018
DOI: 10.1007/978-3-319-96983-1_53
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-core SIMD Processors

Abstract: Particle-in-Cell (PIC) codes are widely used for plasma simulations. On recent multi-core hardware, performance of these codes is often limited by memory bandwidth. We describe a multi-core PIC algorithm that achieves close-to-minimal number of memory transfers with the main memory, while at the same time exploiting SIMD instructions for numerical computations and exhibiting a high degree of OpenMPlevel parallelism. Our algorithm keeps particles sorted by cell at every time step, and represents particles from … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…Thus the multiplication of the cores without any significant increase of the memory bandwidth brings a poor speed-up. This issue is analysed in [43] and received a lot of attention for years [4,5,7]. Different workarounds are proposed to favor the locality of the data, increasing the cache reuse, therefore mitigating the number of requests to the main memory.…”
Section: Efficient Parallelization For 3d-3v Sparse Grid Particle-in-...mentioning
confidence: 99%
See 2 more Smart Citations
“…Thus the multiplication of the cores without any significant increase of the memory bandwidth brings a poor speed-up. This issue is analysed in [43] and received a lot of attention for years [4,5,7]. Different workarounds are proposed to favor the locality of the data, increasing the cache reuse, therefore mitigating the number of requests to the main memory.…”
Section: Efficient Parallelization For 3d-3v Sparse Grid Particle-in-...mentioning
confidence: 99%
“…A non exhaustive overview of optimizations and parallelizations of PIC methods on shared memory architectures. In PIC simulations, the implementations are usually memory-bounded rather than compute-bounded [4,43].…”
Section: 1mentioning
confidence: 99%
See 1 more Smart Citation
“…This approach improves SIMD efficiency on many-core architectures such as the Intel Xeon Phi provided that the particles arrays have enough elements. A similar approach has been extended in [16] where the authors use additional strategies such as the division of a cell's particle set into chunks to improve cache coherence and reduce memory transfers. They report acceleration when using a few hundreds particles per cell.…”
Section: Introductionmentioning
confidence: 99%