2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2017
DOI: 10.1109/ipdpsw.2017.126
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Performance and Optimization of Chapel in Modern Manycore Architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…It supports a similar array of languages and single-node parallel models for nstream, but also supports distributed-memory parallelism (e.g. MPI and PGAS [27]- [29]).…”
Section: B Related Workmentioning
confidence: 99%
“…It supports a similar array of languages and single-node parallel models for nstream, but also supports distributed-memory parallelism (e.g. MPI and PGAS [27]- [29]).…”
Section: B Related Workmentioning
confidence: 99%
“…Even if some memory kinds can be configured as a cache level to enable automatic hardware-driven management (e.g. MCDRAM in Intel KNL [18,21,9]), fine-grain data allocation can lead to better performance. Thus, some research papers set up this MC-DRAM as flat mode meaning that a specific action is required to put data into this target memory.…”
Section: Related Workmentioning
confidence: 99%
“…where R is the rank of the decomposition and typically small (35 in our case), and the computation is fairly light. As described in a Chapel GitHub issue 4 , array slicing can be expensive due to computing and creating the domain of the resulting array view and creating and setting up the array descriptor for the view. Our first approach was to eliminate slicing by using direct 2D indexing for matrices, even though it deviated from the reference implementation of SPLATT.…”
Section: Initialmentioning
confidence: 99%
“…Each slice only consists of R elements, where R is the rank of the decomposition and typically small (35 in our case), and the computation is fairly light. As described in a Chapel GitHub issue 4 , array slicing can be expensive due to computing and creating the domain of the resulting array view and creating and setting up the array descriptor for the view.…”
Section: Mttkrp Optimizationsmentioning
confidence: 99%
See 1 more Smart Citation