Proceedings of the 2018 International Conference on Supercomputing 2018
DOI: 10.1145/3205289.3205313
|View full text |Cite
|
Sign up to set email alerts
|

Towards Efficient SpMV on Sunway Manycore Architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 43 publications
(22 citation statements)
references
References 26 publications
0
22
0
Order By: Relevance
“…CSTF (Blanco et al 2018) proposes a novel queuing strategy to exploit the data reuse between the computation procedures in CP decomposition that reduces the communication cost significantly. In the meanwhile, there are surging research works (Zhong et al 2018;Liu et al 2018;Hu et al 2019;Liu et al 2019;Han et al 2019;Chen et al 2018;Li et al 2018a, b;Duan et al 2018) based on Sunway architecture in the past few years, which provide valuable experience to our work. The achievable performance by leveraging the architecture features of Sunway such as memory architecture, CPEs and register communication, is quantitatively measured by Xu et al (2017) with both memory-bound and computing-bound benchmarks.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…CSTF (Blanco et al 2018) proposes a novel queuing strategy to exploit the data reuse between the computation procedures in CP decomposition that reduces the communication cost significantly. In the meanwhile, there are surging research works (Zhong et al 2018;Liu et al 2018;Hu et al 2019;Liu et al 2019;Han et al 2019;Chen et al 2018;Li et al 2018a, b;Duan et al 2018) based on Sunway architecture in the past few years, which provide valuable experience to our work. The achievable performance by leveraging the architecture features of Sunway such as memory architecture, CPEs and register communication, is quantitatively measured by Xu et al (2017) with both memory-bound and computing-bound benchmarks.…”
Section: Related Workmentioning
confidence: 99%
“…swMR (Zhong et al 2018), a MapReduce programming framework based on Sunway architecture, leverages the computing resources of Sunway processor to automatically parallelize the map/reduce processing and optimize the performance using the unique architectural features such as CPEs and register communication. A sparse matrix vector multiplication algorithm optimized for Sunway architecture, is proposed by Liu et al (2018). The proposed technique optimizes the sparse matrix vector multiplication by tiling resource and data into three levels, and then leverage register communication and local device memory to implement effective data transfer and better usage of CPEs.…”
Section: Related Workmentioning
confidence: 99%
“…For each CDP, it enumerates the sample-NMO velocity pairs (line 2), and then nds the intersection of the traveltime curve and traces. At each intersection, it rst obtains the halfpoint of the current trace (line 9-11), then accesses the data with size of w (line 12-13), and nally retrieves the data computed in a window of width w (line [14][15][16][17][18][19]. Each trace has its own corresponding halfpoints, therefore the accesses to halfpoints are continuous when walking through the traces sequentially.…”
Section: Improving Parallelism Within a Cgmentioning
confidence: 99%
“…For the current trace, the memory addresses of the data accesses are calculated for each sample-NMO velocity pair and kept in the k1 array (line [13][14][15]. en, the maximum and minimum memory address in k1 array is identi ed (line [16][17][18] and used to determine the memory range (len th) of data accesses (line 19). e data within the memory range is copied to LDM at one time through DMA operation (line 20).…”
Section: 32mentioning
confidence: 99%
See 1 more Smart Citation