2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT) 2019
DOI: 10.1109/pdcat46702.2019.00033
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating Conjugate Gradient using OmpSs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 16 publications
0
3
0
Order By: Relevance
“…Listing 2 shows the modifications in the algorithm. These modifications consist of swapping the order of execution of some of the kernels, mainly AXPY operations and DOT products [12], exposing a higher parallelism.…”
Section: Optimized Conjugate Gradient Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Listing 2 shows the modifications in the algorithm. These modifications consist of swapping the order of execution of some of the kernels, mainly AXPY operations and DOT products [12], exposing a higher parallelism.…”
Section: Optimized Conjugate Gradient Methodsmentioning
confidence: 99%
“…Those HPC applications that are composed of multiple memory bound kernels which have to perform the operations repeatedly or have an iterative nature, such as those leveraged in this work, but many others as well, such as CFD simulations [4][5][6][7], image processing [8,9], AI kernels [10], or Linear Algebra kernels [11][12][13][14], just to mention of few, can benefit from the use of Static Graphs by reducing the CPU-GPU communication overhead and achieving higher GPU occupancy. To the best of our knowledge, this is the first time that CUDA Graph has been integrated with OpenACC and effectively adapted to the two different algorithms used as test cases in this work: the Conjugate Gradient Method and Particle Swarm Optimization.…”
Section: Introductionmentioning
confidence: 99%
“…Indeed, a myriad of numerical simulation applications, commercial and ad hoc solutions, uses non stationary iterative methods because of their high effectiveness and robustness whenever solving linear systems of equations [8]. The most popular solvers included in this category are: the Conjugated Gradient (CG) which requires one SpMV product per iteration [9], the Generalized Minimum Residual Method (GMRES) that also uses one SpMV product per iteration [4], the BiConjugate Gradient (BiCG) that needs two SpMV products per iteration [10], and the BiConjugate Gradient Stabilised (BiCGS) that also needs two SpMV products per iteration [11]. Optimising the SpMV product on modern multi and many-core processors for general sparse matrices is not a trivial task because, in order to harness the strong parallel-processing capabilities of these devices, the computations require to have regular execution paths and memory access patterns, which are hardly ever present in the sparse matrices generated by real life numerical applications.…”
Section: Introduction and Related Workmentioning
confidence: 99%