2007
DOI: 10.1007/s00200-007-0036-y
|View full text |Cite
|
Sign up to set email alerts
|

Towards an accurate performance modeling of parallel sparse factorization

Abstract: We present a performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. Our model characterizes the algorithmic behavior by taking account the underlying processor speed, memory system performance, as well as the interconnect speed. The model is validated using the SuperLU DIST linear system solver, the sparse matrices from real applications, and an IBM POWER3 parallel machine. Our modeling methodology can be easily adapted to study perfor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 11 publications
0
2
0
Order By: Relevance
“…In a subsequent paper, Li (2008) considers the performance of these methods when each node of the computer consists of a tightly coupled multicore processor. Grigori and Li (2007) present an accurate simulation-based performance model for SuperLU DIST, which includes the speed of the processors, memory systems, and the latency and bandwidth of the interconnect.…”
Section: Supernodal Lu Factorizationmentioning
confidence: 99%
“…In a subsequent paper, Li (2008) considers the performance of these methods when each node of the computer consists of a tightly coupled multicore processor. Grigori and Li (2007) present an accurate simulation-based performance model for SuperLU DIST, which includes the speed of the processors, memory systems, and the latency and bandwidth of the interconnect.…”
Section: Supernodal Lu Factorizationmentioning
confidence: 99%
“…Given these difficulties, many authors have explored instead empirical performance modeling approaches . These approaches consist in automatically discovering salient application characteristics using one or a combination of several techniques (e.g., instrumented application execution to obtain trace data , analysis of application trace data and of known benchmark trace data to infer application performance , analysis of application simulation driven by trace data , and simulation driven by static code analysis ). These automatically derived empirical models explicitly capture important characteristics of the implementation of the target application (e.g., memory accesses, branch behavior, and floating point unit usage).…”
Section: Related Workmentioning
confidence: 99%