2004
DOI: 10.1109/tpds.2004.44
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing graph algorithms for improved cache performance

Abstract: Abstract

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
49
0
2

Year Published

2005
2005
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 74 publications
(52 citation statements)
references
References 27 publications
1
49
0
2
Order By: Relevance
“…Challanges in parallel graph processing is discussed in [3] Our computations involve matrices and therefore some fast matrix multiplication algorithms such as [19][6] are also our area of concern. As CPU implementations have several limitations of performance so some cache optimization techniques and cache friendly implementations are given in [2]and [5] using recursion for dense graphs. In [10] to reduce TLB misses blocked data layout and mortan layout are given for FW.…”
Section: Problem Time Complexitymentioning
confidence: 99%
“…Challanges in parallel graph processing is discussed in [3] Our computations involve matrices and therefore some fast matrix multiplication algorithms such as [19][6] are also our area of concern. As CPU implementations have several limitations of performance so some cache optimization techniques and cache friendly implementations are given in [2]and [5] using recursion for dense graphs. In [10] to reduce TLB misses blocked data layout and mortan layout are given for FW.…”
Section: Problem Time Complexitymentioning
confidence: 99%
“…Actual algorithms based on this proof are given by various researchers, with minor differences. Our decision to use the DC algorithm as our starting point is inspired by its demonstrated better cache reuse on CPUs [33], and its impressive performance attained on the many-core graphical processor units [11].…”
Section: Previous Workmentioning
confidence: 99%
“…The workflow of the DC-APSP algorithm is also pictured in Figure 2. The correctness of this algorithm has been proved by many researchers [4], [11], [33] using various methods. Edge weights can be arbitrary, including negative numbers, but we assume that the graph is free of negative cycles.…”
Section: Divide-and-conquer Apspmentioning
confidence: 99%
See 1 more Smart Citation
“…For the former, the optimization is performed during the compiling time through code transformation [10,8,9], array padding [1,14], or both of them [4], while for the later optimization is done directly within the source code via manually rewriting the program code using the same techniques [5,12]. In contrast to the compiler approach, the user-level optimization is more common due to its straight-forward manner.…”
Section: Introductionmentioning
confidence: 99%