2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines 2006
DOI: 10.1109/fccm.2006.45
|View full text |Cite
|
Sign up to set email alerts
|

GraphStep: A System Architecture for Sparse-Graph Algorithms

Abstract: Abstract-Many important applications are organized around long-lived, irregular sparse graphs (e.g., data and knowledge bases, CAD optimization, numerical problems, simulations). The graph structures are large, and the applications need regular access to a large, data-dependent portion of the graph for each operation (e.g., the algorithm may need to walk the graph, visiting all nodes, or propagate changes through many nodes in the graph). On conventional microprocessors, the graph structures exceed on-chip cac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
57
0

Year Published

2008
2008
2020
2020

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 71 publications
(57 citation statements)
references
References 37 publications
0
57
0
Order By: Relevance
“…A good amount of literature deals with the design of BFS solutions, either based on commodity processors [11], [12] or special purpose hardware [13], [14], [15], [16]. Some recent publications describe successful parallelization strategies of list ranking [17] and phylogenetic trees on the Cell BE [18].…”
Section: Introductionmentioning
confidence: 99%
“…A good amount of literature deals with the design of BFS solutions, either based on commodity processors [11], [12] or special purpose hardware [13], [14], [15], [16]. Some recent publications describe successful parallelization strategies of list ranking [17] and phylogenetic trees on the Cell BE [18].…”
Section: Introductionmentioning
confidence: 99%
“…The FPGA implementation scales well to, at least, tens of leaf processing FPGAs. See [deLorimier06] for further details on the Concept Net implementation. [Bellman58] is a single-source shortest path algorithm which robustly handles negative edge weights.…”
Section: Resultsmentioning
confidence: 99%
“…Parallel versions could, potentially, reduce the per processor working set; however, communication often ends up dominating computation due to high end-to-end network latency and high network contention. Our FPGA-based Graph Machine implementation is able to perform better because of the high memory bandwidth [deLorimier06] and low PE-to-PE latency. On a Virtex4-LX160-12, we are able to place16 double-precision floating-point PEs which operate at 285MHz each.…”
Section: Bellman-fordmentioning
confidence: 99%
See 1 more Smart Citation
“…We schedule communication between the nodes using a greedy time-multiplexed router that uses A* routing. We developed this scheduler and router as part of the Graph Machine project [19], [28], [32].…”
Section: B Tool Flowmentioning
confidence: 99%