2013 IEEE International Symposium on Parallel &Amp; Distributed Processing, Workshops and PHD Forum 2013
DOI: 10.1109/ipdpsw.2013.197
|View full text |Cite
|
Sign up to set email alerts
|

Using MIC to Accelerate a Typical Data-Intensive Application: The Breadth-first Search

Abstract: Data-intensive applications draw more and more attentions in the last few years. The breadth-first search (BFS), a typical data-intensive application, is so widely used that the Graph 500 benchmark uses it to rank supercomputers' performance. The Intel MIC (Many Integrated Core), which is designed for highly parallel computing, hasn't been fully evaluated for data-intensive applications. In this paper, we discuss how to use MIC to accelerate the BFS. Optimizations both for native mode and for offload mode are … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
15
0
2

Year Published

2014
2014
2018
2018

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(17 citation statements)
references
References 16 publications
0
15
0
2
Order By: Relevance
“…The IWPP resembles graph scan algorithms with multiple sources, which have been the target of a number of recent research projects that implemented, for instance, Breadth-First Search (BFS). 22,23 Hong et al 22 presented approaches to minimize the load imbalance that occurs when processing graphs in which the number of edges may vary from vertices. Tao et al 23 Roman, 25 while other works used devices such as Field-Programmable Gate Arrays (FPGAs) and GPUs to implement this operation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The IWPP resembles graph scan algorithms with multiple sources, which have been the target of a number of recent research projects that implemented, for instance, Breadth-First Search (BFS). 22,23 Hong et al 22 presented approaches to minimize the load imbalance that occurs when processing graphs in which the number of edges may vary from vertices. Tao et al 23 Roman, 25 while other works used devices such as Field-Programmable Gate Arrays (FPGAs) and GPUs to implement this operation.…”
Section: Related Workmentioning
confidence: 99%
“…22,23 Hong et al 22 presented approaches to minimize the load imbalance that occurs when processing graphs in which the number of edges may vary from vertices. Tao et al 23 Roman, 25 while other works used devices such as Field-Programmable Gate Arrays (FPGAs) and GPUs to implement this operation. [26][27][28] A common limitation with these solutions is that they were not built on top of the must efficient sequential algorithm that uses queues.…”
Section: Related Workmentioning
confidence: 99%
“…As such, recent efforts on efficient implementations of Breadth-First Search (BFS) [16] [17] are interesting for the sake of comparison with IWPP execution schemes and optimizations. The work of Hong et al [16], for instance, provides techniques and optimizations to deal with load imbalance from irregular number of edges in vertices from real-world graphs for their GPU-based BFS algorithms.…”
Section: Related Workmentioning
confidence: 99%
“…Although these techniques have shown to be effective to their work, it would have no impact in IWPP that has a regular and constant number of edges per vertex, represented by the fixed neighborhood. Tao et al [17] is a closer related work that describes approaches to accelerate BFS using the Intel Phi. It develops reading and expansion operations using SIMD instructions, but it still uses atomic (non-vectorized) instructions to perform expansion of vertices.…”
Section: Related Workmentioning
confidence: 99%
“…For instance, Hong et al in [6] presented a hybrid method which dynamically decides the best execution method for each BFS-level iteration, shifting between sequential execution, multi-core CPU-only execution, and GPUs. Tao, Yutong and Guang [7] developed two different approaches to improve the performance of BFS algorithm on an Intel Xeon Phi coprocessor.…”
Section: Related Workmentioning
confidence: 99%