Guang Suo scite author profile

Guang Suo

5Publications

83Citation Statements Received

63Citation Statements Given

How they've been cited

140

How they cite others

Affiliations

National University of Defense Technology

Publications

Order By: Most citations

Using the Intel Many Integrated Core to accelerate graph traversal

Gao

Zhang

et al. 2014

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

Data-intensive applications have drawn more and more attention in the last few years. The basic graph traversal algorithm, the breadth-first search (BFS), a typical data-intensive application, is widely used and the Graph 500 benchmark uses it to rank the performance of supercomputers. The Intel Many Integrated Core (MIC) architecture, which is designed for highly parallel computing, has not been fully evaluated for graph traversal. In this paper, we discuss how to use the MIC to accelerate the BFS. We present some optimizations for native BFS algorithms and develop a heterogeneous BFS algorithm. For the native BFS algorithm, we mainly discuss how to exploit many cores and wide-vector processing units. The performance of our optimized native BFS implementation is 5.3 times that of the highest published performance for graphics processing units (GPU). For the heterogeneous BFS algorithm, the performance of the general processing unit (CPU) and MIC cooperative computing can gain an increase in speed of approximately 1.4 times than that of a CPU for graphs with 2M vertices. This work is valuable for using a MIC to accelerate the BFS. It is also a general guidance for a MIC used for data-intensive applications.

show abstract

Using MIC to Accelerate a Typical Data-Intensive Application: The Breadth-first Search

Gao

Suo

2013

View full text Add to dashboard Cite

Data-intensive applications draw more and more attentions in the last few years. The breadth-first search (BFS), a typical data-intensive application, is so widely used that the Graph 500 benchmark uses it to rank supercomputers' performance. The Intel MIC (Many Integrated Core), which is designed for highly parallel computing, hasn't been fully evaluated for data-intensive applications. In this paper, we discuss how to use MIC to accelerate the BFS. Optimizations both for native mode and for offload mode are discussed. About native mode, we propose optimizations for threadlevel and data-level parallelism. We exploit the thread-level parallelism by relaxing inter-thread dependence. The optimized algorithm is proved to be more scalable. Data-level parallelism is exploited by 512-bits single instruction multiple data (SIMD) instructions. The maximum speedup we further gain is up to 3.4 times. About offload mode, we present an offload algorithm. By careful task partition and communication optimizations, it can gain speedup for large graphs which can't run natively on MIC as the limited memory size. We believe that the work is valuable for using MIC to accelerate the BFS. Meanwhile, it's a general evaluation of the MIC for data-intensive applications.

show abstract

High Performance Interconnect Network for Tianhe System

Liao

Pang

Wang

et al. 2015

J. Comput. Sci. Technol.

View full text Add to dashboard Cite

The Fault Tolerant Parallel Algorithm: the Parallel Recomputing Based Failure Recovery

Yang

Wang

et al. 2007

View full text Add to dashboard Cite

NR-MPI: A Non-stop and Fault Resilient MPI

Suo

Liao

et al. 2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guang Suo

Using the Intel Many Integrated Core to accelerate graph traversal

Using MIC to Accelerate a Typical Data-Intensive Application: The Breadth-first Search

High Performance Interconnect Network for Tianhe System

The Fault Tolerant Parallel Algorithm: the Parallel Recomputing Based Failure Recovery

NR-MPI: A Non-stop and Fault Resilient MPI

Contact Info

Product

Resources

About