Shusen Wang scite author profile

Shusen Wang

2Publications

31Citation Statements Received

38Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of California, Berkeley, Zhejiang University

Publications

Order By: Most citations

Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

Gittens

Rothauge

Wang

et al. 2018

View full text Add to dashboard Cite

Apache Spark is a popular system aimed at the analysis of large data sets, but recent studies have shown that certain computations-in particular, many linear algebra computations that are the basis for solving common machine learning problems-are significantly slower in Spark than when done using libraries written in a high-performance computing framework such as the Message-Passing Interface (MPI).To remedy this, we introduce Alchemist, a system designed to call MPI-based libraries from Apache Spark. Using Alchemist with Spark helps accelerate linear algebra, machine learning, and related computations, while still retaining the benefits of working within the Spark environment. We discuss the motivation behind the development of Alchemist, and we provide a brief overview of its design and implementation.We also compare the performances of pure Spark implementations with those of Spark implementations that leverage MPI-based codes via Alchemist. To do so, we use data science case studies: a large-scale application of the conjugate gradient method to solve very large linear systems arising in a speech classification problem, where we see an improvement of an order of magnitude; and the truncated singular value decomposition (SVD) of a 400GB three-dimensional ocean temperature data set, where we see a speedup of up to 7.9x. We also illustrate that the truncated SVD computation is easily scalable to terabyte-sized data by applying it to data sets of sizes up to 17.6TB.

show abstract

Improving the modified nyström method using spectral shifting

Wang

Zhang

Qian

et al. 2014

View full text Add to dashboard Cite

The Nyström method is an efficient approach to enabling largescale kernel methods. The Nyström method generates a fast approximation to any large-scale symmetric positive semidefinete (SPSD) matrix using only a few columns of the SPSD matrix. However, since the Nyström approximation is low-rank, when the spectrum of the SPSD matrix decays slowly, the Nyström approximation is of low accuracy. In this paper, we propose a variant of the Nyström method called the modified Nyström by spectral shifting (SS-Nyström). The SS-Nyström method works well no matter whether the spectrum of SPSD matrix decays fast or slow. We prove that our SS-Nyström has a much stronger error bound than the standard and modified Nyström methods, and that SS-Nyström can be even more accurate than the truncated SVD of the same scale in some cases. We also devise an algorithm such that the SS-Nyström approximation can be computed nearly as efficient as the modified Nyström approximation. Finally, our SS-Nyström method demonstrates significant improvements over the standard and modified Nyström methods on several real-world datasets.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shusen Wang

Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

Improving the modified nyström method using spectral shifting

Contact Info

Product

Resources

About