Matrix Factorization on GPUs with Memory Optimization and Approximate Computing

Tan, Wei; Chang, Shiyu; Fong, Liana; Li, Cheng; Wang, Zijun; Cao, Liangliang

doi:10.1145/3225058.3225096

Cited by 6 publications

(1 citation statement)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To take advantage of hardware acceleration, [17] propose cuMF, a single-machine, memory-optimized multi-GPU implementation that scales to relatively large problems (up to 10 11 model parameters), further extended in [18] to allow approximate computation via a conjugate gradient solver. [17,18] exploit GPU memory hierarchy and model parallelism across GPUs in order to produce a highly performant implementation of ALS. As cuMF exploits unique properties of the GPU hardware, ALX overcomes various challenges and exploits unique properties of TPUs.…”

Section: Related Workmentioning

confidence: 99%

ALX: Large Scale Matrix Factorization on TPUs

Mehta¹,

Rendle²,

Krichene³

et al. 2021

Preprint

View full text Add to dashboard Cite

We present ALX, an open-source library for distributed matrix factorization using Alternating Least Squares, written in JAX. Our design allows for efficient use of the TPU architecture and scales well to matrix factorization problems of O(B) rows/columns by scaling the number of available TPU cores. In order to spur future research on large scale matrix factorization methods and to illustrate the scalability properties of our own implementation, we also built a real world web link prediction dataset called WebGraph. This dataset can be easily modeled as a matrix factorization problem. We created several variants of this dataset based on locality and sparsity properties of sub-graphs. The largest variant of WebGraph has around 365M nodes and training a single epoch finishes in about 20 minutes with 256 TPU cores. We include speed and performance numbers of ALX on all variants of WebGraph. Both the framework code and the dataset will be open-sourced.

show abstract

Section: Related Workmentioning

confidence: 99%