2021
DOI: 10.48550/arxiv.2111.05897
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

Abstract: Deep learning based models have dominated the current landscape of production recommender systems. Furthermore, recent years have witnessed an exponential growth of the model scale-from Google's 2016 model with 1 billion parameters to the latest Facebook's model with 12 trillion parameters. Significant quality boost has come with each jump of the model capacity, which makes us believe the era of 100 trillion parameters is around the corner. However, the training of such models is challenging even within indust… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 54 publications
(101 reference statements)
0
3
0
Order By: Relevance
“…Embedding tables are commonly used to deal with sparse features in recommendation models [1,2,3,4,5,29,30,31]. However, the extremely large embedding tables are often the storage and efficiency bottlenecks [6,7,8,3,6,9,10,11,32]. To our knowledge, the only two studies that target the embedding table placement problem are RecShard [27] and our previous work AutoShard [33].…”
Section: Related Workmentioning
confidence: 99%
“…Embedding tables are commonly used to deal with sparse features in recommendation models [1,2,3,4,5,29,30,31]. However, the extremely large embedding tables are often the storage and efficiency bottlenecks [6,7,8,3,6,9,10,11,32]. To our knowledge, the only two studies that target the embedding table placement problem are RecShard [27] and our previous work AutoShard [33].…”
Section: Related Workmentioning
confidence: 99%
“…We on the other hand implement only synchronous training. A very recent work by [27] introduces a new hybrid sync-async algorithm to train recommender models, unlike this work we only focus on synchronous training. [47] proposes methods for improving data processing while training recommender systems.…”
Section: Related Workmentioning
confidence: 99%
“…Among DNNs, convolutional neural networks (CNNs), one of the representative algorithms of deep learning, are a specialized kind of feedforward neural network with deep structure and convolution computation, and have been tremendously successful in computer vision applications such image recognition and image segmentation because of its smart use of strategies including sparse interactions, parameter sharing and equivariant representations [5,6]. Currently, deep CNNs are driven by high-performance processors such as graphics processing unit (GPU) and tensor processing unit (TPU) for performing a large number of computations such as addition and multiplication [7], which need huge computation time and energy resources. However, as the Moore's law approaches the limits of physics, electronic chips will be hard to keep up with performance growth of the artificial intelligence.…”
Section: Introductionmentioning
confidence: 99%