2019
DOI: 10.14778/3364324.3364325
|View full text |Cite
|
Sign up to set email alerts
|

Model slicing for supporting complex analytics with elastic inference cost and resource constraints

Abstract: Deep learning models have been used to support analytics beyond simple aggregation, where deeper and wider models have been shown to yield great results. These models consume a huge amount of memory and computational operations. However, most of the large-scale industrial applications are often computational budget constrained. In practice, the peak workload of inference service could be 10x higher than the average cases, with the presence of unpredictable extreme cases. Lots of computational resources could b… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 19 publications
(15 citation statements)
references
References 56 publications
0
15
0
Order By: Relevance
“…Model slicing [5] is a general technique to enable deep learning models to support elastic computation. Specifically, each layer of the model is divided into equal-sized contiguous computational groups.…”
Section: Elastic Inferencementioning
confidence: 99%
See 2 more Smart Citations
“…Model slicing [5] is a general technique to enable deep learning models to support elastic computation. Specifically, each layer of the model is divided into equal-sized contiguous computational groups.…”
Section: Elastic Inferencementioning
confidence: 99%
“…As a result, during inference, we can support accuracy-efficiency trade-offs by dynamically slicing a subnet of a certain width, where only the parameters of the activated groups are involved in computation. Theoretically, the number of parameters and computation measured in FLOPs are both roughly quadratic to the slice rate 𝑟 [5], e.g., a slice rate of 0.5 can achieve up to four times speedup. Therefore, we can support elastic inference by introducing the model slicing technique to the training stage.…”
Section: Elastic Inferencementioning
confidence: 99%
See 1 more Smart Citation
“…Shen et al [61] proposed slicing CNN feature maps to understand the appearance and dynamic features in the videos. Cai et al [9] proposed to slice a DNN into different groups that can be assembled elastically to support dynamic workload. However, these approaches are not related to program slicing that aims to understand the internal logic of a program.…”
Section: Program Analysis For Neural Networkmentioning
confidence: 99%
“…In this work, we first carefully analyze how different types of user pairs interact on the WeChat platform and find useful interaction features for identifying relationship types. And then we summarize two major challenges in the edge classification task: The data sparsity problem [15] and the computation bottleneck in networks with billions of nodes [16].…”
Section: Introductionmentioning
confidence: 99%