2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid) 2021
DOI: 10.1109/ccgrid51090.2021.00096
|View full text |Cite
|
Sign up to set email alerts
|

High Performance Serverless Architecture for Deep Learning Workflows

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…The machine learning inference services are commonly latency critical, and the auto-scaling ability of serverless computing could deal with bursty workloads well [8]. Yang et al [68] presented a solution called INFLess that reduces the allocation of resources for each serverless instance to reach optimal performance in inference services.…”
Section: ) Ensure Resource Scalability and Predictive Scalingmentioning
confidence: 99%
See 2 more Smart Citations
“…The machine learning inference services are commonly latency critical, and the auto-scaling ability of serverless computing could deal with bursty workloads well [8]. Yang et al [68] presented a solution called INFLess that reduces the allocation of resources for each serverless instance to reach optimal performance in inference services.…”
Section: ) Ensure Resource Scalability and Predictive Scalingmentioning
confidence: 99%
“…For smaller batch sizes, the processing time increases linearly, but for larger batch sizes, it increases exponentially in an on-premise environment [8]. Deese [22] used the batch mode (maximum size) read and write requests within AWS and found a significant speed increase from batch write operations, but a relatively small benefit from batch reads.…”
Section: ) Batchingmentioning
confidence: 99%
See 1 more Smart Citation