Proceedings of the 27th ACM Symposium on Operating Systems Principles 2019
DOI: 10.1145/3341301.3359654
|View full text |Cite
|
Sign up to set email alerts
|

Parity models

Abstract: Machine learning models are becoming the primary workhorses for many applications. Services deploy models through prediction serving systems that take in queries and return predictions by performing inference on models. Prediction serving systems are commonly run on many machines in cluster settings, and thus are prone to slowdowns and failures that inflate tail latency. Erasure coding is a popular technique for achieving resource-efficient resilience to data unavailability in storage and communication systems… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 26 publications
(1 citation statement)
references
References 46 publications
0
1
0
Order By: Relevance
“…Presently, research on optimizing model inference performance remains focused on single-cluster scenarios. Strategies such as traffic prediction, request consolidation, and memory tensor merging are explored to enhance resource utilization and efficiency [11,[17][18][19][20][21].…”
Section: Introduction and Observationmentioning
confidence: 99%
“…Presently, research on optimizing model inference performance remains focused on single-cluster scenarios. Strategies such as traffic prediction, request consolidation, and memory tensor merging are explored to enhance resource utilization and efficiency [11,[17][18][19][20][21].…”
Section: Introduction and Observationmentioning
confidence: 99%