2021
DOI: 10.1609/aaai.v35i11.17181
|View full text |Cite
|
Sign up to set email alerts
|

Empowering Adaptive Early-Exit Inference with Latency Awareness

Abstract: With the capability of trading accuracy for latency on-the-fly, the technique of adaptive early-exit inference has emerged as a promising line of research to accelerate the deep learning inference. However, studies in this line of research commonly use a group of thresholds to control the accuracy-latency trade-off, where a thorough and general methodology on how to determine these thresholds has not been conducted yet, especially with regard to the common requirements of average inference latency. To address … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(1 citation statement)
references
References 42 publications
0
1
0
Order By: Relevance
“…This dynamism grants the network more flexibility and efficiency in handling various resource budgets, realtime requirements, and device capacities while maintaining a good performance trade-off. Amongst the most promising techniques for DyNNs that appear suitable for addressing limited hardware resources, we find early exiting [7,8,9]. Early exiting was introduced in the context of image classification [10,11].…”
Section: Introductionmentioning
confidence: 99%
“…This dynamism grants the network more flexibility and efficiency in handling various resource budgets, realtime requirements, and device capacities while maintaining a good performance trade-off. Amongst the most promising techniques for DyNNs that appear suitable for addressing limited hardware resources, we find early exiting [7,8,9]. Early exiting was introduced in the context of image classification [10,11].…”
Section: Introductionmentioning
confidence: 99%