2020 Design, Automation &Amp; Test in Europe Conference &Amp; Exhibition (DATE) 2020
DOI: 10.23919/date48585.2020.9116235
|View full text |Cite
|
Sign up to set email alerts
|

Optimising Resource Management for Embedded Machine Learning

Abstract: Machine learning inference is increasingly being executed locally on mobile and embedded platforms, due to the clear advantages in latency, privacy and connectivity. In this paper, we present approaches for online resource management in heterogeneous multi-core systems and show how they can be applied to optimise the performance of machine learning workloads. Performance can be defined using platform-dependent (e.g. speed, energy) and platform-independent (accuracy, confidence) metrics. In particular, we show … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
10
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

4
3

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 28 publications
0
10
0
Order By: Relevance
“…These approaches produce a static DNN architecture with fixed parameters for the target application performance requirements based on the measurement on a fixed hardware resources. However, since available hardware resources dynamically change at runtime, performance requirements can be violated [19,20]. Fig 1 illustrates these problems by using experimental results from a Jetson Xavier NX, where bar A represents an optimized DNN model executing on all GPU cores at 1.1GHz to deliver a 50ms target latency.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…These approaches produce a static DNN architecture with fixed parameters for the target application performance requirements based on the measurement on a fixed hardware resources. However, since available hardware resources dynamically change at runtime, performance requirements can be violated [19,20]. Fig 1 illustrates these problems by using experimental results from a Jetson Xavier NX, where bar A represents an optimized DNN model executing on all GPU cores at 1.1GHz to deliver a 50ms target latency.…”
Section: Introductionmentioning
confidence: 99%
“…Since both software performance requirements and hardware resource availability can change dynamically at runtime [19,20], various dynamic DNNs [21,[23][24][25] have been proposed to address this issue. These dynamic DNNs contain various sub-networks that each have a different accuracy and latency.…”
Section: Introductionmentioning
confidence: 99%
“…These approaches, which are static DNNs, provide an optimal Transformer architecture for the target application performance requirements based on the measurement of fixed hardware resources. However, embedded devices often run many applications on several heterogeneous cores, and the resources a Transformer was optimized for may not be available at run-time [3]. Therefore, the performance requirements of the application can be violated.…”
Section: Introductionmentioning
confidence: 99%
“…Fig 1 illustrates these problems by using experimental results from a Jetson Xavier NX, where bar A represents an optimized DNN model executing on all GPU cores at 1.1GHz to deliver a 50ms target latency. However, under the same target latency, the optimization at design-time is invalid if the operating frequency changes or the DNN shares GPU cores with other applications at runtime, as shown by bars B, C Since both software performance requirements and hardware resource availability can change dynamically at runtime [19,20], various dynamic DNNs [21,[23][24][25] have been proposed to address this issue. These dynamic DNNs contain various sub-networks that each have a different accuracy and latency.…”
Section: Introductionmentioning
confidence: 99%