2020
DOI: 10.1109/jiot.2020.2981684
|View full text |Cite
|
Sign up to set email alerts
|

PreVIous: A Methodology for Prediction of Visual Inference Performance on IoT Devices

Abstract: This paper presents PreVIous, a methodology to predict the performance of convolutional neural networks (CNNs) in terms of throughput and energy consumption on vision-enabled devices for the Internet of Things. CNNs typically constitute a massive computational load for such devices, which are characterized by scarce hardware resources to be shared among multiple concurrent tasks. Therefore, it is critical to select the optimal CNN architecture for a particular hardware platform according to prescribed applicat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(13 citation statements)
references
References 59 publications
0
13
0
Order By: Relevance
“…FastDeepIoT [18] uses execution time models based on linear model trees to predict the layer execution time on the devices Nexus 5 and Galaxy Nexus to finally compress VGGNet for both devices and reduce the neural network execution time by 48% to 78% and energy consumption by 37% to 69% compared with the state-of-the-art compression algorithms. In PreVIous [19], the execution time models are based on linear regression, and for the devices, Raspberry 3 and Odroid-XU4 reaches about 96% average accuracy for the layer-wise estimation. These results lead us to believe that the task of estimating layer execution times for task optimized computing architectures is significantly more challenging than for CPUs.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…FastDeepIoT [18] uses execution time models based on linear model trees to predict the layer execution time on the devices Nexus 5 and Galaxy Nexus to finally compress VGGNet for both devices and reduce the neural network execution time by 48% to 78% and energy consumption by 37% to 69% compared with the state-of-the-art compression algorithms. In PreVIous [19], the execution time models are based on linear regression, and for the devices, Raspberry 3 and Odroid-XU4 reaches about 96% average accuracy for the layer-wise estimation. These results lead us to believe that the task of estimating layer execution times for task optimized computing architectures is significantly more challenging than for CPUs.…”
Section: Related Workmentioning
confidence: 99%
“…As a result, there have been some recent attempts to predict network latency and performance on different hardware platforms. However, most of the work targets either Graphic Processing Units (GPUs) [16], [17] server or the embedded Central Processing Units (CPUs) [18], [19], leaving out a wide range of hardware accelerators such as Field Programmable Gate Arrays (FPGAs) and hardware specifically designed for AI tasks e.g. Xilinx ZCU102 and Intel NCS2.…”
Section: Introductionmentioning
confidence: 99%
“…However, it relies on the Nvidia System Management Interface (SMI), which is not available on Jetson platforms. PreVIous [12] presents a similar approach using linear regression models. It targets embedded CPU platforms like a Raspberry Pi3 and an Odroid-XU4 and reports an average error of 3.24% for the tested networks.…”
Section: Related Workmentioning
confidence: 99%
“…To skip the time consuming compiling step, DNN latency prediction techniques based on analytical or statistical models have been put forward. They target either large desktop-grade GPUs [10], [11] embedded Central Processing Units (CPUs) [12] but not more powerful embedded devices. Methods like [10] designed for desktop GPUs rely on the Nvidia System Management Interface (Nvidia SMI), which is not available on mobile GPUs.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation