2023
DOI: 10.1109/jproc.2022.3226481
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review

Abstract: Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying these highly accurate models for data-driven, learned, automatic, and practical machine learning (ML) solutions to end-user applications remains challenging. DL algorithms are often computationally expensive, power-hungry, and require large memory to process complex and iterative operations of millions of parameters. Hence, training and inference of DL models are typically… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 78 publications
(10 citation statements)
references
References 380 publications
(341 reference statements)
0
10
0
Order By: Relevance
“…By leveraging techniques like compound scaling, which uniformly scales the network width, depth, and resolution, EfficientNet optimizes the model’s architecture to maximize accuracy while minimizing the number of parameters and computations. This enables real-time inference and efficient utilization of resources on edge devices, ensuring faster and more responsive image processing capabilities even with limited computing power [ 56 ]. Moreover, in the considered application, the input size of the available pretrained EfficientNet B5 models matches the resolution of our target images.…”
Section: Methodsmentioning
confidence: 99%
“…By leveraging techniques like compound scaling, which uniformly scales the network width, depth, and resolution, EfficientNet optimizes the model’s architecture to maximize accuracy while minimizing the number of parameters and computations. This enables real-time inference and efficient utilization of resources on edge devices, ensuring faster and more responsive image processing capabilities even with limited computing power [ 56 ]. Moreover, in the considered application, the input size of the available pretrained EfficientNet B5 models matches the resolution of our target images.…”
Section: Methodsmentioning
confidence: 99%
“…The review performed in [6] provides a comprehensive examination of tools and techniques for efficient edge inference, a key element in AI on edge devices. It discusses the challenges of deploying computationally expensive and power-hungry DL algorithms in end-user applications, especially on resource-constrained devices like mobile phones and wearables.…”
Section: Related Workmentioning
confidence: 99%
“…Recent advances in Edge computing allow artificial intelligence and other computations to be performed onboard the device. These computations can be real-time and run on resource-constrained platforms, thus reducing latency and power consumption and addressing privacy-related issues [71]. Still, computationally intensive tasks such as medical imaging that do not need real-time processing can be performed over cloud services.…”
Section: Considerationsmentioning
confidence: 99%