2020
DOI: 10.1109/access.2020.3039278
|View full text |Cite
|
Sign up to set email alerts
|

CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge

Abstract: Convolutional neural networks (CNNs) are widely deployed for many artificial intelligence (AI) applications, such as object detection and image classification. Due to the burgeoning revolution in edge AI, CNN hardware accelerators are also being employed in resource-constrained edge devices for achieving better performance and energy efficiency at the edge. Although CNN accelerators enable fast and energyefficient CNN inference at the edge, the remaining hardware resources on the edge devices except for the CN… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…As for the inference tasks, considering the data as well as the network loading overhead of GPUs (Ma et al 2019), data transferring overhead between GPUs and CPUs, the energy consumption of GPUs, the flexibility of inference tasks with limited parallelism, the high latency of GPUs, GPUs are incompetence compared with CPUs. CPUs are more suitable for deep learning inference workload in many cases (Mittal et al 2021;Kim et al 2019) and numerous studies have been made to optimize and accelerate DNNs on CPUs (de Prado et al 2021;Kim et al 2020;Low et al 2020;Putro et al 2021). Thus, in this paper, we focus on the study of DNNs inference performance on CPUs.…”
Section: Discussionmentioning
confidence: 99%
“…As for the inference tasks, considering the data as well as the network loading overhead of GPUs (Ma et al 2019), data transferring overhead between GPUs and CPUs, the energy consumption of GPUs, the flexibility of inference tasks with limited parallelism, the high latency of GPUs, GPUs are incompetence compared with CPUs. CPUs are more suitable for deep learning inference workload in many cases (Mittal et al 2021;Kim et al 2019) and numerous studies have been made to optimize and accelerate DNNs on CPUs (de Prado et al 2021;Kim et al 2020;Low et al 2020;Putro et al 2021). Thus, in this paper, we focus on the study of DNNs inference performance on CPUs.…”
Section: Discussionmentioning
confidence: 99%
“…The system presents a high degree of flexibility and supports the dynamic deployment of ML algorithms, which demonstrate an efficient and competitive performance of the proposed hardware to accelerate AI-based inference at the edge. Another example is presented in [112] by Kim et al, where they propose a co-scheduling method to accelerate the convolution layer operations of CNN inferences at the edge by exploiting parallelism in the CNN output channels. The developed FPGA-based prototype presented a global performance improvement of up to 200%, and an energy reduction between 14.9% and 49.7%.…”
Section: Edge-ai Levelsmentioning
confidence: 99%
“…Since the greatest computational overhead of CNN is related to the convolutional layers, it is necessary to accelerate the computation using a hardware accelerator. To overcome the overhead issue, there are three options of the accelerators for the implementation of the convolution process in CNN: CPUs (central processing units) that include multiply-and-add instructions [ 27 , 28 , 29 ], GPUs (graphics processing units) that execute massively parallel operations [ 30 , 31 ], and FPGAs (field programmable gate arrays) that implement multiple operators in hardware [ 32 , 33 ]. Furthermore, with the increasing number of parameters in the fully connected layer of CNNs, the model size is growing significantly.…”
Section: Background and Definitionsmentioning
confidence: 99%