2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2019
DOI: 10.1109/ipdpsw.2019.00028
|View full text |Cite
|
Sign up to set email alerts
|

Pareto Optimal Design Space Exploration for Accelerated CNN on FPGA

Abstract: Convolutional Neural Networks (CNNs) are at the base of many applications, both in embedded and in serverclass contexts. While Graphics Processing Units (GPUs) are predominantly used for training, solutions for inference often rely on Field Programmable Gate Arrays (FPGAs) since they are more flexible and cost-efficient in many scenarios. However, existing approaches fall short to accomplish several conflicting goals, like efficiently using resources on multiple platforms while retaining deep configurability a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 16 publications
0
8
0
Order By: Relevance
“…Similarly, dataflows exploiting the local computation of feature maps between CNN layers have already been employed in a variety of hardware accelerators. For example, dataflows that fuse the computation between subsets of subsequent layers within a CNN (Alwani et al, 2016 ) have been used in implementations of CNNs on FPGA-based accelerators (Reggiani et al, 2019 ). Moreover, such advanced dataflows have been proposed for heterogeneous architectures in order to improve throughput and avoid use of off-chip memory (Wei et al, 2018 ).…”
Section: Discussionmentioning
confidence: 99%
“…Similarly, dataflows exploiting the local computation of feature maps between CNN layers have already been employed in a variety of hardware accelerators. For example, dataflows that fuse the computation between subsets of subsequent layers within a CNN (Alwani et al, 2016 ) have been used in implementations of CNNs on FPGA-based accelerators (Reggiani et al, 2019 ). Moreover, such advanced dataflows have been proposed for heterogeneous architectures in order to improve throughput and avoid use of off-chip memory (Wei et al, 2018 ).…”
Section: Discussionmentioning
confidence: 99%
“…Resource and performance models are proposed by Reggiani et al [169] for convolutional neural network (CNN) accelerators, to drive an automatic Pareto-optimal DSE, exploring network performance on different hardware platforms. These models are applied to convolutional cores, which are critical components of the design, directly affecting the overall latency and DSP utilization.…”
Section: A Modelsmentioning
confidence: 99%
“…However, their model is based on the assumption that the performance/area changes monotonically by modifying an individual design parameter, which is not a valid assumption as we explained in Challenge 2 of Section 1. To increase the accuracy of the estimation model, a number of other studies restrict the target application to those that have a well-defined accelerator micro-architecture template [6,11,12,28,32,42], a specific application [39,45], or a particular computation pattern [7,19,25] hence, they lose generality.…”
Section: Model-based Techniquesmentioning
confidence: 99%