2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2018
DOI: 10.1109/ipdpsw.2018.00032
|View full text |Cite
|
Sign up to set email alerts
|

Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform

Abstract: Deep neural networks (DNNs) are used by different applications that are executed on a range of computer architectures, from IoT devices to supercomputers. The footprint of these networks is huge as well as their computational and communication needs. In order to ease the pressure on resources, research indicates that in many cases a low precision representation (1-2 bit per parameter) of weights and other parameters can achieve similar accuracy while requiring less resources. Using quantized values enables the… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0
2

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(21 citation statements)
references
References 16 publications
0
19
0
2
Order By: Relevance
“…Experiments show that our partitioning algorithms achieve higher efficiency than default synthesis strategies, except for frames of size 512 × 512 where the efficiency is unchanged. This is the case where default strategies perform equally well in terms of utilization since the image height and width are powers of 2 (refer back to Equation (13)). This confirms that modified partitioning strategies are required, according to requirements, in order to improve memory usage.…”
Section: Discussion Of Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Experiments show that our partitioning algorithms achieve higher efficiency than default synthesis strategies, except for frames of size 512 × 512 where the efficiency is unchanged. This is the case where default strategies perform equally well in terms of utilization since the image height and width are powers of 2 (refer back to Equation (13)). This confirms that modified partitioning strategies are required, according to requirements, in order to improve memory usage.…”
Section: Discussion Of Resultsmentioning
confidence: 99%
“…Within FPGA processing sub-systems, algorithms evolve from typical software-suitable representations into more hardware-friendly ones [6,11] which can fully exploit data parallelism [11] through application-specific hardware architectures [3], often substantially different from the traditional Von Neumann model, such as dataflow [12,13] or biologically inspired processing [14]. These heterogeneous architectures are customized for FPGA implementation not just for performance (e.g., by exploiting binary logarithmic arithmetic for efficient multiplication/division [15]), but also for power efficiency (e.g., by static/dynamic frequency scaling across parallel datapaths for reduced power consumption [16]).…”
Section: Background and Related Workmentioning
confidence: 99%
“…Previous work [Baskin et al 2018;Chen et al 2017;Jouppi et al 2017;Kim et al 2017;Li et al 2017;Liang et al 2018;Meloni et al 2018;Moss et al 2017;Prost-Boucle et al 2017;Qiu et al 2016;Venkatesh et al 2017;Wang et al 2017;Zhang and Prasanna 2017] has mainly focused on general accelerators, implemented either as a sequence of instructions on fixed hardware, or accelerator platforms designed for linear algebra intensive computation. Performance comparisons with those in this paper are presented in Section 6.…”
Section: Hardware Accelerators Of Cnnsmentioning
confidence: 99%
“…There are two main approaches to compute the convolution. They are either passing in inputs over multiple cycles and storing intermediate results when computing the convolution, or buffering the pixels so that the entire set of inputs needed for the convolution is available simultaneously such as the approach in Baskin et al [Baskin et al 2018] and…”
Section: Bufferingmentioning
confidence: 99%
See 1 more Smart Citation